68.4 Weak References: weakref Module
Right, let’s talk about weak references. You’re probably used to the idea that when you assign an object to a variable, you’re creating a strong reference. You’re essentially telling the garbage collector (GC), “Hey, I’m using this, hands off!” A weak reference, on the other hand, is like telling the GC, “I’d like to know where this object is, but if you need to clean it up, go right ahead. Don’t mind me.”
Think of it as keeping the address of a bakery written on a sticky note instead of chaining yourself to the front door. If the bakery goes out of business (gets garbage collected), your sticky note becomes useless, but you weren’t the reason it stayed open.
Why You’d Even Bother
The primary use case is for creating caches. You want to keep a bunch of large objects in memory only as long as something else in your program is actually using them. The moment the main program says, “I’m done with this,” you want your cache to gracefully let it go and get collected, rather than stubbornly keeping it alive and causing a memory leak. Mappings that hold large objects are the classic example.
The Basic Incantation: weakref.ref
The core tool is weakref.ref. You create a weak reference to an object. To get the actual object back, you call the reference. If the object is still alive, you get it. If it’s been collected, you get None. Simple, but a bit clunky.
import weakref
class ExpensiveObject:
def __init__(self, name):
self.name = name
def __del__(self):
print(f"{self.name} is getting shredded by the GC. Farewell!")
# Create our expensive thing
obj = ExpensiveObject("My Amazing Data")
# Create a weak reference to it
weak_obj = weakref.ref(obj)
# Right now, the object is alive
print(f"Alive: {weak_obj()}") # Output: Alive: <__main__.ExpensiveObject object at 0x...>
# Now, let's delete the strong reference
del obj
# Poof! The next GC run (which we're forcing for demo purposes) will collect it.
import gc; gc.collect()
# Output: My Amazing Data is getting shredded by the GC. Farewell!
# Our weak reference now points to nothing
print(f"Dead: {weak_obj()}") # Output: Dead: None
See? The __del__ method ran, proving the object was destroyed, and our weak reference politely returned None.
The Useful Stuff: WeakValueDictionary and Friends
Manually calling weakref.ref everywhere is for masochists. In practice, you’ll almost always use one of the weak collection types, and WeakValueDictionary is the superstar. It’s a dict that holds weak references to its values. The keys are strong references, but the second the last strong reference to a value disappears, the value in the dictionary magically vanishes.
class Image:
def __init__(self, name):
self.name = name
def __repr__(self):
return f"Image({self.name})"
# Our in-memory cache of loaded images
cache = weakref.WeakValueDictionary()
def get_image(name):
# Try to get the image from the cache
image = cache.get(name)
if image is None:
# Simulate the expensive operation of loading it
print(f"Cache MISS! Loading '{name}' from disk (this is expensive).")
image = Image(name)
cache[name] = image # Store a WEAK reference
else:
print(f"Cache HIT! Found '{name}'.")
return image
# Let's use it
img_1 = get_image("cat.jpg") # Cache MISS! ...
img_2 = get_image("cat.jpg") # Cache HIT! Found 'cat.jpg'.
# These are the same object, which is good
print(img_1 is img_2) # Output: True
# Now, the magic. Delete all strong references...
del img_1
del img_2
# Force collection to see it happen immediately
gc.collect()
# The cache is now empty! The entry was automatically removed.
print(f"Cache contents: {list(cache.keys())}") # Output: Cache contents: []
This is beautiful. The cache did its job without ever becoming a memory hog. The designers got this one right.
The Gotchas (Because Of Course There Are Gotchas)
Not everything can be weakly referenced. The most glaring exceptions are list and dict. You can’t create a weakref.ref to a basic list. Why? Honestly, it’s a bit of a questionable choice, but it’s because these core types just don’t play nice with the weakref mechanism. If you need a weak reference to a container, you’ll need to subclass it. It’s annoying.
my_list = [1, 2, 3]
try:
weak_list = weakref.ref(my_list)
except TypeError as e:
print(f"Told you so: {e}") # Output: Told you so: cannot create weak reference to 'list' object
Also, be wary of circular references involving weakrefs. They can still happen. A weak reference doesn’t break a cycle; it just doesn’t create a new one. If object A has a strong reference to object B, and object B has a weak reference back to A, that’s fine. But if they both have strong references, the weakref is irrelevant and you’ve got a leak.
The Final Boss: weakref.finalize
For when you need a callback, a last rites ceremony for your object. weakref.finalize lets you register a function to be called after the object is garbage collected. This is incredibly useful for cleaning up external resources (like temporary files or network connections) that the Python GC knows nothing about.
def cleanup_callback(resource_id):
print(f"[CLEANUP] Freeing external resource: {resource_id}")
class ResourceHolder:
def __init__(self, resource_id):
self.resource_id = resource_id
# Set up the finalizer to call our callback when 'self' is collected
self._finalizer = weakref.finalize(self, cleanup_callback, resource_id)
# Let's see it in action
holder = ResourceHolder("TempFile_123")
# ... use the resource ...
del holder # The object becomes collectable
gc.collect()
# Output: [CLEANUP] Freeing external resource: TempFile_123
The beauty of finalize is that it’s a weak reference itself, so it doesn’t prevent the object from being collected. It’s the right way to do this. Never rely on __del__ for resource cleanup; it’s fraught with peril due to the interpreter’s shutdown behavior. finalize is more predictable.
So, to sum up: use WeakValueDictionary for caches, remember you can’t weakly ref basic lists and dicts, and use finalize for cleanup instead of __del__. Do this, and you’ll manage memory like a pro, not a barbarian.