Home > Articles > Programming > Python

  • Print
  • + Share This
This chapter is from the book 2.7 weakref—Impermanent References to Objects Purpose Refer to an "expensive" object, but allow its memory to be reclaimed by the garbage collector if there are no other nonweak references. Python Version 2.1 and later The weakref module supports weak references to objects. A normal reference increments the reference count on the object and prevents it from being garbage collected. This is not always desirable, either when a circular reference might be present or when building a cache of objects that should be deleted when memory is needed. A weak reference is a handle to an object that does not keep it from being cleaned up automatically. 2.7.1 References Weak references to objects are managed through the ref class. To retrieve the original object, call the reference object. import weakref class ExpensiveObject(object): def __del__(self): print '(Deleting %s)' % self obj = ExpensiveObject() r = weakref.ref(obj) print 'obj:', obj print 'ref:', r print 'r():', r() print 'deleting obj' del obj print 'r():', r() In this case, since obj is deleted before the second call to the reference, the ref returns None. $ python weakref_ref.py obj: <__main__.ExpensiveObject object at 0x100da5750> ref: r(): <__main__.ExpensiveObject object at 0x100da5750> deleting obj (Deleting <__main__.ExpensiveObject object at 0x100da5750>) r(): None 2.7.2 Reference Callbacks The ref constructor accepts an optional callback function to invoke when the referenced object is deleted. import weakref class ExpensiveObject(object): def __del__(self): print '(Deleting %s)' % self def callback(reference): """Invoked when referenced object is deleted""" print 'callback(', reference, ')' obj = ExpensiveObject() r = weakref.ref(obj, callback) print 'obj:', obj print 'ref:', r print 'r():', r() print 'deleting obj' del obj print 'r():', r() The callback receives the reference object as an argument after the reference is "dead" and no longer refers to the original object. One use for this feature is to remove the weak reference object from a cache. $ python weakref_ref_callback.py obj: <__main__.ExpensiveObject object at 0x100da1950> ref: r(): <__main__.ExpensiveObject object at 0x100da1950> deleting obj callback( ) (Deleting <__main__.ExpensiveObject object at 0x100da1950>) r(): None 2.7.3 Proxies It is sometimes more convenient to use a proxy, rather than a weak reference. Proxies can be used as though they were the original object and do not need to be called before the object is accessible. That means they can be passed to a library that does not know it is receiving a reference instead of the real object. import weakref class ExpensiveObject(object): def __init__(self, name): self.name = name def __del__(self): print '(Deleting %s)' % self obj = ExpensiveObject('My Object') r = weakref.ref(obj) p = weakref.proxy(obj) print 'via obj:', obj.name print 'via ref:', r().name print 'via proxy:', p.name del obj print 'via proxy:', p.name If the proxy is accessed after the referent object is removed, a ReferenceError exception is raised. $ python weakref_proxy.py via obj: My Object via ref: My Object via proxy: My Object (Deleting <__main__.ExpensiveObject object at 0x100da27d0>) via proxy: Traceback (most recent call last): File "weakref_proxy.py", line 26, in print 'via proxy:', p.name ReferenceError: weakly-referenced object no longer exists 2.7.4 Cyclic References One use for weak references is to allow cyclic references without preventing garbage collection. This example illustrates the difference between using regular objects and proxies when a graph includes a cycle. The Graph class in weakref_graph.py accepts any object given to it as the "next" node in the sequence. For the sake of brevity, this implementation supports a single outgoing reference from each node, which is of limited use generally, but makes it easy to create cycles for these examples. The function demo() is a utility function to exercise the Graph class by creating a cycle and then removing various references. import gc from pprint import pprint import weakref class Graph(object): def __init__(self, name): self.name = name self.other = None def set_next(self, other): print '%s.set_next(%r)' % (self.name, other) self.other = other def all_nodes(self): "Generate the nodes in the graph sequence." yield self n = self.other while n and n.name != self.name: yield n n = n.other if n is self: yield n return def __str__(self): return '->'.join(n.name for n in self.all_nodes()) def __repr__(self): return '<%s at 0x%x name=%s>' % (self.__class__.__name__, id(self), self.name) def __del__(self): print '(Deleting %s)' % self.name self.set_next(None) def collect_and_show_garbage(): "Show what garbage is present." print 'Collecting...' n = gc.collect() print 'Unreachable objects:', n print 'Garbage:', pprint(gc.garbage) def demo(graph_factory): print 'Set up graph:' one = graph_factory('one') two = graph_factory('two') three = graph_factory('three') one.set_next(two) two.set_next(three) three.set_next(one) print print 'Graph:' print str(one) collect_and_show_garbage() print three = None two = None print 'After 2 references removed:' print str(one) collect_and_show_garbage() print print 'Removing last reference:' one = None collect_and_show_garbage() This example uses the gc module to help debug the leak. The DEBUG_LEAK flag causes gc to print information about objects that cannot be seen, other than through the reference the garbage collector has to them. import gc from pprint import pprint import weakref from weakref_graph import Graph, demo, collect_and_show_garbage gc.set_debug(gc.DEBUG_LEAK) print 'Setting up the cycle' print demo(Graph) print print 'Breaking the cycle and cleaning up garbage' print gc.garbage[0].set_next(None) while gc.garbage: del gc.garbage[0] print collect_and_show_garbage() Even after deleting the local references to the Graph instances in demo(), the graphs all show up in the garbage list and cannot be collected. Several dictionaries are also found in the garbage list. They are the __dict__ values from the Graph instances and contain the attributes for those objects. The graphs can be forcibly deleted, since the program knows what they are. Enabling unbuffered I/O by passing the -u option to the interpreter ensures that the output from the print statements in this example program (written to standard output) and the debug output from gc (written to standard error) are interleaved correctly. $ python -u weakref_cycle.py Setting up the cycle Set up graph: one.set_next() two.set_next() three.set_next() Graph: one->two->three->one Collecting... Unreachable objects: 0 Garbage:[] After 2 references removed: one->two->three->one Collecting... Unreachable objects: 0 Garbage:[] Removing last reference: Collecting... gc: uncollectable gc: uncollectable gc: uncollectable gc: uncollectable gc: uncollectable gc: uncollectable Unreachable objects: 6 Garbage:[, , , {'name': 'one', 'other': }, {'name': 'two', 'other': }, {'name': 'three', 'other': }] Breaking the cycle and cleaning up garbage one.set_next(None) (Deleting two) two.set_next(None) (Deleting three) three.set_next(None) (Deleting one) one.set_next(None) Collecting... Unreachable objects: 0 Garbage:[] The next step is to create a more intelligent WeakGraph class that knows how to avoid creating cycles with regular references by using weak references when a cycle is detected. import gc from pprint import pprint import weakref from weakref_graph import Graph, demo class WeakGraph(Graph): def set_next(self, other): if other is not None: # See if we should replace the reference # to other with a weakref. if self in other.all_nodes(): other = weakref.proxy(other) super(WeakGraph, self).set_next(other) return demo(WeakGraph) Since the WeakGraph instances use proxies to refer to objects that have already been seen, as demo() removes all local references to the objects, the cycle is broken and the garbage collector can delete the objects. $ python weakref_weakgraph.py Set up graph: one.set_next() two.set_next() three.set_next( ) Graph: one->two->three Collecting... Unreachable objects: 0 Garbage:[] After 2 references removed: one->two->three Collecting... Unreachable objects: 0 Garbage:[] Removing last reference: (Deleting one) one.set_next(None) (Deleting two) two.set_next(None) (Deleting three) three.set_next(None) Collecting... Unreachable objects: 0 Garbage:[] 2.7.5 Caching Objects The ref and proxy classes are considered "low level." While they are useful for maintaining weak references to individual objects and allowing cycles to be garbage collected, the WeakKeyDictionary and WeakValueDictionary provide a more appropriate API for creating a cache of several objects. The WeakValueDictionary uses weak references to the values it holds, allowing them to be garbage collected when other code is not actually using them. Using explicit calls to the garbage collector illustrates the difference between memory handling with a regular dictionary and WeakValueDictionary. import gc from pprint import pprint import weakref gc.set_debug(gc.DEBUG_LEAK) class ExpensiveObject(object): def __init__(self, name): self.name = name def __repr__(self): return 'ExpensiveObject(%s)' % self.name def __del__(self): print ' (Deleting %s)' % self def demo(cache_factory): # hold objects so any weak references # are not removed immediately all_refs = {} # create the cache using the factory print 'CACHE TYPE:', cache_factory cache = cache_factory() for name in [ 'one', 'two', 'three' ]: o = ExpensiveObject(name) cache[name] = o all_refs[name] = o del o # decref print ' all_refs =', pprint(all_refs) print '\n Before, cache contains:', cache.keys() for name, value in cache.items(): print ' %s = %s' % (name, value) del value # decref # Remove all references to the objects except the cache print '\n Cleanup:' del all_refs gc.collect() print '\n After, cache contains:', cache.keys() for name, value in cache.items(): print ' %s = %s' % (name, value) print ' demo returning' return demo(dict) print demo(weakref.WeakValueDictionary) Any loop variables that refer to the values being cached must be cleared explicitly so the reference count of the object is decremented. Otherwise, the garbage collector would not remove the objects, and they would remain in the cache. Similarly, the all_refs variable is used to hold references to prevent them from being garbage collected prematurely. $ python weakref_valuedict.py CACHE TYPE: all_refs ={'one': ExpensiveObject(one), 'three': ExpensiveObject(three), 'two': ExpensiveObject(two)} Before, cache contains: ['three', 'two', 'one'] three = ExpensiveObject(three) two = ExpensiveObject(two) one = ExpensiveObject(one) Cleanup: After, cache contains: ['three', 'two', 'one'] three = ExpensiveObject(three) two = ExpensiveObject(two) one = ExpensiveObject(one) demo returning (Deleting ExpensiveObject(three)) (Deleting ExpensiveObject(two)) (Deleting ExpensiveObject(one)) CACHE TYPE: weakref.WeakValueDictionary all_refs ={'one': ExpensiveObject(one), 'three': ExpensiveObject(three), 'two': ExpensiveObject(two)} Before, cache contains: ['three', 'two', 'one'] three = ExpensiveObject(three) two = ExpensiveObject(two) one = ExpensiveObject(one) Cleanup: (Deleting ExpensiveObject(three)) (Deleting ExpensiveObject(two)) (Deleting ExpensiveObject(one)) After, cache contains: [] demo returning The WeakKeyDictionary works similarly, but it uses weak references for the keys instead of the values in the dictionary. WARNING The library documentation for weakref contains this warning: Caution: Because a WeakValueDictionary is built on top of a Python dictionary, it must not change size when iterating over it. This can be difficult to ensure for a WeakValueDictionary because actions performed by the program during iteration may cause items in the dictionary to vanish "by magic" (as a side effect of garbage collection). See Also: weakref (http://docs.python.org/lib/module-weakref.html) Standard library documentation for this module. gc (page 1138) The gc module is the interface to the interpreter's garbage collector.
  • + Share This
  • 🔖 Save To Your Account