Reference Counting and Garbage Collection
All objects are reference-counted. An object’s reference count is increased whenever it’s assigned to a new name or placed in a container such as a list, tuple, or dictionary, as shown here:
a = 37 # Creates an object with value 37 b = a # Increases reference count on 37 c = [] c.append(b) # Increases reference count on 37
This example creates a single object containing the value 37. a is merely a name that refers to the newly created object. When b is assigned a, b becomes a new name for the same object and the object’s reference count increases. Likewise, when you place b into a list, the object’s reference count increases again. Throughout the example, only one object contains 37. All other operations are simply creating new references to the object.
An object’s reference count is decreased by the del statement or whenever a reference goes out of scope (or is reassigned). Here’s an example:
del a # Decrease reference count of 37 b = 42 # Decrease reference count of 37 c[0] = 2.0 # Decrease reference count of 37
The current reference count of an object can be obtained using the sys.getrefcount() function. For example:
>>> a = 37 >>> import sys >>> sys.getrefcount(a) 7 >>>
In many cases, the reference count is much higher than you might guess. For immutable data such as numbers and strings, the interpreter aggressively shares objects between different parts of the program in order to conserve memory.
When an object’s reference count reaches zero, it is garbage-collected. However, in some cases a circular dependency may exist among a collection of objects that are no longer in use. Here’s an example:
a = { } b = { } a['b'] = b # a contains reference to b b['a'] = a # b contains reference to a del a del b
In this example, the del statements decrease the reference count of a and b and destroy the names used to refer to the underlying objects. However, because each object contains a reference to the other, the reference count doesn’t drop to zero and the objects remain allocated (resulting in a memory leak). To address this problem, the interpreter periodically executes a cycle detector that searches for cycles of inaccessible objects and deletes them. The cycle-detection algorithm runs periodically as the interpreter allocates more and more memory during execution. The exact behavior can be fine-tuned and controlled using functions in the gc module (see Chapter 13, “Python Runtime Services”).