# Garbage Collection: Why, When, and How?

• Print
From the author of

### From the author of 

Tracing or Reference Counting

## Tracing or Reference Counting

The problem of cycles in a reference counting system can be solved by the addition of a cycle detector. When you decrease the reference count of an object without freeing it, then it's possible that it's part of a cycle. You can then run the cycle detector, which walks all references from the object to see if it returns to the start, and find out if the object should be freed.

The opposite approach is tracing, which involves starting at some known non-garbage objects (typically those stored in globals or on the stack), finding all reachable objects, and then deleting everything else.

One of the most interesting papers that I've ever read is David F. Bacon's "A Unified Theory of Garbage Collection." This paper proposes an abstract general algorithm for garbage collection, and it shows that existing algorithms can be reduced to special cases of the general formulation.

The earliest garbage collection algorithm comes from Lisp, all the way back in 1960, and is a simple tracing collector. It starts with a set of root objects (known to be live—typically in globals or on the stack), and then follows every pointer that they contain, marking it as live. Finally, it sweeps away all unmarked objects.

A reference-counting collector works the other way around. It finds a dead object (one that has a reference count of 0) and then follows pointers from it, decreasing reference counts, and then recursively collecting all newly dead objects.

Most real collectors use some combination of the two approaches. The main drawback of reference counting is the cost of incrementing and decrementing the reference count for every assignment. This is especially an issue for on-stack assignments, where something that was a move between registers now requires an atomic operation (often locking the bus, and costing a hundred or so cycles).

Reference counting collectors often treat the stack as special, and they defer collection of objects that have a zero reference count until after they've checked that there are no pointers on the stack. You can view this last phase as a simple tracing collector that runs on a small subset of objects. Tracing collectors in the general case are reference counting collectors that lazily calculate a 1-bit reference count. They walk the object graph finding the reference count (which is either 0 or 1, using saturation semantics for increment) and delete everything with a reference count of 0.