- Memory Management
- Tracing or Reference Counting
- Collection and Concurrency
- Running Finalizers
- Accurate Versus Conservative
- Generations
- Modern Collectors
Collection and Concurrency
Concurrency greatly increases the complexity of memory management. Consider the trivial case of an accessor method. It returns an instance variable of an object. In a single-threaded environment, it can just load the value and return it. If the caller subsequently replaces the value of that instance variable, then the caller is responsible for ensuring that the returned value is not deallocated. In Objective-C, you typically do this by increasing the reference count; in C++, by copying the returned value.
In a concurrent environment, another thread may replace the instance variable. If you're managing memory manually, dealing with this issue is very difficult. In a tracing environment, it's trivial. The collector will automatically ensure that the object is not deallocated while either thread has a reference to it; therefore, the collector has to be able to track exactly which objects are live. This is easy in a reference counted environment, because the reference count of objects is usually maintained using atomic increment and decrement operations. When the count hits zero, it's deallocated.
In a tracing environment, tracking which objects are live is a bit more difficult. It's possible for the only reference to an object to be in the registers of a thread. Simple collectors use a stop-the-world strategy, where they force every thread to write its registers to the stack, using a mechanism like setjmp(), and then scan them for possible pointers.
This approach quickly hits Amdahl's Law. No matter how many cores you have, your program is executing single-threaded code during this phase. Tracing collectors are also difficult to use in real-time environments, because threads can be interrupted for an indeterminate amount of time while the collector runs. Incremental techniques can help alleviate this problem, as can simply turning off the collector in a latency-sensitive section.
The mark phase of a traditional mark-and-sweep collector can run in parallel fairly easily. For example, every running thread can start marking from its own thread, stopping once it reaches an object that's already marked.
In some cases, you can skip the stop-the-world phase. One of the optimizations that Apple's Autozone collector implements is a flag indicating that an object has been assigned to the heap. This behavior is possible because every heap assignment in Objective-C with garbage collection calls a write barrier function in the Objective-C runtime. If an object hasn't been stored in heap or static (global) memory, the only possible place for pointers to it is the stack of the thread that allocated it.
When running a quick collection, Autozone just scans the current stack for temporary objects. A quick incremental collection won't affect other threads, but you need a more complete collection for objects that last a bit longer.