Home > Articles > Programming > Windows Programming

  • Print
  • + Share This
  • 💬 Discuss
This chapter is from the book

This chapter is from the book

Garbage Collector Internals

The CLR GC is a highly efficient, scalable, and reliable automatic memory manager. Much time and effort went into researching the optimal behavioral characteristics of the GC. Before delving into the details of the CLR GC, it is important to state the definition of what the GC is and also what assumptions were made during its design and implementation. Let's begin by looking at some of the key assumptions.

  • The CLR GC assumes that everything is garbage unless otherwise told. This means that the GC is ready to collect all objects on the managed heap unless told otherwise. In essence, it implements a reference tracking scheme for all live objects in the system (we will define what live means shortly) where objects without any references to them are considered garbage and can be collected.
  • The CLR GC assumes that all objects on the managed heap will be short lived (or ephemeral). In other words, the GC attempts to collect short-lived objects more often than long-lived objects operating under the assumption that if an object has been around for a while, chances are it will be around for a little longer and there is no need to attempt to collect that object again.
  • The CLR GC tracks an object's age via the use of generations. Young objects are placed in generation 0 and older objects in generations 1 and 2. As an object grows older, it is promoted from one generation to the next. As such, a generation can be said to define the age of an object.

Based upon the assumptions above, we can arrive at a definition of the CLR GC: It is a reference tracking and generational garbage collector.

Let's look at each of the parts of the definition more concretely and begin with how generations define the age of an object.

Generations

The CLR GC defines three generations very innovatively called generation 0, generation 1, and generation 2. Each of the generations contains objects of a certain age where generation 0 contains newly allocated objects and generation 2 contains the oldest of objects. An object moves from one generation to the next by surviving a garbage collection. By surviving, it's implied that the object was still being referenced (or is still rooted) at the time of the garbage collection. Each of the generations can be garbage collected at any time, but the frequency of garbage collections depend on the generation. Remember from the previous section that one of the assumptions that the CLR makes is that most objects are going to be short-lived (i.e., live in generation 0). Due to that assumption, generation 0 is collected far more frequently than generation 2 in hopes to prune these short-lived objects quicker. Figure 5-5 shows the overall algorithm when it comes to how the generations are garbage collected.

Figure 5-5

Figure 5-5 High-level overview of generational garbage collection algorithm

In Figure 5-5, we can see that the triggering of a garbage collection is by new allocation request and when the budget for generation 0 has been exceeded. If so, the garbage collector collects all objects that have no roots associated with them and promotes all objects with roots to generation 1. Much in the same way that generation 0 has a budget defined, so does generation 1; and if, as part of promoting objects from generation 0 to generation 1, the budget is exceeded, the GC repeats the process of collecting objects with no roots in generation 1 and promoting objects with roots to generation 2. The process repeats itself for generation 2. If, while promoting to generation 2, the GC cannot collect any objects and the budget for generation 2 is exceeded, the CLR heap manager tries to allocate another segment that will hold generation 2 objects. If the creation of a new segment fails, an OutOfMemoryException is thrown. The CLR heap manager also releases segments if they are not in use anymore; we will discuss this process in more detail later in the chapter.

Let's take a practical look at how an object is collected and promoted. Listing 5-2 shows the source code behind the application we will use to illustrate the generational concepts.

Listing 5-2. Example source code to illustrate generational concepts

using System;
using System.Text;
using System.Runtime.Remoting;


namespace Advanced.NET.Debugging.Chapter5
{
    class Name
    {
        private string first;
        private string last;


        public string First { get { return first; } }
        public string Last { get { return last; } }


        public Name(string f, string l)
        {
            first = f; last = l;
        }
    }
    class Gen
    {
        static void Main(string[] args)
        {
            Name n1 = new Name("Mario", "Hewardt");
            Name n2 = new Name("Gemma", "Hewardt");


            Console.WriteLine("Allocated objects");


            Console.WriteLine("Press any key to invoke GC");
            Console.ReadKey();


            n1 = null;
            GC.Collect();


            Console.WriteLine("Press any key to invoke GC");
            Console.ReadKey();


            GC.Collect();


            Console.WriteLine("Press any key to exit");
            Console.ReadKey();
        }
    }
}

The source code and binary for Listing 5-2 can be found in the following folders:

  • Source code: C:\ADND\Chapter5\Gen
  • Binary: C:\ADNDBin\05Gen.exe

In Listing 5-2, we have defined a simple type called Name. In the Main method, we instantiate two instances of the Name type, both of which end up going to generation 0 as new allocations. When the user has been prompted to Press any key to invoke GC, we set the n1 instance to null, which indicates that it can be garbage collected because it no longer has any roots. Next, the garbage collection occurs and collects n1 and promotes n2 to generation 1. Finally, the last garbage collection promotes n2 to generation 2 because it is still rooted.

Let's run the application under the debugger and see how we can verify our theories on how n1 and n2 are collected and promoted. When the application is running under the debugger, resume execution until the first Press any key to invoke GC prompt. At that point, we need to break execution and find the addresses to the two object instances, which can easily be done via the ClrStack command as shown in the following:

0:000> !ClrStack -a
OS Thread Id: 0x1c0c (0)
ESP       EIP
0028f3b4 77709a94 [NDirectMethodFrameSlim: 0028f3b4]
 Microsoft.Win32.Win32Native.ReadConsoleInput(IntPtr, InputRecord ByRef, Int32,
Int32 ByRef)
0028f3cc 793e8f28 System.Console.ReadKey(Boolean)
    PARAMETERS:
        intercept = 0x00000000
    LOCALS:
        <no data>
        0x0028f3dc = 0x00000001
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>


0028f40c 793e8e33 System.Console.ReadKey()
0028f410 003000f3 Advanced.NET.Debugging.Chapter5.Gen.Main(System.String[])
    PARAMETERS:
        args = 0x01c55818
    LOCALS:
        <CLR reg> = 0x01da5938 
        <CLR reg> = 0x01da5948


0028f65c 79e7c74b [GCFrame: 0028f65c]

The addresses of the two objects on the managed heap are 0x01da5938 and 0x01da5948. How can we figure out which generation objects on the managed heap belong to? The answer to that lies in understanding the correlation between managed heap segments and generations. As previously discussed, each managed heap consists of one or more segments where the objects reside. Furthermore, part of the segment(s) is dedicated to a given generation. Figure 5-6 shows an example of a hypothetical managed heap segment.

Figure 5-6

Figure 5-6 Hypothetical managed heap segment

In Figure 5-6, the managed heap segment is divided into three generations, each with its own starting address managed by the CLR heap manager. Generations 0 and 1 are part of a single segment known as the ephemeral segment where short-lived objects live. Because the GC goes under the assumption that most objects are short lived, most objects are not expected to live past generation 0 or, at a maximum, generation 1. Objects that live in generation 2 are the oldest objects and get collected very infrequently. It is possible that generation 2 can also be part of the ephemeral segment even though generation 2 is not collected as often. By looking at an object's address and knowing the address ranges for each of the generations, we can find out which generation an object belongs to. How do we know what the generational starting addresses for the CLR heap manager are? The answer lies in a command called eeheap. The eeheap command displays various memory statistics of data consumed by internal CLR data structures. By default, eeheap displays verbose data, meaning that information related to the GC as well as the loader is displayed. To display information only about the GC, the –gc switch can be used. Let's run the command in our existing debug session and see what we get:

0:004> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x01da1018
generation 1 starts at 0x01da100c
generation 2 starts at 0x01da1000
ephemeral segment allocation context: none
 segment    begin allocated     size
002c7db0 790d8620  790f7d8c 0x0001f76c(128876)
01da0000 01da1000  01da8010 0x00007010(28688)
Large object heap starts at 0x02da1000
 segment    begin allocated     size
02da0000 02da1000  02da3250 0x00002250(8784)
Total Size   0x289cc(166348)
–––––––––––––––––––––––––––––
GC Heap Size   0x289cc(166348)

Part of the output shows clearly the starting addresses of each of the generations. If we look at the object addresses in the debug session of our sample application, we can see the following:

<CLR reg> =  0x01da5938
<CLR reg> =  0x01da5948

Both of these addresses corresponding to our objects fall within the address range of generation 0 (starting at 0x01da1018), hence we can conclude that both of them live within the realm of that generation. This makes perfect sense because we are currently in the code flow where the objects were just allocated and we are pending a garbage collection. If we resume execution of the application and subsequently break execution again the next time we see the Press any key to invoke GC, we should see some difference in which generation the objects belong to. If we look at the source code, we can see that prior to invoking a garbage collection, we set the n1 reference to null, which in essence makes the object rootless and one that should be garbage collected. Furthermore, n2 is still rooted and as such should be promoted to generation 1 during the garbage collection. Let's take a look by following the same process as earlier: find the object addresses, use the eeheap command to find the generational address ranges, and see which generation the object falls into:

0:000> !ClrStack -a
OS Thread Id: 0x1910 (0)
ESP       EIP
0021f394 77709a94 [NDirectMethodFrameSlim: 0021f394]
 Microsoft.Win32.Win32Native.ReadConsoleInput(IntPtr, InputRecord
ByRef, Int32, Int32 ByRef)
0021f3ac 793e8f28 System.Console.ReadKey(Boolean)
    PARAMETERS:
        intercept = 0x00000000
    LOCALS:
        <no data>
        0x0021f3bc = 0x00000001
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
0021f3ec 793e8e33 System.Console.ReadKey()
0021f3f0 01690111 Advanced.NET.Debugging.Chapter5.Gen.Main(System.String[])
    PARAMETERS:
        args = 0x01da5818
    LOCALS:
        <CLR reg> = 0x00000000
        
        <CLR reg> = 0x01da5948


0021f644 79e7c74b [GCFrame: 0021f644]
0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x01da6c00
generation 1 starts at 0x01da100c
generation 2 starts at 0x01da1000
ephemeral segment allocation context: none
 segment    begin allocated     size
002c7db0 790d8620  790f7d8c 0x0001f76c(128876)
01da0000 01da1000  01da8c0c 0x00007c0c(31756)
Large object heap starts at 0x02da1000
 segment    begin allocated     size
02da0000 02da1000  02da3240 0x00002240(8768)
Total Size   0x295b8(169400)
––––––––––––––––––––––––––––––
GC Heap Size   0x295b8(169400)

The most interesting part of the output is in the eeheap command output. We can see now that the generational address ranges have changed slightly. More specifically, the starting address of generation 0 has changed from 0x01da1018 to 0x01da6c00, which in essence implies that generation 1 has become bigger (because the starting address of generation 1 remains unchanged). If we correlate the address of our n2 object (0x01da5948) with the generational address ranges that the eeheap command displayed, we can see that the n2 object falls into generation 1. Again, this is fully expected because n2 previously lived in generation 0 and was still rooted at the time of the garbage collection, thereby promoting the object to the next generation. I will leave it as an exercise to you to see what happens on the final garbage collection in the sample application.

Although the SOS debugger extension provides the means of finding out which generation any given object belongs to, it is a somewhat tedious process as it requires that addresses be checked against potentially changing generational addresses within any given managed heap segment. Furthermore, there is no concrete way to list all the objects that fall into any given generation, making it hard to get an overall picture of the per generation utilization. Fortunately, the SOSEX extension comes to the rescue with a command named dumpgen. With the dumpgen command, you can easily get a list of all objects that belong to the generation specified as an argument to the command. For example, using the same sample application as shown in Listing 5-2, here is the output when running dumpgen:

0:000>  !dumpgen 0
01da6c00              12 **** FREE ****
01da6c0c              68 System.Char[]
2 objects, 80 bytes
0:000>  !dumpgen 1
01da100c              12 **** FREE ****
01da1018              12 **** FREE ****
01da1024              72 System.OutOfMemoryException
01da106c              72 System.StackOverflowException
01da10b4              72 System.ExecutionEngineException
01da10fc              72 System.Threading.ThreadAbortException
01da1144              72 System.Threading.ThreadAbortException
01da118c              12 System.Object
01da1198              28 System.SharedStatics
01da11b4             100 System.AppDomain
...
...
...
01da5948               16 Advanced.NET.Debugging.Chapter5.Name
01da5958              28 Microsoft.Win32.Win32Native+InputRecord
01da5974              12 System.Object
01da5980              20 Microsoft.Win32.SafeHandles.SafeFileHandle
01da5994              36 System.IO.__ConsoleStream
01da59b8              28 System.IO.Stream+NullStream
...
...
...

We can see that there aren't a lot of objects in generation 0; instead, we have a ton of objects in generation 1 including our n2 instance at address 0x01da5948. The dumpgen command really makes life easier when looking at generation specific data.

So far, we have discussed how objects live in managed heap segments divided into generations and how these objects are either garbage collected or promoted to the next generation, depending on if they are still referenced (or still rooted). One question that still remains is what it means for an object to be rooted. The next section introduces the notion of roots, which are at the heart of the decision-making process the GC uses to determine if an object can be collected.

Roots

One of the most fundamental aspects of a garbage collection is that of being able to determine which objects are still being referenced and which objects are not and can be considered for garbage collection. Contrary to popular belief, the GC itself does not implement the logic for detecting which objects are still being referenced; rather, it uses other components in the CLR that have far more knowledge about the lifetimes of the objects. The CLR uses the following components to determine which objects are still referenced:

  • Just In Time compiler. The JIT compiler is the component responsible for translating IL to machine code and has detailed knowledge of which local variables were considered active at any given point in time. The JIT compiler maintains this information in a table that it subsequently references when the GC asks for objects that are still considered to be alive.
  • Stack walker. This comes into play when unmanaged calls are made to the execution engine. During these calls, it is imperative that any managed objects used during the call also be part of the reference tracking system.
  • Handle table. The CLR maintains a set of handle tables on a per application domain basis that can contain, for example, pointers to pinned reference types on the managed heap. During a GC inquiry, these handle tables are probed for live references to objects on the managed heap.
  • Finalize queue. We will discuss the notion of object finalizers shortly, but for the time being, view objects with finalizers as objects that can be considered dead from an application's perspective but still need to be kept alive for cleanup purposes.
  • If the object is a member of any of the above categories.

During the probing phase, the GC also marks all the objects according to their state (rooted). When all components have been probed, the GC goes ahead and starts the garbage collection of all objects by promoting all objects that are still considered rooted. An interesting question in regards to roots is, Given an address to an object on the managed heap, is it possible to see if the object is rooted or not; and if so, what the reference chain of object is? Again, we turn to the SOS extension and a command named gcroot. The gcroot command uses a technique similar to the earlier one utilized by the GC to find the aliveness of the object. Let's take a look at some sample code. Listing 5-3 shows the source code of an application that defines a set of types and references to those types at various scopes.

Listing 5-3. Sample application to illustrate object roots

using System;
using System.Text;
using System.Threading;


namespace Advanced.NET.Debugging.Chapter5
{
    class Name
    {
        private string first;
        private string last;


        public string First { get { return first; } }
        public string Last { get { return last; } }


        public Name(string f, string l)
    {
        first = f; last = l;
    }
}


class Roots
{
    public static Name CompleteName = new Name ("First", "Last");


    private Thread thread;
    private bool shouldExit;


    static void Main(string[] args)
    {
        Roots r = new Roots();
        r.Run();
    }


    public void Run()
    {
        shouldExit = false;


        Name n1 = CompleteName;


        thread = new Thread(this.Worker);
        thread.Start(n1);


        Thread.Sleep(1000);


        Console.WriteLine("Press any key to exit");
        Console.ReadKey();


        shouldExit = true;


    }


    public void Worker(Object o)
    {
        Name n1 = (Name)o;
        Console.WriteLine("Thread started {0}, {1}",
                          n1.First,
                          n1.Last);


        while (true)
        {
            // Do work
               Thread.Sleep(500);
               if (shouldExit)
                   break;
            }
        }
    }
}

The source code and binary for Listing 5-3 can be found in the following folders:

  • Source code: C:\ADND\Chapter5\Roots
  • Binary: C:\ADNDBin\05Roots.exe

The source code in Listing 5-3 declares a static instance of the Name type. The main part of the application declares a reference to the static instance in the Run method as well as starts up a thread passing the reference to the newly created thread. The method that the new thread executes uses the reference passed to it until the user hits any key, at which point both the worker thread and the application terminate. The object we are interested in tracking for this exercise is the CompleteName static field. From the source code, we can glean the following characteristics about CompleteName:

  • We have a static reference to the object instance at the Roots class level serving as our first root to the object.
  • In the Run method, we assign a local variable reference (n1) to the object instance serving as our second root. The n1 local variable is not used after the thread has started and is subject to becoming invalid even before the end of the method scope (in retail builds). In debug builds, the reference is guaranteed to remain valid until the end of the scope is reached.
  • In the Run method, we pass the local variable reference n1 to the thread method during thread startup serving as our third root.

Let's run the application under the debugger and manually break execution when the Press any key to exit prompt is displayed. The first thing we need to find is the address to the object we are interested in (and dumping the object for good measure) followed by running the gcroot command on the address:

0:005> ~0s
eax=002cef9c ebx=002cef94 ecx=792274ec edx=79ec9058 esi=002cedf0 edi=00000000
eip=77709a94 esp=002ceda0 ebp=002cedc0 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
77709a94 c3              ret
0:000> !ClrStack -a
OS Thread Id: 0x2358 (0)
ESP       EIP
002cef6c 77709a94 [NDirectMethodFrameSlim: 002cef6c]
 Microsoft.Win32.Win32Native.ReadConsoleInput(IntPtr, InputRecord ByRef, Int32,
Int32 ByRef)
002cef84 793e8f28 System.Console.ReadKey(Boolean)
    PARAMETERS:
        intercept = 0x00000000
    LOCALS:
        <no data>
        0x002cef94 = 0x00000001
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>


002cefc4 793e8e33 System.Console.ReadKey()
002cefc8 00890212 Advanced.NET.Debugging.Chapter5.Roots.Run()
    PARAMETERS:
        this = 0x01c758e0
    LOCALS:
        <CLR reg> = 0x01c758d0


002cefe8 0089013f Advanced.NET.Debugging.Chapter5.Roots.Main(System.String[])
    PARAMETERS:
        args = 0x01c75888
    LOCALS:
        <CLR reg> = 0x01c758e0


002cf208 79e7c74b [GCFrame: 002cf208]
0:000> !do 0x01c758d0
Name: Advanced.NET.Debugging.Chapter5.Name
MethodTable: 001b311c
EEClass: 001b13a0
Size: 16(0x10) bytes
 (C:\ADNDBin\05Roots.exe)
Fields:
      MT    Field   Offset                 Type VT     Attr     Value Name
790fd8c4  4000001        4        System.String  0 instance 01c75898 first
790fd8c4  4000002        8         System.String 0 instance 01c758b4 last
0:000> !gcroot 0x01c758d0
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 2358
ESP:2cefbc:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)
Scan Thread 1 OSTHread 1630
Scan Thread 3 OSTHread 254c
ESP:47df428:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)
ESP:47df42c:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)
ESP:47df438:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)
ESP:47df4d0:Root:01c75984(System.Threading.ThreadHelper)->
01c758d0(Advanced.NET.Debugging.Chapter5.Name)
ESP:47df4d8:Root:01c75984(System.Threading.ThreadHelper)->
01c758d0(Advanced.NET.Debugging.Chapter5.Name)
ESP:47df4f4:Root:01c75984(System.Threading.ThreadHelper)->
01c758d0(Advanced.NET.Debugging.Chapter5.Name)
ESP:47df500:Root:01c75984(System.Threading.ThreadHelper)->
01c758d0(Advanced.NET.Debugging.Chapter5.Name)
ESP:47df5c0:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)->
01c758d0(Advanced.NET.Debugging.Chapter5.Name)
ESP:47df5c4:Root:01c75998(System.Threading.ParameterizedThreadStart)->
01c75984(System.Threading.ThreadHelper)
ESP:47df754:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)->
01c75984(System.Threading.ThreadHelper)
ESP:47df758:Root:01c75998(System.Threading.ParameterizedThreadStart)->
01c75984(System.Threading.ThreadHelper)
ESP:47df764:Root:01c75998(System.Threading.ParameterizedThreadStart)->
01c75984(System.Threading.ThreadHelper)
ESP:47df76c:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)->
01c75984(System.Threading.ThreadHelper)
DOMAIN(0037FCF8):HANDLE(Pinned):a13fc:Root:02c71010(System.Object[])->
01c758d0(Advanced.NET.Debugging.Chapter5.Name)

As you can see from the gcroot output, the command scans a number of different sources to find and build the reference chain to the object specified. Regardless of the source, the output of the GCRoot command results in the following general format:

<root>-><reference 1>-><reference 2>-><reference X>-><object>

Depending on the source probed, each of the elements takes on a slightly different format as shown.

  • Local variables on a threads stack. The root element typically looks like the following: <stack register>:<stack pointer>:Root:<object>. The stack register depends on the architecture. For example, on x86 machines it shows as ESP and on x64 machines it shows as RSP. The stack pointer shows the location on the stack where the object is rooted, and the object address is the address of the object that is holding a reference to the next object in the reference chain. Let's take a look at an example:
    ESP:47df428:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)
    We can see that there is a local variable located on stack (ESP) location 0x047df428. Furthermore, the output tells us that this constitutes a root to the object at address 0x01c758d0, which is a reference to the Advanced.NET.Debugging.Chapter5.Name type.
  • Handle tables. All handle tables are scanned as part of GCRoot execution looking for references to the specified object. If a reference is found, the output of the command takes on the following general syntax: DOMAIN(<address>):HANDLE(<type>):<handleaddress>:Root: <object>. The domain address field indicates the address of the application domain to which the handle reference belongs. The handle type specifies the type of the handle. The possible handle types are Weak, WeakTrac Resurrection, Normal, and Pinned.

    Next is the handle address, which is the address to the handle itself. Please keep in mind that the handle type is a value type and if you want to dump out the contents you must use the DumpVC command rather than DumpObj. Finally, the root object address is shown. Let's take a look at an example:

    DOMAIN(002EFCD8):HANDLE(Pinned):2813fc:Root:02c81010
    (System.Object[])->01c858d0(Advanced.NET.Debugging.
    Chapter5.Name)

    The preceding output indicates that the object at address 0x01c858d0 is rooted by an object that resides in the handle table corresponding to the application domain with address 0x002efcd8. Furthermore, the address of the handle value holding the reference is located at address 0x002813fc and the type of the handle value is pinned. Lastly, the actual object that holds the reference is at address 0x02c81010, which is of type System.Object[].

  • F-reachable queue. The f-reachable queue is scanned to see if there are any references to the specified object. If a root reference to the object is found on the f-reachable queue, it will be displayed in the following general format: Finalizer queue:Root:<object address>(<object type>). The first part of the output indicates that the source of the root is the f-reachable queue. Next, the address of the referenced object is displayed, followed by the object type. What follows is an example of the output of GCRoot when run against an object that is on the f-reachable queue:
    Finalizer
    queue:Root:01d15750(Advanced.NET.Debugging.Chapter5.Name)
    In the preceding output, we can see that the object at address 0x01d15750 of type Advanced.NET.Debugging.Chapter5.Name is rooted by the f-reachable queue.
  • The last source of output for the GCRoot command is if an object is a member of any of the preceding categories.

One of the potential problems with gcroot and local variables is that it may not always be accurate, thereby producing false positives. To convince ourselves that the stack locations listed in the output are accurate, we have to manually inspect the stack location and correlate it to source code so that we can see whether the local variable is in fact still referencing the object. For example, assume we have the following very simple function:

   public void Run()
   {
       Name n1 = new Name("A", "B");


       Console.WriteLine("Press any key to exit");
       Console.ReadKey();
}

In the source code, we have a simple instance of the Name class assigned to the n1 local variable. If we ran the GCRoot command on the n1 reference, we would expect to only see one reference on the thread stack:

0:000> !GCRoot 0x01e9580c
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 1638
ESP:1df29c:Root:01e9580c(Advanced.NET.Debugging.Chapter5.Name)

ESP:1df2a0:Root:01e9580c(Advanced.NET.Debugging.Chapter5.Name)
Scan Thread 2 OSTHread 14ac

The output clearly shows that thread 0 apparently has two references to the object on the thread stack. How is this possible? The way that the GCRoot command works is by assuming that every address on the stack is an address to an object. It tries to verify this assumption by utilizing various metadata information. In light of this, objects that are (or were) previously present on the stack are treated as first class references to those objects and listed in the output of GCRoot. If you suspect that the output of GCRoot, in as far as thread stacks is concerned, is incorrect, the best approach is to use the U command to unassemble the stack frames and correlate the stack registers in the GCRoots output to the unassembled code to see which objects are truly valid.

Finalization

The garbage collection mechanism described so far assumes that objects that are collected do not require any special cleanup code. At times, objects that encapsulate other resources require that these resources be cleaned up as part of object destruction. A great example is an object that wraps an underlying native resource such as a file handle. Without explicit cleanup code, the memory behind the managed object is cleaned up by the GC, but the underlying handle that the object encapsulates is not (because GC has no special knowledge of native handles). The net result is naturally a resource leak. To provide a proper cleanup mechanism, the CLR introduces what is known as finalizers. A finalizer can be compared to destructors in the native C++ world. Whenever an object is freed (or garbage collected), the destructor (or finalizer) is run. In C#, a finalizer is declared very similarly to a C++ destructor by using the ~<class name>() notation. An example is shown in the following listing:

public class MyClass
{
      ...
      
   ...
      
   ...
      
   ~MyClass()
      
   {
            
         // Cleanup code
      
   }
}

When the class is compiled into IL, the finalize method gets translated into a function called Finalize. The key thing about objects with finalizers is that the garbage collector treats them a little differently than other objects. Because the garbage collector is in fact an automatic memory manager, it also has the responsibility of executing all finalization code that an object may have during a garbage collection. To keep tabs on which objects have finalizers, the garbage collector maintains a queue called a finalization queue. Objects that are created on the managed heap and contain finalizers are automatically placed on the finalization queue during creation. Please note that the finalization queue does not contain objects that are considered garbage, but rather it contains all objects with finalizers that are alive on the managed heap. When an object with a finalizer becomes rootless and a garbage collection occurs, the GC places the object on a different queue known as the f-reachable queue. This queue contains all objects with defined finalizers that are considered to be garbage and need to have their finalizers executed. All objects on the f-reachable queue are considered roots to those objects, meaning that the object is still alive. It is important to note that the finalizer code for each of the objects on the f-reachable queue is not executed as part of the garbage collection phase. Instead, each .NET process contains a special thread known as the finalization thread. The finalization thread wakes up, on request of the GC, and checks the state of the f-reachable queue. If there are any objects on the f-reachable queue, the finalization thread picks them up one by one and executes the finalize methods.

When the garbage collection finishes, objects with finalizers are on the f-reachable queue (rooted and alive) until the finalization thread executes the finalize methods. At that point, the object is removed from the f-reachable queue, is considered rootless, and can be truly reclaimed by the garbage collector. The next time a garbage collection is started, the objects are collected. Figure 5-7 illustrates an example of the finalization process.

Figure 5-7

Figure 5-7 Example of finalization process

Step 1 in Figure 5-7 consists of allocating Obj D and Obj E, both of which contain finalize methods. As part of the allocation, the objects are placed on the managed heap as well as on the finalization queue to indicate that the objects need to be finalized when no longer in use. In step 2, Obj D and Obj E have both become rootless when a garbage collection occurs. At that point, both objects are moved from the finalization queue to the f-reachable queue to indicate that the finalize methods are now ready to be run. At some point in the future (nondeterministic), step 3 is executed and the finalizer thread wakes up and starts running the finalize methods for both of the objects. Even after the finalizer has finished, both objects are still rooted on the f-reachable queue. Lastly, in step 4, another garbage collection occurs and the objects are removed from the f-reachable queue (no longer rooted) and then collected from the managed heap by the garbage collector.

An interesting aspect of having a dedicated thread executing the finalize methods is that the CLR does not place any guarantees when the thread wakes up and executes. As such, it is possible that it will take some time before an object with a finalizer is actually cleaned up. When dealing with objects that aggregate scarce resources, it may not always be feasible to wait for a long period of time for the resource to be reclaimed. In such situations, it is best to implement an explicit and deterministic cleanup pattern such as the IDisposable and/or Close patterns. Finally, having a dedicated thread also means that you have no control over the state of that thread, and making assumptions based on state can break your application.

Let's take a look at a concrete example of an object with a finalize method and see if we can track the object during a garbage collection. Listing 5-4 shows the source code of the application we will be utilizing.

Listing 5-4. Simple object with a finalize method

using System;
using System.Text;
using System.Runtime.InteropServices;


namespace Advanced.NET.Debugging.Chapter5
{
    class NativeEvent
    {
        private IntPtr nativeHandle;


        public IntPtr NativeHandle { get { return nativeHandle; } }


        public NativeEvent(string name)
        {
            nativeHandle = CreateEvent(IntPtr.Zero,
                                       false,
                                       true,
                                       name);
        }


        ~NativeEvent()
        {
            if(nativeHandle!=IntPtr.Zero)
            {
                CloseHandle(nativeHandle);
                nativeHandle=IntPtr.Zero;
            }
        }

        [DllImport("kernel32.dll")]
        static extern IntPtr CreateEvent(IntPtr lpEventAttributes,
                                         bool bManualReset,
                                         bool bInitialState,
                                         string lpName);


        [DllImport("kernel32.dll")]
        static extern IntPtr CloseHandle(IntPtr lpEvent);
    }


    class Finalize
    {
        static void Main(string[] args)
        {
            Finalize f = new Finalize();
            f.Run();
        }


        public void Run()
        {
            NativeEvent nEvent = new NativeEvent("MyNewEvent");


            //
            // Use nEvent
            //


            nEvent = null;


            Console.WriteLine("Press any key to GC");
            Console.ReadKey();


            GC.Collect();


            Console.WriteLine("Press any key to GC");
            Console.ReadKey();


            GC.Collect();


            Console.WriteLine("Press any key to exit");
            Console.ReadKey();
        }


    }
}

The source code and binary for Listing 5-4 can be found in the following folders:

  • Source code: C:\ADND\Chapter5\Finalize
  • Binary: C:\ADNDBin\05Finalize.exe

The source code in Listing 5-4 declares a type called NativeEvent that simply wraps the creation of a Windows event using the .NET interoperability services. Because the net result of creating a native event is a handle, the handle must be closed during object destruction to avoid a handle leak in the application. The closing of the handle is implemented in the NativeEvent finalize method. The main part of the application is implemented in the Finalize class. More specifically, the Run method declares an instance of the NativeEvent class, sets the local variable reference to null (indicating that it can be garbage collected), followed by a couple of forced garbage collections. What do we expect to happen to the NativeEvent instance we declared at the point of the first garbage collection? From our previous discussion, we expect that prior to the garbage collection, the object is in the finalization queue. Furthermore, when the garbage collection occurs, the object is deemed rootless and moved to the f-reachable queue where it maintains a reference to the object so that the finalization thread can run the Finalize method. It's important to remember that the execution of the finalization thread does not happen during the garbage collection, but rather it happens out of band at any time. When the Finalize method has run, the object can be fully collected during the next garbage collection. Let's see if we can use the debuggers to verify our earlier theory. Run 05Finalize.exe under the debugger and break execution when the first Press any key to GC prompt appears. When we have broken into the debugger, we can use the FinalizeQueue command to show the state of the finalizable objects in the process:

0:004> !FinalizeQueue
SyncBlocks to be cleaned up: 0
MTA Interfaces to be released: 0
STA Interfaces to be released: 0
––––––––––––––––––––––––––––––––––
generation 0 has 6 finalizable objects (003d3160->003d3178)
generation 1 has 0 finalizable objects (003d3160->003d3160)
generation 2 has 0 finalizable objects (003d3160->003d3160)
Ready for finalization 0 objects (003d3178->003d3178)
Statistics:
      MT    Count    TotalSize Class Name
00123128        1           12 Advanced.NET.Debugging.Chapter5.NativeEvent
7911c9c8        1           20 Microsoft.Win32.SafeHandles.SafePEFileHandle
791037c0        1           20 Microsoft.Win32.SafeHandles.SafeFileMappingHandle
79103764        1           20 Microsoft.Win32.SafeHandles.SafeViewOfFileHandle
79101444        1           20 Microsoft.Win32.SafeHandles.SafeFileHandle
790fe704        1           56 System.Threading.Thread
Total 6 objects

There are several pieces of useful information in the output. First, the finalization queues for each generation are shown. In this particular case, generation 0 has 6 finalizable objects and generations 1 and 2 have none. For each of the finalization queues, the FinalizeQueue command also shows the address range of the queue itself for that particular generation. For example, generation 0's finalization queue starts at address 0x003d3160 and ends at address 0x003d3178. We can use the dd command to dump the queue as shown here:

0:004> dd 003d3160 l6
003d3160  01fc1df0 01fc5090 01fc5964 01fc5998
003d3170  01fc683c 01fc6850

The elements in the queue can be looked at further by using the do command. If we want to look at the object at address 0x01fc5964 in more detail, we would use the command shown here:

0:004> !do 01fc5964

Name: Advanced.NET.Debugging.Chapter5.NativeEvent
MethodTable: 00123128
EEClass: 00121804
Size: 12(0xc) bytes
 (C:\ADNDBin\05Finalize.exe)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
791016bc  4000001        4        System.IntPtr  1 instance      1f0 nativeHandle

The next piece of useful information from the FinalizeQueue command is the f-reachable queue, which is shown in the following output:

Ready for finalization 0 objects (000c3178->000c3178)

The output indicates that at this point there are no objects that are ready to be finalized. This makes perfect sense because a garbage collection has not yet occurred.

The final piece of output in the FinalizeQueue command is the statistics section, which shows a summarized list of all objects in either the finalization queue or the f-reachable queue.

Before we resume execution, we need to discuss the magic finalization thread that exists in all managed processes. What does the stack trace of this thread look like? To find the answer, use the ~*kn command to display the stack traces of all the threads in the process including frame numbers. In the output, one thread in particular looks interesting:

      2  Id: 1a10.c10 Suspend: 1 Teb: 7ffdd000 Unfrozen
 # ChildEBP RetAddr
00 011cf604 77709254 ntdll!KiFastSystemCallRet
01 011cf608 7618c244 ntdll!ZwWaitForSingleObject+0xc
02 011cf678 79e789c6 KERNEL32!WaitForSingleObjectEx+0xbe
03 011cf6bc 79e7898f mscorwks!PEImage::LoadImage+0x1af
04 011cf70c 79e78944 mscorwks!CLREvent::WaitEx+0x117
05 011cf720 79ef2220 mscorwks!CLREvent::Wait+0x17
06 011cf73c 79fb997b mscorwks!WKS::WaitForFinalizerEvent+0x4a
07 011cf750 79ef3207 mscorwks!WKS::GCHeap::FinalizerThreadWorker+0x79
08 011cf764 79ef31a3 mscorwks!Thread::DoADCallBack+0x32a
09 011cf7f8 79ef30c3 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3
0a 011cf834 79fb9643 mscorwks!Thread::ShouldChangeAbortToUnload+0x30a
0b 011cf85c 79fb960d mscorwks!ManagedThreadBase_NoADTransition+0x32
0c 011cf86c 79fba09b mscorwks!ManagedThreadBase::FinalizerBase+0xd
0d 011cf8a4 79f95a2e mscorwks!WKS::GCHeap::FinalizerThreadStart+0xbb
0e 011cf93c 76184911 mscorwks!Thread::intermediateThreadProc+0x49
0f 011cf948 776ee4b6 KERNEL32!BaseThreadInitThunk+0xe
10 011cf988 776ee489 ntdll!__RtlUserThreadStart+0x23
11 011cf9a0 00000000 ntdll!_RtlUserThreadStart+0x1b

Frames 6 and 7 in the stack trace indicate that in fact this is the finalizer thread for the process. Frame 6 in particular shows that the thread is currently waiting for finalizer events (or objects that need to be finalized). Let's set a breakpoint on the return address of frame 6 (0x79fb997b), which will trigger any time the finalizer thread is awakened to perform work:

bp 79fb997b

When the breakpoint is set, resume execution and press any key to trigger the first garbage collection. You'll notice that a breakpoint is hit, as shown in the following:

0:003> g
 Breakpoint 0 hit
eax=00000001 ebx=00000001 ecx=7618c42d edx=77709a94 esi=00000000 edi=00493a48
eip=79fb997b esp=00b7f768 ebp=00b7f770 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202
mscorwks!WKS::GCHeap::FinalizerThreadWorker+0x79:
79fb997b 3bde           cmp     ebx,esi

The breakpoint corresponds to the finalizer thread breakpoint set earlier and indicates that the finalizer is ready to execute the Finalize methods on the objects in the f-reachable queue. How do we find out what objects are in the f-reachable queue? You guessed it: by using the FinalizeQueue command:

0:002> !FinalizeQueue
SyncBlocks to be cleaned up: 0
MTA Interfaces to be released: 0
STA Interfaces to be released: 0
––––––––––––––––––––––––––––––––––
generation 0 has 0 finalizable objects (003d3170->003d3170)
generation 1 has 4 finalizable objects (003d3160->003d3170)
generation 2 has 0 finalizable objects (003d3160->003d3160)
Ready for finalization 2 objects (003d3170->003d3178)
Statistics:
      MT    Count    TotalSize Class Name
00123128        1           12 Advanced.NET.Debugging.Chapter5.NativeEvent
7911c9c8        1           20 Microsoft.Win32.SafeHandles.SafePEFileHandle
791037c0        1           20 Microsoft.Win32.SafeHandles.SafeFileMappingHandle
79103764        1           20 Microsoft.Win32.SafeHandles.SafeViewOfFileHandle
79101444        1           20 Microsoft.Win32.SafeHandles.SafeFileHandle
790fe704        1           56 System.Threading.Thread

This time, the output states that there are two objects in the f-reachable queue, starting at address 0x003d3160, that the finalization thread is about to execute. If we dump out the contents of the f-reachable queue and each of the objects, we can see the following:

0:002> dd 003d3170 l2
003d3170  01fc5090 01fc5964
0:002> !do 01fc5090
Name: Microsoft.Win32.SafeHandles.SafePEFileHandle
MethodTable: 7911c9c8
EEClass: 791fb61c
Size: 20(0x14) bytes
 (C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
791016bc  40005c1        4        System.IntPtr  1 instance    3eab28 handle
79102290  40005c2        8         System.Int32  1 instance        4 _state
7910be50  40005c3        c       System.Boolean  1 instance        1 _ownsHandle
7910be50  40005c4        d       System.Boolean  1 instance        1
_fullyInitialized
0:002> !do01fc5964
Name: Advanced.NET.Debugging.Chapter5.NativeEvent
MethodTable: 00123128
EEClass: 00121804
Size: 12(0xc) bytes
 (C:\ADNDBin\05Finalize.exe)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
791016bc  4000001        4        System.IntPtr  1 instance      1f0 nativeHandle

The first object is of type SafePEFileHandle and the second object is of type NativeEvent, which happens to be the object we are interested in. If we resume execution, the finalizer thread executes the Finalize method of our NativeEvent class. What happens to the objects on the f-reachable queue after finalization has completed? Well, the objects are removed from the f-reachable queue, which renders them rootless; they will be collected during the next garbage collection.

This concludes our discussion of finalization. As you can see, there is a lot of work being done under the hood whenever a finalizable type comes into play. Not only does the CLR need additional data structures (such as the finalization queue and f-reachable queue), but it also spins up a dedicated thread to run the Finalize methods for each object that is being collected. Furthermore, an object with a Finalize does not get collected in just one garbage collection, but rather two, which in essence means that the objects with Finalize methods always get promoted to generation 1 before they are truly dead, making it a far more expensive object to work with.

Reclaiming GC Memory

We have discussed the GC in quite a bit of detail. We now know exactly what the GC does when an object is considered garbage. The one missing piece of information is what the GC does with the memory that becomes available after an object is garbage collected. Does the memory get put on some sort of free list and then reused when another allocation request arrives? Does the memory get freed? Is fragmentation ever a problem on the managed heap? The answer is a combination of all three. If a collection that occurs in generations 0 and 1 leaves a gap on the managed heap, the garbage collector compacts all live objects so that they reside next to each other and coalesces any free blocks on the managed heap into a larger block that is located after the last live object (starting at the current allocation pointer). Figure 5-8 shows an example of the compacting and coalescing.

Figure 5-8

Figure 5-8 Garbage collection compacting and coalescing phase

In Figure 5-8, the initial state of the managed heap contains five rooted objects (A through E). At some point during execution, objects B and D become rootless and are candidates to be reclaimed during a garbage collection. When the garbage collection occurs, the memory occupied by objects B and D is reclaimed, which leads to gaps on the managed heap. To remove these gaps, the garbage collector compacts the remaining live objects (Obj A, C, and E) and coalesces the two free blocks (used to hold Obj B and D) into one free block. Lastly, the current allocation pointer is updated as a result of the compacting and coalescing.

The ephemeral segment contains both generation 0 and generation 1 (and also part of generation 2), but generation 2 can consist of multiple managed heap segments. As more and more objects make it to generation 2, the need to grow generation 2 also increases. The way that the CLR heap manager grows generation 2 is by allocating more segments. When objects in generation 2 are collected, the CLR heap manager decommits memory in the segments, and when a segment is no longer needed, it is entirely freed. In certain situations and allocation patterns, generation 2 grows and shrinks quite frequently, leading to a large number of calls to allocate and free virtual memory (VirtualAlloc and VirtualFree APIs). Two common drawbacks of this approach are that these calls can be expensive because a transition to kernel mode is required as well as the potential to fragment the VM address space. As such, CLR 2.0 introduces a feature called VM hoarding, which essentially does not free segments but rather keeps the segments on a standby list that can be utilized when more memory is required. To utilize the VM hoarding feature, the CLR host itself must specify that it wants to use the feature.

Because the cost of a compaction is directly proportional to the size of the object (the bigger the object, the costlier the compaction), the garbage collector introduces another type of heap called the large object heap (LOH). Objects that are large enough to severely hurt the performance of a compaction are placed on the LOH, which we will discuss next.

Large Object Heap

The large object heap (LOH) consists of objects that are greater than or equal to 85,000 bytes in size. The decision to separate objects of that size into its own heap is related to the fact that during the compacting phase of a garbage collection, the cost of compacting an object is directly proportional to the size of the object being compacted. Rather than having large objects on the standard heap eating up garbage collection time during compaction, the LOH was created. The LOH is best viewed as an extension of generation 2, and a collection of the LOH can only be done after a generation 2 collection has occurred, implying that a collection of the LOH is only done during a full garbage collection. Because compacting large objects is very expensive, the GC avoids compacting the LOH altogether and instead uses a process known as sweeping that keeps a free list that is used to keep track of available memory in the LOH segment(s). Figure 5-9 shows an example of a LOH with two segments.

Figure 5-9

Figure 5-9 LOH example

Please note that although the LOH does not perform any compaction, it does do coalescing of adjacent free blocks. That is, if you ever end up with two free adjacent blocks, the GC coalesces those blocks into a larger block and adds it to the free list (while also removing the two smaller blocks).

To find out the current state of the LOH in the debugger, we can again use the eeheap –gc command, which includes details on the LOH:

0:004> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x01fc6c18
generation 1 starts at 0x01fc100c
generation 2 starts at 0x01fc1000
ephemeral segment allocation context: none
 segment    begin allocated     size
00308030 790d8620  790f7d8c 0x0001f76c(128876)
01fc0000 01fc1000  01fc8c24 0x00007c24(31780)
Large object heap starts at 0x02fc1000
 segment    begin allocated     size
02fc0000 02fc1000  02fc3240 0x00002240(8768)
Total Size   0x295d0(169424)
––––––––––––––––––––––––––––––
GC Heap Size   0x295d0(169424)

The LOH section in the command output shows the starting point of the LOH as well as per-segment information such as the segment, start, and end address of the segment and total size of the segment. In the preceding example, we can see that the LOH has one segment (0x02fc000) starting at address 0x02fc1000 and ending at 0x02fc3240 with a total size of 0x00002240. The last piece of information is the total size of all segments in the LOH. One interesting question related to the LOH is how the contents of the LOH can be dumped. There are a couple of options that both revolve around using DumpHeap command switches. The first switch of interest is the –min switch, which tells the DumpHeap command that you are only interested in objects of the specified size. Because we know that LOH objects are greater than or equal to 85,000 bytes in size, we can use the following command:

0:004> !DumpHeap -min 85000
 Address       MT     Size
02c53250 7912dae8    100016
total 1 objects
Statistics:
      MT    Count    TotalSize Class Name
7912dae8        1       100016 System.Byte[]

Here, we can see that there is one object of size 100016 on the LOH. You can verify or convince yourself that the object is in fact on the LOH by looking at the address. If the address of the object falls within the LOH segments addresses, it must be located on the LOH (with the exception of free objects, which can reside both in the SOH as well as the LOH).

The next option we have is to specify a starting address for the DumpHeap command. If we specify the starting address of the LOH, we can ask the command to dump out all objects on the LOH. The switch to use is the –startAtLowerBound switch, which takes the address as a parameter. Using the same LOH as earlier, the following command can be used:

0:004> !DumpHeap -startAtLowerBound 02c51000
 Address       MT     Size
02c51000 002a6360       16 Free
02c51010 7912d8f8     4096
02c52010 002a6360       16 Free
02c52020 7912d8f8     4096
02c53020 002a6360       16 Free
02c53030 7912d8f8      528
02c53240 002a6360       16 Free
02c53250 7912dae8   100016
02c6b900 002a6360       16 Free
total 9 objects
Statistics:
      MT    Count    TotalSize  Class Name
002a6360        5           80      Free
7912d8f8        3         8720   System.Object[]
7912dae8        1       100016   System.Byte[]
Total 9 objects

Again, we see the object of size 100016, but even more interesting is that we see objects that are smaller than 85,000 bytes on the LOH. What are these objects and how did they end up on the LOH? The answer is that these very, very small objects are placed there by the CLR heap manager, which uses them for its own purposes. Generally speaking, you always see a select few objects with a size less than 85,000 bytes exclusively used by the GC.

Let's take a look at a small sample application that allocates a single large object of size 10,000 bytes (see Listing 5-5). We will then use the debuggers to see if we can locate the object on the LOH and see what happens when the object is collected.

Listing 5-5. Sample application demonstrating LOH

using System;
using System.Text;
using System.Runtime.InteropServices;


namespace Advanced.NET.Debugging.Chapter5
{
     class LOH
    {
        static void Main(string[] args)
        {
            LOH l = new LOH();
            l.Run();
        }


        public void Run()
        {
            byte[] b = null;
            Console.WriteLine("Press any key to allocate on LOH");
            Console.ReadKey();


            b = new byte[100000];


            Console.WriteLine("Press any key to GC");
            Console.ReadKey();


            b = null;
            GC.Collect();


            Console.WriteLine("Press any key to exit");
            Console.ReadKey();
        }


    }
}

The source code and binary for Listing 5-5 can be found in the following folders:

  • Source code: C:\ADND\Chapter5\LOH
  • Binary: C:\ADNDBin\05LOH.exe

Let's run the application in the debugger and break execution when the Press any key to allocate on LOH is displayed. At this point, we haven't yet created our big allocation, but it never hurts to take a look at the LOH heap to see what, if anything, is already on it:

0:004> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x01f01018
generation 1 starts at 0x01f0100c
generation 2 starts at 0x01f01000
ephemeral segment allocation context: none
 segment    begin allocated     size
004a8008 790d8620  790f7d8c 0x0001f76c(128876)
01f00000 01f01000  01f5c334 0x0005b334(373556)
Large object heap starts at 0x02f01000
 segment    begin allocated     size
02f00000 02f01000  02f03250 0x00002250(8784)
Total Size   0x7ccf0(511216)
––––––––––––––––––––––––––––––
GC Heap Size   0x7ccf0(511216)
0:004> !dumpheap -startatlowerbound 02f01000
 Address       MT      Size
02f01000 00496360        16 Free
02f01010 7912d8f8      4096
02f02010 00496360        16 Free
02f02020 7912d8f8      4096
02f03020 00496360        16 Free
02f03030 7912d8f8       528
02f03240 00496360        16 Free
total 7 objects
Statistics:
      MT    Count    TotalSize Class Name
00496360        4           64      Free
7912d8f8        3         8720 System.Object[]
Total 7 objects

We start by finding the starting point of the LOH by using the eeheap command. The starting point in this case is 0x02f01000. Then, we feed the starting address to the dumpheap command using the –startatlowerbound switch to output all objects on the LOH. In the output, we can see that the only objects that are on the LOH are the mysterious object arrays that are smaller than 85,000 bytes. Other than that, we have no other objects present. Next, resume execution and again manually break execution when the Press any key to GC is shown.

We issue the same dumpheap command as before to see if we can spot our 100KB allocation:

0:003> !dumpheap -startatlowerbound 02f01000
 Address       MT    Size
02f01000 00496360      16 Free
02f01010 7912d8f8    4096
02f02010 00496360      16 Free
02f02020 7912d8f8    4096
02f03020 00496360      16 Free
02f03030 7912d8f8     528
02f03240 00496360      16 Free
02f03250 7912dae8  100016
02f1b900 00496360      16 Free
total 9 objects
Statistics:
      MT    Count    TotalSize Class Name
00496360        5           80      Free
7912d8f8        3         8720 System.Object[]
7912dae8        1       100016 System.Byte[]
Total 9 objects

We can see that our allocation is stored at address 0x02f03250 on the LOH. Next, we resume execution until we see the Press any key to exit prompt. At this point, a garbage collection has occurred, so let's see what the LOH looks like by using the same dumpheap command again:

0:003> !dumpheap -startatlowerbound 02f01000
 Address       MT    Size
02f01000 00496360      16 Free
02f01010 7912d8f8    4096
02f02010 00496360      16 Free
02f02020 7912d8f8    4096
02f03020 00496360      16 Free
02f03030 7912d8f8     528
total 6 objects
Statistics:
      MT    Count    TotalSize Class Name
00496360        3           48      Free
7912d8f8        3         8720 System.Object[]

This time, we can see how the object has been removed from the LOH and the free blocks available as a result of the collection.

Pinning

As we saw in the Releasing GC Memory section, the garbage collector employs a technique known as compaction to reduce fragmentation on the GC heap. When a compaction occurs, objects may end up moving around on the heap so that they can be placed together, thereby avoiding gaps. As part of the object move, because the address of the object changes, all references to the object are also updated. This works well assuming all references to the object are contained within the CLR, but quite often it is necessary for .NET applications to work outside of the boundary of the CLR by using the interoperability services (such as platform invocation or COM interoperability). If a reference to a managed object is passed to an underlying native API, the object might be moved while the native API is reading and/or writing to the memory, causing serious problems because the CLR clearly cannot notify the native API of the address change. Figure 5-10 illustrates the problem.

Figure 5-10

Figure 5-10 Interoperability services and GC compaction problem

From the flow in Figure 5-10, we can see that the initial state of the managed heap includes five objects starting with Obj A at address 0x02000000. At a certain point, a platform invocation call to an asynchronous native API is required. Furthermore, the address of Obj C (0x02000090) needs to be passed to the API. Upon successfully calling the asynchronous native API, a garbage collection occurs causing Obj A and Obj B to be collected. This leaves a gap of two free objects on the managed heap and the garbage collector dutifully rectifies the problem by compacting the managed heap and therefore moving Obj C to address 0x02000000. It also coalesces the two free blocks and places them at the end of the heap. After the garbage collection has finished, the asynchronous API call we made earlier decides to write to the address initially passed to it (0x02000090), which originally held Obj C. As you can see, with the asynchronous API writing to that address, we will experience a managed heap corruption as the memory is no longer occupied by Obj C.

Because the invocation of native code is such a common task, a solution had to be devised that allowed for safe invocation in light of a compacting garbage collector. The solution is called pinning and refers to the capability to pin specific objects on the managed heap. When an object is pinned, the garbage collector will not move the object for any reason until the object is unpinned. If Obj C in Figure 5-10 was pinned prior to invoking the asynchronous native API, the managed heap corruption would not have occurred due to the garbage collector not moving Obj C during the compaction phase.

Let's take a look at an example of a simple application that performs pinning and see what it looks like in the debugger. Listing 5-6 shows the source code of the application.

Listing 5-6. Sample application using pinning

using System;
using System.Text;
using System.Runtime.InteropServices;


namespace Advanced.NET.Debugging.Chapter5
{
    class Pinning
    {
        static void Main(string[] args)
        {
            Pinning p = new Pinning();
            p.Run();
        }
        public void Run()
        {
            SByte[] b1 = null;
            SByte[] b2 = null;
            SByte[] b3 = null;
            Console.WriteLine("Press any key to alloc");
            Console.ReadKey();


            b1 = new SByte[100];
            b2 = new SByte[200];
            b3 = new SByte[300];


            GCHandle h1 = GCHandle.Alloc(b1, GCHandleType.Pinned);
            GCHandle h2 = GCHandle.Alloc(b2, GCHandleType.Pinned);
            GCHandle h3 = GCHandle.Alloc(b3, GCHandleType.Pinned);


            Console.WriteLine("Press any key to GC");
            Console.ReadKey();


            GC.Collect();


            Console.WriteLine("Press any key to exit");
            Console.ReadKey();


            h1.Free(); h2.Free(); h3.Free();
        }


    }
}

The source code and binary for Listing 5-6 can be found in the following folders:

  • Source code: C:\ADND\Chapter5\Pinning
  • Binary: C:\ADNDBin\05Pinning.exe

The sample application shown in Listing 5-6 illustrates how to use the GCHandle type to pin objects. The Run method declares three arrays of the SByte type and creates GCHandles for each of the allocations specifying that the objects be pinned. The application then forces a garbage collection and exits. Let's run the application under the debugger and see if we can track the allocated memory and how it gets pinned.

Resume execution of the application until you see the Press any key to GC prompt. At this point, we manually break execution and use a command called GCHandles. The GCHandles command displays a list of all the handles available in the process:

0:004> !GCHandles
GC Handle Statistics:
Strong Handles: 15
Pinned Handles: 7
Async Pinned Handles: 0
Ref Count Handles: 0
Weak Long Handles: 0
Weak Short Handles: 1
Other Handles: 0
Statistics:
      MT    Count    TotalSize Class Name
790fd0f0        1           12 System.Object
790feba4        1           28 System.SharedStatics
790fcc48        2           48 System.Reflection.Assembly
790fe17c        1           72 System.ExecutionEngineException
790fe0e0        1           72 System.StackOverflowException
790fe044        1           72 System.OutOfMemoryException
790fed00        1          100 System.AppDomain
790fe704        2          112 System.Threading.Thread
79100a18        4          144 System.Security.PermissionSet
790fe284        2          144 System.Threading.ThreadAbortException
7912ee44        3          636 System.SByte[]
7912d8f8        4          8736 System.Object[]
Total 23 objects

The GCHandles command walks the handle tables and looks for all types of different handles (strong, weak, pinned, etc.) and displays a summary of the results as well as a statistical section with detailed information on each type found. In the preceding output, we can see that we have 15 strong handles, 7 pinned handles, and 1 weak short handle. In addition, in the Statistics section, we can see the three SByte arrays that we allocated and pinned. The GCHandles command provides a good overview of the handle activity in any given process, but if further information is required, such as the type of handle for each of the types listed in the Statistics section, we have to use an additional command called objsize. One of the functions of the objsize command is to output the size of the object passed in as an argument. If no arguments are specified, it scans all the referenced objects in the process and outputs the size as well as other useful information:

0:004> !objsize
Scan Thread 0 OSTHread 2558
ESP:2fed54:  sizeof(01d9599c)  =          20 (        0x14) bytes
 (Microsoft.Win32.SafeHandles.SafeFileHandle)
ESP:2fee18: sizeof(01d96d9c) =          312 (           0x138) bytes (System.SByte[])
ESP:2fee20: sizeof(01d96c58) =          112 (            0x70) bytes (System.SByte[])
ESP:2fee24: sizeof(01d96cc8) =          212 (            0xd4) bytes (System.SByte[])
ESP:2fee30: sizeof(01d958b4)  =          12 (            0xc) bytes
 (Advanced.NET.Debugging.Chapter5.Pinning)
...
...
...
Scan Thread 2 OSTHread 2c80
DOMAIN(004DFD10):HANDLE(Strong):1c119c: sizeof(01d958a4) =
           16 (         0x10) bytes (System.Object[])
...
...
...
DOMAIN(004DFD10):HANDLE(WeakSh):1c12fc: sizeof(01d91de8) =
          56 (        0x38) bytes (System.Threading.Thread)
DOMAIN(004DFD10):HANDLE(Pinned):1c13e4: sizeof(01d96d9c) =
          312 (         0x138) bytes (System.SByte[])
DOMAIN(004DFD10):HANDLE(Pinned):1c13e8: sizeof(01d96cc8) =
          212 (          0xd4) bytes (System.SByte[])
DOMAIN(004DFD10):HANDLE(Pinned):1c13ec: sizeof(01d96c58) =
          112 (        0x70) bytes (System.SByte[])
DOMAIN(004DFD10):HANDLE(Pinned):1c13f0: sizeof(02d93030) =
          708 (       0x2c4) bytes (System.Object[])
DOMAIN(004DFD10):HANDLE(Pinned):1c13f4: sizeof(02d92020) =
          4276 (       0x10b4) bytes (System.Object[])
DOMAIN(004DFD10):HANDLE(Pinned):1c13f8: sizeof(01d9118c) =
          12 (         0xc) bytes (System.Object)
DOMAIN(004DFD10):HANDLE(Pinned):1c13fc: sizeof(02d91010) =
          19332 (      0x4b84) bytes (System.Object[])

The output has been abbreviated, but clearly shows that our SByte arrays have been pinned as shown by HANDLE(Pinned).

Although the notion of pinning objects solves the problem of movable objects during native code invocations, it does present a problem to the garbage collector; the problem is that of fragmentation (one of the problems that compaction is meant to solve). If there are a lot of interleaved pinned objects on the managed heap, situations may occur where there isn't enough contiguous free space available. Figure 5-11 shows a hypothetical example of a fragmented managed heap due to excessive pinning.

Figure 5-11

Figure 5-11 Hypothetical example of a fragmented managed heap

In the layout illustrated in Figure 5-11, we can see that we have several free smaller blocks intertwined with live objects (Obj A through D). If a garbage collection should occur, the layout of the managed heap will remain unchanged. The reason for that is simple: The garbage collector cannot perform a compaction due to all live objects being pinned and hence not movable. Because the free blocks are not adjacent, it also cannot perform coalescing. Even though we have free blocks available, memory allocation requests may in fact fail if the size of the requested allocation is greater than 32 bytes. We will take a look at a real-world managed heap fragmentation problem in detail later in the chapter.

Garbage Collection Modes

The last topic we will discuss are the modes that the garbage collector runs in. There are three primary modes of operation:

  • Nonconcurrent workstation
  • Concurrent workstation
  • Server

We've already discussed the difference between server and workstation in general, and it boils down to the server mode creating one heap and one GC thread per processor. All garbage collection related activities are performed by the dedicated GC thread on the processor it is assigned to. What we haven't discussed is the notion of concurrent and nonconcurrent garbage collections. In the nonconcurrent workstation mode, the garbage collector suspends all managed threads for the entire duration of the garbage collection. Only when the garbage collection is finished does it resume all the managed threads in the process. This may work fine if there isn't a need for super-fast responsiveness, but in cases such as GUI applications, quick response times are very critical. Hence, the introduction of the concurrent workstation mode where, during a garbage collection, the managed threads are not suspended for the entire duration of the garbage collection but are allowed to wake up periodically and do work before being put back to sleep again for the garbage collector to do some more work. This increases the responsiveness of the application but can make garbage collection slightly slower.

  • + Share This
  • 🔖 Save To Your Account

Discussions

comments powered by Disqus