Home > Articles > Programming > Windows Programming

Exploring the CLR

This chapter is from the book

Garbage Collection

Dynamically allocated memory is the bane of most programmers' existence. Although you need heap memory to write most non-trivial applications, managing this memory correctly is an error-prone nightmare. Only the most careful and diligent programmers get it right, and failure to do so results in applications that leak memory (because the programmer forgot to free memory that is no longer being used) or that crash sporadically because the programmers have deallocated memory that is still needed by their application. These two bugs are the two biggest time-wasters in software development for programmers and for testers who spend countless hours finding and documenting these bugs. They also take a huge toll on the productivity of end-users who waste countless hours dealing with the results of such bugs: brittle software that crashes or gradually eats up all the memory on a client until it or some other application crashes.

The basic idea of garbage collection is that programmers should be able to allocate as much memory as they see fit (within reason of course), and, when they are not using a block of memory, the system should simply reclaim it. In this model, there is no need for delete, dealloc, or release functions.

The managed heap in the CLR uses garbage collection to automatically free memory that is no longer being used. As a programmer using the .NET Framework, you don't have to do anything special to use the managed heap. You simply allocate instances of objects from the managed heap using the new operator in your language of choice. When a request for more memory cannot be satisfied with the available memory, the CLR's garbage collection algorithm will run (or you can explicitly run the garbage collection algorithm by calling the Collect method on the System.GC class). The garbage collector figures out which blocks of memory are no longer being used by your application, frees that memory, and compacts the used memory into a contiguous block. The rest of this section explains how the .NET Framework's garbage collection algorithm works.

The first premise that you must accept before you can understand garbage collection is that, in order for a block of memory to be used (now or in the future) by an application, that memory must be reachable through a pointer/reference, that is, at least one pointer/reference must point to it. If there are no pointers/references pointing to a block of memory, it can no longer be used by an application. COM took advantage of this fact to implement its life cycle management scheme. With COM, each object was responsible for maintaining a count of the references that currently point to it by implementing the AddRef and Release methods in the IUnknown interface. Consumers of a COM object use the AddRef method in IUknown to increment the reference count and the Release method to decrement the reference count. The object is supposed to delete itself when its reference count goes to zero. The problem with this approach is that it is manual. In order for COM reference counting to work correctly, component developers must implement the IUnknown interface correctly, and consumers of those components must use the interface correctly.

Garbage collection takes us error-prone developers out of the memory management process. The CLR determines if a block of memory can still be used or not by first assuming that all memory is garbage. It then starts at the roots of the application and builds a graph of all the objects that are reachable from the roots. The roots of an application include static and global pointers, local variables, method parameters on the stack, and even CPU registers. If an object is not part of this graph, it is unreachable from any reference/pointer within the application and is therefore garbage. The garbage collector then compacts all the nongarbage objects by shifting them down in memory using the memcpy function so that no gaps are in the heap.

This process is best illustrated by Figure 3–15. In the scenario illustrated by this picture, Object 1 is currently loaded in a CPU register and contains a pointer to Object 7. Objects 3 and 4 are pointed to by stack pointers (either parameters to a method or local variables to a method). Object 5 is referenced by a static object pointer. Object 5 also contains a pointer to Object 3. Objects 2 and 6 are not currently pointed to by any of the roots or any objects reachable from the roots, so they are garbage. The garbage collector will start at the roots and build a graph. Objects 2 and 6 will obviously not be in the graph because they are not referenced by the roots or any objects reachable from the roots. The garbage collector will then remove all the gaps in memory and position the next object pointer that contains the address of the next available block of memory after Object 7, as shown in Figure 3–16.

Figure 15Figure 3–15 The heap prior to garbage collection.


Figure 16Figure 3–16 After garbage collection.


For performance reasons, the garbage collector may elect not to compact memory if most of the objects survive the collection. The CLR also maintains, for performance reasons, a separate managed heap for large objects. Objects in this heap are garbage collected like the regular heap, but, to avoid copying large objects, the CLR does not compact this heap.

So far so good, but there's actually a lot more to the garbage collection algorithm than this simple explanation. First, many objects contain cleanup logic that must be run when the object is destroyed. For instance, if a business object holds a connection to a database, you may want the connection to be closed when the object is destroyed. Most object-oriented programming languages support the notion of a destructor, which is a method that gets called automatically when the object is destroyed. Cleanup logic, such as closing a database connection or freeing any other resource used by the object, is usually placed in this method. The cleanup method in the .NET Framework is called Finalize, and the process is called Finalization. Curiously, the C# language uses the same destructor syntax as C++, and it also refers to its cleanup method as a destructor. C# destructors also will automatically call the destructor of their base class. The Finalize method in other .NET programming languages does not do this. The destructor for a C# class is declared as follows:

public class Manager : Employee
{
  public Manager(int id,string name,decimal salary,
    decimal bonus) : base(id,name,salary)
  {
    this.mBonus=bonus;
  }
  public override decimal GetSalary()
  {
    return base.GetSalary()+mBonus;
  }
  ~Manager()
  {
    // This destructor will also call the destructor 
    // in its base class.
    MessageBox.Show(
      "Finalize method called in Manager.");
  }
  private decimal mBonus;
}

Therefore, in this case, where I have a Manager class that inherits from an Employee class, the destructor for the Employee class will be called immediately after the destructor for the Manager class.

You can use this new knowledge of MSIL and ildasm to see what is really going on behind the scenes when you create a destructor. Here is the (slightly simplified) MSIL code for the destructor in the Manager class:

void Finalize() 
{
 .try
 {
  IL_0000: ldstr  "Finalize method called in Manager."
  IL_0005: call  MessageBox::Show(string)
  IL_000a: pop
  IL_000b: leave.s  IL_0014
 } // end .try
 finally
 {
  IL_000d: ldarg.0
  IL_000e: call  gctest.Employee::Finalize()
  IL_0013: endfinally
 } // end handler
 IL_0014: ret
} // end of method Manager::Finalize

Notice that the method is actually called Finalize in the generated MSIL code. The logic in the Manager destructor displays a message box, and then it calls the Finalize in the Employee base class of the Manager class. In beta 1 and 2 of the .NET Framework, you had to override the Finalize method in the System.Object class to implement a cleanup method in C#. The destructor syntax and terminology is unique to the release version of C#. Visual Basic .NET still uses a Finalize method in the release version of the .NET Framework. The following code shows how you would implement a Finalize method in Visual Basic .NET:

Public Class Class1
  Sub New()
    MessageBox.Show("Constructor called")
  End Sub
  Protected Overrides Sub Finalize()
    MessageBox.Show("Destructor called")
  End Sub
End Class

NOTE

Even though Microsoft in the release version of .NET decided to use the term destructor with C#, I prefer a term that I saw in the .NET Framework SDK docs, finalize destructor, and this is the term that I will use throughout the rest of this chapter.

In order to implement finalize destructors, the CLR maintains a pair of queues called the Finalization and the Freachable queue. The Finalization queue contains a list of all nongarbage objects that have Finalize destructors. The Freachable queue contains a list of garbage objects that are waiting for a special runtime thread to execute their finalize destructors. When an object that has a Finalize destructor is instantiated, a pointer to that object is inserted into the Finalization queue; this indicates to the CLR that this object will require Finalization when it is destroyed. If an object that has a destructor is determined to be garbage when the garbage collector runs, the pointer to the object is removed from the Finalization queue and appended to the Freachable queue; this indicates that the object is no longer being used and is waiting for a special thread to run its finalize destructor. The CLR does not run the Finalize destructor immediately because poorly written Finalize destructors may take a long time to execute and cause the garbage collection process to take an unacceptably long period of time. An object that has a Finalize destructor will actually survive a garbage collection in a sort of zombie state, even if the garbage collector determined that the object is garbage. After the garbage collector runs, the object will be referenced by the Freachable queue which is considered to be a root. At some point, a special thread in the CLR will wake up and start calling the Finalize destructors on all of the objects in the Freachable queue. After this thread calls the Finalize destructor on an object, it will remove the reference to the object from the Freachable queue. Now the object is truly garbage, and, the next time the garbage collector runs, the memory occupied by the object will be reclaimed.

There are a few key points to glean from this explanation: (1) Finalize destructors are expensive. A class that has a Finalize destructor will actually require two garbage collections before its memory can be reclaimed. Therefore, think carefully before you add one to your classes. (2) You should not make any assumptions about the thread that your Finalize destructor will run on. It will run on a unique thread provided by the CLR, so you will need to avoid accessing thread-local resources in a Finalize destructor. (3) The actual time when a Finalize destructor will run is indeterminate. The CLR will not call the Finalize destructor on a class until (a) the garbage collector runs, (b) the object is determined to be garbage, and (c) the special thread assigned to executing Finalize destructors completes its work. The Finalize destructor may run any time from when the last reference to the object is removed to when the application shuts down. This is totally different than what most developers are used to. With languages like C++, the destructor is called for a stack object as soon as it goes out of scope; the destructor is called for a heap object when you use the "delete" operator on the object. Because you cannot know when a Finalize destructor will run, it is unwise to leave the cleanup or reclamation of scarce resources to a Finalize destructor. For instance, in most cases in the .NET Framework, it is a bad idea to close a database connection in a Finalize destructor. This is a common thing that people (myself included) did in C++. If your object uses scarce resources like database connections, you should instead put the logic to close the database connection in a Dispose or Close method. By convention, you should use a Close method if the object may be used again after the call to the Close method. You should use Dispose if the object will not be used again. There actually is an IDisposable interface in the System namespace that contains a Dispose method. You should implement this interface to provide your Dispose method. The recommended semantics for this method are as follows: A Dispose method should release all resources that the object on which it was called owns. It should also remove the object from the Finalization queue so its destructor will not get called. Therefore, if you had an Employee class that used a database connection, you would implement the IDisposable interface as follows:

public class Employee : IDisposable
{
  public Employee(int id,string name,decimal salary)
  {
   this.mName=name;
   this.mID=id;
   this.mSalary=salary;
  }
  public int ID
  {
   get { return mID; }
   set { mID=value; }
  }
  public virtual void Dispose()
  {
   Dispose(true);
   GC.SuppressFinalize(this);
  }
  protected virtual void Dispose(bool disposing)
    {
   if(!disposed)
      {
   // if disposing = true cleanup 
// managed resources 

// Close database connection here...

   }
   disposed = true; 
  }
  public string Name
  {
   get { return mName; }
   set { mName=value; }
  }
  public virtual decimal GetSalary()
  {
   return mSalary;
  }
  ~Employee()
  {
   Dispose(false);
  }
  private string mName;
  int mID;
  decimal mSalary;
  private bool disposed = false;
}

This code shows the recommended design pattern for implementing the IDisposable interface. There are a number of reasons why this code is so complicated. First, remember that once you implement the IDisposable interface, your class must be able to handle 2 different "cleanup" scenarios. One is where the finalize destructor is called by the garbage collector, in this case you will need to close the database connection (or free any other unmanaged [non-garbage collected] resources) there is no need to cleanup managed resources, the garbage collector will do that for you. The other scenario is where the user has explicitly called the Dispose method. In this scenario you should close the database connection (or free any other unmanaged resources) and cleanup any managed resources—if necessary. Microsoft recommends that you put both the managed and unmanaged cleanup logic in a protected, virtual method called Dispose that takes a boolean parameter; this method is an overload of the Dispose method from the IDisposable interface. If you call this method with "true" specified for the parameter it should cleanup both the managed and unmanaged resources, if you pass in "false" it should cleanup just the unmanaged resources. You should call this method with "false" specified for the parameter from the destructor (the garbage collector will handle the cleanup of managed resources). You should call this method from the IDisposable.Dispose method with "true" specified for the parameter, because the method call there is not made within the context of a garbage collection. The IDisposable.Dispose method should also call the SuppressFinalize method on the GC class. The SuppressFinalize method will remove the object from the Finalization queue, so its Finalization method will not be called. The Finalization call is no longer necessary because I have already disposed of the object.

The GC class in the System namespace contains methods for interacting with the garbage collector, and it includes methods for removing an object from the Finalization list (SuppressFinalize) and re-adding an object to the Finalization queue (ReRegisterForFinalize). You will typically only call ReRegisterForFinalize if you decide to resurrect an object during its Finalize method. Remember I mentioned that an object with a Finalize destructor will exist in a zombie state after the garbage collector has determined that it is garbage. The garbage collector will remove the object's entry from the Finalization queue, and add it to the Freachable queue. After the runtime thread in the CLR executes the Finalize destructor, it is possible that the Finalize destructor could resurrect the object by assigning the object's "this" pointer to a global or static variable. The garbage collector will not collect the object the next time it runs because it will be reachable from a root. Of course, the object is in a weird state now because its Finalize destructor has been called. Even if you reinitialize the object's state, you still have a problem, because its entry has been removed from the Finalization queue, Finalization will not be called again. You can remedy this situation by calling ReRegisterForFinalize, which will add the object's entry back to the Finalization queue. In almost all cases, resurrecting an object like this is a bad idea, and it should be avoided.

The CLR will determine when to run the garbage collector, but if you want to explicitly start a garbage collection, the GC class contains a method called Collect that allows you to explicitly cause the garbage collector to run at a particular time. There are two forms of this method: one takes no parameters as follows:

GC.Collect();

The other form of the method takes an integer parameter, which is the generation that you want to collect:

int gen=0;
GC.Collect(gen);

NOTE

The .NET garbage collector is highly optimized and in most cases you are better off letting it decide when to perform a garbage collection rather than trying to do it manually.

Generations are a technique that the garbage collector uses to optimize the garbage collector for speed. The basic ideas underlying generations are that (1) it is faster to compact a portion of the managed heap instead of the entire heap, (2) newer objects will have shorter lifetimes, and (3) older objects will have longer lifetimes. Of course, these three points aren't always true, but they have been found through research to be true for most applications. To take advantage of these ideas, the garbage collector in the CLR assumes that all new objects are in generation 0. When the garbage collector runs, any objects that survive the collection are considered to be in generation 1. Any new objects that are created after the garbage collection go into generation 0. When another garbage collection needs to occur, the garbage collector has two choices: (1) It can collect only generation 0, or (2) it can collect generations 0 and 1 (actually, there are three choices because there is also a generation 2, which I will talk about shortly). In most circumstances, the garbage collector will only attempt to garbage-collect generation 0. The exact algorithm that the garbage collector uses to determine whether to garbage-collect only generation 0 or 0 and 1 is obviously a Microsoft secret, but, in general, the garbage collector will only collect generation 1 If performing a garbage collection on generation 0 does not free up enough memory to satisfy a memory allocation request. Any objects that survive a collection on generation 0 and 1 are promoted to generation 2. There are again more heuristics built into the garbage collector that determine when it will run a collection on all three generations. Any objects that survive a collection on all 3 generations will remain in generation 2 because the garbage collector currently only supports three generations (0,1, and 2). Microsoft does seem to be leaving the door open to support more generations in the future because the GC class in the System namespace does support a MaxGeneration property that you can use to determine the highest generation number. This method currently returns 2 but this may change in the future.

The last garbage collector-related topic that I will discuss is weak references. Weak references give you a way to maintain a reference to an object, while allowing the garbage collector to collect the object if a collection occurs. Normal references are called strong references because, if the garbage collector runs while you are holding a strong reference to an object, the object will not be collected. In order to use the object pointed to by a weak reference, you must first obtain a strong reference from the weak reference. If the garbage collector has collected the object, the conversion from a weak reference to a strong reference will fail, so you will have to re-create the object. Weak references are good for objects that take up a lot of memory, but are easy to re-create. A good example is a directory tree for a file system. A directory tree can be extremely large and therefore may take a lot of time to re-create. For performance reasons, you may like to keep this tree in memory, but requiring the system to keep this tree in memory will put a lot of memory pressure on your system. So you may choose to keep the directory tree in memory, but still allow the garbage collector to reclaim the memory used by the tree if it needs to. Let's look at some code that should make this much clearer.

You can create a weak reference on an object using the code shown in the cmdCreateWeak_Click method that follows. Notice that I first check to see that the object has not been collected using the IsAlive property on the WeakReference class before I attempt to use the Manager object.

public class Form1 : System.Windows.Forms.Form
{
  private WeakReference wkRef;
  // Other code omitted from this class.
  //
  private void cmdCreateWeak_Click(object sender,
    System.EventArgs e)
  {
    Manager mgr=new Manager(1,"Alan Gordon",500,100);
    wkRef=new WeakReference(mgr);
  } 
  private void cmdUseWeak_Click(object sender,
    System.EventArgs e)
  {
    if (wkRef.IsAlive)
    {
      aManager=(Manager)wkRef.Target;
      MessageBox.Show("The object is alive");
      // Use the manager object
      // 
    }
    else
      MessageBox.Show(
        "The manager has been collected");
  }
}

You can specify a Boolean trackResurrection parameter in the WeakReference constructor as follows:

wkRef=new WeakReference(mgr,true);

If you specify false for this second parameter (the default), the WeakReference will not track the underlying object (that is, the IsAlive property will return false) after its Finalize Destructor has run. This is called a short weak reference. If you specify true for the trackResurrection parameter, the WeakReference will continue to track the object while it exists in the zombie state after its Finalize Destructor has run, but before a second garbage collection has finished off the object. This is called a long weak reference. Essentially, specifying true for the trackResurrection property allows you to specify whether you can use the WeakReference to resurrect an object whose Finalize method has been run.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020