Home > Articles > Programming > C#

.NET Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

A Generic BinaryHeap Class

Last updated Mar 14, 2003.

In Sorting a Large Text File, I implemented a binary heap to speed the process of selecting the smallest item from a group of items. In the article, I implemented a binary heap of integers and, later, a binary heap of strings. A binary heap, however, is a handy data structure to have, though, for a number of different applications. It should be easily instantiated to work with any data type. That is, it's a perfect candidate for a generic implementation.

You may want to re-familiarize yourself with the heap data structure and binary heap in particular before continuing.

The goal of this project is to create a generic BinaryHeap class that implements minimal useful functionality.

The interface

In order to be useful a heap has to support two operations: insert an item, and remove the root item. The addition of a constructor or two will give us a minimal, but functional, binary heap. The minimal class interface would be:

public class BinaryHeap<T>
{
    public BinaryHeap(IComparer<T> comp);
    public Insert(T item);
    public T RemoveRoot();
}

The IComparer interface passed to the constructor contains the comparison function that's used to order items in the heap. The only difference between a min-heap (where items are sorted in ascending order and the root item is the minimum item on the heap) and a max-heap (descending order, maximum item at the root) is the comparison function.

I've elected to add a default constructor for the common case of wanting a min-heap that uses the default comparison for a given type. That is:

public BinaryHeap()
 : this(Comparer<T>.Default)
{
}

That allows you to write:

var heap = new BinaryHeap<int>();

Rather than having to write:

var heap = new BinaryHeap<int>(Comparer<int>.Default);

I've also implemented the IEnumerable<T> interface, and the following methods and properties, which make the BinaryHeap a bit easier to work with:

  • The Count property simply returns the number of items currently in the collection.
  • The Clear method removes all items from the collection.
  • The TrimExcess method helps reduce memory usage by decreasing the capacity of the collection to the current Count, if that value is less than 90% of the current capacity. This can save a lot of memory if you have a mostly empty collection. It's common to follow a call to Clear with a call to TrimExcess, which will reset the collection to the default number of items.
  • The Peek method lets you examine the root item, without actually removing it from the heap.

The code

Implementation of the generic BinaryHeap class turns out to be a relatively small amount of code: about 120 lines. A little less than half of that code manages the heap itself (the Insert and RemoveRoot methods). The rest of the code is comments, the constructors, and the convenience methods mentioned above.

public class BinaryHeap<T> : IEnumerable<T>
{
    private IComparer<T> Comparer;
    private List<T> Items = new List<T>();
    public BinaryHeap()
        : this(Comparer<T>.Default)
    {
    }
    public BinaryHeap(IComparer<T> comp)
    {
        Comparer = comp;
    }
    /// <summary>

    /// Get a count of the number of items in the collection.
    /// </summary>
    public int Count
    {
        get { return Items.Count; }
    }
    /// <summary>
    /// Removes all items from the collection.
    /// </summary>
    public void Clear()
    {
        Items.Clear();
    }
    /// <summary>
    /// Sets the capacity to the actual number of elements in the BinaryHeap,
    /// if that number is less than a threshold value.
    /// </summary>

    /// <remarks>
    /// The current threshold value is 90% (.NET 3.5), but might change in a future release.
    /// </remarks>
    public void TrimExcess()
    {
        Items.TrimExcess();
    }
    /// <summary>
    /// Inserts an item onto the heap.
    /// </summary>
    /// <param name="newItem">The item to be inserted.</param>

    public void Insert(T newItem)
    {
        int i = Count;
        Items.Add(newItem);
        while (i > 0 && Comparer.Compare(Items[(i - 1) / 2], newItem) > 0)
        {
            Items[i] = Items[(i - 1) / 2];
            i = (i - 1) / 2;
        }
        Items[i] = newItem;
    }
    /// <summary>
    /// Return the root item from the collection, without removing it.
    /// </summary>
    /// <returns>Returns the item at the root of the heap.</returns>

    public T Peek()
    {
        if (Items.Count == 0)
        {
            throw new InvalidOperationException("The heap is empty.");
        }
        return Items[0];
    }
    /// <summary>
    /// Removes and returns the root item from the collection.
    /// </summary>
    /// <returns>Returns the item at the root of the heap.</returns>
    public T RemoveRoot()
    {
        if (Items.Count == 0)
        {
            throw new InvalidOperationException("The heap is empty.");
        }
        // Get the first item
        T rslt = Items[0];
        // Get the last item and bubble it down.
        T tmp = Items[Items.Count - 1];
        Items.RemoveAt(Items.Count - 1);
        if (Items.Count > 0)
        {
            int i = 0;
            while (i < Items.Count / 2)
            {
                int j = (2 * i) + 1;
                if ((j < Items.Count - 1) && (Comparer.Compare(Items[j], Items[j + 1]) > 0))
                {
                    ++j;
                }
                if (Comparer.Compare(Items[j], tmp) >= 0)
                {
                    break;
                }
                Items[i] = Items[j];
                i = j;
            }
            Items[i] = tmp;
        }
        return rslt;
    }
    IEnumerator<T> IEnumerable<T>.GetEnumerator()
    {
        foreach (var i in Items)
        {
            yield return i;
        }
    }
    public IEnumerator GetEnumerator()
    {
        return GetEnumerator();
    }
}

Final thoughts

Minimal as it is, this binary heap implementation is quite useful for a number of different things. It can serve as the intermediate data structure for a large file sort, and you can also use it to implement a priority queue. You can also use a heap to select, in linear time, the top N elements in an unordered list. Also, many graph algorithms rely on heaps to provide better performance.

This implementation could benefit from a few improvements. As written, the is no way to remove an arbitrary item from the heap. If the item isn't the root, you can't access it. It would be useful to have a Remove(T item) function that will find and remove an item, while maintaining the heap property. Also, it would be useful to implement a ChangeKey method that allows you to change an item's key and re-order the heap appropriately. It might also be useful to have a Merge method that combines two heaps into one. Merge could be implemented as an extension method.

The implementation above is a classic min-heap. There is also something called a min-max heap, which isn't actually a heap in that the tree doesn't meet the requirements of the heap property, but it acts like a binary heap that gives you very fast access to the lowest and highest items, and provides performance that is very close to that of min-heap.

Finally, there are many different ways to implement a heap. A binary heap is convenient because it provides reasonably good performance overall while being small and easy to implement. Other implementations have their strengths and weaknesses: faster for some operations and slower for others, and are variously more difficult to implement. If your application is highly dependent on the performance of a particular heap operation, you might be interested in investigating one of the many heap variants.