Home > Articles > Programming > C#

.NET Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

The Cost of Locks

Last updated Mar 14, 2003.

Locks aren’t free. There are definite costs associated with adding Monitor locks to your code. There are performance costs associated with acquiring and releasing locks as well as with other threads blocking while waiting to acquire a lock. In addition, the addition of locks complicates your code and multiple locks can make your program susceptible to deadlocks if you’re not careful.

Let’s take a closer look first at the performance cost of acquiring and releasing a lock. The program below allocates a very large array (100,000,000 integers) and then times how long it takes to write a value to each item in the array—first by doing it without locks, and then by acquiring a lock on the array before each access. The program then computes the time difference and the amount of time it takes for each lock.

[C#]

class Program
{
  const int ArraySize = 100 * 1000 * 1000;
  static int[] bigArray = new int[ArraySize];

  [STAThread]
  static void Main(string[] args)
  {
    // Ensures that the array is allocated and initialized
    DoWithoutLocks();

    long nolockTicks;
    long lockTicks;
    Stopwatch sw = new Stopwatch();

    // Time a run without locks
    sw.Start();
    DoWithoutLocks();
    sw.Stop();
    nolockTicks = sw.ElapsedTicks;

    sw.Reset();

    // Time a run with locks
    sw.Start();
    DoWithLocks();
    sw.Stop();
    lockTicks = sw.ElapsedTicks;

    long noLocksMs = (nolockTicks * 1000) / Stopwatch.Frequency;
    long withLocksMs = (lockTicks * 1000) / Stopwatch.Frequency;
    Console.WriteLine("With locks  = {0} milliseconds", withLocksMs);
    Console.WriteLine("Without locks = {0} milliseconds", noLocksMs);
    Console.WriteLine("Difference  = {0} milliseconds", withLocksMs - noLocksMs);

    // Compute differences and per-lock time
    long tickDiff = lockTicks - nolockTicks;
    long msDiff = (tickDiff * 1000)/Stopwatch.Frequency;
    double lockms = (double)msDiff / ArraySize;

    Console.WriteLine("{0} locks requires {1} milliseconds", ArraySize, msDiff);
    Console.WriteLine("Lock requires {0} microseconds", 1000 * lockms);

    Console.WriteLine("Press Enter...);
    Console.ReadLine();
  }

  static void DoWithoutLocks()
  {
    for (int i = 0; i < ArraySize; i++)
    {
      bigArray[i] = i;
    }
  }

  static void DoWithLocks()
  {
    for (int i = 0; i < ArraySize; i++)
    {
      lock (bigArray)
      {
        bigArray[i] = i;
      }
    }
  }
}

[Visual Basic]

Module Module1
  Const ArraySize As Integer = 100 * 1000 * 1000
  Dim bigArray(ArraySize) As Integer

  Sub Main()
    ’ Ensures that the array is allocated and initialized
    DoWithoutLocks()

    Dim nolockTicks As Long
    Dim lockTicks As Long
    Dim sw As New Stopwatch()

    ’ Time a run without locks
    sw.Start()
    DoWithoutLocks()
    sw.Stop()
    nolockTicks = sw.ElapsedTicks

    sw.Reset()

    ’ Time a run with locks
    sw.Start()
    DoWithLocks()
    sw.Stop()
    lockTicks = sw.ElapsedTicks

    Dim noLocksMs As Long = (nolockTicks * 1000) / Stopwatch.Frequency
    Dim withLocksMs As Long = (lockTicks * 1000) / Stopwatch.Frequency
    Console.WriteLine("With locks  = {0} milliseconds", withLocksMs)
    Console.WriteLine("Without locks = {0} milliseconds", noLocksMs)
    Console.WriteLine("Difference  = {0} milliseconds", withLocksMs - noLocksMs)

    ’ Compute differences and per-lock time
    Dim tickDiff As Long = lockTicks - nolockTicks
    Dim msDiff As Long = (tickDiff * 1000) / Stopwatch.Frequency
    Dim lockms As Double = msDiff / ArraySize

    Console.WriteLine("{0} locks requires {1} milliseconds", ArraySize, msDiff)
    Console.WriteLine("Lock requires {0} microseconds", 1000 * lockms)

    Console.WriteLine("Press Enter...)
    Console.ReadLine()

  End Sub

  Sub DoWithoutLocks()
    Dim i As Integer
    For i = 0 To ArraySize - 1
      bigArray(i) = i
    Next
  End Sub

  Sub DoWithLocks()
    Dim i As Integer
    For i = 0 To ArraySize - 1
      SyncLock (bigArray)
        bigArray(i) = i
      End SyncLock
    Next
  End Sub
End Module

Here’s the output from a sample run of the C# version:

With locks  = 5734 milliseconds
Without locks = 724 milliseconds
Difference  = 5010 milliseconds
100000000 locks requires 5009 milliseconds
Lock requires 0.05009 microseconds

Many runs of that program produce approximately the same values: about 0.050 microseconds per lock in the C# version, and about 0.051 microseconds per lock in the Visual Basic version. Both are compiled in Release mode.

So a lock takes 50 nanoseconds (0.05 microseconds) on my machine: a 2.4 GHz Pentium Core 2 Duo. 50 nanoseconds doesn’t seem like much by itself, but performing 100 million locks takes five full seconds. 100 million might seem like a lot, but it’s not really a huge number when you’re talking about multiple threads continually accessing a central data structure.

The above numbers were gathered by having just one thread access the data structure so that we could see how much time it takes to acquire and release a lock. But we don’t use locks in single-threaded programs. The real cost of a lock is how much it affects throughput when multiple threads are trying to access the data structure. Calculating that takes a few changes to the test program.

In particular, we need to spawn two threads that concurrently go through the loop, and the main program has to wait for both threads to complete before continuing. This is a perfect application in which to use the asynchronous programming model.

Using the APM doesn’t require very many modifications to the code. First, we have to define a delegate that we can call asynchronously. Insert this line of code just below the declaration of the bigArray:

[C#]

delegate void TestDelegate();

[Visual Basic]

Delegate Sub TestDelegate()

After that, it’s a simple matter of replacing the code that calls the test function with code that invokes two asynchronous delegates and waits on their completion. Comment out the calls to DoWithoutLocks and DoWithLocks, and replace those lines with this code:

[C#]

// Replace the call to DoWithoutLocks with this code
TestDelegate t1 = new TestDelegate(DoWithoutLocks);
TestDelegate t2 = new TestDelegate(DoWithoutLocks);
IAsyncResult ia1 = t1.BeginInvoke(null, null);
IAsyncResult ia2 = t2.BeginInvoke(null, null);
t1.EndInvoke(ia1);
t2.EndInvoke(ia2);

// Replace the call to DoWithLocks with this code
t1 = new TestDelegate(DoWithLocks);
t2 = new TestDelegate(DoWithLocks);
ia1 = t1.BeginInvoke(null, null);
ia2 = t2.BeginInvoke(null, null);
t1.EndInvoke(ia1);
t2.EndInvoke(ia2);

[Visual Basic]

’ Replace the call to DoWithoutLocks with this code
Dim t1 As New TestDelegate(AddressOf DoWithoutLocks)
Dim t2 As New TestDelegate(AddressOf DoWithoutLocks)
Dim ia1 As IAsyncResult = t1.BeginInvoke(Nothing, Nothing)
Dim ia2 As IAsyncResult = t2.BeginInvoke(Nothing, Nothing)
t1.EndInvoke(ia1)
t2.EndInvoke(ia2)

’ Replace the call to DoWithLocks with this code
t1 = New TestDelegate(AddressOf DoWithLocks)
t2 = New TestDelegate(AddressOf DoWithLocks)
ia1 = t1.BeginInvoke(Nothing, Nothing)
ia2 = t2.BeginInvoke(Nothing, Nothing)
t1.EndInvoke(ia1)
t2.EndInvoke(ia2)

A sample run of the C# version provides this output:

With locks  = 13081 milliseconds
Without locks = 741 milliseconds
Difference  = 12340 milliseconds
100000000 locks requires 12340 milliseconds
Lock requires 0.1234 microseconds

Again, the Visual Basic program was very slightly slower.

Since I have a dual core machine, the "without locks" time for two threads is almost identical to the "without locks" time for a single thread: 741 milliseconds as opposed to 724 milliseconds. The "with locks" time, however, is quite different, as is the per-lock time, which increased to 120 nanoseconds. Here, we’re measuring not only the time required to create the locking object, but also the time that a thread spends waiting for the other thread to release the lock.

Things look even worse with three threads. The "without locks" time increases to about 1,700 milliseconds because with three threads and only two cores, there’s a lot of thread context switching going on. The "with locks" times are about what you’d expect: 180 nanoseconds per lock.

Granted, this is a bit of an extreme case. If all you were doing was processing an array or list, then it doesn’t make sense to use multiple threads. In a real program, some threads will be waiting on I/O or computing results before going to the central data structure. Still, with many threads running it’s quite likely that one or more will be waiting for a lock.

As you can see from the example above, locks can be very expensive if they’re used improperly. In cases where updates to the central data structure take a long time, it’s often better to have threads post update requests to a queue that a separate thread then processes. Updates are delayed a bit, but the individual worker threads spend very little time blocked: just the amount of time it takes to lock the queue and enqueue an item.

Using Monitor locks is the easiest way to synchronize access to shared data, and it’s very effective. However, the exclusive nature of the lock carries with it some significant drawbacks. The lock can become a bottleneck that makes your multi-threaded program perform worse than an equivalent single-threaded program. Fortunately, there are other synchronization primitives that go a long way towards alleviating those problems.