Why We Dispose Things

Pop quiz: Why do we use IDisposable?

If you said something like “To allow us to clean up unmanaged resources”, I have good news: most other people make the same mistake.

The correct answer is “To make our code faster and more predictable.”

Don’t believe me? Let me try to convince you.

Consider the following:

void Main()
{
     var thing = new Thing();
     GC.Collect();                  // It doesn't matter what you do
     GC.WaitForFullGCComplete();    // Or how long you wait
     GC.WaitForPendingFinalizers(); // Thing will never release its resource
}

public class Thing : IDisposable {
      public object FakeResource = new object();
      public void Dispose() {
            // Do not implement the Disposable pattern this way!
           FakeResource = null;
           "Resource released!".Dump();     // This never happens
     }
}

It doesn’t matter how thoroughly you implement IDisposable. If somebody using your code fails to call the Dispose() method (or wrap your object in a using block), your resources will never be released. If you’re looking to ensure your resources are released, you should implement a finalizer:

void Main()
{
     var thing = new Thing();
}

public class Thing {
     public object FakeResource = new object();
     ~Thing() {
           FakeResource = null;
           "Resource released!".Dump();
     }
}

This guarantees that our resource will be released – however, it doesn’t guarantee when it will be released. In fact, when I ran that code, the first two runs didn’t print anything, and the third run printed the message twice. The fourth run printed the message twice again (LINQPad doesn’t unload the app domain between runs, so we see the finalizers from earlier runs completing during later runs.)

What you should see from this is that IDisposable isn’t for disposing resources. One of the uses of IDisposable is, however, to provide some control over when those resources are released. A basic pattern you might use is this one:

public class Thing : IDisposable {
     public object FakeResource = new object();
     ~Thing() {
           releaseResources();
     }
     public void Dispose() {
           releaseResources();    // This still isn't the full pattern you should be using
     }
      private void releaseResources() {
           if (FakeResource != null) {
                FakeResource = null;
                 "Resource released!".Dump();
           }
     }
}

Now, if a Thing is wrapped in a using block, or Dispose() is called, the resource will be released immediately. If the caller fails to ensure Dispose() is called, the resource will still be released by the finalizer.

Hopefully you can see that a finalizer is what we should be using to ensure resources are released, and IDisposable gives us a way to control when that happens. This is what I meant about predictability, and it also improves our stability: if resources are cleaned up in a timely fashion, our system is less likely to run out of limited resources under heavy load. If we rely on the finalizer, we guarantee that the resource will be released, but it’s possible for large numbers of objects to be waiting to be finalized, while hanging onto resources which won’t be used again.

Performance

I promised that IDisposable can also make code run faster, and to do that we need to understand a little bit about the garbage collector.

In the CLR, our heap has three different generations, numbered 0, 1, and 2. Objects are initially allocated on the gen 0 heap, and are moved up to the gen 1 and 2 heaps as they last longer.

The garbage collector needs to make a fast decision about every object, and so every time it encounters an object during a collection, it does one of two things: collect the object, or promote it to the next generation. This means that if your object survives a single gen 0 garbage collection, it will be moved onto the gen 1 heap by copying the memory and updating all references to the object. If it survives a gen 1 garbage collection, it is again moved – it is copied to the gen 2 heap, and all references are updated again.

The other thing you need to understand is how finalizers get called. When the garbage collector encounters an object which needs to be finalized, it has to put it on a queue and leave it uncollected until the finalizer has run – but remember that the garbage collector can only do two things: collect or promote. This means that the object has to be promoted to the next generation, just to give it time for the finalizer to be run.

Let’s look at some numbers again. The following code has a simple finalizer which just adds to some counts: the total number of objects finalized, and the number which reached the later generation heaps.

void Main()
{
     Thing.Gen1Count = 0;
     Thing.Gen2Count = 0;
     Thing.FinaliseCount = 0;
     for (int repeatCycles = 0; repeatCycles < 1000000; repeatCycles++) {
            var n = new Thing();
     }
     GC.Collect();
     GC.WaitForPendingFinalizers();
     ("Total finalizers run:" + Thing.FinaliseCount).Dump();
     ("Objects which were finalized in gen1:" + Thing.Gen1Count).Dump();
     ("Objects which were finalized in gen2:" + Thing.Gen2Count).Dump();
}

public class Thing {
     public static int FinaliseCount;
     public static int Gen1Count;
     public static int Gen2Count;
     ~Thing() {
           finalize();
     }
     private void finalize() {
           FinaliseCount += 1;
            var gen = GC.GetGeneration( this);
            if (gen == 1) Gen1Count++;
            if (gen == 2) Gen2Count++;
     }
}

After running this a few times, it’s quite clear that the performance is all over the place. I got run-times ranging from 0.5 seconds up to 1.1 seconds. A typical output looks like this:

Total finalizers run: 999999
Objects which were finalized in gen1: 118362
Objects which were finalized in gen2: 881637

As you can see, most objects go through two promotions before they are collected, incurring a significant overhead.

With a few changes, we can significantly improve this situation.

void Main()
{
     Thing.Gen1Count = 0;
     Thing.Gen2Count = 0;
     Thing.FinaliseCount = 0;
     for (int repeatCycles = 0; repeatCycles < 1000000; repeatCycles++) {
           var n = new Thing();
           n.Dispose(); // This is new - we could also have used a using block
     }
     GC.Collect();
     GC.WaitForPendingFinalizers();
     ("Total finalizers run: " + Thing.FinaliseCount).Dump();
     ("Objects which were finalized in gen1: " + Thing.Gen1Count).Dump();
     ("Objects which were finalized in gen2: " + Thing.Gen2Count).Dump();
}

public class Thing : IDisposable {
     public static int FinaliseCount;
     public static int Gen1Count;
     public static int Gen2Count;
     public void Dispose() {
           finalise();
           GC.SuppressFinalize(this); // If we can perform finalization now, we can tell the GC not to bother
     }
     ~Thing() {
           finalise();
     }
     private void finalise() {
           FinaliseCount += 1;
           var gen = GC.GetGeneration(this);
           if (gen == 1) Gen1Count++;
           if (gen == 2) Gen2Count++;
     }
}

The changes I have made is to make Thing implement IDisposable, make the Dispose() method call GC.SuppressFinalize(this), and make the main loop call Dispose(). That tells the garbage collector that the object has already finished disposing of any resources it uses, and it can be collected immediately (instead of being promoted and placed on the finalizer queue).

The code now runs in a very consistent 0.2 seconds – less than half the original – and the output looks like this:

Total finalizers run: 1000000
Objects which were finalized in gen1: 0
Objects which were finalized in gen2: 0

As you can see, the finalizers now all run while the object is still in gen 0. Measuring using the Windows Performance Monitor tells a similar story: in the version which uses only the finalizer, the monitor records numerous promotions and an increase in both gen 1 and 2 heap sizes. We don’t see that happening when we use the Dispose() method to suppress the finalizer.

So there you have it. Finalizers are for guaranteeing your resources get released. IDisposable is for making your code faster and more predictable.

Nerdhold Coder

Performance

Leave a Reply Cancel reply

Tinkerings of a C# Coder