In the post about perfmon I’ve briefly mentioned the fact that you can utilize GC.Collect to find memory leaks.

By combining some custom code, WinDbg and SOS you can track most (if not all) of your managed memory leaks without buying a memory profiler.

The methodology is very simple and very easy to implement and use.
It involves taking serveral memory dumps at specific points and anaylzing the objects on the managed heap.

Preparing the data for analysis
Before we can start analyzing we need to prepare the data by doing the following things:

  • Prepare a call to GC.Collect. The exact code should look like this:
    Just to be sure that everything is clean, call this code 3 times. This will ensure that everything, including all pending finalizers (objects that awaits for their finalizer to be called), are really gone.

    If this is a WinForms application, just add a button that will call the above code.
    If this is a Web application (ASP.NET or ASP.NET WebService) add a page called Collect.aspx with this code (If you don’t want to compile this into a code behind assembly, just download this page and place it in the root folder of the application).

  • Run the application and reach to a point before the area of code you wish analyze and take a memory dump using adplus.vbs -hang -p [process id].
    This dump will be used to show us what objects were already on the heap before we started the operation.
  • Run the operation we want to check for memory leaks and after it ends take another dump using the same command line as we used above.
    This dump will be used to show us what objects were added to the heap during the operation we are checking.
  • Run the Collect code (either by pressing the button you have prepared in the WinForms application or access the Collect.aspx page if this is a web application) and take another memory dump.
    This dump will be used to show us what objects are left on the heap after the operation has ended and after we called GC.Collect a few times to make sure all garbage was collected. All objects that cannot be accounted for (explained why they are still alive) are leaks and should be handled.

How to analyze the data

We first need to get some statistics on each one of the dumps to get a hold of the number of objects and their types that we currently have in the heap. To do that, we need to run the command: !dumpheap -stat on each one of the dumps and get a statistical view of all types of objects and the number of living instances.

We then need to compare between the 2nd and 3rd dumps to see if some of the objects did not decrease in number. If they haven’t (and we know they should have) these are our leaking objects.

We can find who is referencing them by using the !dumpheap -type [Object Type] (where [Object Type] is the namesapce and class name of the objects we want to check).

Then, we need to take one of the addresses (take the upper ones, they are the oldest ones) and run !gcroot [Object Address] on them to see who is referencing them.

Continue until you can account for all the objects in the 3rd dump and check if they are really supposed to be there after we called GC.Collect & GC.WaitForPendingFinalizers 3 times. Objects that should not be there should be traced to find who is referencing them and find out why they are being referenced.

That’s it. Quite simple, saves money but doesn’t produce nice graphs like the other tools 🙂

In a server application even the smallest leak can grow fast to become a major issue.

In a pure unmanaged world, finding these leaks can be challanging but there are more than a few ways of doing so (perhaps I shall discuss this in a later post), but in .NET, with the help of our faithful friends WinDbg and the SOS extension, we can easily find and eliminate these leaks in no time if we stick to a proven and useful methodology.

As you already know, in pure managed code there are no memory leaks in the traditional manner, you can’t forget to free something since we are in a garbage collected environment. What you can do, however, is to forget to unreference an object in which case the GC will never consider it garbage and will never collect it.

Common situations include having some static variable (Shared variable for our VB.NET audiance) that is a collection of some type (usually a Hashtable) that you add various objects to it.
Since its a static variable it is rooted and will never get unreferenced and if we don’t explicitally remember to remove the objects we have added to that collection we have a managed memory leak.

So how can we overcome this problem?

Well… there are three ways.

  1. You can go the money way which involves buying one of the various .NET memory profilers available today (I can personally recommed on the fine and very easy to use .NET Memory Profiler by SciTech. Its an excellent tool and you get a 14 days fully functional evaluation and the new version also have a command line and an instrumentation API that lets you intiate the profiler in certain places in the code without a user intervention).
  2. You can go the programming way by implementing your own memory profiler using the .NET profiling API which involves implementing a COM interface and handling all the plumbing yourself (yet another post material for future posts 😉 and its even very interesting to do).
  3. Using WinDbg and the SOS extension in a certain methodology that will guarentee success.

Obviously, we are going to talk about option #3 in this post.

The tools and commands we are going to use are:

  • adplus.vbs – a sophisticated VBScript that was written by some fine SIE engineers at Microsoft Support that instruments the use of cdb and the process of taking memory dumps automatically. It comes with the Debugging Tools for Windows installation (the WinDbg installation).
  • WinDbg (duh!)
  • !dumpheap – An SOS extension command used to list the objects allocated on the managed heap.
  • !gcroot – An SOS extension command to find who is the object that is referncing the object we are currently checking.
  • !dumpobject – An SOS extension command to investigate a certain object and see what are its fields and who they reference.

Before we can start to work on WinDbg we need a set of dumps taken in a predefined and continous interval. If the memory leak is small we will use a bigger interval (i.e. 20min) if the leak is big we will use small interval (i.e. 1min).

To take the dump we will use the following adplus.vbs command line:

([UPDATE] 6/16/2005: Thanks to some anonymous reader that pointed my mistake. I WAS reffering for the use of -hang and NOT -crash option, which is good for other situations)

adplus.vbs -hang -p [process Id] -quiet

  • -hang tells adplus.vbs to attach the debugged process, take a memory dump and detach (this will only work in Windows 2000 and above, since the detach option is only availabe in Windows 2000 and above).
  • Replace [process Id] with the PID of the process we want to take its dump
  • -quiet will silence unnecessay message boxes that adplus might pop in certain cases.

After we have a series of dumps (usually 3 will suffice to show a trend) we can get down and dirty with WinDbg.

TIP: To save your WinDbg session that will contain all the commands you have executed and all the outputs you have received use the .logopen [file name] command after openning the memory dump and before starting to run any other command

Since our problem is memory increasing and its a pure managed application it means we are allocating objects and some object that is still alive is holding a reference to them making the GC not consider them garbage and therefore not collecting them.

To find them we will use the following set of commands:

  1. Run !dumpheap -stat to get a statistical view of all objects currently allocated in the heap. If the case is indeed objects that are allocated and never gets free you will find that certain classes will have a steady increasing amount of live instances over the course of the series of dump taken and this should point to the objects that we should take a closer look at. Click here to see a Sample Output.
    In our case, we see that we have 4500 instances of type MemoryLeakSample.MyObject which should turn a red light in our head since this is a good candidate of a leaking object.
  2. Run !dumpheap -type MemoryLeakSample.MyObject where “MemoryLeakSample.MyObject” is the full namespace and class name of the class type that we in the step above. This should give us a list of all the instances sorted by their addresses. Since we are in a GC environment and the .NET GC compacts the heap, the instance with the smallest address will be our oldest instance and we should focus on it. Click here to see a Sample Output.
  3. Run !gcroot 0x04ab8348 where 0x04ab8348 is the address of the oldest object (the first one in the list) that we saw earlier and we will get a full list of references showing who is holding who. In our case, since we have a static hashtable we will see something like “HANDLE(Strong):931c0:Root ….” which means that the parent (and true parent that is at the top of the reference list) is rooted, meaning it will never get freed unless the AppDomain will unload (in a multiple AppDomain scenario) or the process will die. Click here to see a Sample Output.

Now all we have left to do, is to find out where in the code we have defined this static variable and either NOT use a static variable (the best option, but not always possible) or make sure we clean this collection.

This methodology can be used to find not only rooted objects, but a big increase in instances of a ceratin type over a period of time and see who is holding them.

The best thing about this methodology is that you can do it at a customer’s site without a problem and finding 80% of the problem without having your symbols and/or your code.

Happy leak hunt!