Archive for the 'Managed Debugging' Category

In the post about perfmon I’ve briefly mentioned the fact that you can utilize GC.Collect to find memory leaks.

By combining some custom code, WinDbg and SOS you can track most (if not all) of your managed memory leaks without buying a memory profiler.

The methodology is very simple and very easy to implement and use.
It involves taking serveral memory dumps at specific points and anaylzing the objects on the managed heap.

Preparing the data for analysis
Before we can start analyzing we need to prepare the data by doing the following things:

  • Prepare a call to GC.Collect. The exact code should look like this:
    Just to be sure that everything is clean, call this code 3 times. This will ensure that everything, including all pending finalizers (objects that awaits for their finalizer to be called), are really gone.

    If this is a WinForms application, just add a button that will call the above code.
    If this is a Web application (ASP.NET or ASP.NET WebService) add a page called Collect.aspx with this code (If you don’t want to compile this into a code behind assembly, just download this page and place it in the root folder of the application).

  • Run the application and reach to a point before the area of code you wish analyze and take a memory dump using adplus.vbs -hang -p [process id].
    This dump will be used to show us what objects were already on the heap before we started the operation.
  • Run the operation we want to check for memory leaks and after it ends take another dump using the same command line as we used above.
    This dump will be used to show us what objects were added to the heap during the operation we are checking.
  • Run the Collect code (either by pressing the button you have prepared in the WinForms application or access the Collect.aspx page if this is a web application) and take another memory dump.
    This dump will be used to show us what objects are left on the heap after the operation has ended and after we called GC.Collect a few times to make sure all garbage was collected. All objects that cannot be accounted for (explained why they are still alive) are leaks and should be handled.

How to analyze the data

We first need to get some statistics on each one of the dumps to get a hold of the number of objects and their types that we currently have in the heap. To do that, we need to run the command: !dumpheap -stat on each one of the dumps and get a statistical view of all types of objects and the number of living instances.

We then need to compare between the 2nd and 3rd dumps to see if some of the objects did not decrease in number. If they haven’t (and we know they should have) these are our leaking objects.

We can find who is referencing them by using the !dumpheap -type [Object Type] (where [Object Type] is the namesapce and class name of the objects we want to check).

Then, we need to take one of the addresses (take the upper ones, they are the oldest ones) and run !gcroot [Object Address] on them to see who is referencing them.

Continue until you can account for all the objects in the 3rd dump and check if they are really supposed to be there after we called GC.Collect & GC.WaitForPendingFinalizers 3 times. Objects that should not be there should be traced to find who is referencing them and find out why they are being referenced.

That’s it. Quite simple, saves money but doesn’t produce nice graphs like the other tools 🙂

Thanks to Matt Adamson for pointing out to me that DebugDiag is now in release candidate stage (RC1) and is available for testing.

In a few words, DebugDiag is a very sophisiticated scriptable and extendable tool that enables you to debug server applications (mainly ones that are related to IIS, but not only limited to them).

Some of its features include scheduled dump taking using DbgSvc (or whatever action you required) as well as advanced memory leak checks using LeakTrack that eventually generates web reports.

I used it when I was in a Microsoft lab a while back mainly to use its scheduled dump taking. I configured it to take dumps every X minutes and it was great.
One of the SIE engineers also used its LeakTrack to find some memory problems in our application.

It a cool tool and a great addition to your debugging tools arsenal.

You can find all the necessary instructions on how to get it from here.
You will find there the tool itself and a PowerPoint presentation about the tool and how to use it.

Not all memory leaks in .NET applications specifically relate to objects that are rooted or being referenced by rooted objects. There are other things that might produce the same behavior (memory increase) and we are going to talk about one of them.

What is the Finalizer Thread?
The finalizer thread is a specialized thread that the Common Language Runtime (CLR) creates in every process that is running .NET code. It is responsible to run the Finalize method (for the sake of simplicity, at this point, think of the finalize method as some kind of a destructor) for objects that implement it.

Who needs finalization?
Objects that needs finalization are usually objects that access unmanaged resources such as files, unmanaged memory and so on. Finalization is used to make sure these resources are closed or discarded to avoid actual memory leaks.

How does finalization works?
(NOTE: please email me if I’ve got something wrong or in accurate at this stage 😉 )

Objects that implement the finalize method, upon their creation, are being placed in a finalization queue. When no one references these objects (determined when the GC runs) they are moved to a special queue called FReachable queue (which means Finalization Reachable queue) which the finalizer thread iterates on and calls the finalize method of each of the objects there.
After the finalize method of an object is called it is read to be collected by the GC.

So how can the finalizer thread get blocked?
Finalizer thread block occurs when the finalizer thread calls to a finalize method of a certain object.
The implementation of that finalize method is dependat on a resource (its a general term for the sake of the general definition) that is blocked.
If that resource will not be freed and available for our finalize method, the finalizer thread will be blocked and none of the objects that are in the FReachable queue will get GCed.

For example:

  1. The implementation of the Finalize method contains code that requires a certain lock of a synchronization object (Critical Section, Mutex, Semaphore, etc) and that synchronization object is already blocked and is not getting freed (see previous post on Identifying Deadlocks in Managed Code for help on resolving this issue).
  2. The object that is being finalized is an Single Threaded Apratment (STA) COM object. Since STA COM objects have thread affinity, in order to call the destructor of that COM object we have to switch to the STA thread that created that object. If, for some reason, that thread is blocked, the finalizer thread will also get blocked due to that.

Symptoms of a blocked Finalizer thread

  • Memory is increasing
  • Possbile deadlock in the system (depends on a lot of factors, but it can happen)

How can we tell our Finalizer thread is blocked?

The technique for finding if the finalizer thread is blocked is quite easy and involves a few easy steps.

First of all, we need to take a series of dumps at fixed intervals (i.e. 5min apart). When we have the dumps we need to run the following set of commands from the SOS extension on all dumps and compare results (to find out how to load the SOS extension and set the symbols path look at this previous post):

  1. !FinalizeQueue – This dumps the finalization queue (not the FReachable queue). If you’ll see that the total number of object is increasing from dump to dump, we can start to suspect that our finalizer thread is blocked.
    NOTE: I say “suspect” because its still not certain at this point that the finalizer thread is blocked. This situation can also mean that the finalization of some of the objects takes a long time and the rate of objects being allocated vs. the rate of objects being finalized is in favor of the allocated objects, meaning, we are allocating objects that needs finalization faster than we are collecting them.
  2. !threads – This command will show us all .NET threads in the current process. The finalizer thread is marked at the end of the line by the text (finalizer) (surprisingly enough 😉 ). At the beginging of the line that has the (finalizer) text at the end of it you will see the thread’s index. We will need it to run the next command.
  3. ~[Finalizer Thread Index]k (i.e. ~24k)- This will dump the native call stack of the finalizer thread. If you will see in all dumps that the last function in the call stack (the top most) is something like ZwWaitForSingleObject or ZwWaitForMultipleObjects it means something is blocking our thread, usually a sync object.
    NOTE: If the call stack contains a call to a function named GetToSTA it means that the call to ZwWaitForSingleObject is there because we are tring to switch to the STA thread that created the STA objects and we are waiting to switch to it. This means that the STA thread is blocked for some reason.
  4. ~[Finalizer Thread Index]e!clrstack (i.e. ~24e!clrstack) – Dump the .NET call stack just to verify that if we don’t see a call to ZwWaitForSingleObject or ZwWaitForMultipleObjects we might be blocked due to a call to the .NET Monitor class (a native .NET implementation for a Critical Section). If we do see a call to Monitor.Enter it means we have some kind of a managed deadlock.

How to resolve a blocked finalizer thread?

If we are talking about a block that is being created due to a synchronization object (usually a critical section) we need to address it as a deadlock.

Managed Deadlock
To resolve a mangaed deadlock refer to the previous post on Identifying
Deadlock in Managed Code.

Unmanaged Deadlock
I’ll give a crash course to find unmanaged deadlocks, specifically Critical Sections since this is mainly a .NET debugging blog.

There is a Microsoft extension called sieextpub.dll (you can download it from here that is mainly focused at resolving COM issues. It has some useful commands for synchronization objects as well that we will use.

  • run the !critlist command. It will list all locked critical sections and the thread that is owning them (This is similar to running !locks but runs a lot faster).
  • Run ~[Thread ID]k (i.e. ~[22]k) to get the call stack of the owning thread.
  • If the problem is not with a critical section, but with another synchornization object such as a Mutex or a Semaphore, you can run !waitlist to see all the handles that every thread is blocking on. These handles should appear as parameters to functions such as WaitForSingleObject and WaitForMultipleObjects. We can use some native WinDbg commands to find these handles from the calls stack of the finalizer thread. For example:WaitForSingleObject:
    The handle itself will be shown in the call stack (using the kb or kv command, not just k) here:
    09b2f608 77ab4494 00000594 ffffffff 00000100 KERNEL32!WaitForSingleObject+0xf

    Compare this number to the number shown in the output of !waitlist and you will see who is the thread that is blocking the handle.

    Here the handle doesn’t appear directly in the kb command output since the function receives an array of handles. To find it we will need to find to parameters:
    00f4ff34 791d25d5 00000003 00f4ff58 00000000 KERNEL32!WaitForMultipleObjects+0x17

    The first bold one is the count of element in the passed array. The second one is the pointer to the array. To dump the array on the screen we will need to run the following command:
    dc 00f4ff58 – This will output the memory at that address and the output will look something like this:00f4ff58 00000700 00000708 000006d4 81b35acc ………….Z..
    00f4ff68 03a50008 b6788cb0 00000000 00000003 ……x………
    00f4ff78 00000000 ffffffff 00000000 00f4ff4c …………L…
    00f4ff88 8042f639 00f4ffdc 7920fd39 791d25e0 9.B…..9. y.%.y
    00f4ff98 ffffffff 00f4ffb4 791d254c 00000000 ……..L%.y….
    00f4ffa8 00dfcdf4 00000000 791d4d50 00f4ffec ……..PM.y….
    00f4ffb8 7c57b388 03a50008 00dfcdf4 00000000 ..W…………
    00f4ffc8 03a50008 7ff9f000 00dfc8ac 00f4ffc0 …………….

    We need to relate to the bolded number since we earlier saw that we have 3 handles to watch for. We should look to see if they appear in the output of the !waitlist to see who is blocking on them.

STA COM issues

STA COM issues are usualy caused due to the thread in which the STA object was created is blocked. To quickly find to which thread the finalizer thread is trying to switch to in order to call the destructor of the STA COM object we will use the sieextpub.dll that we mentioned above and call the !comcalls command which will simply show us the two important figures, the index of the thread that is trying to switch and the index of thread that we are trying to switch to.

Best resolution way – Avoidance

The best way of resolving blocked finalizer thread is avoiding it.
The first rule of thumb is AVOID FINALIZATION. If you don’t have to use it DON’T.

The only case implementing the finalize method is important is in a case where your class holds some private members that are native resources or that call native resources (SqlConnection for example, various Streams like FileStream and so on) in which case it is best to use the IDisposable pattern. You can find a lot of information on the internet about that but I’ll just point out a few interesting links such as this (from Eric Gunnerson’s blog. He is the Visual C# PM), and this excellent post (from Peter Provost’s blog).

In regards to STA COM object, the minute you don’t need them call Marshal.ReleaseComObject. This function wil guarentee that the COM object will get released the minute you call this function and NOT when the next GC will occur.

This was a long post, but I think it worthwhile. It a very important not to block the finalizer thread. I know that having only one without some watchdogging mechanisms is not a good design and Microsoft are aware of that.

Some additional resources for extra reading: – An excellent post by Maoni. – An excellent article on Garbage Collection in .NET. Specifically read the “Forcing an Object to Clean Up” section which talks about the finalization of objects and a few other things.,1759,1785503,00.asp – An excellent article on DevSource about .NET memory management and Finalization

Setting a break point in native code using WinDbg is easy.
You just run the bp command with the address of the place in memory where you want to place the breakpoint.

Setting a breakpoint in Managed code is a bit trickier.

First of all, WinDbg is a native debugger. Until a method is JITed (JIT as in Just In Time Compilation) it doesn’t even have native code.
The managed assembly might be loaded and mapped, but only after the function in JITed and its native code is generated and loaded it has a place in memory that WinDbg can set a breakpoint on to.

An exception to this are Ahead of Time (AOT) compiled assemblies (assemblies which have a native image generated by ngen.exe).

The steps for setting a breakpoint in Managed code are very simple:

  1. Find the method table handle of the class.
  2. Find the method description handle of the method
    we want to put a breakpoint on.
  3. Place a breakpoint on the virtual address of method we want
    to break into (this will only work on already JITed method whether its from AOT
    or they were called once to actually get JITed).

OK, now that we understand the logical steps on how to do it, what do we actually do? (I’m using the same executable from the previous post).

  1. !name2ee [Assembly name (including extension] [Class Full Namespace]. For example: !name2ee SyncBlkDeadLock.exe SyncBlkDeadLock.Form1. That is the class on which we want to place a breakpoint in one of its methods. The output will look like this.
  2. !dumpmt -md [MethodTable handle that we got from the previous command]. For Example: !dumpmt -md 0x00a8543c. The output will look like this.
  3. !dumpmd [MethodDesc handle that we got from the previous command]. For Example: !dumpmd 0x00a853d8. This is the handle for the method
    SyncBlkDeadLock.Form1.Thread1Handler().The output will look like this.
  4. In the field “Method VA” we now have th method Virtual Address and we can set a breakpoint on that address.

Quite easy, when you know what you are looking for 🙂

Remember, that this only sets a breakpoint on the function entry address, so whenever this function is called it will break (unless you point some kind of a conditional breakpoint).


What is a deadlock?

(from the first entry when running the google query “define:deadlock“)
“This is an inter-blocking that occurs when two processes want to access at shared variables mutually locked. For example, let A and B two locks and P1 and P2 two processes: P1: lock A P2: lock B P1: lock B (so P1 is blocked by P2) P2: lock A (so P2 is blocked by P1) Process P1 is blocked because it is waiting for the unlocking of B variable by P2. However P2 also needs the A variable to finish its computation and free B. So we have a deadlock”

What are the sympotms of a deadlock?
Usually you will see that the application stops responding for some reason and the CPU (if you’ll open up the Task Manager) is at 0% most of the time (if this is the only major application running on this machien).

What to do?
So… how can we find a deadlock, specifically in a production environment without a debugger or development envrionment? (just the kind of problem we like 🙂 )

What we want to find out is who is holding the locks and who is waiting on them.

Below are a few easy steps to figure this out.

  1. Attach WinDbg to the relevant process (F6 and select the process).
  2. make sure the symbols paths are OK by calling .sympath. If you don’t have the symbols for .NET and the rest of windows just type “.symfix c:\symbols” this will put a path to Microsoft’s public symbols server and will download all relevant symbols into c:\symbols. If you have symbols of your own in some folder you can use .sympath with the += flag (like this .sympath += c:\mypath) to add the additional path.
  3. type “.load clr10\sos.dll” to load the SOS extension (it might already be loaded, but it won’t do any harm calling it again).
  4. Run the command “!SyncBlk.

You can see a sample output here.

As you can see, we have two locks, since we have 2 resources that are locked. Each resource is locked by a different thread and each one of these thread is trying to acquire a lock on the other resource.

In the “ThreadID” column we can see the ID (and near it its Index in the current debug session) of the thread that is the owner of this lock. Under each lock we can see the list the indices of the waiting threads.

Now all we have left is to use the “!clrstack command on each of the waiting threads and the locking threads to see where are they in the code and determine why is the deadlock happening and figure out a way to avoid this situation.

To run the !clrstack command on a specific thread just use the following command “~3e!clrstack“. This command displays the CLR stack of the thread whose index is 3.

You can see a sample output of !clrstack (specifically ~4e!clrstack) here

While debugging this I had the debug symbols available, that is why WinDbg was able to give me an exact line number of where this is happening. If you don’t have the debug symbols of your code you will only see the function name which is good enough in most cases.

Can’t do a live debug? The problem is at a remote customer’s site?

If you are not able to live debug this situation because this happens at a customer site and you cannot get there or unable to have direct access to the machine, you can perform all the steps above on a dump taken using the “adplus.vbs” script.

Just tell the customer to download WinDbg from here and tell him to reproduce the problem. When he does reproduce it tell him to run the following command line:

adplus.vbs -p [process Id] -hang

(Replace [process Id] with the process Id of the application).

And tell him to send the dump back to you. You will be able to run the same commands as above and figure out where the deadlock happens.

Download sample code used in this post.

Happy Deadlock Hunt!