Archive for June, 2005

Thanks to Matt Adamson for pointing out to me that DebugDiag is now in release candidate stage (RC1) and is available for testing.

In a few words, DebugDiag is a very sophisiticated scriptable and extendable tool that enables you to debug server applications (mainly ones that are related to IIS, but not only limited to them).

Some of its features include scheduled dump taking using DbgSvc (or whatever action you required) as well as advanced memory leak checks using LeakTrack that eventually generates web reports.

I used it when I was in a Microsoft lab a while back mainly to use its scheduled dump taking. I configured it to take dumps every X minutes and it was great.
One of the SIE engineers also used its LeakTrack to find some memory problems in our application.

It a cool tool and a great addition to your debugging tools arsenal.

You can find all the necessary instructions on how to get it from here.
You will find there the tool itself and a PowerPoint presentation about the tool and how to use it.

On this post we will talk about the Performance Monitor (a.k.a PerfMon) that comes with all recent Windows version (Including Windows 2000, Windows XP and Windows 2003).

You can find the PerfMon in your Start Menu -> Control Panel -> Administrative Tools -> Performance.

Although it is not a debugger, some of the issues we have previously disccused such as memory issues, blocked finalizer thread and deadlocks, have many manifestations that can be easily monitored and found using PerfMon.

In fact, PerfMon can provide a first mean of identifying such problems as well as providing additional “second opinion” that can endorse a certain diagnosis.

I will go through some of the most interesting Performance Objects and list the most interesting counters that belong to it. Afterwards, I will show to each common problem the counters that can indicate on it.

The Counters

.NET CLR Exceptions:
# of Exceptions Thrown / sec – Counts the number of managed exceptions and unmanaged exceptions that were translated into managed exceptions such as null pointer reference that translates into System.NullReferenceException.

This indicator is good to determine performance problems due to too many exceptions thrown during the course of the application. This usually indicates a more serious problem of bad implementation such as using Exceptions as a mean of handling normal program flow.

.NET CLR Interop:
# of CCWs – Number of Com Callable Wrappers (CCWs) current living. A CCW is a .NET proxy that is used to wrap .NET objects that are referenced from unmanaged COM.

If you see memory growing (or not getting freed when it should have) and this counter increasing it might suggest that you have unmanaged code holding on to some of your managed objects and this is the cause of memory either not being freed or increasing.

.NET CLR LocksAndThreads:
Contention Rate / sec – The rate at which threads in the runtime attempt to acquire a managed lock unsucessfully.

Managed locks are acquired using either the “lock statement in C# or by using System.Threading.Monitor.Enter.

If this number is increasing it means we have a bottleneck in our code. This area in the code is synchronized, so only one thread at a time enters it, but it is being “hammered” by multiple threads that are all trying to get into this piece of code.

We need find that piece of code and see how can we avoid this situation in order to resolve this bottleneck.

Current Queue Length – This counter displays the the total number of threads currently waiting to acquire some managed lock in the application. It only shows the last observed vaule.

This counter is similar to Contention Rate / sec but it shows the number of threads waiting to acquire a managed lock at a given point in time, not only the failed ones. It will also outline possible bottlenecks in the code that are being accessed by multiple threads multiple times.

If Current Queue Length value is almost equal to Threads Count and % Processor Time is at a fixed value it might also indicate that we have a CPU spin issue.

If % Processor Time is 0 (or almost 0) and the application is not responding it means that we have a deadlock.

.NET CLR Memory:
# Bytes in all heaps This counter is the sum of four other counters: Gen 0 Heap Size, Gen 1 Heap Size, Gen 2 Heap Size and Large Object Heap Size. It displays the current memory allocated in bytes on the GC (managed) heaps.

If this counter keeps on rising it indicates that we have a managed leak. Some managed objects are always being refereneced and are never collected.

# GC Handles – Displays the current amount of GC handles in use. GCHandles are handles to external resources to the CLR and managed environment. They occupy small amounts of memory on the GC heap but potentially expensive unmanaged resources hide behind them.

If this counter keeps on growing in addition to the private bytes (we will talk about this counter later on), it means that we are probably referencing unmanaged resources and not releasing them causing a memory leak.

# Induced GC – Indicates the number of times GC.Collect was explicitly called.

Calling GC.Collect is not a good practive in production code. It is usually useful to find memory leaks (a good material for another method of finding memory leaks, but we will talk about it in a later post) while debugging or while developing but you should NEVER call it in production code and let the GC tune itself.

# of Pinned Objects – This counter shows the number of pinned objects encountered since the last GC. A pinned object is an object that the GC cannot move in memory.

Pinned objects are usually objects that were passed as pointers to unmanaged code and are pinned so that the Garbage Collector will not move them in memory while it compacts the heap, otherwise it will cause unexpected behavior in the unmanaged code and might even lead to memory corruption.

Generally, this number shouldn’t be that high if you don’t call to unmanaged code too much.
If it is increasing it might suggest that we are pinning objects due to passing them into unmanaged code and not release the unmanaged code or we explicitly pinned it and forgot to unpin it.
If this counter is increasing and the Virtual Bytes counter (we will talk about this counter later on) is also increasing it means that we do pin objects too much and the GC is not able to effectivly compact the heap, thus forcing it to reserve additional virutal memory so the GC heap could grow and accomodate the requested needs of allocation.

# of Sink Blocks in use – This counter displays the current number of sync blocks in use. Sync blocks are per-object data structures allocated for storing synchronization information. Sync blocks hold weak references to managed objects and need to be scanned by the Garbage Collector. Sync blocks are not limited to storing synchronization information and can also store COM interop metadata.

This counter was designed to indicate a performance problem that occurs due to exessive usage of synchronization primitives. If this counter keeps on increasing we should probably take a look in all the places that we are using synchronization objects and see if they are truely needed. It will also show us, combined with Current Queue Length counter and Contention Rate / Sec counter, that we probably have some synchronization bottlenecks in our application that should be addressed to improve performance.

# Total committed Bytes – This counter show the total amount of virtual memory (in bytes) currently committed by the Garbage Collector.

This counter actually shows us the total amount of virutal memory that is actually being used at a given point in time in the GC heap. If # Total Reserved Bytes is significantly larger than this counter it means that the GC keeps on growing segments and reserve more memory.

This indicates one of two things:

  1. The GC is having problems compacting the heap due to a large amount of small sized pinned objects or a small amount of pinned objects that takes a lot of space (usually large arrays that are being passed to unmanaged code).
  2. We are leaking in the managed sense of the word, which means, we have a lot of objects that were suppose to die by something is still holding a reference to them.


# Total reserved Bytes
– This counter shows the total amount of virtual memory (in bytes) currently reserved (not committed) by the Garbage Collector.

In addition to what I’ve mentioned above in # Total committed Bytes, this counter, when increasing over a large period of time, might also suggest fragmentation of the virtual address space. This situtaion is usually common in a mixed application that has a lot of managed and unamanged code tangled together and effectivly will limit your application’s total life time before commiting application suicide.

NOTE: Virtual address space fragmenetation may also occur natually (in unmangaed code or in a mixed managed/unmanaged application) due to the nature of your application’s memory allocation profile, so not every increasing in reserved bytes might indicate that.

Finalization Survivors – This counter displays the number of managed objects that survive a collection because they are waiting to be finalized. This counter updates at the end of the GC and displays the number of survivors at that specific GC.

This counter will indicate if we have too many objects that needs finalization. Having too many finalizable objects is usually not a good idea since it requires at least 2 GCs before they are truely collected, it might also indicate that the finalizer thread (I talked about it in the last post) will have a lot of work and if we are not careful in the implementation of the finalizable objects, we might reach a blocked finalizer thread issue.

Gen 2 Heap Size – Indicates the size in bytes of generation 2 objects.

If this number keeps on growing it indicates that we have too many objects that manage to survive and reach generation 2. Since Gen 2 collections are not as common as Gen 0 and Gen 1 it means this memory will stick around for a while, burdening the application’s memory.

Large Object Heap – All objects that are greater than 85 Kbytes are allocated on the large object heap mainly due to performance issues. The major difference in the Large Object heap from the other heaps (Gen 0, Gen 1 and Gen 2) is that it is never compacted and it is only collected in Gen 2 collections.

If this counter keeps on growing it will inidcate that we are allocating too many large objects. Doing that may lead to memory fragmentation because these objects are never compacted and only get collected in Gen 2 (they will burden the GC heap). It might also indicate that someone is still referencing these objects and they are not being collected at all.

Process:

Virtual Bytes – Indicates the current size (in bytes) of the allocated (reserved and committed) virtual memory.

Private Bytes – Indicates the current size (in bytes) of the allocated (commited) virtual memory. This memory cannot be shared with other processes.

Threads Count – Shows the number of threads currently active in the current process.

Processor:

% Processor Time – The percentage of elapsed time that the processor spends to execute a non-Idle thread.

Counters Inidicating Common Problems

Below is a list of common problems and the counters that might indicate on these problems.

Remember that these are only indicators and in some cases when they will indicate on a problem it might just be the application’s behavior.

Memory Leaks Indicators:

  • # bytes in all Heaps increasing
  • Gen 2 Heap Size increasing
  • # GC handles increasing
  • # of Pinned Objects increasing
  • # total committed Bytes increasing
  • # total reserved Bytes increasing
  • Large Object Heap increasing

Virtual Address Space Fragmentation Indicators:

  • # total reserved Bytes significantly larger than # total committed Bytes
  • # of Pinned Objects increasing
  • # GC handles increasing
  • # bytes in all heaps always increasing.

CPU Spin Indicators:

  • Current Queue Length is very close to Threads Count and stays that way for a long time.
  • % Processor Time is continuously at a fixed level for a long period of time (as long as the Current Queue Length is at the same value).

Managed Deadlock Indicators:

  • Current Queue Length is very close to Threads Count and stays that way for a long time.
  • % Processor Time is 0 (or close to 0) (as long as the Current Queue Length is at the same value) and the application stopped responding.

Blocked Finalizer Thread Indicators:

  • # bytes in all heaps increasing
  • Private Bytes incresing
  • Virtual Bytes increasing

As I’ve mentioned above, these are all indicators and might not actually tell you if that certain problem is occuring. To actually prove the problem there is a need for debugging (either with a full blow development environment, live debugging with WinDbg or a post mortem debug using a memory dump).

To conclude, PerfMon gives us a great tool that comes built in with Windows to better monitor our application for common problems. Combining PerfMon with an additional technique such as taking subsequent memory dumps using adplus.vbs at fixed intervals can give a better indication and usually point you to the cause of the problem in no time.

PerfMon – Don’t leave home without it (well, you can’t because its always there 😉 ).

Not all memory leaks in .NET applications specifically relate to objects that are rooted or being referenced by rooted objects. There are other things that might produce the same behavior (memory increase) and we are going to talk about one of them.

What is the Finalizer Thread?
The finalizer thread is a specialized thread that the Common Language Runtime (CLR) creates in every process that is running .NET code. It is responsible to run the Finalize method (for the sake of simplicity, at this point, think of the finalize method as some kind of a destructor) for objects that implement it.

Who needs finalization?
Objects that needs finalization are usually objects that access unmanaged resources such as files, unmanaged memory and so on. Finalization is used to make sure these resources are closed or discarded to avoid actual memory leaks.

How does finalization works?
(NOTE: please email me if I’ve got something wrong or in accurate at this stage 😉 )

Objects that implement the finalize method, upon their creation, are being placed in a finalization queue. When no one references these objects (determined when the GC runs) they are moved to a special queue called FReachable queue (which means Finalization Reachable queue) which the finalizer thread iterates on and calls the finalize method of each of the objects there.
After the finalize method of an object is called it is read to be collected by the GC.

So how can the finalizer thread get blocked?
Finalizer thread block occurs when the finalizer thread calls to a finalize method of a certain object.
The implementation of that finalize method is dependat on a resource (its a general term for the sake of the general definition) that is blocked.
If that resource will not be freed and available for our finalize method, the finalizer thread will be blocked and none of the objects that are in the FReachable queue will get GCed.

For example:

  1. The implementation of the Finalize method contains code that requires a certain lock of a synchronization object (Critical Section, Mutex, Semaphore, etc) and that synchronization object is already blocked and is not getting freed (see previous post on Identifying Deadlocks in Managed Code for help on resolving this issue).
  2. The object that is being finalized is an Single Threaded Apratment (STA) COM object. Since STA COM objects have thread affinity, in order to call the destructor of that COM object we have to switch to the STA thread that created that object. If, for some reason, that thread is blocked, the finalizer thread will also get blocked due to that.

Symptoms of a blocked Finalizer thread

  • Memory is increasing
  • Possbile deadlock in the system (depends on a lot of factors, but it can happen)

How can we tell our Finalizer thread is blocked?

The technique for finding if the finalizer thread is blocked is quite easy and involves a few easy steps.

First of all, we need to take a series of dumps at fixed intervals (i.e. 5min apart). When we have the dumps we need to run the following set of commands from the SOS extension on all dumps and compare results (to find out how to load the SOS extension and set the symbols path look at this previous post):

  1. !FinalizeQueue – This dumps the finalization queue (not the FReachable queue). If you’ll see that the total number of object is increasing from dump to dump, we can start to suspect that our finalizer thread is blocked.
    NOTE: I say “suspect” because its still not certain at this point that the finalizer thread is blocked. This situation can also mean that the finalization of some of the objects takes a long time and the rate of objects being allocated vs. the rate of objects being finalized is in favor of the allocated objects, meaning, we are allocating objects that needs finalization faster than we are collecting them.
  2. !threads – This command will show us all .NET threads in the current process. The finalizer thread is marked at the end of the line by the text (finalizer) (surprisingly enough 😉 ). At the beginging of the line that has the (finalizer) text at the end of it you will see the thread’s index. We will need it to run the next command.
  3. ~[Finalizer Thread Index]k (i.e. ~24k)- This will dump the native call stack of the finalizer thread. If you will see in all dumps that the last function in the call stack (the top most) is something like ZwWaitForSingleObject or ZwWaitForMultipleObjects it means something is blocking our thread, usually a sync object.
    NOTE: If the call stack contains a call to a function named GetToSTA it means that the call to ZwWaitForSingleObject is there because we are tring to switch to the STA thread that created the STA objects and we are waiting to switch to it. This means that the STA thread is blocked for some reason.
  4. ~[Finalizer Thread Index]e!clrstack (i.e. ~24e!clrstack) – Dump the .NET call stack just to verify that if we don’t see a call to ZwWaitForSingleObject or ZwWaitForMultipleObjects we might be blocked due to a call to the .NET Monitor class (a native .NET implementation for a Critical Section). If we do see a call to Monitor.Enter it means we have some kind of a managed deadlock.

How to resolve a blocked finalizer thread?

If we are talking about a block that is being created due to a synchronization object (usually a critical section) we need to address it as a deadlock.

Managed Deadlock
To resolve a mangaed deadlock refer to the previous post on Identifying
Deadlock in Managed Code.

Unmanaged Deadlock
I’ll give a crash course to find unmanaged deadlocks, specifically Critical Sections since this is mainly a .NET debugging blog.

There is a Microsoft extension called sieextpub.dll (you can download it from here that is mainly focused at resolving COM issues. It has some useful commands for synchronization objects as well that we will use.

  • run the !critlist command. It will list all locked critical sections and the thread that is owning them (This is similar to running !locks but runs a lot faster).
  • Run ~[Thread ID]k (i.e. ~[22]k) to get the call stack of the owning thread.
  • If the problem is not with a critical section, but with another synchornization object such as a Mutex or a Semaphore, you can run !waitlist to see all the handles that every thread is blocking on. These handles should appear as parameters to functions such as WaitForSingleObject and WaitForMultipleObjects. We can use some native WinDbg commands to find these handles from the calls stack of the finalizer thread. For example:WaitForSingleObject:
    The handle itself will be shown in the call stack (using the kb or kv command, not just k) here:
    09b2f608 77ab4494 00000594 ffffffff 00000100 KERNEL32!WaitForSingleObject+0xf

    Compare this number to the number shown in the output of !waitlist and you will see who is the thread that is blocking the handle.

    WaitForMultipleObjects:
    Here the handle doesn’t appear directly in the kb command output since the function receives an array of handles. To find it we will need to find to parameters:
    00f4ff34 791d25d5 00000003 00f4ff58 00000000 KERNEL32!WaitForMultipleObjects+0x17

    The first bold one is the count of element in the passed array. The second one is the pointer to the array. To dump the array on the screen we will need to run the following command:
    dc 00f4ff58 – This will output the memory at that address and the output will look something like this:00f4ff58 00000700 00000708 000006d4 81b35acc ………….Z..
    00f4ff68 03a50008 b6788cb0 00000000 00000003 ……x………
    00f4ff78 00000000 ffffffff 00000000 00f4ff4c …………L…
    00f4ff88 8042f639 00f4ffdc 7920fd39 791d25e0 9.B…..9. y.%.y
    00f4ff98 ffffffff 00f4ffb4 791d254c 00000000 ……..L%.y….
    00f4ffa8 00dfcdf4 00000000 791d4d50 00f4ffec ……..PM.y….
    00f4ffb8 7c57b388 03a50008 00dfcdf4 00000000 ..W…………
    00f4ffc8 03a50008 7ff9f000 00dfc8ac 00f4ffc0 …………….

    We need to relate to the bolded number since we earlier saw that we have 3 handles to watch for. We should look to see if they appear in the output of the !waitlist to see who is blocking on them.

STA COM issues

STA COM issues are usualy caused due to the thread in which the STA object was created is blocked. To quickly find to which thread the finalizer thread is trying to switch to in order to call the destructor of the STA COM object we will use the sieextpub.dll that we mentioned above and call the !comcalls command which will simply show us the two important figures, the index of the thread that is trying to switch and the index of thread that we are trying to switch to.

Best resolution way – Avoidance

The best way of resolving blocked finalizer thread is avoiding it.
The first rule of thumb is AVOID FINALIZATION. If you don’t have to use it DON’T.

The only case implementing the finalize method is important is in a case where your class holds some private members that are native resources or that call native resources (SqlConnection for example, various Streams like FileStream and so on) in which case it is best to use the IDisposable pattern. You can find a lot of information on the internet about that but I’ll just point out a few interesting links such as this (from Eric Gunnerson’s blog. He is the Visual C# PM), and this excellent post (from Peter Provost’s blog).

In regards to STA COM object, the minute you don’t need them call Marshal.ReleaseComObject. This function wil guarentee that the COM object will get released the minute you call this function and NOT when the next GC will occur.

This was a long post, but I think it worthwhile. It a very important not to block the finalizer thread. I know that having only one without some watchdogging mechanisms is not a good design and Microsoft are aware of that.

Some additional resources for extra reading:

http://blogs.msdn.com/maoni/archive/2004/11/4.aspx – An excellent post by Maoni.

http://msdn.microsoft.com/msdnmag/issues/1100/GCI/ – An excellent article on Garbage Collection in .NET. Specifically read the “Forcing an Object to Clean Up” section which talks about the finalization of objects and a few other things.

http://www.devsource.com/article2/0,1759,1785503,00.asp – An excellent article on DevSource about .NET memory management and Finalization

CPU Spinning occurs when a certain thread gets into either an infinite loop or is running a computation intesive operation that takes a long time. In both cases, the thread is “hogging” the CPU and the operating system cannot preempt and switch to run another thread of the same process (as it usually does).

The technique used to find out what is causing the CPU to spin is not native to .NET since both .NET code and native code can cause the CPU to spin. The only difference is that instead of looking at the native stack (using one of the “k” commands) we will use the !clrstack command to see the managed stack of the thread that is causing the problem.

This technique is very useful in live debugging, but can also be used in a post mortem situation using a memory dump taken by, for example, the customer.

In a live debugging situation when we attach to the relevant process the CPU is hogged by that process and it will take the debugger up to 30sec to wait for before forcibly breaking in.


Finding the rebellious thread
To find the rebellious thread, all we need to do is to run a command called !runaway.
The command will show us all the running threads in the CPU and how much CPU time they actually take.

The thread with the highest CPU time is usually the thread that is causing the spinning.

Of course, to actually be able to see this you will have to let the application run a bit in full CPU utilization so that rebellious thread will pop up in the !runaway list at the top.

Showing the thread’s stack
The thread ID shown in the !runaway command list is its true operating system ID and NOT its index (like you usually see and use when using the “~” comamnd).

To dump its stack we need to use the following syntax:
~~[Thread ID]e!clrstack
For example: ~~[2234]e!clrstack

The reason that we add the “e” is because !clrstack is a command that is not native to the debugger and it is implemented in a debugger extension (the SOS.dll extension).


If this is an unmanaged thread, we will use the following syntax:
~~[Thread ID]k
For example: ~~[2234]k

Of course, you can use any of the k command that is available (kb, kv, etc).


It’s that simple. No development environment needed. Nothing.
Just 2 commands and you can find the problematic code in no time.

Happying Spinning Fighting.

In a server application even the smallest leak can grow fast to become a major issue.

In a pure unmanaged world, finding these leaks can be challanging but there are more than a few ways of doing so (perhaps I shall discuss this in a later post), but in .NET, with the help of our faithful friends WinDbg and the SOS extension, we can easily find and eliminate these leaks in no time if we stick to a proven and useful methodology.

As you already know, in pure managed code there are no memory leaks in the traditional manner, you can’t forget to free something since we are in a garbage collected environment. What you can do, however, is to forget to unreference an object in which case the GC will never consider it garbage and will never collect it.

Common situations include having some static variable (Shared variable for our VB.NET audiance) that is a collection of some type (usually a Hashtable) that you add various objects to it.
Since its a static variable it is rooted and will never get unreferenced and if we don’t explicitally remember to remove the objects we have added to that collection we have a managed memory leak.

So how can we overcome this problem?

Well… there are three ways.

  1. You can go the money way which involves buying one of the various .NET memory profilers available today (I can personally recommed on the fine and very easy to use .NET Memory Profiler by SciTech. Its an excellent tool and you get a 14 days fully functional evaluation and the new version also have a command line and an instrumentation API that lets you intiate the profiler in certain places in the code without a user intervention).
  2. You can go the programming way by implementing your own memory profiler using the .NET profiling API which involves implementing a COM interface and handling all the plumbing yourself (yet another post material for future posts 😉 and its even very interesting to do).
  3. Using WinDbg and the SOS extension in a certain methodology that will guarentee success.

Obviously, we are going to talk about option #3 in this post.

The tools and commands we are going to use are:

  • adplus.vbs – a sophisticated VBScript that was written by some fine SIE engineers at Microsoft Support that instruments the use of cdb and the process of taking memory dumps automatically. It comes with the Debugging Tools for Windows installation (the WinDbg installation).
  • WinDbg (duh!)
  • !dumpheap – An SOS extension command used to list the objects allocated on the managed heap.
  • !gcroot – An SOS extension command to find who is the object that is referncing the object we are currently checking.
  • !dumpobject – An SOS extension command to investigate a certain object and see what are its fields and who they reference.

Before we can start to work on WinDbg we need a set of dumps taken in a predefined and continous interval. If the memory leak is small we will use a bigger interval (i.e. 20min) if the leak is big we will use small interval (i.e. 1min).

To take the dump we will use the following adplus.vbs command line:

([UPDATE] 6/16/2005: Thanks to some anonymous reader that pointed my mistake. I WAS reffering for the use of -hang and NOT -crash option, which is good for other situations)

adplus.vbs -hang -p [process Id] -quiet

  • -hang tells adplus.vbs to attach the debugged process, take a memory dump and detach (this will only work in Windows 2000 and above, since the detach option is only availabe in Windows 2000 and above).
  • Replace [process Id] with the PID of the process we want to take its dump
  • -quiet will silence unnecessay message boxes that adplus might pop in certain cases.

After we have a series of dumps (usually 3 will suffice to show a trend) we can get down and dirty with WinDbg.

TIP: To save your WinDbg session that will contain all the commands you have executed and all the outputs you have received use the .logopen [file name] command after openning the memory dump and before starting to run any other command

Since our problem is memory increasing and its a pure managed application it means we are allocating objects and some object that is still alive is holding a reference to them making the GC not consider them garbage and therefore not collecting them.

To find them we will use the following set of commands:

  1. Run !dumpheap -stat to get a statistical view of all objects currently allocated in the heap. If the case is indeed objects that are allocated and never gets free you will find that certain classes will have a steady increasing amount of live instances over the course of the series of dump taken and this should point to the objects that we should take a closer look at. Click here to see a Sample Output.
    In our case, we see that we have 4500 instances of type MemoryLeakSample.MyObject which should turn a red light in our head since this is a good candidate of a leaking object.
  2. Run !dumpheap -type MemoryLeakSample.MyObject where “MemoryLeakSample.MyObject” is the full namespace and class name of the class type that we in the step above. This should give us a list of all the instances sorted by their addresses. Since we are in a GC environment and the .NET GC compacts the heap, the instance with the smallest address will be our oldest instance and we should focus on it. Click here to see a Sample Output.
  3. Run !gcroot 0x04ab8348 where 0x04ab8348 is the address of the oldest object (the first one in the list) that we saw earlier and we will get a full list of references showing who is holding who. In our case, since we have a static hashtable we will see something like “HANDLE(Strong):931c0:Root ….” which means that the parent (and true parent that is at the top of the reference list) is rooted, meaning it will never get freed unless the AppDomain will unload (in a multiple AppDomain scenario) or the process will die. Click here to see a Sample Output.

Now all we have left to do, is to find out where in the code we have defined this static variable and either NOT use a static variable (the best option, but not always possible) or make sure we clean this collection.

This methodology can be used to find not only rooted objects, but a big increase in instances of a ceratin type over a period of time and see who is holding them.

The best thing about this methodology is that you can do it at a customer’s site without a problem and finding 80% of the problem without having your symbols and/or your code.

Happy leak hunt!

Setting a break point in native code using WinDbg is easy.
You just run the bp command with the address of the place in memory where you want to place the breakpoint.


Setting a breakpoint in Managed code is a bit trickier.


First of all, WinDbg is a native debugger. Until a method is JITed (JIT as in Just In Time Compilation) it doesn’t even have native code.
The managed assembly might be loaded and mapped, but only after the function in JITed and its native code is generated and loaded it has a place in memory that WinDbg can set a breakpoint on to.

An exception to this are Ahead of Time (AOT) compiled assemblies (assemblies which have a native image generated by ngen.exe).

The steps for setting a breakpoint in Managed code are very simple:

  1. Find the method table handle of the class.
  2. Find the method description handle of the method
    we want to put a breakpoint on.
  3. Place a breakpoint on the virtual address of method we want
    to break into (this will only work on already JITed method whether its from AOT
    or they were called once to actually get JITed).

OK, now that we understand the logical steps on how to do it, what do we actually do? (I’m using the same executable from the previous post).

  1. !name2ee [Assembly name (including extension] [Class Full Namespace]. For example: !name2ee SyncBlkDeadLock.exe SyncBlkDeadLock.Form1. That is the class on which we want to place a breakpoint in one of its methods. The output will look like this.
  2. !dumpmt -md [MethodTable handle that we got from the previous command]. For Example: !dumpmt -md 0x00a8543c. The output will look like this.
  3. !dumpmd [MethodDesc handle that we got from the previous command]. For Example: !dumpmd 0x00a853d8. This is the handle for the method
    SyncBlkDeadLock.Form1.Thread1Handler().The output will look like this.
  4. In the field “Method VA” we now have th method Virtual Address and we can set a breakpoint on that address.

Quite easy, when you know what you are looking for 🙂

Remember, that this only sets a breakpoint on the function entry address, so whenever this function is called it will break (unless you point some kind of a conditional breakpoint).

Enjoy!

What is a deadlock?

(from the first entry when running the google query “define:deadlock“)
“This is an inter-blocking that occurs when two processes want to access at shared variables mutually locked. For example, let A and B two locks and P1 and P2 two processes: P1: lock A P2: lock B P1: lock B (so P1 is blocked by P2) P2: lock A (so P2 is blocked by P1) Process P1 is blocked because it is waiting for the unlocking of B variable by P2. However P2 also needs the A variable to finish its computation and free B. So we have a deadlock”

What are the sympotms of a deadlock?
Usually you will see that the application stops responding for some reason and the CPU (if you’ll open up the Task Manager) is at 0% most of the time (if this is the only major application running on this machien).

What to do?
So… how can we find a deadlock, specifically in a production environment without a debugger or development envrionment? (just the kind of problem we like 🙂 )

What we want to find out is who is holding the locks and who is waiting on them.

Below are a few easy steps to figure this out.

  1. Attach WinDbg to the relevant process (F6 and select the process).
  2. make sure the symbols paths are OK by calling .sympath. If you don’t have the symbols for .NET and the rest of windows just type “.symfix c:\symbols” this will put a path to Microsoft’s public symbols server and will download all relevant symbols into c:\symbols. If you have symbols of your own in some folder you can use .sympath with the += flag (like this .sympath += c:\mypath) to add the additional path.
  3. type “.load clr10\sos.dll” to load the SOS extension (it might already be loaded, but it won’t do any harm calling it again).
  4. Run the command “!SyncBlk.

You can see a sample output here.

As you can see, we have two locks, since we have 2 resources that are locked. Each resource is locked by a different thread and each one of these thread is trying to acquire a lock on the other resource.

In the “ThreadID” column we can see the ID (and near it its Index in the current debug session) of the thread that is the owner of this lock. Under each lock we can see the list the indices of the waiting threads.

Now all we have left is to use the “!clrstack command on each of the waiting threads and the locking threads to see where are they in the code and determine why is the deadlock happening and figure out a way to avoid this situation.

To run the !clrstack command on a specific thread just use the following command “~3e!clrstack“. This command displays the CLR stack of the thread whose index is 3.

You can see a sample output of !clrstack (specifically ~4e!clrstack) here

While debugging this I had the debug symbols available, that is why WinDbg was able to give me an exact line number of where this is happening. If you don’t have the debug symbols of your code you will only see the function name which is good enough in most cases.

Can’t do a live debug? The problem is at a remote customer’s site?

If you are not able to live debug this situation because this happens at a customer site and you cannot get there or unable to have direct access to the machine, you can perform all the steps above on a dump taken using the “adplus.vbs” script.

Just tell the customer to download WinDbg from here and tell him to reproduce the problem. When he does reproduce it tell him to run the following command line:

adplus.vbs -p [process Id] -hang

(Replace [process Id] with the process Id of the application).

And tell him to send the dump back to you. You will be able to run the same commands as above and figure out where the deadlock happens.

Download sample code used in this post.

Happy Deadlock Hunt!

WinDbg is the tool we will mostly use here to do both production debugging and post mortem debugging.

I’ve came across some interesting links that will bring you up to speed with WinDbg:
A word for WinDbg
A word for WinDbg 2

They are really good and it will help if you’ll get familiar with the tool before we dive into a lot of .NET debugging fun.

Hello Everyone.
My name is Eran Sandler and I’ve been messing around with .NET for the past 5+ years (a little bit after the first PDC it was introduced to the world).

I wanted to share with the world some of the experiences I’ve gathered in the last couple of years in regards to .NET debugging, specifically to Production debugging and Post Mortem debugging since there is little to no information available on the web (not to mention not a whole lot of tools to do that).

So what is Production Debugging?
In my own terms, production debugging is the ability to debug your application in a production environment without the ability to install a fully feldged development environment. This means, NO Visual Studio or some other IDE at your disposal, NO code compilation on the machine and a lot of other NOs that you can figure yourself.

This type of debugging is very suited to situations where you need to debug on a production environment some tricky problem and there is no way (or ability) to install a development environment.
A good example of this situtation is deubgging on a customer’s production environment.

OK, so what is Post Mortem Debugging?
I’m not sure its the right term for it (and maybe I mixed it up with some other term), but basically its the ability to investigate a problem after it occured.
We are capable of doing that using a tool that actually takes a snapshot of all the memory that is being used at the moment in a given process. This gives us a current image of the process at a certain point in time enabling us to investigate various things like the objects alive at this momemnt, what are the various threads doing and so on.

Post Mortem debugging is very useful if you can’t reach a cutomer’s site (or don’t want to) and you need to get more information to better understand and solve a problem.
Another good use of post mortem debugging is the ability to take a few sequencial snapshots of a ceratin process and examine some information as it changes over time, for example, checking the virtual memory fragmentation that is happening over time in a ceratin process.


By now you are probably wonder if this is actually possible. Well… It is!
Microsoft even supplies the tools (Debugging Tools for Windows – contains WinDbg and friends, and there is also CorDbg which is the .NET managed Debugger that comes with the .NET Framework SDK) and its even quite easy but the main problem is lack of documentation and places on the net to actually learn this stuff.

I will try to give some tips, pointers and tutorials on how to do various things I’ve picked up a long the way and hopefully we will have enough information here to help everyone.

In the next couple of posts I’m going to show the basic things of how to use WinDebug (that’s the main tool we’ll use) with .NET and we will learn some cool stuff.

So… hold on to your hourses because we are going on a crazy debugging ride 🙂