Though I’ve briefly mentioned in a previous post that WinDbg has a built-in Semi-Scripting language to instrument it a bit, I didn’t explained much about it, so here is a nice little trick that you can do with it.

As part of this semi-scripting language there are a few flow controls that are very useful for performing a specific opeartions on multiple items such as object addresses.
One such useful flow control is “.foreach” which enables one to run a command that returns a result and for each line that returns invoke another command on that specific line.

Let’s say you want to view the full content of all of the string objects in your application that are in Generation 2.

The long way to perform this is to run dump all Generation 2 string objects using “!dumpheap“, save the list of addresses on the side and run “!dumpobj” on each and every one of the addresses we got.

The quick way to do it would be to use the “.foreach” command like this:
.foreach (obj { !dh -type System.String -gen 2 -short }) { !do ${obj} }

What we are doing here is running “!dh -type System.String -gen 2 -short” which returns all of the Generation 2 String objects currently alive on the heap (!dh is the short form of !dumpheap).

Mind the “-short” parameter. It is essential for this to work since “-short” tells the “!dumpheap” command to return the object’s address, one in each new line, instead of the normal output of “!dumpheap“.

NOTE: “-short” was only introduced in newer version of the SOS.dll that comes with WinDbg, so be sure to catch the latest version if you want to use it.

The second part of the “.foreach” command expands the obj parameter which contains each new line that we recieved from the “!dumpheap” command and we expand it by using the “${obj}” syntax.

This will run “!do” (the short form command of “!dumpobj“) on each and everyone of the addresses that returned from the “!dumpheap” command.

Just a small and useful tip for debugging… Enjoy!

Dan McKinley Emailed me to tell me about a WinDbg Extension he wrote that interogates CLR information. In the process of writing it he also documented the process of its creation, thus producing a tutorial of how to create WinDbg extension.

It’s all written in a nice and tidy post on his blog.

What is most interesting about his extension is that it interogates CLR information, so it should be a good basis for people writing WinDbg extensions that perform operations on CLR information.

If you are in the process of thinking about writing a WinDbg extension that’s a good place to start.

I got the following question from a reader named “Bysza” in the form of a comment on post about “How to tackle memory issues in a server application”:

“I’m using this approach for some time now, but I not always succeed in finding roots to objects.
I’m looking for a method to find objects that hold reference to my live object which sould be already gone.

Could you help me in that matter?”

If I understand the question correctly, you want to be able to find who is holding your live objects.

If that is the case, its quite simple.
What you need to do is either take a memory dump (I explained a bit on how to use it in this post) or attach WinDbg to the relevant process.

After you loaded the dump or attached to the process perform the following operations:

  1. Run !dumpheap -type [Part of the full namespace of the object] for example:
    !dumpheap -type MyNamespace.MyObjectThis will get you a list of the live objects of that type.
  2. Take the address from the first colum (that’s the address of the object) of the relevant object and run the command !gcroot [Object Address], for example:
    !gcroot 0x235025aThis will show you the chain of references up to the root object that is holding your referencing your object.
    This chain contains the addresses of each object in the chain so you will be able to know exactly who is holding who and why.

If a rooted object is directly holding your object you need to check one of the following things:

  • Do you have some kind of a Static (Shared in VB.NET) object in the application? If you do, this object is rooted and therefore every object it referecnces will never die until the application’s process is closed.Usually people have some kind of a static collection in which they store various objects and in a lot of the cases they forget to remove the objects from the collection.
  • Do you specifically pin objects? Do you call GCHandle.Alloc ? If so, be sure to release those handles or make sure that the objects being pinned do not hold references to objects you want to release.

I hope this helps a bit. If not, post a comment and we will continue the discussion there.

If any of you have various questions as to issues raised in this blog or from personal experience, please write me an email to eran.sandler at gmail.com or leave a comment on one of the posts. Also, if any of you have some other comments about how did you implemet/use some of the things we have talked about here, I would love to hear about it.

I’m going to be helping out in the Visual Studio 2005 Launch Event here in Israel in a side session titled (and it’s a translation from Hebrew): “Error detection in off-road terrain and combining your application with Windows Error Reporting”.

Sounds a bit long for a title, isn’t it? 😉

There are a few such side sessions which will be going on at the same time as other “Main” sessions. The main idea behind the side sessions is to have a smaller more intimate session (~50+ peoplee) that will allow better interaction between the lecturers and the people coming to hear the lecture.

We will be talking about how to do some production debugging at a customer site and how to handle memory dumps for post mortem debugging when you can’t have access to the client’s site. Gadi Meir of Idag (who is leading the session) will also talk about Windows Error Reporting (WER) and how to make your program work with it.
I will be talking about the managed areas of production and post mortem debugging as well as some samples on how to use, instrument and extend MDbg (which I talked about quite a bit in previous posts). Mainly these are topics that I’m usually covering here on the blog.

For some reason, since I’m a second chair in the presentation (and there are a lot of other second chairs for the other side sessions) I was not mentioned in the invitation.

That’s not nice, isn’t it? (remind me to have a small chat with the organizers. The second chairs should also be mentioned!)

Oh well… at least it’s going to be a cool event.
All of you in Israel – be sure to register here.

See you there and “Get Ready to Rock” (god, these marketing guys needs someone to help them a bit… 😉 )

If you want to catch me for a quick talk/question/whatever I will be hanging around the place where the session will be conducted and probably all around the place, so be sure to catch me!

After further experimentation it looks writing a WinDbg extension that will let one write WinDbg extensions in .NET is actually feasible. If you remember, I’ve talked about it in my previous post on the subject.

I could have gone the easy way and written some code in .NET exposed as a COM object and make my WinDbg extension call that managed code which will do all the rest, but that would be too easy and a lot less fun 🙂

Therefore, I’ve decide to gain some experience and knowledge in the .NET CLR hosting API.

I’m using the hosting API to bind and start the CLR, call the rest of my .NET code and pass the various interfaces that WinDbg is passing me back to the .NET code.

The challenge is to convert the interface definitions of the various WinDbg interface used to interact with WinDbg to .NET. It’s a bit of a tedious job, but someone has to do it… don’t you think?! 🙂

Anyhow, once I’ll finish the part about the CLR hosting APIs I’d write a long post about this subject and what I’ve learned while using this API so stay tuned!

Oh, one last thing. In that previous post tinkered with the idea of using some other way of instrumenting WinDbg instead of using its archaic and strange scripting language (actually that’s WinDbg’s command line interface turned into a scripting language).

Anyhow, technically I can enable any language, compile the script at runtime and run the assembly, but its a bit annoying. There is the ability of using IronPython or some other .NET dynamic language (maybe JScript .NET).

What do you think? would you like to be able to script WinDbg in your favorite .NET scripting language?

I don’t know how many of you are writing WinDbg extensions but its not that trivial to most people and especially not that robust and nice such as writing a plugin to MDbg or instrumenting it.

For managed debugging MDbg (which can help you debug both 1.1 and 2.0 frameworks) should be more than enough for all of your managed debugging needs but, AFAIK, MDbg does not handle dump files which are VERY important for post mortem debugging or for performance tuning/debugging in an environment that simply cannot be slowed down due to an attached debugger.

The only way of writing WinDbg extensions, at the moment, is to write C/C++ code. There are a lot of samples that comes with the WinDbg installation but it is not as robust or easy as writing it, say, in C#.

In addition to that, WinDbg has a built-in proprietary mini scripting language that you can use to instrument it a bit but its not standard, lack documentation (at least public one) and is not that easy to use (at least to most people).

What if you could write a WinDbg plugin in C# or in any one of your favorite .NET languages?
What if you could use JavaScript or a Monad like script? Or perhaps some C# and/or VB.NET as your “scripting language” for WinDbg?

I’m playing with the idea of writing a WinDbg extension that will enable writing WinDbg extensions in .NET.
Perhaps it will have a feature to run a special script that is not written in WinDbg’s strange scripting language.

What do you think?

Have you encountered a situation in which you needed to instrument WinDbg but didn’t know its “scripting language” or simply couldn’t do that certain thing you wanted to do?

Did you ever need some kind of functionality that is missing in WinDbg and didn’t want to spend too much time playing around with C/C++ to write it down?

I’ve started writing a prototype to see if this is fesible and it seems it is (at least for the moment 😉 ). I’ll try to find a bit more time to finish it and I’ll post my findings (and hopefully the code if it actually works).

A few posts ago I talked about a situation in which the finalizer thread is being blocked by a switch to an STA apartment when we are disposing an STA COM object without calling Marshal.ReleaseComObject.

In that post, I suggested using the sieextpub.dll debugger extension and call its !comcalls command that shows us which COM call is headed to which thread.

I just got an Email from Herve Chapalain (Herve, please let me know if you have a link that you want me to put here, or your Email address) about another way of getting the STA thread to which we are going to without using sieextpub.dll.

The method involves understanding the memory structure of a structure called OXIDEntry which, as far as I know (please correct me if I’m not) is an internal struct and does not appear in public symbols.

Fortunately, Herve is familiar with its memory layout and gave us the necessary offsets to use.

We first need to see the native stack of the finalizer thread (or in a case that we are not talking about a blocked finalizer thread and just want to find a deadlock regadless of .NET, the thread that is blocked) using the one of k commands.

The output should look something like this.
Notice that the call stack contains the GetToSTA call which means we do want to switch to the right STA thread that our object was created in.

We then take the 3rd parameter of the function “ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall” (you will see it in the call stack).
In Herve’s sample the address is 0016f6f4.

Next, we use the dd command to dump the memory at that address and we should be getting this:

0:005> dd 0016f6f4

0016f6f4 774a2328 774a22f8 00000003 0000002a
0016f704 001f5790 00000000 0016eec0 0016f978
0016f714 0017a120 01cca3e8 774a1930 00070005
0016f724 00000003 00000e54 00000000 004c0002
0016f734 000c01e0 0016f4d8 00150178 000201f5
0016f744 000801ee 00000000 0000000c 00000005
0016f754 00000000 00000000 0016f3b8 00171040
0016f764 0016eec0 0000b400 0e54086c c6a4049a

We need to take the 3rd value on the second line (its mared in bold) and run dd on it as well (this is, probably, a pointer inside the OXIDEntry structure):

0:005> dd 0016eec0
0016eec0 0016ef38 775c6fa0 0000086c 00000e54
0016eed0 d3b9b1cc ca2488ab b5ad2644 88cf2210
0016eee0 b5ad2644 88cf2210 0000b400 0e54086c
0016eef0 c6a4049a 49cd531e 00000103 004a02e6
0016ef00 0015a580 00000000 00000000 00000000
0016ef10 00000001 ffffffff 0016ee08 0150803c
0016ef20 00000006 00000000 00000000 00000000
0016ef30 00000000 00070005 0016efb0 0016eec0

Now we got the process id which is 86c and the thread id which is e54 (they are both marked in bold).

This means we are trying to move to thread e54 because that is where our STA COM object was created. In this sample, e54 is the main thread.

It’s a bit trickier, but if you know your offsets you can find it in no time and it’s very useful if you forgot to download sieextpub.dll or you are not able to install it at a customer site due to various limitations.

Thanks again for Herve for sending this information.

In the post about perfmon I’ve briefly mentioned the fact that you can utilize GC.Collect to find memory leaks.

By combining some custom code, WinDbg and SOS you can track most (if not all) of your managed memory leaks without buying a memory profiler.

The methodology is very simple and very easy to implement and use.
It involves taking serveral memory dumps at specific points and anaylzing the objects on the managed heap.

Preparing the data for analysis
Before we can start analyzing we need to prepare the data by doing the following things:

  • Prepare a call to GC.Collect. The exact code should look like this:
    GC.Collect();
    GC.WaitForPendingFinalizers();
    Just to be sure that everything is clean, call this code 3 times. This will ensure that everything, including all pending finalizers (objects that awaits for their finalizer to be called), are really gone.

    If this is a WinForms application, just add a button that will call the above code.
    If this is a Web application (ASP.NET or ASP.NET WebService) add a page called Collect.aspx with this code (If you don’t want to compile this into a code behind assembly, just download this page and place it in the root folder of the application).

  • Run the application and reach to a point before the area of code you wish analyze and take a memory dump using adplus.vbs -hang -p [process id].
    This dump will be used to show us what objects were already on the heap before we started the operation.
  • Run the operation we want to check for memory leaks and after it ends take another dump using the same command line as we used above.
    This dump will be used to show us what objects were added to the heap during the operation we are checking.
  • Run the Collect code (either by pressing the button you have prepared in the WinForms application or access the Collect.aspx page if this is a web application) and take another memory dump.
    This dump will be used to show us what objects are left on the heap after the operation has ended and after we called GC.Collect a few times to make sure all garbage was collected. All objects that cannot be accounted for (explained why they are still alive) are leaks and should be handled.

How to analyze the data

We first need to get some statistics on each one of the dumps to get a hold of the number of objects and their types that we currently have in the heap. To do that, we need to run the command: !dumpheap -stat on each one of the dumps and get a statistical view of all types of objects and the number of living instances.

We then need to compare between the 2nd and 3rd dumps to see if some of the objects did not decrease in number. If they haven’t (and we know they should have) these are our leaking objects.

We can find who is referencing them by using the !dumpheap -type [Object Type] (where [Object Type] is the namesapce and class name of the objects we want to check).

Then, we need to take one of the addresses (take the upper ones, they are the oldest ones) and run !gcroot [Object Address] on them to see who is referencing them.

Continue until you can account for all the objects in the 3rd dump and check if they are really supposed to be there after we called GC.Collect & GC.WaitForPendingFinalizers 3 times. Objects that should not be there should be traced to find who is referencing them and find out why they are being referenced.

That’s it. Quite simple, saves money but doesn’t produce nice graphs like the other tools 🙂

Thanks to the good people at ShinyStat that provides me with free statistics of my blog, I am able to see all referrers including those that come from search engines and even see the keywords they have used to search and find this blog.

I’ve looked just now at the statistics and saw one combination of keywords that interested me. It was something like this: “!dumpheap parameters”

This means that someone was looking for some help on the parameters of the !dumpheap command, so I thought I’ll add some i.nformation about that.

What is !dumpheap?

!dumpheap is a command from the SOS extension that dumps the content of the managed heap.
You can get all the addresses and some additional information (as we will see) on all managed objects currently alive on the heap.

From the last two versions of WinDbg SOS was actually replaced by PSSCOR which has a good help system. For most commands you can simply type “!help commandName“, for example, “!help dumpheap” and you will get a details help on the parameters and how to use.

!dumpheap Parameters:

  • -stat – Outputs only the statistical summary of all types of objects on the heap, their count and their own size (without references)
  • -nostrings – Exclude the output of strings (when not using -stat).
  • -gen X – Outputs only objects that belong to generation X where X can have the following values: for 1.1 – 0, 1, 2 and 3 for large objects (objects larger than 85Kb without their references). For 1.0 everything expect use -1 instead of 3.
  • -min X – Ignores objects that are less than X (where X is a number in bytes).
  • -max X – Ignores objects that are larger than X (where X is a number in bytes).
  • -mt MethodTable – List only those objects with the MethodTable given.
  • -type TYPE – List only those objects whose type name is a substring math of TYPE.
  • -cache – Saves the objects in an internal cache for later use (helps to speed up things instead of rescanning the heap all over again).
  • -l X – Prints out only X items from each heap instead of all the objects.
  • -short – Prints out only the object address. Useful for combining with the .foreach command.
  • -fix START END – Use the given START and END addresses and scan the heap only between these addresses.

NOTE: If I remember correctly, -cache, -nostring and -short are all new commands that were added in the last two versions of SOS (previously PSSCOR), the rest are available for quite a while in most versions of SOS.

-short, the wonder parameter

Just to show you how powerful the -short option is, let say that you want to print the content of all gen 2 objects that you currently have. Prior to the -short command you had to run “!dumpheap -gen 2” copy the output to notepad, parse it to leave only object addresses and only then you could either run manually !do on each address, or use .foreach with the /f command.

Today, with -short all you need to do is run the following command line:
.foreach ( obj { !dumpheap -gen 2 -short } ) { !do ${obj} }

Quite useful, isn’t it…

Not all memory leaks in .NET applications specifically relate to objects that are rooted or being referenced by rooted objects. There are other things that might produce the same behavior (memory increase) and we are going to talk about one of them.

What is the Finalizer Thread?
The finalizer thread is a specialized thread that the Common Language Runtime (CLR) creates in every process that is running .NET code. It is responsible to run the Finalize method (for the sake of simplicity, at this point, think of the finalize method as some kind of a destructor) for objects that implement it.

Who needs finalization?
Objects that needs finalization are usually objects that access unmanaged resources such as files, unmanaged memory and so on. Finalization is used to make sure these resources are closed or discarded to avoid actual memory leaks.

How does finalization works?
(NOTE: please email me if I’ve got something wrong or in accurate at this stage 😉 )

Objects that implement the finalize method, upon their creation, are being placed in a finalization queue. When no one references these objects (determined when the GC runs) they are moved to a special queue called FReachable queue (which means Finalization Reachable queue) which the finalizer thread iterates on and calls the finalize method of each of the objects there.
After the finalize method of an object is called it is read to be collected by the GC.

So how can the finalizer thread get blocked?
Finalizer thread block occurs when the finalizer thread calls to a finalize method of a certain object.
The implementation of that finalize method is dependat on a resource (its a general term for the sake of the general definition) that is blocked.
If that resource will not be freed and available for our finalize method, the finalizer thread will be blocked and none of the objects that are in the FReachable queue will get GCed.

For example:

  1. The implementation of the Finalize method contains code that requires a certain lock of a synchronization object (Critical Section, Mutex, Semaphore, etc) and that synchronization object is already blocked and is not getting freed (see previous post on Identifying Deadlocks in Managed Code for help on resolving this issue).
  2. The object that is being finalized is an Single Threaded Apratment (STA) COM object. Since STA COM objects have thread affinity, in order to call the destructor of that COM object we have to switch to the STA thread that created that object. If, for some reason, that thread is blocked, the finalizer thread will also get blocked due to that.

Symptoms of a blocked Finalizer thread

  • Memory is increasing
  • Possbile deadlock in the system (depends on a lot of factors, but it can happen)

How can we tell our Finalizer thread is blocked?

The technique for finding if the finalizer thread is blocked is quite easy and involves a few easy steps.

First of all, we need to take a series of dumps at fixed intervals (i.e. 5min apart). When we have the dumps we need to run the following set of commands from the SOS extension on all dumps and compare results (to find out how to load the SOS extension and set the symbols path look at this previous post):

  1. !FinalizeQueue – This dumps the finalization queue (not the FReachable queue). If you’ll see that the total number of object is increasing from dump to dump, we can start to suspect that our finalizer thread is blocked.
    NOTE: I say “suspect” because its still not certain at this point that the finalizer thread is blocked. This situation can also mean that the finalization of some of the objects takes a long time and the rate of objects being allocated vs. the rate of objects being finalized is in favor of the allocated objects, meaning, we are allocating objects that needs finalization faster than we are collecting them.
  2. !threads – This command will show us all .NET threads in the current process. The finalizer thread is marked at the end of the line by the text (finalizer) (surprisingly enough 😉 ). At the beginging of the line that has the (finalizer) text at the end of it you will see the thread’s index. We will need it to run the next command.
  3. ~[Finalizer Thread Index]k (i.e. ~24k)- This will dump the native call stack of the finalizer thread. If you will see in all dumps that the last function in the call stack (the top most) is something like ZwWaitForSingleObject or ZwWaitForMultipleObjects it means something is blocking our thread, usually a sync object.
    NOTE: If the call stack contains a call to a function named GetToSTA it means that the call to ZwWaitForSingleObject is there because we are tring to switch to the STA thread that created the STA objects and we are waiting to switch to it. This means that the STA thread is blocked for some reason.
  4. ~[Finalizer Thread Index]e!clrstack (i.e. ~24e!clrstack) – Dump the .NET call stack just to verify that if we don’t see a call to ZwWaitForSingleObject or ZwWaitForMultipleObjects we might be blocked due to a call to the .NET Monitor class (a native .NET implementation for a Critical Section). If we do see a call to Monitor.Enter it means we have some kind of a managed deadlock.

How to resolve a blocked finalizer thread?

If we are talking about a block that is being created due to a synchronization object (usually a critical section) we need to address it as a deadlock.

Managed Deadlock
To resolve a mangaed deadlock refer to the previous post on Identifying
Deadlock in Managed Code.

Unmanaged Deadlock
I’ll give a crash course to find unmanaged deadlocks, specifically Critical Sections since this is mainly a .NET debugging blog.

There is a Microsoft extension called sieextpub.dll (you can download it from here that is mainly focused at resolving COM issues. It has some useful commands for synchronization objects as well that we will use.

  • run the !critlist command. It will list all locked critical sections and the thread that is owning them (This is similar to running !locks but runs a lot faster).
  • Run ~[Thread ID]k (i.e. ~[22]k) to get the call stack of the owning thread.
  • If the problem is not with a critical section, but with another synchornization object such as a Mutex or a Semaphore, you can run !waitlist to see all the handles that every thread is blocking on. These handles should appear as parameters to functions such as WaitForSingleObject and WaitForMultipleObjects. We can use some native WinDbg commands to find these handles from the calls stack of the finalizer thread. For example:WaitForSingleObject:
    The handle itself will be shown in the call stack (using the kb or kv command, not just k) here:
    09b2f608 77ab4494 00000594 ffffffff 00000100 KERNEL32!WaitForSingleObject+0xf

    Compare this number to the number shown in the output of !waitlist and you will see who is the thread that is blocking the handle.

    WaitForMultipleObjects:
    Here the handle doesn’t appear directly in the kb command output since the function receives an array of handles. To find it we will need to find to parameters:
    00f4ff34 791d25d5 00000003 00f4ff58 00000000 KERNEL32!WaitForMultipleObjects+0x17

    The first bold one is the count of element in the passed array. The second one is the pointer to the array. To dump the array on the screen we will need to run the following command:
    dc 00f4ff58 – This will output the memory at that address and the output will look something like this:00f4ff58 00000700 00000708 000006d4 81b35acc ………….Z..
    00f4ff68 03a50008 b6788cb0 00000000 00000003 ……x………
    00f4ff78 00000000 ffffffff 00000000 00f4ff4c …………L…
    00f4ff88 8042f639 00f4ffdc 7920fd39 791d25e0 9.B…..9. y.%.y
    00f4ff98 ffffffff 00f4ffb4 791d254c 00000000 ……..L%.y….
    00f4ffa8 00dfcdf4 00000000 791d4d50 00f4ffec ……..PM.y….
    00f4ffb8 7c57b388 03a50008 00dfcdf4 00000000 ..W…………
    00f4ffc8 03a50008 7ff9f000 00dfc8ac 00f4ffc0 …………….

    We need to relate to the bolded number since we earlier saw that we have 3 handles to watch for. We should look to see if they appear in the output of the !waitlist to see who is blocking on them.

STA COM issues

STA COM issues are usualy caused due to the thread in which the STA object was created is blocked. To quickly find to which thread the finalizer thread is trying to switch to in order to call the destructor of the STA COM object we will use the sieextpub.dll that we mentioned above and call the !comcalls command which will simply show us the two important figures, the index of the thread that is trying to switch and the index of thread that we are trying to switch to.

Best resolution way – Avoidance

The best way of resolving blocked finalizer thread is avoiding it.
The first rule of thumb is AVOID FINALIZATION. If you don’t have to use it DON’T.

The only case implementing the finalize method is important is in a case where your class holds some private members that are native resources or that call native resources (SqlConnection for example, various Streams like FileStream and so on) in which case it is best to use the IDisposable pattern. You can find a lot of information on the internet about that but I’ll just point out a few interesting links such as this (from Eric Gunnerson’s blog. He is the Visual C# PM), and this excellent post (from Peter Provost’s blog).

In regards to STA COM object, the minute you don’t need them call Marshal.ReleaseComObject. This function wil guarentee that the COM object will get released the minute you call this function and NOT when the next GC will occur.

This was a long post, but I think it worthwhile. It a very important not to block the finalizer thread. I know that having only one without some watchdogging mechanisms is not a good design and Microsoft are aware of that.

Some additional resources for extra reading:

http://blogs.msdn.com/maoni/archive/2004/11/4.aspx – An excellent post by Maoni.

http://msdn.microsoft.com/msdnmag/issues/1100/GCI/ – An excellent article on Garbage Collection in .NET. Specifically read the “Forcing an Object to Clean Up” section which talks about the finalization of objects and a few other things.

http://www.devsource.com/article2/0,1759,1785503,00.asp – An excellent article on DevSource about .NET memory management and Finalization