Archive for the 'COM Interop' Category

A friend of mine, Eyal Post, whom I worked with in a previous company opened up a blog about two weeks ago (an DIDN’T let me know about it). I found out about it by looking at some stats of my blog and seeing that he linked to one of my posts about STA COM objects in .NET.

Eyal added some more important information and a solution to handling STA COM objects in ASP.NET Web Services and you can read all about it (and get the code) here.

Enjoy!

Oh, and Eyal, when you do something like that drop me a line in the Email :-)

One of the strongest areas of .NET is interoping which allows .NET application to take advantage of legacy code whether it is written in the form of C DLLs or COM objects.

In this post I will talk about the Runtime Callable Wrapper (RCW) which is the .NET wrapper around COM objects when they are being accessed from .NET application.

Before we’ll discuss the internals of the RCW, we should first see how can we create it.

How can a .NET application access COM?
There are 3 methods a .NET application can access a COM object:

  1. Use the CreateObject function (only available in VB.NET)
  2. Add a reference to the COM object from Visual Studio .NET which will create an interop assembly automatically for you.
  3. Manually create an interop assembly and add a reference to it.

While option 2 and 3 are similar in result there are a few side effects on strong naming and namespaces which affects certain things. I will discuss about it later on.

How does an RCW object works?

The RCW is a .NET object which is a wrapper on top of a COM object. It is the bridge between the Garbage Collection approach of .NET and the reference counted approach of COM.

It is treated as a pure .NET garbage collected object which means it will be collected only when no one is referencing it.

Since there is no effective way of marrying the reference counted way and the garbage collected way what the RCW actaully does is always hold a reference of 1 to the COM object (excluding the cases where an RCW is marshalled across AppDomains, in which case its reference count is upped by one). This means that what will determine if the COM object lives or dies is the whether its RCW is alive (which in garbage collection words means its reachable from another .NET object or it is rooted).

An RCW also has a finalizer in which the COM object is actually being dereferenced and destroyed. This means that it takes at least 2 Garbage Collections until the COM object actually dies.

Important things we should all remember about RCWs:

Having an RCW control the death of underlying COM object means that instead of having the deterministic and immediate destruction model of referencing counting we have an undeterministic garbage collection and that is something we should keep an eye on.

The RCW itself is very light so having a lot of RCWs alive will not affect the size of the GC heaps too much, but it will affect the private bytes.

How to detect RCW “leaks”?

RCWs leaks are actually RCWs that are referenced and never released. While it is similar to finding leaking .NET objects that are still being referenced it has other impacts on the system.

As I said earlier, RCWs are light, this means the underlying COM object takes up memory in the default NT heap or in some other custom heap. If its not getting freed it adds presure on the Virtual Memory, since now the native memory takes up more and may take parts in the virtual memory instead of letting the GC reserve them.

Rembmer I said there are a few methods of adding COM objects to your .NET project? The main reason I listed them was due to the fact that each method of adding them to the project will make the objects appear a bit differently in WinDbg.

Option 1 – CreateObject – will make the RCWs appear of type System.__ComObject.

Option 2 – Adding a direct reference to the COM object – will make the RCWs appear as Interop.XXX.ComClassName where XXX is the name of the COM dll and ComClassName is the name of the COM class as it appears in the type library.

Option 3- Manually creating an Interop will make them appear like Option 2 if you did not change the namespace, or as the namespace that was chosen during the type library import using tlbimp.exe utility.

So, how can we actually detect them? There are two useful methods for that.

Using WinDbg to detect RCWs leaks

This method can work on a live debug or with memory dumps. The main technique is:

  1. Perfrom some operations that allocates RCWs.
  2. Take a memory snapshot or list all objects of that type that are currently alive.
  3. Run a GC
  4. Do step 2 again.

If you took memory dumps, all that is left now is to compare the list of object.

How to list the objects? Simply use !dumpheap -type XXX where XXX is the type of RCWs you use (see the 3 options above to see the names of RCWs object you can expect).

Using LeakDiag to detect RCWs leaks

LeakDiag is a tool developed in Microsoft to detect unmanaged memory leaks. It can track and give you the exact size of the unmanaged memory leak as well as the complete stack trace of the code that allocated that object.

Do use LeakDiag you need to attach it to a running process and perform the same technique as mentioned above with WinDbg. When you click on “Log” in LeakDiag you will be able to get an XML log file that will tell you the exact call stack of the leaking memory and you will be able to identify the call stack that leads to the exact COM object.

NOTE: Make sure you have the correct symbols for your COM objects and that you set up a correct symbols path to those symbols prior to attaching LeakDiag to the process.

You can download LeakDiag from here. Although this is not the latest version, it is still handy.

There is another handy utility (which is written in .NET) that parses the XML log files of LeakDiag called LDGrapher that you can use to view a little bit more easily the LeakDiag logs.

You can download LDGrapher from here (this is also not the latest version).

I plan to write in greater detail about LeakDiag and LDGrapher in a future post.

I’ll try and see if I can convine some of the guys at MS to release LeakDiag and LDGrapher like they released DebugDiag. I’ll keep you posted on those efforts here in this blog.

A few posts ago I talked about a situation in which the finalizer thread is being blocked by a switch to an STA apartment when we are disposing an STA COM object without calling Marshal.ReleaseComObject.

In that post, I suggested using the sieextpub.dll debugger extension and call its !comcalls command that shows us which COM call is headed to which thread.

I just got an Email from Herve Chapalain (Herve, please let me know if you have a link that you want me to put here, or your Email address) about another way of getting the STA thread to which we are going to without using sieextpub.dll.

The method involves understanding the memory structure of a structure called OXIDEntry which, as far as I know (please correct me if I’m not) is an internal struct and does not appear in public symbols.

Fortunately, Herve is familiar with its memory layout and gave us the necessary offsets to use.

We first need to see the native stack of the finalizer thread (or in a case that we are not talking about a blocked finalizer thread and just want to find a deadlock regadless of .NET, the thread that is blocked) using the one of k commands.

The output should look something like this.
Notice that the call stack contains the GetToSTA call which means we do want to switch to the right STA thread that our object was created in.

We then take the 3rd parameter of the function “ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall” (you will see it in the call stack).
In Herve’s sample the address is 0016f6f4.

Next, we use the dd command to dump the memory at that address and we should be getting this:

0:005> dd 0016f6f4

0016f6f4 774a2328 774a22f8 00000003 0000002a
0016f704 001f5790 00000000 0016eec0 0016f978
0016f714 0017a120 01cca3e8 774a1930 00070005
0016f724 00000003 00000e54 00000000 004c0002
0016f734 000c01e0 0016f4d8 00150178 000201f5
0016f744 000801ee 00000000 0000000c 00000005
0016f754 00000000 00000000 0016f3b8 00171040
0016f764 0016eec0 0000b400 0e54086c c6a4049a

We need to take the 3rd value on the second line (its mared in bold) and run dd on it as well (this is, probably, a pointer inside the OXIDEntry structure):

0:005> dd 0016eec0
0016eec0 0016ef38 775c6fa0 0000086c 00000e54
0016eed0 d3b9b1cc ca2488ab b5ad2644 88cf2210
0016eee0 b5ad2644 88cf2210 0000b400 0e54086c
0016eef0 c6a4049a 49cd531e 00000103 004a02e6
0016ef00 0015a580 00000000 00000000 00000000
0016ef10 00000001 ffffffff 0016ee08 0150803c
0016ef20 00000006 00000000 00000000 00000000
0016ef30 00000000 00070005 0016efb0 0016eec0

Now we got the process id which is 86c and the thread id which is e54 (they are both marked in bold).

This means we are trying to move to thread e54 because that is where our STA COM object was created. In this sample, e54 is the main thread.

It’s a bit trickier, but if you know your offsets you can find it in no time and it’s very useful if you forgot to download sieextpub.dll or you are not able to install it at a customer site due to various limitations.

Thanks again for Herve for sending this information.

I’ve stumbled upon this article at MSDN titled “Registration-Free Activation of .NET-Based Components: A Walkthrough” which explains how to write a .NET component that is exposed through COM and when deployed it allows you to NOT register your .NET Component in the regsitry to make it visible to the COM code accessing it.

This is specifically useful for a couple of things:

  1. Deployments at which the registry cannot be modified due the organization’s security policy (I’ve met a few of those in the past and its not nice. Believe me.).
  2. Better Side-By-Side support. This is a better way of supporting side by side without using the registry. Everything is embedded in the .NET assembly and just by dropping it into the GAC you get everything you need.
  3. No dependecy on the registry which is always a good thing :-).

The only down side of this technology is that it is only supported in Windows XP SP2 and Windows 2003.

There are a lot of problems that can occur due to misusing the .NET registration like:

  • Using regasm with the /codebase parameter that fixes the exact assembly (and its location) that will be used when creating this object. This is like writing in .NET and deploying in a non side-by-side supported environment like COM.
  • Garbage left in the Registry by ncorrect uninstallation which can lead to unnecessary debugging until coming to the conclusion that the problem was a dirty registry (and that CAN be annoying, believe me).

You can spend a good couple of hours debugging in a production environment just to realize that the problem is due to stupid mistakes like I’ve mentioned above.

Anything that can alleviate this situation is blessed.

I’m such a statistics junkie that I found another interesting Google query that return my site and someone clicked on the link and got in (thanks again ShinyStat guys for that cool feature).

Anyhow, the query that got my attention today was “marshal.releasecomobject spinning“.

Now this is interesting and I thought I would elaborate a bit on Marshal.ReleaseComObject and what problems that can happen due to calling it (or not calling it).

First, we need to understand what Marshal.ReleaseComObject actually does and then we will cover the two assumptions mentioned above.

What does Marshal.ReleaseComObject actually do?
.NET can access COM objects through something called a Runtime Callable Wrapper (RCW).

The RCW is the wrapper for a COM object. There is one per a COM object instance and what it does is to provide the interface between .NET and COM.
COM uses reference counting to know when to release and destroy an object. .NET, on the other hand, is a garbage collected environment and these two approaches are opposite to each other.

COM will release an object the minute the minute the object’s ref count reached zero.
.NET will only release the COM object when no one is referencing the RCW and it is ready to be finalized.

RCW encapsulates the COM object and ALWAYS keeps a ref count of 1 to that object. An exception to this would be when the RCW is marshaled to another AppDomain, in which case it will increase the ref count of the COM object.

The RCW is a managed object and therefore have all the properties of a managed object. It will not be considered garbage as long as it is reachable from another object (and that object is either rooted or is referenced by a rooted object… You get the idea).

RCW has a finalizer in which the RCW will de-reference the COM object, get it destroyed and its memory freed (assuming it is not marshaled to another AppDomain). This means that it takes at least two garbage collections until a COM object that is being referenced from .NET will actually get released after no one references it.

So I did call Marshal.ReleaseComObject, so what?
If you did call Marshal.ReleaseComObject when you no longer need a certain COM object its great. It means that you release the memory of that object right away and you are not holding it until the next GC.

It can, however, cause a few problems:

  • If, for some reason the COM object is an STA object, call Marshal.ReleaseComObject means you will have to context switch to the STA thread that created that object (unless you are in the STA thread that created the object). This means that if you are not careful enough there is a possibility for a deadlock (for more information about deadlocks, refer to my previous posts about deadlocks).
  • You are passing the reference to that RCW around and you call Marshal.ReleaseComObject on it but you still have a reference to the RCW itself (which is still alive because its a manged object and will only die when it has no references to it and get Garbage Collected) in another place.
    In this case (which is an easy one), the RCW will throw an exception which states that you actaully disconnected the underlying COM object so you have a valid RCW object but it points to a dead COM object and you have nothing to do with it now.

I didn’t call Marshal.ReleaseComObject, so sue me

If you didn’t call Marshal.ReleaseComObject you might get into a different set of problems:

  • If you have STA objects you might get to a blocked finalizer thread situation.
  • It will just take more time for this memory to get released because, as I’ve said, the RCW has a finalizer and it takes at least 2 garbage collections to actually clean it and the memory its underlying COM object takes.

As you can see by this short post, Marshal.ReleaseComObject is an important thing that should not be taken lightly when used. You need to know when to use it and be aware of the consequences.

Happy COM interoping and memory releasing ;-)