Archive for September, 2005

This post is a bit futuristic, but since .NET 2.0 and Visual Studio 2005 release is very near, I thought I should start to talk about it a bit more.

.NET 2.0 will introduce an improved and better GC.
One of the parameters that the GC takes into account is the amount of memory a certain class instance takes. By doing so, the GC can better understand how much memory will be gained by collecting this object.

One of the inputs the GC takes into accoutn when decided whether to initiate a collection or not is the amoutn of managed memory allocated.
If we have a managed class instance that doesn’t allocate a lot of mamanged memory but holds a pointer to a large unmanaged memory (either a reference to a COM object that allocates a lot of information, or directly allocating unmanaged memory using functions such as Marshal.AllocHGlobal) the GC will not know about the unmanaged memory allocated and will not consider scheduling a GC sooner.

This means, that if there is no reference to that object and its finalizer releases the unmanaged memory, until there is a GC and the finalizer thread reaches that object, this unmanaged memory will not be released and may add pressure on the application’s memory usage.

For this purpose, in .NET 2.0 the “AddMemoryPressure” function was added to the GC class.

“AddMemoryPressure” allows the developer to notify the GC about the amount of additional unmanaged memory that was allocated in different places in the application. The GC will take this into account when considering the schedualing of a collection.

For example, if at some point in the application I create a COM object that I know allocates a bit chunk of memory, after creating it I will call “AddMemoryPressure” and give it a rough estimate of the amount of unmanaged memory this COM object takes.

For example:

class MySpecialBitmapClass
{
private long size;

MySpecialBitmapClass(string fileName)
{
size = new FileInfo(fileName).Length;
GC.AddMemoryPressure(size);
}

~MySpecialBitmapClass()
{
GC.RemoveMemoryPressure(size);
}
}

When I create the class I say that I will add a memory pressure which is at least as large as the file I’m working on. When the instance of this class is being finalized it will remove the pressure.

Be sure to tune in to hear some more changes and updates that are coming in .NET 2.0’s GC.

Sorry for being off topic (again) but I just had to comment a bit on some of the comments that were posted for the previous post.

Eyal said that this feature was also added to Delphi 2005 and that its very good for portability issues. It seems that the guys at Borland used that quite a bit to map missing functions that were in .NET and not in Delphi’s VCL which made porting a lot easier.

It’s still a dirty hack and its can and probably WILL be used in a wrongful way.

Yaniv Golan added that the Extension Methods lacks namespace support, so if I have two Extension methods named the same way (due to a reference I’ve added or some other issue) we will get unexpected behavior? Should the compiler yell about that? Will it automatically select only one?

Yaniv suggested using the syntax that includes the namespace. So if I’m taking the example from my previous post instead of having:

s.Foo();

I will have:

s.MyExtension.Foo();

After all what will happen if someone adds some extension method and in a future version of the BCL (Basic Class Library) someone adds an extension method with a similar name?

Barry Kelly said that Extension methods were added because “It is absolutely required in order to extend IEnumerable to support the new query methods such as Max, Select etc.“.

Perhaps I don’t have the exact vision that Anders Hejlsberg has for C#, and you can call me old school if you wish, but I think that this whole LINQ thing could have been implemented as a certain type of framework.

Yes ADO (ActiveX Data Objects) wasn’t part of the language but most Visual Basic 6 users were able to use it with little to no effort. So it wasn’t part of the language and the compiler didn’t understand a thing about it. So what?!

As far as I know (and people, please correct me if I’m wrong) the compiler can only check the syntax correctness of the statement but can’t perform any other checks at compile time.
Yes, this syntrax might be a bit more intuitive but I’m quite sure this could have been made without brutally hacking the compiler and adding these features.

What’s wrong with a syntax like:

int[] numbers = { 2,2,2,4,5 };
int count = LINQ.Count(LINQ.Distinct(numbers));

I also wonder what happend to the whole XQuery idea? I remember Microsoft releasing an XQuery implementation? XQuery is a language that can be used not only for XML querying but to query and intersect information from various places much like LINQ’s features.

I might be missing the vision of marrying data access with the language itself, but I really don’t think this is the MOST necessary feature that will boost productivity and will ascend to a level of a killer feature.

.NET allows creating Windows Services which are commonly used for unattended services such as Remoting containers.

For some reason, when using a Windows Services as a container for your Remoting application or as just a Windows Service that perfrom various tasks, the GC used is the workstation GC.

We have previously talked a bit about the difference between the workstation GC and the server GC but I’ll explain a bit about them again.

Workstation GC
The workstation GC is, as its name applies, is used in a workstation scenario. It is optimized for single CPU machines and for desktop application by using the main thread to perform the GC.
It uses 16mb segments that it reserves and sub allocates.

It has an option called “Concurrent GC” which allows the GC to run on a dedicated thread.

Server GC
The Server GC It is optimized for server operations and works only on a multi processor machines (CPUs that has Hyper Threading enabled are considered as two CPUs).

It has a GC heap per CPU and a thread per CPU that performs the garbage collection.
It uses 32Mb segments that it reserves and sub allocates.

All of these features make the Server GC more appropriate for demanding server applications due to its higher throughput.

Who uses the Server GC?
The only containers that use the Server GC by default are ASP.NET and COM+ (through Enterprise Services).

All other applications including Windows Services use the Workstation GC by default.

This means that even if you wrote a cool Windows Service in .NET that does cool stuff it may suffer from using a non optimized GC even though its a high throughput service that serves millions of users.

So, what can we do about it?
Before .NET Framework 1.1 SP1 you had to implement your own container for the CLR.
Since .NET Framework 1.1 SP1 you can just add to your app.config file the following tag and it will tell the GC to use the Server GC (of course, only if you have more than one CPU):

<configuration>
<runtime>
<gcserver enabled="true" />
</runtime>
</configuration>

You can read more about it (though not too much) in this Microsoft KB article.

For .NET Framework 1.0 you’ll still have to implement your own container for the CLR.
There are a bunch of these hangging around. A nice one is this one which is posted in The Code Project.


Just think about the fact that with one simple flag you can boost your Windows Service performance.

I know this is a bit off topic, but I just had to comment about it here.

Since I was not able to attend to PDC I’ve been hearing a lot about the announcements and new stuff coming out of it.

One of these things is some of the features in C# 3.0 spec called “Extension Methods”.
In short, Extension Methods allows you to extend an existing class that is sealed by creating a class that has a static function with a special syntax which allows you to call that function on an instance of that object as if it was a member function.

For example, if I want to extend the the String class (which, as you all know, is a sealed class and has its reasons to be like that which I won’t get into at the moment) I would create a class like this:

public static class MyExtension {
public static void Foo(this string s) {
Console.WriteLine(“MyExtension::Foo = {0}”, s);
}
}

Notice the “this” keyword near the function’s parameter. This is the keyword that does the magic.

To invoke this method I would use the following syntax:

string s = “Testing Extensions”;
s.Foo();

You see, it looks like the String class always had a method named Foo.

Now as much as some people think this feature is cool I don’t see the point of actually having it and showing it like a big thing.

This “functionality” could have been used in previous version of C#. In previous version all you had to do is this:

public static class MyExtension {
public static void Foo(string s) {
Console.WriteLine(“MyExtension::Foo = {0}”, s);
}
}

and invoke the code like this:

string s = “Testing Extensions”;
MyExtension.Foo(s);

Which, in my opinion, is a lot clearer when reading this code since I know that there is a class somewhere in the code (either from a reference or inside the same .cs file or in another .cs file in the project) that is called MyExtension and have a method called Foo.

So why do we need this functionality?

I guess we don’t, but the only reason I can think of came to me from seeing all the examples out there which involves the String class. Perhaps because the String class (and a few other classes in the Base Class Library (BCL)) are sealed, some people whined too much about it and someone decided on adding this to make them get the same feel as if they inherited from a sealed class and extended it.

Now I know that the C# Spec is saying: “Extension methods are less discoverable and more limited in functionality than instance methods. For those reasons, it is recommended that extension methods be used sparingly and only in situations where instance methods are not feasible or possible.”

But what I do know is that every feature which is not very specific and limited and can be used in a wrongful way, WILL be used in a wrongful way.

So the main question here is, why add such a feature if there are a LOT of other features that are more needed and can get a enchance productivity?

(I wonder how can one get into the language design meetings or at least for the review process of the specs before they are submitted 😉 ).

Rico Mariani, The man (with a capital “T”) for CLR performance and other related information, has posted this on his blog.

This post features a wealth of links to information and programs such as VADump (a fine program to list your memory usage in a given proccess) and some information about how to use it).

Links to the CLR Profiler (which I’m hoping to cover in one of the next posts) for both 1.1 and for Beta2.
And a small LogDump analyzer that he wrote.

I would recommend in generate to check his blog. He has some nice information there that can help anyone.

Enjoy!

One of the strongest areas of .NET is interoping which allows .NET application to take advantage of legacy code whether it is written in the form of C DLLs or COM objects.

In this post I will talk about the Runtime Callable Wrapper (RCW) which is the .NET wrapper around COM objects when they are being accessed from .NET application.

Before we’ll discuss the internals of the RCW, we should first see how can we create it.

How can a .NET application access COM?
There are 3 methods a .NET application can access a COM object:

  1. Use the CreateObject function (only available in VB.NET)
  2. Add a reference to the COM object from Visual Studio .NET which will create an interop assembly automatically for you.
  3. Manually create an interop assembly and add a reference to it.

While option 2 and 3 are similar in result there are a few side effects on strong naming and namespaces which affects certain things. I will discuss about it later on.

How does an RCW object works?

The RCW is a .NET object which is a wrapper on top of a COM object. It is the bridge between the Garbage Collection approach of .NET and the reference counted approach of COM.

It is treated as a pure .NET garbage collected object which means it will be collected only when no one is referencing it.

Since there is no effective way of marrying the reference counted way and the garbage collected way what the RCW actaully does is always hold a reference of 1 to the COM object (excluding the cases where an RCW is marshalled across AppDomains, in which case its reference count is upped by one). This means that what will determine if the COM object lives or dies is the whether its RCW is alive (which in garbage collection words means its reachable from another .NET object or it is rooted).

An RCW also has a finalizer in which the COM object is actually being dereferenced and destroyed. This means that it takes at least 2 Garbage Collections until the COM object actually dies.

Important things we should all remember about RCWs:

Having an RCW control the death of underlying COM object means that instead of having the deterministic and immediate destruction model of referencing counting we have an undeterministic garbage collection and that is something we should keep an eye on.

The RCW itself is very light so having a lot of RCWs alive will not affect the size of the GC heaps too much, but it will affect the private bytes.

How to detect RCW “leaks”?

RCWs leaks are actually RCWs that are referenced and never released. While it is similar to finding leaking .NET objects that are still being referenced it has other impacts on the system.

As I said earlier, RCWs are light, this means the underlying COM object takes up memory in the default NT heap or in some other custom heap. If its not getting freed it adds presure on the Virtual Memory, since now the native memory takes up more and may take parts in the virtual memory instead of letting the GC reserve them.

Rembmer I said there are a few methods of adding COM objects to your .NET project? The main reason I listed them was due to the fact that each method of adding them to the project will make the objects appear a bit differently in WinDbg.

Option 1 – CreateObject – will make the RCWs appear of type System.__ComObject.

Option 2 – Adding a direct reference to the COM object – will make the RCWs appear as Interop.XXX.ComClassName where XXX is the name of the COM dll and ComClassName is the name of the COM class as it appears in the type library.

Option 3- Manually creating an Interop will make them appear like Option 2 if you did not change the namespace, or as the namespace that was chosen during the type library import using tlbimp.exe utility.

So, how can we actually detect them? There are two useful methods for that.

Using WinDbg to detect RCWs leaks

This method can work on a live debug or with memory dumps. The main technique is:

  1. Perfrom some operations that allocates RCWs.
  2. Take a memory snapshot or list all objects of that type that are currently alive.
  3. Run a GC
  4. Do step 2 again.

If you took memory dumps, all that is left now is to compare the list of object.

How to list the objects? Simply use !dumpheap -type XXX where XXX is the type of RCWs you use (see the 3 options above to see the names of RCWs object you can expect).

Using LeakDiag to detect RCWs leaks

LeakDiag is a tool developed in Microsoft to detect unmanaged memory leaks. It can track and give you the exact size of the unmanaged memory leak as well as the complete stack trace of the code that allocated that object.

Do use LeakDiag you need to attach it to a running process and perform the same technique as mentioned above with WinDbg. When you click on “Log” in LeakDiag you will be able to get an XML log file that will tell you the exact call stack of the leaking memory and you will be able to identify the call stack that leads to the exact COM object.

NOTE: Make sure you have the correct symbols for your COM objects and that you set up a correct symbols path to those symbols prior to attaching LeakDiag to the process.

You can download LeakDiag from here. Although this is not the latest version, it is still handy.

There is another handy utility (which is written in .NET) that parses the XML log files of LeakDiag called LDGrapher that you can use to view a little bit more easily the LeakDiag logs.

You can download LDGrapher from here (this is also not the latest version).

I plan to write in greater detail about LeakDiag and LDGrapher in a future post.

I’ll try and see if I can convine some of the guys at MS to release LeakDiag and LDGrapher like they released DebugDiag. I’ll keep you posted on those efforts here in this blog.