Typed DataSets are a type safe wrapper around a DataSet which mirrors your database structure. It was created to make sure that the code accessing the database is type safe and any changes in the database structure that changes tables, columns or column types will be caught at compilation time rather than runtime.

If you have a big typed dataset that contains a lot of tables, columns and relations it might be quite expensive to create it in terms of memory and time.

The main reason that creating a big typed dataset is expensive is due to the fact that all of the meta data that is contained within the typed data sets (tables, columns, relations) is created when you create a typed dataset even if eventually all you’ll use it for is to retrieve data from a single table.

I can speculate that the reason all of the typed dataset meta data is created during the instantiation of the typed dataset is due to the fact that it inherits from a generic DataSet and accessing the meta data (tables, columns) can also be done in a non type safe manner (i.e. access the Tables collection and/or Columns collection of a table).

If you are using a typed dataset (or dataset in general) you might be interested in the following tips:

  • If you have a big typed dataset, avoid creating it too many times in the application. This is specifically painful for web applications where each request might create the dataset. You can use a generic DataSet instead, but this might lead to bugs due to database changes and the fact that you’ll only be able to find these bugs during runtime rather than compilation time (which basically misses the whole point of using a typed dataset in the first place).
  • DataSets (typed datasets included) inherits from the MarshalByValueComponent class. That class implements IDisposable which means DataSets will actually be garbage collected after finalization (you can read more about finalization and the finalizer thread here). To make sure datasets are collected more often and are not hagging around waiting for finalization make sure you call the “Dispose” method of the dataset (or typed dataset) or use the “Using” clause which will call “Dispose” for you at the end of the code block.
  • Don’t use DataSets at all 🙂 Consider using a different data access layer with a different approach such as the one used in the SubSonic project.

I guess it would be rather trivial creating a typed dataset that is lazy in nature which creates the meta data objects only when they are accessed for the first time. That would reduce the memory footprint of a large typed dataset but will make the computation used to create these objects a little less predictable. If you are up to it or already did a lazy typed dataset ping me through the contact form 🙂

Sometimes when you have a big DataSet with elaborated relationships you might get the following error when trying to add or load data into the dataset:
Failed to enable constraints. One or more rows contain values violating non-null, unique, or foreign-key constraints

Some of the causes for this error are usually “regular” violations of the foreign-key constraints, which means you are referencing a certain key that does not exist in the parent table. If that is the case, you can check this article on MSDN that explains a bit on how to resolve these issues.

If you are still having problems with your dataset and ADO.NET code, you might just want to try this little trick.
It appears that inside a DataRow there is a property called RowError.

RowError is a string value that can be set or read in numerous other occasions, but the situation in which I encountered was due to a bad relationship that was added on a table which caused the code to throw an exception at runtime. In that case the RowError property hold the exact name of the troublesome relationship.

So, how do you access it?
When you code at runtime throws an exception and you are with a debugger attached, check the following thing using the “immediate window” or Quickwatch:

myDataSet.Tables[“YourTableName”].Rows[0].RowError

Don’t forget to replace “myDataSet” with the variable name of your dataset and “YourTableName” with the table that is (probably) causing the problems.

In the case I’ve described above, this property told me exactly what is the problematic foreign-key that I had and from there I figured out what is the problematic relationship.

Update 13/6/2006: Updated a few typos.

At my day job (which is part of the reason I’m posting less frequently) I’ve had to P/Invoke a bit and gathered some P/Invoke tips that I’ve decided to share.
Today’s tip is about P/Invoking and C style unions.

Let’s take the following union:

union MYUNION {
int number;
double d;
}

This is quite a standard union, which means its memory is contiguous. Its translation in C# would look like this:

[StructLayout(LayoutKind.Explicit)]
public struct MYUNION {
[FieldOffset(0)]
public int number;
[FieldOffset(0)]
public double d;
}

Notice two important things. First, the StructLayout attribute is set to LayoutKind.Explicit, this means that using the FieldOffset attributes we are building the memory layout of the struct and it will look exactly like this. When usually handling C structs in .NET and P/Invoking we usually use LayoutKind.Sequential so that the memory layout will be as described using the layout of the fields as we have put them.

The second important thing is the FieldOffset attribute. Notice that both fields have the same offset. This will tell the P/Invoke layer that these two fields are actually part of the same memory (remember, these two fields were declared inside a union!).

This is how unions will work for you in your P/Invoke code. Enjoy!

Technorati : , , , , , , ,
Del.icio.us : , , , , , , ,
Ice Rocket : , , , , , , ,

Disclaimer: I usually like to keep this blog clean of link posts (posts that only have links to other posts in other blogs), but this time the information was too valuable so I had to make an exception.
Although this information is not exactly pure .NET debugging, it does have some information that can actually save you debugging time 🙂

Tess has some interesting tips that she expanded from a post her colleague Doug wrote about:

Read, learn and implement where needed.

Francisco Martinez has some interesting tools that he developed to make it a bit easier to develop to .NET application to Mono using Visual Studio .NET.

These tools allow you to create a makefile from a Visual Studio project file enabling you to build the project in a complete Mono environment as well as import and export MonoDevelop project files from and into Visual Studio Project files.
It will also enable you to run your compiled code under Mono instead of the Microsoft .NET Runtime directly from the Visual Studio IDE.

I’ve been following the Mono project since its inception and it has come a long way since.

I know there are (and are going to be) a lot of legal issues that can do a lot of problems in regards to the project, but I still think its a necessary and welcome addition to the development platforms available on the different Linux flavors and UN*X operating systems.

I wonder if someone is planning to write a WinDbg extension to handle Mono 😉
Though I’m not sure there will be a lot of market for it. People will probably use the Microsoft .NET Runtime for Windows development and/or deployment and Mono for a Non Windows development / Deployment.

I guess only time will tell 🙂

A friend of mine that worked with me on my previous work place talked to me the other day and asked me why we put all of our assemblies in the GAC. After all, it does make a bit of a mess during the installation and no one actually see the benefit of it since a lot of the .NET first hype was “X-Copy Installation” and the “End to DLL Hell”.

As it usually goes in these situations, the “End of DLL Hell” brought its own set of challenges to overcome, therefore, I’ve decide to dedicate this post to the GAC and explain what is it for and how and when to use it.

What is the GAC?
So, what is the GAC? It stands for Global Assembly Cache and has two major functions:

  1. Provides a place to put a few versions of the same assembly that will co-exist side by side.
  2. Provides a place where one can place the native image of an assembly so that it will save us the need to perform Just In Time compilation (JIT) on the fly the first time we call to a certain method in one of the objects in that assembly.

In the old days before .NET and COM people relied on the search path to strategically place DLLs so that the application will find them and load them. There was a certain order and logic to the search path and people usually tried to go with it. If you wanted to place a DLL that will be accessible to all without changing the PATH environment variable, you usually put it in the SYSTEM directory (and later in the SYSTEM32 directory).

This process usually caused the DLL Hell since the application relied on a correct PATH envrionment variable and if, for some reason, it was changed or the same DLL in a different version was in a place before your DLL, your application would load the wrong version and bad things would start the happen.
At best, you can’t see some bug fixes and a new functionality, at the worst case, your application just crashes and strange places (depending how the application is written).

After the old days, came what I’d like to call the “Middle ages” and the era of COM which relied a bit on the search path as in the old days, but now it was written in the registry where to actually look for the exact DLL. If it couldn’t find it, it would rollback to the old days and usea the PATH environment variable and the search logic to try and find the DLL.

In the .NET era the GAC first major function comes to our aid. No more multiple places to put DLLs. We now have a single place that can provide a place to all of the versions of our assemblies all together. We can specifically redirect application that are suppose to use a certain version of the assembly to a different one using policy files and if everything is indeed in the GAC we are no longer in the mercy of the dreaded search path (although, if we can’t find the assembly in the GAC, the loader will revert to the old search path ways, as explained in this post).

The second major function of the GAC is to host the native images of assemblies that were compiled using the NGEN tool. If you wish to read more about NGEN and when to use it I suggest this fine post from Jason Zander‘s blog.

How to use the GAC?
The first basic condition of putting an assembly into the GAC is to strongly sign it. How to sign an assembly is beyond the scope of this post (but I will write about it in a future post).

After the assembly is strongly signed, you simply drop it in the “Assembly” folder located under you Windows Home Directory\assembly (that’s where the GAC is actually stored).

That’s it.

When to use the GAC?
OK, so now that we know how to use the GAC and what it was meant for, when should you use it?

Well, the rule of thumb is this… If you have a shared component that is used in more than one application OR your application exposes some kind of an API that can be used by other, you should definitely put your assemblies in the GAC. After all, this is one of its major uses!

If you have a simple executable or a simple application that only uses the assemblies in its own directory there is no need in putting anything in the GAC.

Quite simple, isn’t it?

In the next post I’ll dig a bit more into the inner workings of the GAC so stay tuned!

In my previous post about GC.AddMemoryPressure Tomer Gabel commented that people should not use this feature since its not the correct thing to do.

In addition to that Yuval Ararat wrote a post about this issue as well.

First of all, if I wasn’t clear enough in my first post, they are technically write BUT (and there is a big but here) there are situations, mainly in interoperability issues that might require us to better notify the GC that behind this small and insignificant .NET object lies a whole lot of unmanaged memory that is allocated and will only be released when this .NET object dies.

Calling GC.AddMemoryPressure will simply make the GC consider this object for who it really is, a memory hungry object that hides behind it a lot of unmanaged allocated memory.

Originally I just wanted to introduce this feature to people and make sure they know about it even though its quite esoteric and will mostly be used by applications with specific needs (and believe me, I know at least one application that could have benefited from this feature if it was available in .NET 1.1).

And Tomer, regarding the typos and stuff, I usually try to spell check and proof my posts, but some times they slip by 😉

I was working today on figuring out some Managed C++ code with a friend from work (Oran Singer) and I just wanted to outline some of the oddities of the Managed C++ Destructors because I think it is important for everyone to understand how they work and what some of the quirks about them are.

We are still in the middle of figuring out something more serious (I will post on that later on).

I would like to “appologize” to the C#/VB.NET crowed in advance since this post is all about Managed C++ today a bit on its future, but I think it will be interesting none-the-less.

First of all, a Managed C++ class that had a destructor will include two additional methods due to the destructor:

  1. Finalize method
  2. __dtor method

Currently you cannot implement the finalize method in managed C++ directly like in C# by simply override it since it is defined as private protected in System::Object so the only way to do it is to add a destructor to the class.

This means that a Managed C++ class that has a destructor has a finalizer with all the implications of a .NET finalizer (I’ve written in more detail about finalizers here).

The __dtor is being called when you use the “delete” operator on a managed C++ class instance (IMPORTANT NOTE: You can only call “delete” on a managed C++ class instance that implements a destructor).

The implementation of the __dtor method is similar to an implementation one would put in a Dispose method when implementing the IDisposable interface. It has two function calls, one to GC.SuppressFinalize (so the finalizer thread will ignore calling the Finalize method of this instance) and a call to Finalize (to actually perform the destructor code).

Another interesting anecdote is that if you have a pure virtual class and you add a constructor to it, it must have a body even though this is a pure virtual class since there is always a virtual call to the base class’s destructor.

Looking a bit to the future, in Visual Studio 2005 (previously known as Whidbey) Visual C++ that generates managed code get a few new features that are a bit different in implementation than what we saw above.

First of all, when implementing only a destructor (“~MyClass()”) the compiler will actually fully implement the IDisposable interface and in certain cases will automatically call the Dispose method (in C# you would either use try..finally block and call the Dispose at the finally part or use the “using” keyword) in the following cases:

  • A stack based object goes out of scope
  • A class member’s enclosing object is destroyed
  • A delete is performed on the pointer or handle.

In addition to that you can explicitly implement a finalizer using the “!” tag (“!MyClass()”). That finalizer is subject to that usual finalizer rules we already know. The main advantage is that it will NOT run if you already called the destructor.

These new features allows improved control for a more deterministic cleanup and allows finer control on C++ managed objects that was previously not available.

I would like to thank Oran again for his help, mainly for forcing me to search some more on the subject and getting the idea to write this post.