Jérôme Laban

.NET Powered

The Disposable Pattern, Determinism in the .NET World

clock August 16, 2004 11:43 by author Jerome
One of the numerous features that can be found in the CLR is the garbage collection. Depending on the programmer’s background, this can be a little disturbing, having the habit to manage the memory by hand (especially for plain C programmers). For C++ developers for instance, even though the memory management can be abstracted by the use of the STL, the C++ has a pretty strong model that provides a deterministic way of handling memory allocations. Once an object has reach the end of its life, either by the control flow leaving the scope or the container object being destroyed, the object is immediately destroyed. This is done by the C++ compiler that calls the object’s destructors and releases the memory. The CLR model is, however, weaker than that. The Garbage Collector (GC) is generally more efficient than the programmer and it handles objects destruction in an asynchronous way. When an object is created, it has the ability to provide the GC a special method named the Finalizer, which is called by the GC when it needs to reclaim the memory used by the object. This can be done at any time by an internal CLR thread, only when necessary. This means that the memory management is really efficient and fast. This also means that it can fit the environment the program is running in by efficiently using the available memory. The effect of this asynchronous behavior is that there is no way to have a deterministic destruction of objects. This is one of the most frequent critics of programmers beginning with .NET programming. The biggest trap in this area for C++ developers is the syntax of the finalizer in managed C++ and C#. For instance, in C# :       class Program      {        static void Main(string[] args)        {          {            Dummy dummy = new Dummy();          }           Console.WriteLine("End of main.");        }      }       public class Dummy      {        ~Dummy()        {          Console.WriteLine("Dummy.~Dummy()");        }      } Since the instanciation of the Dummy class is between brackets, a C++ programmer would think that the so called destructor is called right at end of the scope, before the last WriteLine. In reality, the GC will call the Finalizer when the memory is reclaimed : At the end of the program execution. A concrete view of this problem is often found when using file IO :   class Program  {    static void Main(string[] args)    {      StreamWriter writer = File.OpenWrite("Test.txt");      writer.Write("Some value");       Console.ReadLine();    }  } The problem with this program is that the writer is not closed. It will eventually be closed when the GC will call the finalizer of the writer intance, thus closing the file. This is a common problem found in C# programs, leaving some file handles opened, preventing the files to be opened by some other program. To fix this problem, there are two methods :
  • Call the StreamWriter.Close method when the stream is not used anymore,
  • Use the using keyword to limit the scope of the object.
 The using keyword is a shortcut in the C# language to call a method of the System.IDisposable interface, at the and of its scope. In pratice this is what that means :   static void Main(string[] args)  {    using (StreamWriter writer = new StreamWriter(File.OpenWrite("Test.txt")))    {      writer.Write("Some value");    }  } Which is in fact expanded by the C# compiler to :   static void Main(string[] args)  {    StreamWriter writer = new StreamWriter(File.OpenWrite("Test.txt"));     try {      writer.Write("Some value");    }    finally {      writer.Dispose();    }  } This is straightforward, the Dispose method is called at the end of the “using” scope. One thing though, this does not mean that the memory allocated for the StreamWriter instance is reclaimed right after the dispose. This only means that the instance will release the “unmanaged” objects it holds. A file handle in this case. But one might say : “But if the programmer forgets to call Dispose or Close, the file is not closed at the end”. Actually, no. This is where the Disposable pattern enters the scene. The good thing about this is that you can combine the Dispose method and the GC, the GC being the safekeeper of the unamanged resources of the object; Even if the programmer forgets to call the Close or Dispose method. Here is an example of a Disposable type that implements the Disposable pattern:   public class MyDisposable : IDisposable  {    public void Dispose()    {      Dispose(true);    }     ~MyDisposable()    {      Dispose(false);    }     private void Dispose(bool disposing)    {      // If we come from the Dispose method, suppress the      // finalize method, so this instance is only disposed      // once.      if (disposing)        GC.SuppressFinalize(this);       // Release any unmanaged resource      // ...    }  } This class implements implicitly the IDisposable interface, by defining the Dispose method. Here, both the Dispose method and the Finalizer call an overload of the Dispose method. This method is the code that will actually release unmanaged resources. You might note the use of the GC.SuppressFinalize method, that prevents the GC from calling the Dispose method again from the Finalize method. This also has a alternate objective: Remove some pressure on the GC, as finalizing objects is rather expensive.This pattern can also be completed by an “already disposed” check, to avoid multiple Dispose calls. There are two possible behaviors there : Either silently ignore any subsquent calls, or throw a DisposedException. Using one or the other is a matter of context. 

While not every object needs to be finalizable (and disposable), each time you add a finalizer, you should also implement the System.IDisposable interface and the Disposable pattern.

?>

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


Don't get C# volatile the wrong way

clock August 5, 2004 11:45 by author Jerome

Don't get the C# volatile the wrong way. There is a lot of blurriness around synchronization issues in .NET, especially around the Memory Barriers, System.Monitor, the lock keyword and stuff like this.

It is common to have objects that are able to create Unique identifiers, by mean of a index incremented each time a new value is retreived. It would, using a simplistic view, look like this :

   public class UniqueIdentifier  
  
{     
    
private static int _currentIndex = 0;
    
public int NewIndex
     {         
      
get 
        
       
{

        
return _currentIndex++;
      
}
     }
   }

This is pretty straightforward: Each time the NewIndex property is called, a new index is returned.

But there is a problem in a multithreaded environment, where multiple threads can call the NewIndex at the same time. If we look at the code generated for the getter, here is what we have:

  IL_0000:  ldsfld     int32 UniqueIdentifier::_currentIndex
  IL_0005:  dup
  IL_0006:  ldc.i4.1
  IL_0007:  add
  IL_0008:  stsfld     int32 UniqueIdentifier::_currentIndex

One thing about the multithreading in general, the system can stop the execution anywhere it wants, especially between the operation at 0 and the end of the operation at 8. The effect is pretty obvious : If during this stop time, some other thread executes that same piece of code, each thread ends up with the same "new" index, each one incrementing from the same index. This scenario is in the presence of a uni-processor system, which interleaves the execution of running threads. On multi-processor systems, threads do not even need to be stopped to have this exact same problem. While this is harder that kind of race condition on a uniprocessor, this is far more easier to fall into with multiple processors.

This is a very common problem when programming in multithreaded environments, which is generally fixed by means of synchronization mecanisms like Mutexes or CriticalSections. The whole operation needs to be atomic which means executed by at most one thread at a time.

In the native world, in C/C++ for instance, the language does not provide any "built-in" synchronization mecanisms and the programmers have to do all the work by hand. The .NET framework with C#, on the other hand, provides that kind of mecanisms integrated in the language : the volatile and lock keywords.

A common and incorrect interpretation of the volatile keyword is to think that all operations (opposed to accesses) on a volatile variables are synchronized. This generally leads to this kind of code :

   public class UniqueIdentifier
  
{
    
private static volatile int _currentIndex = 0;
    
public int NewIndex
    
{
      
get
       
{
        
return _currentIndex++;
      
}
     
}
  
}

While this code is valid, it does not fix the synchronization problem encountered. The correct interpretation of the volatile keyword is that read and write operations to a volatile fields must not be reordered and, that the value of the variable must not be cached.

On a single x86 processor system, the only effect of the volatile keyword is that the value is never cached in something like a register and is always fetched from the memory. Since there is only one set of caches and one processor, there is no risk to have inconsistencies where memory would have been modified elsewhere. (This is called processor Self-Consistency)
But, on a multiprocessor system, each processor has a data cache set and depending on the cache policy, an updated value for the variable might not be written back immediatly into the main memory to make it available to the other threads requesting it. In fact it may never be updated, depending on the cache policy. Actually, this kind of situation is really hard to reproduce because of the high utilization of the cache and frequent flushes.

Back to volatile, it means that read/write operations will always target the main memory. In practice, a volatile read or write is called a Memory Barrier. Then, when using a volatile variable the thread is sure to have the latest value.
Back to our example, while we are sure to have the latest value, the read/increment/write operation is still not atomic and can be interrupted right in the middle.

To have a correct implementation of this UniqueIndentifier generator, we have to make it atomic :

   public class UniqueIdentifier
  
{
    
private static int _currentIndex = 0;
    
private object _syncRoot = new object();
    
public int NewIndex
    
{
      
get
      
{
        
lock(_syncRoot)

         
{

          
return _currentIndex++;

         
}

      
}

     
}

  
}

In this example, we are using the lock keyword. This is pretty nice because it uses the System.Threading.Monitor class to create a scope that can be entered by one thread at a time. While this solves the atomicity problem, you might notice that the volatile keyword is not present anymore. This is where the Monitor does something under the hood : It does a memory barrier at the acquisition of the lock and an other one a the release of the lock.

A lot of implicit stuff done by the CLR and it can be pretty hard to catch up on all this. Besides, the x86 memory model is pretty strong and does not introduce a lot of race conditions, while it would on a weaker memory model like the one on Itanium.

As a conclusion, use lock and forget volatile. :)

By the way, this article is pretty interesting on the subject.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


About me

My name is Jerome Laban, I am a Software developer and .NET enthustiast from Montréal, QC. You will find my blog on this site, where I'm adding my thoughts on current events, or the things I'm working on, such as the Bluetooth Remote Control Software for Windows Mobile.

© Copyright 2008

Links

Advertizing

Search

Categories


Tags

Calendar

<<  October 2008  >>
SuMoTuWeThFrSa
2829301234
567891011
12131415161718
19202122232425
2627282930311
2345678

Archive

Blogroll

Sign in