Don't get the C# volatile the wrong way. There is a lot of blurriness around synchronization issues in .NET, especially around the Memory Barriers, System.Monitor, the lock keyword and stuff like this.

It is common to have objects that are able to create Unique identifiers, by mean of a index incremented each time a new value is retreived. It would, using a simplistic view, look like this :

   public class UniqueIdentifier  
  
{     
    
private static int _currentIndex = 0;
    
public int NewIndex
     {         
      
get 
        
       
{

        
return _currentIndex++;
      
}
     }
   }

This is pretty straightforward: Each time the NewIndex property is called, a new index is returned.

But there is a problem in a multithreaded environment, where multiple threads can call the NewIndex at the same time. If we look at the code generated for the getter, here is what we have:

  IL_0000:  ldsfld     int32 UniqueIdentifier::_currentIndex
  IL_0005:  dup
  IL_0006:  ldc.i4.1
  IL_0007:  add
  IL_0008:  stsfld     int32 UniqueIdentifier::_currentIndex

One thing about the multithreading in general, the system can stop the execution anywhere it wants, especially between the operation at 0 and the end of the operation at 8. The effect is pretty obvious : If during this stop time, some other thread executes that same piece of code, each thread ends up with the same "new" index, each one incrementing from the same index. This scenario is in the presence of a uni-processor system, which interleaves the execution of running threads. On multi-processor systems, threads do not even need to be stopped to have this exact same problem. While this is harder that kind of race condition on a uniprocessor, this is far more easier to fall into with multiple processors.

This is a very common problem when programming in multithreaded environments, which is generally fixed by means of synchronization mecanisms like Mutexes or CriticalSections. The whole operation needs to be atomic which means executed by at most one thread at a time.

In the native world, in C/C++ for instance, the language does not provide any "built-in" synchronization mecanisms and the programmers have to do all the work by hand. The .NET framework with C#, on the other hand, provides that kind of mecanisms integrated in the language : the volatile and lock keywords.

A common and incorrect interpretation of the volatile keyword is to think that all operations (opposed to accesses) on a volatile variables are synchronized. This generally leads to this kind of code :

   public class UniqueIdentifier
  
{
    
private static volatile int _currentIndex = 0;
    
public int NewIndex
    
{
      
get
       
{
        
return _currentIndex++;
      
}
     
}
  
}

While this code is valid, it does not fix the synchronization problem encountered. The correct interpretation of the volatile keyword is that read and write operations to a volatile fields must not be reordered and, that the value of the variable must not be cached.

On a single x86 processor system, the only effect of the volatile keyword is that the value is never cached in something like a register and is always fetched from the memory. Since there is only one set of caches and one processor, there is no risk to have inconsistencies where memory would have been modified elsewhere. (This is called processor Self-Consistency)
But, on a multiprocessor system, each processor has a data cache set and depending on the cache policy, an updated value for the variable might not be written back immediatly into the main memory to make it available to the other threads requesting it. In fact it may never be updated, depending on the cache policy. Actually, this kind of situation is really hard to reproduce because of the high utilization of the cache and frequent flushes.

Back to volatile, it means that read/write operations will always target the main memory. In practice, a volatile read or write is called a Memory Barrier. Then, when using a volatile variable the thread is sure to have the latest value.
Back to our example, while we are sure to have the latest value, the read/increment/write operation is still not atomic and can be interrupted right in the middle.

To have a correct implementation of this UniqueIndentifier generator, we have to make it atomic :

   public class UniqueIdentifier
  
{
    
private static int _currentIndex = 0;
    
private object _syncRoot = new object();
    
public int NewIndex
    
{
      
get
      
{
        
lock(_syncRoot)

         
{

          
return _currentIndex++;

         
}

      
}

     
}

  
}

In this example, we are using the lock keyword. This is pretty nice because it uses the System.Threading.Monitor class to create a scope that can be entered by one thread at a time. While this solves the atomicity problem, you might notice that the volatile keyword is not present anymore. This is where the Monitor does something under the hood : It does a memory barrier at the acquisition of the lock and an other one a the release of the lock.

A lot of implicit stuff done by the CLR and it can be pretty hard to catch up on all this. Besides, the x86 memory model is pretty strong and does not introduce a lot of race conditions, while it would on a weaker memory model like the one on Itanium.

As a conclusion, use lock and forget volatile. :)

By the way, this article is pretty interesting on the subject.