I read the article a long time ago, but I decided to re-read it and to post a short summary here, just to make sure I remember the important points of it. The summary has links to the corresponding sections in Vance’s article.
- The fundamental requirement for writing correct programs is the same whether the program is sequential or multithreaded: all code in the program must protect any invariants that other parts of the program need, but preserving program invariants is much more difficult in the multithreaded case.
- The tricky part of multithreaded programming is how threads communicate with one another.
- The most commonly deployed multithreaded communication model is called the shared memory model:
- In this model all threads have access to the same pool of shared memory.
- Its advantage and also its biggest problem is that multithreaded programs are programmed in much the same way as sequential programs.
- The model does not distinguish between memory that is being used strictly for thread local use (like most locally declared variables), and memory that is being used to communicate with other threads (like some global variables and heap memory).
- There are four conditions needed for a race to be possible:
- There are memory locations (typically global/static variables or heap memory reachable from global/static variables) that are accessible from more than one thread.
- There is a property associated with these shared memory locations that is needed for the program to function correctly. Typically, the property needs to hold true before an update occurs for the update to be correct.
- The property does not hold during some part of the actual update.
- Another thread accesses the memory when the invariant is broken, thereby causing incorrect behavior.
- The most common way of preventing races is to use locks to prevent other threads from accessing memory associated with an invariant while it is broken. This removes the fourth condition required for a race, thus making it impossible to happen.
- In the Microsoft .NET Framework locks are implemented by the System.Threading.Monitor class.
- The lock statement in C# is a shortcut for the following:
- Exceptions in a lock block must be properly handled. The lock is there to ensure an invariant, so it’s important to make sure that the invariant wasn’t broken as a result of the exception.
- A simple lock statement is appropriate if there is no useful cleanup to do or if everything in its body is read-only.
- Locks provide mutual exclusion for regions of code, but generally programmers want to protect regions of memory.
- For a lock to provide mutual exclusion for a region of memory, no writes to that memory can occur without entering the same lock. In a properly designed program, associated with every lock is a region of memory for which it provides mutual exclusion. There is no obvious artifact of the code that makes this association clear, and yet this information is absolutely critical for anyone reasoning about the multithreaded behavior of the program.
- Every lock should have a specification associated with it that documents the precise region of memory (set of data structures) for which it provides mutual exclusion.
- While having memory regions protected by different locks that overlap is not strictly incorrect, it should be avoided because it is not useful.
- The main disadvantage of having more than one lock is the complexity involved. The program has more parts associated with it so there is more opportunity for programmer error.
- It is best to have few locks that protect large regions of memory and only split them when lock contention is shown to be a bottleneck on performance.
- A lock should always be entered before writing a memory location. When memory is read the case is a little more complicated because it depends on the programmer’s expectations.
- In general, when code needs a program invariant, all locks associated with any memory involved with the invariant must be taken.
- For multithreaded applications the programmer’s mindset should change from believing that nothing changes unless it is explicitly changed to believing that everything changes unless locks are used to prevent it.
- An easy and correct answer would be that all memory needs to be protected by a lock, but this would be overkill for most apps.
- Memory can be made safe for multithreaded use in one of several ways:
- Memory that is only accessed by a single thread is safe because other threads are unaffected by it. This includes most local variables and all heap-allocated memory before it is published (made reachable to other threads). Once memory is published, however, it falls out of this category and some other technique must be used.
- Memory that is read-only after publication does not need a lock because any invariants associated with it must hold for the rest of the program (since the value does not change).
- Memory that is actively updated from multiple threads generally uses locks to ensure that only one thread has access while a program invariant is broken.
- In certain specialized cases where the program invariant is relatively weak, it is possible to perform updates that can be done without locks. Typically, specialized compare-and-exchange instructions are used. These techniques are best thought of as special, lightweight implementations of locks.
- A general rule should be that all program memory should fall into one of three buckets: thread exclusive, read-only, or lock protected.
- In a well-designed class, clients only access its internal state by calling its instance methods. If a lock is taken on entry to any instance method, and released on exit, there is a systematic way of ensuring that all accesses to internal data (instance fields) only occur when the lock is held. Classes that follow this protocol are called monitors (NOTE: So in essence monitors are classes that thread-safe because they synchronize access to their internal data. See this Wikipedia article about monitors for a better explanation of the concept).
- The concept of a monitor is one of the simplest and most useful techniques for methodical locking.
- The name of a lock in the .NET Framework is System.Threading.Monitor, a type specifically tailored to support the monitor concept.
- .NET locks operate on System.Object to make the creation of monitors easier.
- Every object has a built-in lock that can be used to protect its instance data. By embedding the body of every instance method in a lock(this) statement, a monitor can be formed (NOTE: Jeffrey Richter, in his “CLR via C#, Second edition” book, explains why using lock(this) or lock(typeof(ClassName)) isn’t a good idea. See  in the footnotes).
- There is a special attribute, [MethodImpl(MethodImplAttributes.Synchronized)], that can be placed on instance methods to automatically insert the lock(this) statement (NOTE: Since this attribute will surround the method’s code with a lock(this) is it’s an instance methof or a lock(typeof(typename)) if it’s a static method, you should never use this attribute either. Again, see  in the footnotes).
- .NET locks are reentrant, meaning that the thread that has entered the lock can enter the lock again without blocking. This allows methods to call other methods on the same class without causing the deadlock that would normally occur.
- Indiscriminate use of monitors can result in either too little or too much locking.
- An important weakness with the monitor concept is that it provides no protection if the class gives out updatable pointers to its data.
- Monitors don’t address the more complicated situation when multiple locks need to be taken.
- Generally speaking, most reusable code (like container classes) should not have locks built into it because it can only protect itself, and it is likely that whatever code uses it will need a stronger lock anyway. The exception to this rule is when it is important that the code works even with high-level program errors. The global memory heap and security-sensitive code are examples of exceptional cases.
- It is less error prone and more efficient to have just a few locks that protect large amounts of memory. A single lock that protects many data structures is a good design if it allows the desired amount of concurrency. If the work each of the threads will do doesn’t require many updates to shared memory, consider taking this principle to the extreme and having one lock that protects all shared memory. It works well for applications whose worker threads don’t interact much.
- In cases where threads read shared structures frequently, but write to them infrequently, reader-writer locks such as System.Threading.ReaderWriterLock can be used to keep the number of locks in the system low (NOTE: Jeffrey Richter also says in his “CLR via C#, Second edition” book that he doesn’t recommend that anyone ever use this class, see  in the footnotes). This type of lock has one method for entering for reading and one for entering for writing. The lock will allow multiple readers to enter concurrently, but writers get exclusive access. Since readers now don’t block one another, the system can be made simpler (containing fewer locks) and still achieve the necessary concurrency.
- Lock usage can often be at odds with normal object-oriented program abstraction because locking is really another independent dimension of the program that has its own design criteria (other such dimensions include lifecycle management, transactional behavior, real-time constraints, and exception-handling behavior).
- If some method does not take locks itself, but expects its caller to provide that mutual exclusion, then that requirement is a precondition of calling that method and should be in its interface contract.
- If a data abstraction might take a lock or call client (virtual) methods while holding a lock, that also needs to be part of the interface contract.
- There is little that class library designers can do to make the library friendlier to multithreading, because locking just one data structure is rarely useful. It is only when the threading structure of program is designed that locks can be usefully added.
- Once a program has more than one lock deadlock becomes a possibility, which is another reason to avoid having many locks in the system.
- Deadlocks can only occur if there is a circular chain of threads such that each thread in the chain is waiting on a lock already acquired by the next in line. For example, if one thread tries to enter Lock A and then Lock B, while simultaneously another thread tries to enter Lock B and then Lock A, it is possible for them to deadlock if each enters the lock that the other owns before attempting to enter the second lock.
- Deadlocks are generally prevented in one of two ways:
- The best way to prevent deadlock is to have few enough locks in the system that it is never necessary to take more than one lock at a time.
- If the former is impossible, deadlock can also be prevented by having a convention on the order in which locks are taken. Each lock in the system is assigned a “level”, and the program is designed so that threads always take locks only in strictly descending order by level. This protocol makes cycles involving locks, and therefore deadlock, impossible. If this strategy does not work (can’t find a set of levels), it is likely that the lock-taking behavior of the program is so input-dependent that it is impossible to guarantee that deadlock cannot happen in every case. Typically, this type of code falls back on timeouts or some kind of deadlock detection scheme.
- An analysis must be made to determine why more than one lock has to be taken simultaneously.
- Taking multiple locks is only necessary when code needs to get exclusive access to memory protected by different locks. This analysis typically either yields a trivial lock ordering that will avoid deadlock or shows that complete deadlock avoidance is impossible.
- Another reason to avoid having many locks in a system is the cost of entering and leaving a lock.
- The lightest locks use a special compare/exchange instruction to check whether the lock is taken, and if it isn’t, they enter the lock in a single atomic action. This special instruction is relatively expensive (typically ten to hundreds of times longer than an ordinary instruction).
- There are two main reasons for this expense, and they both have to do with issues arising on a true multiprocessor system:
- The compare/exchange instruction must ensure that no other processor is also trying to do the same thing. Fundamentally, this requires one processor to coordinate with all other processors in the system. This is a slow operation and accounts for the lower bound of the cost of a lock (dozens of cycles).
- The other reason for the expense is the effect inter-process communication has on the memory system. After a lock has been taken, the program is very likely to access memory that may have been recently modified by another thread. If this thread ran on another processor, it is necessary to ensure that all pending writes on all other processors have been flushed so that the current thread sees the updates. The cost of doing this largely depends on how the memory system works and how many writes need to be flushed. This cost can be pretty high in the worst case (possibly hundreds of cycles or more).
- The cost of a lock is can be significant. If a frequently called method needs to take a lock and only executes a hundred or so instructions, lock overhead is likely to be an issue. The program generally needs to be redesigned so that the lock can be held for a larger unit of work.
- As the number of processors in the system grows locks become the main impediment in using all the processors efficiently:
- If there are too few locks in the program, it is impossible to keep all the processors busy since they are waiting on memory that is locked by another processor.
- If there are many locks in the program, it is easy to have a “hot” lock that is being entered and exited frequently by many processors. This causes the memory-flushing overhead to be very high, and throughput again does not scale linearly with the number of processors. The only design that scales well is one in which worker threads can do significant work without interacting with shared data.
- Locks don’t really provide a mechanism for threads to synchronize.
- The principles for synchronizing threads without introducing races are not much different from the principles for proper locking.
- In the .NET Framework synchronization functionality is implemented in the ManualResetEvent and AutoResetEvent classes. These classes provide Set, Reset, and WaitOne methods: the WaitOne method causes a thread to block as long as the event is in the reset state. When another thread calls the Set method, AutoResetEvents will allow one thread that called WaitOne to unblock, while ManualResetEvents will unblock all waiting threads.
- Generally, events are used to signal that a more complicated program property holds.
- In general, it’s a bad idea to hold locks while waiting on events.
- The difference between sequential and multithreaded programs is that protecting program invariants is more complicated in the multithreaded case.
- It requires a significantly higher degree of discipline to build a correct multithreaded program:
- Ensuring through tools like monitors that all thread-shared, read-write data is protected by locks.
- Carefully design which memory is protected by which lock.
- Controlling the extra complexity that locks inevitably add to the program.
- Good designs are typically the simplest, and for multithreaded programs this means having the fewest number of locks that achieve the needed concurrency.
 Simply stated, when you lock on the this pointer or the object’s type, you open the possibility for malicious code to deadlock threads that use that type. For a complete explanation see pages 636-639 of “CLR via C#, Second Edition“.
 The ReaderWriterLock class has three flaws: its performance is awful, even when there is not contention on the lock, it has a policy of favoring readers over writers, “writers get starved for time and do not complete their work in a timely fashion”, and finally it supports recursion, meaning that the thread that enters the lock must be the thread that exits it (Richter consider this a bug because it’s becoming commonplace to have a thread enter a lock and start an operation to access some data and have another thread to complete the operation and exit the lock, specially likely in a asynchronous programming model). See pages 642-643 of “CLR via C#, Second Edition“.
So that’s it. If you are new to multithreading programming I strongly suggest you to read the original article. Happy coding!