I read this article by Bill Pugh on why the double checked locking idiom does not guarantee thread safety in Java Singletons. That article taught me a lot of new things, and to be honest, I had to re-read that article at least a couple of times to partially understand it :-)
I recently created a presentation to make at DevCamp on this topic. What follows are my slides and an explanation of each slide. I hope you enjoy this presentation and find it useful.
In this presentation I will discuss the double checked locking idiom, and explain why it does not work to provide thread safety to Java Singletons. I will also talk about how using volatile fields will fix the problem in JDK 1.5 onwards.
slide 2 (Singleton):
Many of you might already have used the Singleton design pattern. In case you have not, here is a brief description of what Singletons are. The Singleton pattern is used when we want to ensure that there is only one instance of a class per something. In most cases, the something is 'JVM - Classloader' combination. So, in such cases we want to ensure that there is only one instance of the Singleton class in the JVM-Classloader. But we are not restricted to this. We may need only one instance of the Singleton per Session, or per Request, or per anything else that makes sense in the software you are making.
It is clear that we want to control the instantiation process of the Singleton class. To do this, the first thing we do as shown in the example is to make the constructor private. With the constructor private, no other class can create an instance of this class. OK, so how will we create any instance of this class? We provide a public static method called
getInstance()which will return an instance of this class. We will also use a private static field called instance to hold the one and only instance of this class. Since we control the
getInstance()method, every time it is invoked we will return instance. The first time however,
instancewill not be initialized, so we put a null check in
instanceif it has not already been instantiated. Hence forth we will always return a reference to
instance. This way we ensure that only one instance of our Singleton will be ever created (the one we save in the field called
slide 3 (Is This A Safe Singleton?):
Look at the class carefully and ask yourself "Is this Singleton safe?"
slide 4 (The Singleton is not Threadsafe):
If you have looked at the class carefully, you will realize that the Singleton is not thread safe. This slide shows why it is not thread safe. Imagine that a thread T1 invokes
getInstance(). If an instance of the Singleton has not been created till now, then
instancewill be null, and T1 will enter the if block. Now imagine that T1 gets pre-empted, and thread T2 also calls
getInstance(). Since T1 got pre-empted before it got a chance to create
instance, the reference in
instanceis still null. So T2 will also enter the if block. Now if T2 does not get pre-empted then it will go ahead all the way where it will create an instance of Singleton, and probably even use it. Now when T1 is resumed, unfortunately it does not know that the Singleton has already been created, since it has already gone past the null check. T1 will also happily create an instance of our Singleton, thus breaking the Singleton pattern. We have 2 instances of the Singleton which is simply not allowed.
slide 5 (Let;s Make It Threadsafe):
Alright so we need to make the
getInstance()method thread safe. The simplest way to do this is by making it synchronized. Once it is synchronzied, only one thread will be allowed to enter the method at a time, and we will never get into an issue where we end up having multiple instances of the Singleton.
slide 6 (That Was Expensive):
Making a method synchronized is expensive. A thread executing a synchronized method must acquire a monitor before it can invoke the method, and release the monitor after executing the method. This takes time and CPU cycles. In early JVM's (I have read somewhere), the time to invoke (not execute) a synchronized method was as much as 100x the time taken to invoke a non synchronized method. I think this was improved in later JVM's where it was about 15x. I recently did a quick test on JDK 1.6, and the number I got was 2x. Regardless of how much slower it is, it is definitely slower, and programmers always want to make their code fast.
slide 7 (Let's Use Double Checked Locking):
Some very smart programmers came up with the "double checked locking" idiom to relieve the code of this expensive operation. What we really should have done, is only synchronize the code which instantiates the Singleton, and not the entire method. That way we incur the expense of the synchronized block only once, when the Singleton needs to be created, and all subsequent times (instance == null) will always be false, and the instance will be returned to the calling code. We put a second check inside the synchronized block, because it is possible that a thread could get pre-empted just after the first null check, but before it enters the synchronized block. In such a case, if another thread enters this method, and goes all the way to creating the instance then the first thread will not know of it and will create a second instance. To prevent this, we put another check in the synchronized block.
So this is a pretty smart solution. We eliminate the synchronization on the method, thus eliminating synchronization on the main path.
slide 8 (Will It Work?):
The solution described in the previous slide (double checked locking) is a pretty smart solution. But it does not work.
slide 9 (Why?):
There are two reasons why it does not work. The compiler as well as the processor (as well as the JVM) may reorder instructions if they feel it might be more optimal. Off course they cannot randomly reorder instructions but they could if they can prove that the reordering will maintain as-if-serial semantics. However, this is not the only reason. On a multi-processor system, each processor has it's own memory cache. The cache is not always synchronized with main memory immediately. This can cause a write to a memory location to happen in the local cache, which will not be visible to other threads which could be running on other processors. Even if the writing thread does flush it's local cache, it is still possible that another thread which reads that value, may not have pulled in most recent values from the main memory into it's local cache. This can cause a situation very similar to reordering where the effect of a write is not visible to another thread that is reading that variable's value.
slide 10 (instance = new LoneRanger(””) != Atomic operation):
If you look at the example in slide 7, you might think that the statement below is an atomic statement.
instance = new LoneRanger();
But that is not the case. For the sake of simplicity this slide shows the statement above broken into 2 operations. In the first part, an instance of the class is constructed, and in the second part a reference to that instance is assigned to the variable instance.
slide 11 (Compiler Reordering):
Now what if the compiler reorders the instructions in such a way that the field instance is assigned an uninitialized block of memory (created for the class LoneRanger), and then the constructor of LoneRanger is invoked in the second step.
slide 12 (Compiler Reordering Pseudocode):
This slides shows pseudocode to understand the effect of compiler reordering on the double checked locking code. The code in the synchronzied block which instantiates our Singleton is broken into two steps: assignment, and initialization, where the assignment happens before the initialization.
Now imagine thread T1 enters the
getInstance()method, enters the synchronized block and executes the statement which assigns the block of memory allocated for LoneRanger to the variable called instance. If T1 gets preempted and thread T2 enters
getInstance()then T2 may see a non null value for instance. Thus T2 will actually be returned an initialized object of LoneRanger. This could cause all sorts of bugs in the software.
So in the first problem described a few slides back, we ran into the issue of having multiple instances of our Singleton, which we attempted to fix with the double checked locking idiom. However, this introduced the issue where a thread might be given a reference to an uninitialized instance of the Singleton.
slide 13 (Can We Prevent reordering):
Programmers never give up :-) When an even smarter programmer was explained this problem, he remarked "so let us prevent reordering and our problem will be solved !". Ok let's try that, but how do we prevent reordering? Well there is something called a memory barrier, which may help us. Instructions cannot be reordered across memory barriers (although they may be reordered within a memory barrier). Let's see if we can introduce memory barriers to prevent the reordering that's been giving us such a hard time till now.
slide 14 (Memory Barrier):
A memory barrier is a low level (at the level of the processor) construct which is used to create a fence around instructions. Instructions which are fenced inside a memory barrier cannot be moved out of the fence, and memory caches are also synched with main memory when a memory barrier is encountered.
slide 15 (Memory Barrier):
This slide explains memory barriers with an illustration. In this slide notice that instructions (instr2, instr3, & instr3) are fenced with the memory barrier. Because of the semantics of a memory barrier these instructions can never be moved out of the barrier, but they may be re-ordered within the memory barrier. We can also see when the memory barrier is entered the local processor cache is invalidated and latest values are read from the main memory, and when the memory barrier is exited then the local cached is flushed and the main memory is updated with the latest values.
slide 16 (Memory Barrier):
Because memory barrier is a low level construct, there is no way to explicitly create one in a high level language like Java. However, the synchronized keyword in Java implicitly creates a memory barrier. Before I read this, I had no clue that synchronization in Java is anything more than a mutex. But I read somewhere that when a monitor is obtained an memory barrier is also created and when a monitor is released the memory barrier ends (this is my understanding, please correct me if this is wrong).
slide 17 (Double Checked Locking With Memory Barrier):
The fact that assignment and construction of our Singleton could be reordered created the issue of potentially having an uninitialized Singleton. We want to ensure that the reordering does not happen. For that let us separate the construction and assignment with a memory barrier and put the assignment after the construction. We do this with 2 synchronized blocks and a temporary variable. In the inner synchronized block we will initialize the Singleton and assign it to a temporary variable. We don't care if this is reordered, because the class variable instance will still be null which will prevent another thread from getting an initialized instance of the Singleton. Then we assign the reference to the temporary variable to the class variable instance. This happens outside the memory barrier. So by employing a memory barrier we are hoping to maintain program order in the execution of instructions. Again a very smart solution, but unfortunately this does not work either.
slide 18 (Monitor Exit Semantics):
The semantics of monitor exit specify that everything that happened before the monitor exit should happen before it, which means that nothing from the inner synchronized block (where we instantiate the Singleton) will be moved out of the inner synchronized block, however, it does not mean that something will not be moved from outside of the block to within it.
slide 19 (Monitor Exit Semantics):
This slide explains monitor exit semantics with an illustration. As we can see inst2 and inst3 are within the memory barrier. Neither of them will be moved out of the block, but inst4 which is out of the memory barrier may be moved in the barrier.
slide 20 (Double Checked Locking With Memory Barrier):
This slide shows the code where we tried to use the memory barrier. But this time we show the code with a potential reordering. With the statement instance = tempInstance inside the memory barrier it could be further reordered to the point before the actual construction of the Singleton, bringing us back to the same problem.
slide 21 (OK So What The Hell Will Work ?):
Alright enough of playing around... not I am getting a bit edgy.... just tell me what the hell is going to work. Is this what you are saying to yourself? Java has a special modifier called volatile. We can use these fields to help us with the Singleton.
slide 22 (Semantics of volatile):
The volatile modifier is used in Java to communicate state changes between threads. This slide explains the semantics of volatile with an illustration.
The code on the left shows a class which has two methods: writer() and reader(). The writer() method writes values of variables x and v to memory, and the reader() method reads values of x and v from memory. Imagine that both these method will be called by different threads T1 and T2, running on different processors P1, and P2. Each processor has a local memory cache and off course there is also the main memory. The variable v is volatile and x is not volatile.
When thread T1 executes writer() we are guaranteed that the instructions will not be reordered because v is volatile. We are also guaranteed that when thread T1 exits the method, processor P1's local cache will be flushed to main memory, which means that the value of x as well as v will be visible to other threads. When thread T2 executes the method reader(), it first reads the value of v, this is going to invalidate the local cache of P1, and fetch the latest values from main memory. This ensures that it sees the latest values (which in this case were the ones written by T1 when it called writer()) of v and x.
So this slide explains the semantics of volatile and how volatile may be used to ensure that instructions are executed in program order and also ensure the visibility of writes by one thread by another thread.
slide 23 (Double Checked Locking With Volatile):
This slide shows our old and well known example which uses double checked locking, but this time with a little difference. This time we have made the variable instance a volatile field. Making it volatile will ensure that the write to initialize the LoneRanger object and the assignment of that instance to the field instance will not be reordered. This elimitaes the problem of seeing an uninitialized Singleton object.
slide 24 (Singleton With Static Initializer):
After doing all these coding gymnastics, in this slide we show a solution which is probably the simplest solution for creating a thread safe Singleton. Instead of instantiating the Singleton in
getInstance(), we use a static initializer, which will instantiate the Singleton when the class is loaded. Java semantics ensure us that all static fields of a class will be completely initialized before the class is available for use. This means that the field instance will be properly initialized before anyone makes use of the class to get an instance of the Singleton.
slide 25 (Singleton With Static Initializer):
In this slide we understand the pros and cons of using a static initializer. Even though using a static initializer is the simplest solution, it is possible that everything the Singleton needs to initialize itself may not be available when the class is loaded (which causes the static initializer to be invoked). It is also possible that the Singleton may be eagerly loaded, which may result in greater loading time for our application, something that may be undesirable if instantiating the Singleton is an expensive operation.
slides 26, 27, 28 (Summary):
In these slides we summarize the main points covered in the presentation till now.
slide 29 (Resources):
Links to some very good and relevant articles, including my favorite article by Bill Pugh.
slide 30 (Thank You):