Double-Checked Locking with Singleton

Regarding the singleton pattern in Java programming, the author most often uses the internal static class implementation, or the enumeration implementation.

// 内部静态类的实现
public class Singleton1 {
  private static class SingletonHolder {
    private static final Singleton1 INSTANCE = new Singleton1();
  }
  private Singleton1() { ... }
  public static Singleton1 getInstance() {
    return SingletonHolder.INSTANCE;
  }
  // Singleton1.getInstance().doSomething();
  public void doSomething() { ... }
}

// 枚举实现
public enum Singleton2 {
  INSTANCE;
  // Singleton2.INSTANCE.doSomething();
  public void doSomething() { ... }
}

We all know that static code blocks in Java are executed when the class is loaded and only one copy exists globally, so singleton patterns implemented using static code blocks are thread-safe. Specifically for internal static class implementations, the SingletonHolder is a lazy singleton pattern because it is private and is loaded the first time getInstance() is called. The enumeration implementation is also essentially a static block of code. Check out the Singleton2 bytecode.

// javac Singleton2.java
// javap -c Singleton2.class
Compiled from "Singleton2.java"
public final class Singleton2 extends java.lang.Enum<Singleton2> {
  public static final Singleton2 INSTANCE;
  ...

  static {};
    Code:
       0: new           #4 // class Singleton2
       3: dup
       4: ldc           #7 // String INSTANCE
       6: iconst_0
       7: invokespecial #8 // Method "<init>":(Ljava/lang/String;I)V
      10: putstatic     #9 // Field INSTANCE:LSingleton2;
      13: iconst_1
      14: anewarray     #4 // class Singleton2
      17: dup
      18: iconst_0
      19: getstatic     #9 // Field INSTANCE:LSingleton2;
      22: aastore
      23: putstatic     #1 // Field $VALUES:[LSingleton2;
      26: return
}

As you can see, the enumeration is actually converted by javac into a class implementation that inherits from java.lang.Enum; and the constants of the enumeration are converted into static constants that are automatically initialized when the class is loaded. So the enum implementation’s singleton is actually a hungry-mode singleton. Despite the slight performance waste of the hungry pattern, Joshua Bloch, author of Effective Java, still considers the enum implementation to be the best singleton implementation.

This approach is functionally equivalent to the public field approach, except that it is more concise, provides the serialization machinery for free, and provides an ironclad guarantee against multiple instantiation, even in the face of sophisticated serialization or reflection attacks. While this approach has yet to be widely adopted, a single-element enum type is the best way to implement a singleton.

Of course, we all know that there is another implementation of double-checked locking, which is often an interview question for Java programming basics because it involves the synchronized and volatile keywords. A typical implementation of a double-checked lock is as follows.

public class Singleton3 {
  private static volatile Singleton3 sInstance;
  private Singleton3() { ... }

  public static Singleton3 getInstance() {
    if (sInstance == null) { // #1
      synchronized (Singleton3.class) {
        if (sInstance == null) { // #2
          sInstance = new Singleton3();
        }
      }
    }
    return sInstance;
  }

  // Singleton3.getInstance().doSomething();
  public void doSomething() { ... }
}

The code for double-checking locks is better understood. When calling getInstance() in a concurrent environment, multiple threads may enter the if at #1 at the same time; after the first thread holding the class lock releases the lock, there will be other (previously blocked) threads entering the synchronization block again; to avoid initializing the instance multiple times, another if is written at #2.

The problem is the role of volatile here. I always thought that volatile was to ensure memory visibility, for example, to ensure that the sInstance assignment takes effect in time, and to try to avoid other (previously blocked) threads crossing over to #2 and causing multiple instances to be initialized. But this is not actually the case. synchronized already guarantees visibility when exiting the synchronized block.

Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This guarantees that changes to the state of the object are visible to all threads.

What volatile actually does here is prohibit instruction reordering. To borrow from most blogs on the web, sInstance = new Singleton3(); is not an atomic operation, but will be broken down into three steps.

allocate memory space for the Singleton3 instance.
call Singleton3’s constructor.
Point the memory space to sInstance.

Without the volatile modifier, the above initialization steps may be optimized by JIT as 1 → 3 → 2. Thus, it may happen that

while the first thread A executes initialization steps 1 → 3, another thread B happens to be at #1.
Since sInstance has already been pointed to a memory space, sInstance ! = null will be returned directly.
Then B tries to call doSomething(), but since sInstance has not yet been constructed, it may error out.

A  B
↓  ↓  public static Singleton3 getInstance() {
↓  ↓    if (sInstance == null) { // #1
↓         synchronized (Singleton3.class) {
↓           if (sInstance == null) { // #2
↓             sInstance = new Singleton3();
            }
          }
        }
        return sInstance;
      }

It sounds like that. But the above is all theory by word of mouth, we need reproducible proof. Normally, even if we remove volatile, it is very difficult to reproduce errors caused by instruction reordering on our own machines. But luckily, a StackOverflow problem from 4 years ago solves this problem.

The OpenJDK provides a concurrent stress test tool called jcstress and a copy called UnsafePublication is the Java code used to test for instruction reordering. We can also use it to test for the presence or absence of volatile.

We can download the test code locally, note that it is hg and not git.

1
2

$ hg clone http://hg.openjdk.java.net/code-tools/jcstress/ jcstress
$ cd jcstress/

Then compile and run UnsafePublication, noting that the JDK version must be >= 9.

1
2

$ mvn clean install -pl tests-custom -am
$ java -XX:-UseCompressedOops -jar tests-custom/target/jcstress.jar -t ".*UnsafePublication.*" -v

For space reasons, only the code snippet of UnsafePublication related to the running result is attached here.

public class UnsafePublication {
  int x = 1;

  // 默认没有 volatile 修饰
  MyObject o;

  // 多个线程同时调用 publish()
  @Actor
  public void publish() { o = new MyObject(x); }

  // 多个线程同时调用 consume()，
  // 这里 jcstress 依赖 res.r1 的值判断有几个成员变量被初始化
  @Actor
  public void consume(I_Result res) {
    res.r1 = o != null ? o.x00 + o.x01 + o.x02 + o.x03 : -1
  }

  static class MyObject {
    int x00, x01, x02, x03;
    public MyObject(int x) {
      x00 = x; x01 = x; x02 = x; x03 = x;
    }
  }
}

As we can see, MyObject is not volatile by default, and the test results are as follows.

Test results without volatile modifier

We can change the code to volatile MyObject o; and recompile and run it as follows.

Test results with volatile modifier

Comparing the two screenshots, we can see that during the 100 million runs, the test without the volatile modification results in only some of the member variables being initialized, while the test with the volatile modification results in none.

Finally, even though the above test case is a simple new every time, it took more than 100 million runs before there were more than a thousand instruction rearrangement errors; if we convert it to a singleton implementation with double-checked locks without the volatile modifier, the probability of errors is obviously lower – but we still can’t avoid instruction rearrangement errors. Therefore, the keyword volatile is essential in a singleton implementation of double-checked locking, where it serves to disable instruction reordering.