Talk about some tips for using Unsafe

I remember when I first learned Java, just after learning the syntax basics, I came across reflection, a feature provided by Java, although it seems to be a very basic knowledge point now, but at that time, I was undoubtedly excited, and I instantly felt that I was out of the “Java beginner” team. As I gained experience, I gradually learned a lot of similar points that I was excited about, and the Unsafe technique was definitely one of them.

Unsafe` is a tool class provided natively by the JDK that contains many operations that seem cool in Java, such as memory allocation and reclamation, CAS operations, class instantiation, memory barriers, and so on. As its name suggests, the operations it provides are also more dangerous due to its ability to manipulate memory directly and perform underlying system calls. unsafe has been instrumental in extending the expressiveness of the Java language and facilitating the implementation of core library functionality in higher-level (Java layer) code that would otherwise be implemented at a lower level (C layer).

Starting with JDK9, the limitations of Java’s modular design prevented any of the non-standard library modules from accessing sun.misc.Unsafe. However, in JDK8, we can still operate Unsafe directly, and if we don’t learn it, we may not have the chance later.

Using Unsafe

Unsafe was not designed to be invoked by the average developer, so we cannot instantiate Unsafe objects via the new or factory methods.

public static final Unsafe unsafe = getUnsafe();

static sun.misc.Unsafe getUnsafe() {
    try {
        Field field = Unsafe.class.getDeclaredField("theUnsafe");
        field.setAccessible(true);
        return  (Unsafe) field.get(null);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

Once you get it, you can use the global singleton object to do whatever you want.

I borrowed the image directly from the web. The above picture contains many features of Unsafe, which is quite comprehensive. If I introduce all of them, the article will be too long, and the format will inevitably be a running account, so I am going to combine some of my project experience and some competition experience, and talk about some tips of Unsafe from a practical point of view.

Memory allocation & access

Java can actually manipulate memory directly like C++, with Unsafe. Let’s start with a ByteBuffer example where we will open up a 16-byte memory space and write and read 4 int types of data.

public static void testByteBuffer() {
    ByteBuffer directBuffer = ByteBuffer.allocateDirect(16);
    directBuffer.putInt(1);
    directBuffer.putInt(2);
    directBuffer.putInt(3);
    directBuffer.putInt(4);
    directBuffer.flip();
    System.out.println(directBuffer.getInt());
    System.out.println(directBuffer.getInt());
    System.out.println(directBuffer.getInt());
    System.out.println(directBuffer.getInt());
}

Students who are familiar with nio operations should not be familiar with the above example, which is a very basic and standard way of using memory. How can Unsafe achieve the same effect?

public static void testUnsafe0() {
    Unsafe unsafe = Util.unsafe;
    long address = unsafe.allocateMemory(16);
    unsafe.putInt(address, 1);
    unsafe.putInt(address + 4, 2);
    unsafe.putInt(address + 8, 3);
    unsafe.putInt(address + 12, 4);

    System.out.println(unsafe.getInt(address));
    System.out.println(unsafe.getInt(address + 4));
    System.out.println(unsafe.getInt(address + 8));
    System.out.println(unsafe.getInt(address + 12));
}

The output of both codes is the same.

The following is a description of the Unsafe APIs used, one by one.

`1`	`public native long allocateMemory(long var1);`

The native method allocates memory off the heap and returns a long type value, which is the first address of the memory and can be used as a reference to other Unsafe APIs. If you have seen the source code of DirectByteBuffer, you will see that it is actually wrapped in Unsafe internally. Speaking of DirectByteBuffer, here’s an extra note: ByteBuffer.allocateDirect allocates out-of-heap memory subject to -XX:MaxDirectMemorySize, while Unsafe allocates out-of-heap memory without the limit, and of course, without the -Xmx limit. . If you are participating in a contest and are inspired by something, you can type “I got it” on the public screen.

Seeing the other two APIs putInt and getInt, you should realize that there must be other byte manipulation APIs such as putByte / putShort / putLong, and of course put and get come in pairs. There are also points to note in this series of APIs, and it is recommended to use them in pairs, otherwise the parsing may fail due to byte order. You can see the following example.

public static void testUnsafe1() {
    ByteBuffer directBuffer = ByteBuffer.allocateDirect(4);
    long directBufferAddress = ((DirectBuffer)directBuffer).address();
    System.out.println("Unsafe.putInt(1)");
    Util.unsafe.putInt(directBufferAddress, 1);
    System.out.println("Unsafe.getInt() == " + Util.unsafe.getInt(directBufferAddress));
    directBuffer.position(0);
    directBuffer.limit(4);
    System.out.println("ByteBuffer.getInt() == " + directBuffer.getInt());
    directBuffer.position(0);
    directBuffer.limit(4);
    System.out.println("ByteBuffer.getInt() reverseBytes == " + Integer.reverseBytes(directBuffer.getInt()));
}

The output is as follows.

Unsafe.putInt(1)
Unsafe.getInt() == 1
ByteBuffer.getInt() == 16777216
ByteBuffer.getInt() reverseBytes == 1

We can find that when we use Unsafe to putInt and then use ByteBuffer to getInt, the result is not as expected and we need to change the byte order of the result to restore the correct one. This is actually because ByteBuffer internally determines the byte order of the current operating system, and for multi-byte data types like int, my test machine uses large end order storage, while Unsafe stores them in small short order by default. If you are not sure, it is recommended to use the write and read APIs together to avoid byte order problems.

Memory Copy

Memory copy is still a very common requirement in real-world applications. For example, as I introduced in the previous article, when writing to disk, in-heap memory needs to be copied to off-heap memory first, and when we do memory aggregation, for example, we need to buffer some data, which also involves memory copy. Unsafe provides native methods for memory copying, which can be done from heap to heap, from heap to heap, from heap to heap, and from heap to heap, in short, from anywhere to anywhere.

`1`	`public native void copyMemory(Object src, long offset, Object dst ,long dstOffset, long size);`

For in-heap memory, we can directly pass src the first address of the array of objects and specify offset as the offset of the corresponding array type, and we can get the offset of the objects stored in in-heap memory by using the arrayBaseOffset method

`1`	`public native int arrayBaseOffset(Class<?> var1);`

For example, to get a fixed offset of byte[] you can do this: unsafe.arrayBaseOffset(byte[].class)

For off-heap memory, it is a bit more intuitive, with dst set to null and dstOffset set to the memory address obtained by Unsafe.

Example code for copying in-heap memory to out-of-heap memory.

public static void unsafeCopyMemory()  {
    ByteBuffer heapBuffer = ByteBuffer.allocate(4);
    ByteBuffer directBuffer = ByteBuffer.allocateDirect(4);
    heapBuffer.putInt(1234);
    long address = ((DirectBuffer)directBuffer).address();

    Util.unsafe.copyMemory(heapBuffer.array(), 16, null, address, 4);

    directBuffer.position(0);
    directBuffer.limit(4);

    System.out.println(directBuffer.getInt());
}

In practice, most ByteBuffer-related source code uses the copyMemory method when it comes to memory copying.

Unconventional instantiated objects

Before JDK9 modularity, there were usually two common practices if you didn’t want to open some classes to other users or avoid being instantiated randomly (singleton pattern)

Case 1: Privatizing constructors

public class PrivateConstructorFoo {

    private PrivateConstructorFoo() {
        System.out.println("constructor method is invoked");
    }

    public void hello() {
        System.out.println("hello world");
    }

}

If you wish to instantiate the object, the first thing that comes to mind is probably reflection to create

public static void reflectConstruction() {
  PrivateConstructorFoo privateConstructorFoo = PrivateConstructorFoo.class.newInstance();
  privateConstructorFoo.hello();
}

Unsurprisingly, we obtained an exception

`1`	`java.lang.IllegalAccessException: Class io.openmessaging.Main can not access a member of class moe.cnkirito.PrivateConstructorFoo with modifiers "private"`

With a slight adjustment, the constructor is called to create the instance

public static void reflectConstruction2() {
   Constructor<PrivateConstructorFoo> constructor = PrivateConstructorFoo.class.getDeclaredConstructor();
   constructor.setAccessible(true);
   PrivateConstructorFoo privateConstructorFoo = constructor.newInstance();
   privateConstructorFoo.hello();
}

It works! The output is as follows.

1
2

constructor method is invoked
hello world

Of course, Unsafe also provides the allocateInstance method

`1`	`public native Object allocateInstance(Class<?> var1) throws InstantiationException;`

Instantiation is also possible and more intuitive

public static void allocateInstance() throws InstantiationException {
    PrivateConstructorFoo privateConstructorFoo = (PrivateConstructorFoo) Util.unsafe.allocateInstance(PrivateConstructorFoo.class);
    privateConstructorFoo.hello();
}

Again works! The output is as follows.

`1`	`hello world`

Note one detail here, allocateInstance does not trigger the constructor method.

Case 2: package level instances

package moe.cnkirito;

class PackageFoo {

    public void hello() {
        System.out.println("hello world");
    }

}

Note that here I have defined a package level accessible object PackageFoo that is only accessible to classes under the moe.cnkirito package.

Let’s also try to use reflection first

package com.bellamm;

public static void reflectConstruction() {
  Class<?> aClass = Class.forName("moe.cnkirito.PackageFoo");
  aClass.newInstance();
}

Got the expected error reported.

`1`	`java.lang.IllegalAccessException: Class io.openmessaging.Main can not access a member of class moe.cnkirito.PackageFoo with modifiers ""`

What about trying Unsafe again?

package com.bellamm;

public static void allocateInstance() throws Exception{
    Class<?> fooClass = Class.forName("moe.cnkirito.PackageFoo");
    Object foo = Util.unsafe.allocateInstance(fooClass);
    Method helloMethod = fooClass.getDeclaredMethod("hello");
    helloMethod.setAccessible(true);
    helloMethod.invoke(foo);
}

Since we cannot even define the PackageFoo class at compile time under the com.bellamm package, we have to use the reflection mechanism to get the methods of moe.cnkirito.PackageFoo at runtime, with Unsafe instantiation, and finally call it to successfully output hello world.

After spending so much time experimenting with the two limiting cases and the Unsafe solution, we need to have real-world scenarios to support the value of Unsafe#allocateInstance. I will briefly list two scenarios.

when a serialization framework is unable to create an object using reflection, you can try to create it using Unsafe as the underwriting logic.
Get the package level protected class, and then with the help of reflection mechanism, you can magic some source code implementation or call some native methods, this method is used with caution, not recommended for production use.

Sample code: dynamically modify the out-of-heap memory limit to override the JVM startup parameter: -XX:MaxDirectMemorySize.

private void hackMaxDirectMemorySize() {
    try {
        Field directMemoryField = VM.class.getDeclaredField("directMemory");
        directMemoryField.setAccessible(true);
        directMemoryField.set(new VM(), 8L * 1024 * 1024 * 1024);

        Object bits = Util.unsafe.allocateInstance(Class.forName("java.nio.Bits"));
        Field maxMemory = bits.getClass().getDeclaredField("maxMemory");
        maxMemory.setAccessible(true);
        maxMemory.set(bits, 8L * 1024 * 1024 * 1024);

    } catch (Exception e) {
        throw new RuntimeException(e);
    }

    System.out.println(VM.maxDirectMemory());

}

Summary

First of all, I would like to introduce these three Unsafe usage, which I personally think are some of the more commonly used Unsafe cases.

Unsafe is something that people who know how to use it basically know that you can’t use it blindly; if you don’t use it, it’s better to know that Java has this mechanism than not to know it, right? Of course, this article also introduces some practical scenarios that may have to use Unsafe, but more still appear in the underlying source code.

Table of Contents

Using Unsafe

Memory allocation & access

Memory Copy

Unconventional instantiated objects

Summary