Starting with the constructor of DirectBuffer

The off-heap memory opened by DirectBuffer is actually allocated through Unsafe, take a look at the constructor of DirectBuffer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
DirectByteBuffer(int cap) {                   // package-private
       super(-1, 0, cap, cap);
       boolean pa = VM.isDirectMemoryPageAligned();
       int ps = Bits.pageSize();
       long size = Math.max(1L, (long)cap + (pa ? ps : 0));
       Bits.reserveMemory(size, cap);
       long base = 0;
       try {
           base = unsafe.allocateMemory(size);
       } catch (OutOfMemoryError x) {
           Bits.unreserveMemory(size, cap);
           throw x;
       }
       unsafe.setMemory(base, size, (byte) 0);
       if (pa && (base % ps != 0)) {
           // Round up to page boundary
           address = base + ps - (base & (ps - 1));
       } else {
           address = base;
       }
       cleaner = Cleaner.create(this, new Deallocator(base, size, cap));
       att = null;
   }

A short dozen lines of code contains a very large amount of information, first of all, the key points

  • long base = unsafe.allocateMemory(size); Call Unsafe to allocate memory and return the first address of memory.
  • unsafe.setMemory(base, size, (byte) 0); Initialize memory to 0. We’ll focus on this line in the next section.
  • Cleaner.create(this, new Deallocator(base, size, cap)); Set the out-of-heap memory recycler, without going into details, you can refer to my previous article “An article on monitoring and recycling out-of-heap memory”.

This scene in the constructor alone makes Unsafe and ByteBuffer inextricably linked, and if you use your imagination, you can think of ByteBuffer as a safe version of the Unsafe family of memory manipulation APIs. ByteBuffer encapsulates the concepts of limit/position/capacity, which I find easier to use than Netty’s post-encapsulation ByteBuffer encapsulates the concepts of limit/position/capacity, which I find easier to use than Netty’s post-encapsulation ByteBuf, but even though it is good, it still has one aspect that people dislike: a lot of boundary checking.

One of the most attractive places for performance challenge competitors to use Unsafe to manipulate memory, rather than ByteBuffer, is bounds checking. As in example code I.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
public ByteBuffer put(byte[] src, int offset, int length) {
    if (((long)length << 0) > Bits.JNI_COPY_FROM_ARRAY_THRESHOLD) {
        checkBounds(offset, length, src.length);
        int pos = position();
        int lim = limit();
        assert (pos <= lim);
        int rem = (pos <= lim ? lim - pos : 0);
        if (length > rem)
            throw new BufferOverflowException();
            Bits.copyFromArray(src, arrayBaseOffset,
                               (long)offset << 0,
                               ix(pos),
                               (long)length << 0);
        position(pos + length);
    } else {
        super.put(src, offset, length);
    }
    return this;
}

You don’t need to care what role the above code plays in DirectBuffer, what I want to show you is just its checkBounds and a bunch of if/else, especially for extreme performance scenarios, where geeks see if/else and are nervously aware of the performance degradation of branch prediction, and secondly aware of whether this pile of code can be removed.

If you don’t want a bunch of bounds checks, you can implement a custom ByteBuffer with Unsafe, like the following.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
public class UnsafeByteBuffer {

    private final long address;
    private final int capacity;
    private int position;
    private int limit;

    public UnsafeByteBuffer(int capacity) {
        this.capacity = capacity;
        this.address = Util.unsafe.allocateMemory(capacity);
        this.position = 0;
        this.limit = capacity;
    }

    public int remaining() {
        return limit - position;
    }

    public void put(ByteBuffer heapBuffer) {
        int remaining = heapBuffer.remaining();
        Util.unsafe.copyMemory(heapBuffer.array(), 16, null, address + position, remaining);
        position += remaining;
    }

    public void put(byte b) {
        Util.unsafe.putByte(address + position, b);
        position++;
    }

    public void putInt(int i) {
        Util.unsafe.putInt(address + position, i);
        position += 4;
    }

    public byte get() {
        byte b = Util.unsafe.getByte(address + position);
        position++;
        return b;
    }

    public int getInt() {
        int i = Util.unsafe.getInt(address + position);
        position += 4;
        return i;
    }

    public int position() {
        return position;
    }

    public void position(int position) {
        this.position = position;
    }

    public void limit(int limit) {
        this.limit = limit;
    }

    public void flip() {
        limit = position;
        position = 0;
    }

    public void clear() {
        position = 0;
        limit = capacity;
    }

}

Unsafe is usually disabled in some competitions to prevent players from entering endless involutions, but there are also competitions that allow some of Unsafe’s capabilities to be used, allowing players to let loose and explore the possibilities. For example, Unsafe#allocateMemory, which is not restricted by -XX:MaxDirectMemory and -Xms, was disabled in this second Cloud Native Programming Challenge, but Unsafe#put, Unsafe#get, and Unsafe#copyMemory were allowed to to be used. If you definitely want to use Unsafe to manipulate out-of-heap memory, you can write code like this, which does the same thing as example code 1.

1
2
3
4
5
byte[] src = ...;

ByteBuffer byteBuffer = ByteBuffer.allocateDirect(src.length);
long address = ((DirectBuffer)byteBuffer).address();
Util.unsafe.copyMemory(src, 16, null, address, src.length);

This is the first key point I want to introduce: DirectByteBuffer can bypass bounds checking by using Unsafe to perform fine-grained operations at the memory level.

Memory initialization of DirectByteBuffer

Notice that there is another operation in the DirectByteBuffer constructor that involves Unsafe: unsafe.setMemory(base, size, (byte) 0); . In some scenarios or hardware, memory operations can be very expensive, especially when large chunks of memory are opened up, and this code can be a bottleneck for DirectByteBuffer.

If you wish to allocate memory without this initialization logic, you can do so with the help of Unsafe allocation memory and then magic the DirectByteBuffer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
public class AllocateDemo {

    private Field addressField;
    private Field capacityField;
    
    public AllocateDemo() throws NoSuchFieldException {
        Field capacityField = Buffer.class.getDeclaredField("capacity");
        capacityField.setAccessible(true);
        Field addressField = Buffer.class.getDeclaredField("address");
        addressField.setAccessible(true);
    }
    
    public ByteBuffer allocateDirect(int cap) throws IllegalAccessException {
        long address = Util.unsafe.allocateMemory(cap);

        ByteBuffer byteBuffer = ByteBuffer.allocateDirect(1);
        Util.unsafe.freeMemory(((DirectBuffer) byteBuffer).address());

        addressField.setLong(byteBuffer, address);
        capacityField.setInt(byteBuffer, cap);

        byteBuffer.clear();
        return byteBuffer;
    }

}

After all this, we get an uninitialized DirectByteBuffer, but don’t worry, everything is working fine and setMemory for free!

Talking about zero copies of ByteBuffer

When a ByteBuffer is used as a read buffer, some of our partners choose to use locking to access the memory, but this is actually a very wrong approach and should use the duplicate and slice methods provided by ByteBuffer.

Concurrent read buffer options.

1
2
3
4
5
6
ByteBuffer byteBuffer = ByteBuffer.allocateDirect(1024);
ByteBuffer duplicate = byteBuffer.duplicate();
duplicate.limit(512);
duplicate.position(256);
ByteBuffer slice = duplicate.slice();
// use slice

This allows the ByteBuffer after slice to be read concurrently without changing the original ByteBuffer pointer.