1. What is buffer mapping

Without giving a definition, simply put, after mapping (Mapping) a certain piece of video memory, it can be accessed by the CPU.

After the Buffer (meaning video memory) of the three graphics APIs (D3D12, Vulkan, Metal) is mapped, the CPU will be able to access it, and at this point, note that the GPU can still access this piece of video memory. This leads to a problem: IO conflict, which needs to be considered by the program.

WebGPU disables this behavior and instead passes “ownership” to represent the mapped state, rather like the Rust philosophy. At each moment, the CPU and GPU access the memory unilaterally, thus avoiding competition and conflicts.

When JavaScript requests a memory map, ownership is not immediately handed over to the CPU; the GPU may have other operations on hand at the time to handle the memory. Therefore, the GPUBuffer mapping method is an asynchronous method.

1
2
3
4
const someBuffer = device.createBuffer({ /* ... */ })
await someBuffer.mapAsync(GPUMapMode.READ, 0, 4) // 从 0 开始,只映射 4 个字节
 
// 之后就可以使用 getMappedRange 方法获取其对应的 ArrayBuffer 进行缓冲操作

However, the unmapping operation is a synchronous operation, and the CPU can be unmapped when it runs out.

1
somebuffer.unmap()

Note that the mapAsync method will directly press an operation into the device’s default queue inside the WebGPU, which acts on the queue timeline of the three main timelines in the WebGPU. And memory is only incremented after a successful mapAsync (actually tested).

The data on the memory is committed to the GPU only after the instruction buffer is committed to the queue (a GPUBuffer is used for one of the rendering channels of this instruction buffer) (guess).

I didn’t see any significant memory reduction after calling the destroy method because I didn’t have much room to test it, so I hope some of you can test it.

1.1 Create time mapping

You can pass mappedAtCreation: true when creating the buffer, so you don’t even need to declare its usage with GPUBufferUsage.MAP_WRITE.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
const buffer = device.createBuffer({
  usage: GPUBufferUsage.UNIFORM,
  size: 256,
  mappedAtCreation: true,
})
// 然后马上就可以获取映射后的 ArrayBuffer
const mappedArrayBuffer = buffer.getMappedRange()
 
/* 在这里执行一些写入操作 */
 
// 解映射,还管理权给 GPU
buffer.unmap()

2. Flow of buffered data

2.1 CPU to GPU

The JavaScript side frequently passes large amounts of data in the rAF to the ArrayBuffer mapped by the GPUBuffer, then to the GPU as it unmaps, commits instructions to buffer to the queue, and finally passes to the GPU.

The most common examples of this are passing the VertexBuffer, UniformBuffer, and StorageBuffer needed to compute the channel for each frame, etc.

Writing buffer objects using the writeBuffer method of the queue object is very efficient, but writeBuffer has an extra copy operation compared to the mapped GPUBuffer used for writing. Presumably this affects performance, although there are many writeBuffer operations in the officially recommended examples, mostly for UniformBuffer updates.

2.2 GPU to CPU

Such reverse passes are relatively rare, but not unheard of. For example, screenshots (to save color attachments to ArrayBuffer), result statistics of computation channels, etc., need to get data from the GPU’s computation results.

For example, the official example of getting pixel data from a rendered texture.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
const texture = getTheRenderedTexture()
 
const readbackBuffer = device.createBuffer({
  usage: GPUBufferUsage.COPY_DST | GPUBufferUsage.MAP_READ,
  size: 4 * textureWidth * textureHeight,
})
 
// 使用指令编码器将纹理拷贝到 GPUBuffer
const encoder = device.createCommandEncoder()
encoder.copyTextureToBuffer(
  { texture },
  { buffer, rowPitch: textureWidth * 4 },
  [textureWidth, textureHeight],
)
device.submit([encoder.finish()])
 
// 映射,令 CPU 端的内存可以访问到数据
await buffer.mapAsync(GPUMapMode.READ)
// 保存屏幕截图
saveScreenshot(buffer.getMappedRange())
// 解映射
buffer.unmap()