1. Problems encountered

Project Description:

  • File size 5.6 GB
  • Number of files 529352

Dockerfile

1
2
FROM golang:1.13
COPY ./ /go/src/code

The build commands and inputs are as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
time DOCKER_BUILDKIT=1 docker build --no-cache -t test:v3 -f Dockerfile .  --progress=plain

#1 [internal] load build definition from Dockerfile
#1 sha256:2a154d4ad813d1ef3355d055345ad0e7c5e14923755cea703d980ecc1c576ce7
#1 transferring dockerfile: 37B done
#1 DONE 0.1s

#2 [internal] load .dockerignore
#2 sha256:9598c0ddacf682f2cac2be6caedf6786888ec68f009c197523f8b1c2b5257b34
#2 transferring context: 2B done
#2 DONE 0.2s

#3 [internal] load metadata for golang:1.13
#3 sha256:0c7952f0b4e5d57d371191fa036da65d51f4c4195e1f4e1b080eb561c3930497
#3 DONE 0.0s

#4 [1/2] FROM golang:1.13
#4 sha256:692ef5b58e708635d7cbe3bf133ba934336d80cde9e2fdf24f6d1af56d5469ed
#4 CACHED

#5 [internal] load build context
#5 sha256:f87f36fa1dc9c0557ebc53645f7ffe404ed3cfa3332535260e5a4a1d7285be3c
#5 transferring context: 18.73MB 4.8s
#5 transferring context: 38.21MB 9.8s done
#5 DONE 10.5s

#6 [2/2] COPY ./ /go/src/code
#6 sha256:2c63806741b84767def3d7cebea3872b91d7ef00bd3d524f48976077cce3849a
#6 DONE 26.8s

#7 exporting to image
#7 sha256:e8c613e07b0b7ff33893b694f7759a10d42e180f2b4dc349fb57dc6b71dcab00
#7 exporting layers
#7 exporting layers 67.5s done
#7 writing image sha256:03b278543ab0f920f5af0540d93c5e5340f5e1f0de2d389ec21a2dc82af96754 done
#7 naming to docker.io/library/test:v3 done
#7 DONE 67.6s

real    1m45.411s
user    0m18.374s
sys     0m7.344s

Some of the more time consuming cases are as follows.

  • 10s to load the build context
  • 26s, perform COPY operation
  • 67s, export image, image size 5.79GB

The following is also along these lines to troubleshoot, test and verify, and find IO bottlenecks in the build process.

2. Using a custom go client and submitting directly to Dockerd builds does not work well

The project https://github.com/shaowenchen/demo/tree/master/buidl-cli performs the function of submitting the local Dockerfile and context to Dockerd for building, thus testing the Docker CLI for bottlenecks in commit files.

2.1 Compile to binary file

1
GOOS=linux GOARCH=amd64 go build  -o build main.go

2.2 Submitting build tasks

1
2
3
4
5
time ./build ./ test:v3

real    5m12.758s
user    0m2.182s
sys     0m14.169s

Using the cli tool written in Go, submitting the build context to Dockerd for building took a dramatic increase in time; at the same time, the load on the build machine spiked.

There may be other optimization points that need to be debugged slowly. The Docker CLI actually has parameters that can be used to reduce the IO footprint.

3. Build parameters compress, stream are poorly optimized

compress compresses the context into gzip format for transfer, while stream transfers the context as a stream.

3.1 Optimization with compress

1
2
3
4
5
time DOCKER_BUILDKIT=1 docker build --no-cache -t test:v3 -f Dockerfile . --compress

real    1m46.117s
user    0m18.551s
sys     0m7.803s

3.2 Optimization with stream

1
2
3
4
5
time DOCKER_BUILDKIT=1 docker build --no-cache -t test:v3 -f Dockerfile . --stream

real    1m51.825s
user    0m19.399s
sys     0m7.657s

These two parameters do not have much effect on reducing the build time. However, it is important to note that the test project has large and numerous files, which may have different effects if the test cases change. Next, let’s take a look at the effect of the number of files and file size on the Dockerd build image together.

4. The number of files has much less impact on COPY than file size

4.1 Preparing test files

1
2
3
4
du -h --max-depth=1

119M    ./data
119M    .

A 119MB file is placed in the data directory, which is copied to keep increasing the size of the build context.

4.2 Testing Dockerfile

1
2
3
FROM golang:1.13

COPY ./ /go/src/code

4.3 Build commands

1
DOCKER_BUILDKIT=1 docker build --no-cache -t test:v3 -f Dockerfile .

4.4 Test file size has a significant effect on COPY

file size build time number of files
119M 0.3s 1
237M 0.4s 2
355M 0.5s 3
473M 0.6s 4
1.3G 3.7s 11
2.6G 9.0s 22

File size has a significant impact on COPY, with near linear growth.

4.5 The number of test files has little effect on COPY

file size build time number of files
2.9G 13.8s 264724
5.6G 37.1s 529341

The number of files does not have a significant impact on COPY. This is because when the Docker CLI sends the build context to Dockerd, it tarballs the context and does not transfer it file by file.

4.6 The bottleneck for building concurrency is disk IO

5.6G / 529341

concurrency build time
1 37.1s
2 46s
3 81s

With iotop you can observe the disk write speed in real time, up to 200MB/s, which is closest to the file system 4K random write speed.

1
2
Rand_Write_Testing: (groupid=0, jobs=1): err= 0: pid=30436
  write: IOPS=37.9k, BW=148MiB/s (155MB/s)(3072MiB/20752msec); 0 zone resets

With a common Dockerd, there will be a bottleneck in Dockerd throughput and a bottleneck in system disk IO when concurrent.

5. Not clearing the Buildkit cache has little effect on new builds

If docker build is not found, you need to enable EXPERIMENTAL. Or if there is no buildx, you need to download docker-buildx to the /usr/libexec/docker/cli-plugins/ directory.

  • View build cache

    1
    
    docker system df  -v
    
  • Clear all build cache

    1
    
    DOCKER_BUILDKIT=1 docker builder prune -f 
    

Build cache is generated only when BuildKit is enabled. the cache size in the production environment is 1.408TB, but comparing before and after cleanup, no significant build speed change is found for new project builds; for old projects, if there are no changes, the speed is fast after hitting the cache. The possible reason for this is that the cache is large but does not have many entries, and the time overhead for querying whether the cache is available is small.

However, regularly scheduled caching helps to prevent the risk of disk overruns.

  • Regularly clean up the remote build cache

    Clear out the cache before 72h.

    1
    
    DOCKER_CLI_EXPERIMENTAL=enabled docker buildx prune --filter "until=72h" -f
    

6. build does not limit CPU but IO is slow

6.1 Testing CPU limits

Dockerfile file.

1
2
3
4
FROM ubuntu
RUN apt-get update -y
RUN apt-get install -y stress
RUN stress -c 40
1
DOCKER_BUILDKIT=1 docker build --no-cache -t test:v3 -f Dockerfile .

The build machine has 40C and the CPU load of the machine can reach 95% during the build, which means that Dockerd does not limit the CPU consumption by default during the build. In a production environment, there are scenarios where npm run build takes up more than 10 GB of memory, so I think Dockerd does not limit memory consumption by default either.

6.2 Testing IO in Dockerfile

Dockerfile file.

1
2
3
4
FROM ubuntu
RUN apt-get update -y
RUN apt-get install -y fio
RUN fio -direct=1 -iodepth=128 -rw=randwrite -ioengine=libaio -bs=4k -size=3G -numjobs=1 -runtime=1000 -group_reporting -filename=/tmp/test.file --allow_mounted_write=1 -name=Rand_Write_Testing
1
2
3
4
DOCKER_BUILDKIT=1 docker build --no-cache -t test:v3 -f Dockerfile . 

Rand_Write_Testing: (groupid=0, jobs=1): err= 0
   write: IOPS=17.4k, BW=67.9MiB/s (71.2MB/s)(3072MiB/45241msec); 0 zone resets

6.3 Testing IO in containers

1
docker run -it shaowenchen/demo-fio bash
1
2
Rand_Write_Testing: (groupid=0, jobs=1): err= 0
  write: IOPS=17.4k, BW=68.1MiB/s (71.4MB/s)(3072MiB/45091msec); 0 zone resets

6.4 Testing IO in a container’s storage volume

1
docker run -v /tmp:/tmp -it shaowenchen/demo-fio bash
1
2
Rand_Write_Testing: (groupid=0, jobs=1): err= 0
  write: IOPS=39.0k, BW=152MiB/s (160MB/s)(3072MiB/20162msec); 0 zone resets

6.5 Testing IO on the host

1
2
Rand_Write_Testing: (groupid=0, jobs=1): err= 0
  write: IOPS=38.6k, BW=151MiB/s (158MB/s)(3072MiB/20366msec); 0 zone resets

When Dockerd builds a Dockerfile, encountering the Run command will start a container running, and then commit the image. From the test results, you can see that the IO speed in Dockerfile is far from that of the host, which is consistent with the IO speed in the container; the IO speed of the host storage volume is consistent with the IO speed of the host.

7. Using buildkitd directly does not work well

While it is possible to enable Buildkit builds with DOCKER_BUILDKIT=1, it is a good option to use buildkitd directly if it works well and is used to replace Dockerd builds.

7.1 Installing buildkit

1
2
3
wget https://github.com/moby/buildkit/releases/download/v0.11.2/buildkit-v0.11.2.linux-amd64.tar.gz
tar xvf buildkit-v0.11.2.linux-amd64.tar.gz
mv bin/* /usr/local/bin/

7.2 Deploying buildkitd

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
cat > /usr/lib/systemd/system/buildkitd.service <<EOF
[Unit]
Description=/usr/local/bin/buildkitd
ConditionPathExists=/usr/local/bin/buildkitd
After=containerd.service

[Service]
Type=simple
ExecStart=/usr/local/bin/buildkitd
User=root
Restart=on-failure
RestartSec=1500ms

[Install]
WantedBy=multi-user.target
EOF
1
2
3
4
systemctl daemon-reload
systemctl restart buildkitd
systemctl enable buildkitd
systemctl status buildkitd

Just check that buildkitd is running normally.

7.3 Testing buildctl commit builds

1
2
3
buildctl build --frontend=dockerfile.v0 --local context=. --local dockerfile=. --no-cache --output type=docker,name=test:v4 | docker load

[+] Building 240.8s (7/7) FINISHED

Using buildctl to submit to buildkitd for building takes more time, 4min, which is double the previous time.

8. There is a bottleneck in reading and writing images with the current storage driver

8.1 View Dockerd processing logic

The logic for handling Dockerfile can be found in the code https://github.com/moby/moby/blob/8d193d81af9cbbe800475d4bb8c529d67a6d8f14/builder/dockerfile/dispatchers.go.

  1. Add and Copy both call performCopy function
  2. performCopy calls NewRWLayer() to create a new layer and exportImage to write data

Therefore, the suspicion is that Dockerd is slow in writing the image layer.

8.2 Test mirror layer write speed

Prepare a mirror, 16GB in size, 18 layers in total.

  • Import the image

    1
    2
    3
    
    time docker load < /tmp/16GB.tar
    
    real    2m43.288s
    
  • Save image

    1
    2
    3
    
    time docker save 0d08de176b9f > /tmp/16GB.tar
    
    real    2m48.497s
    

docker load and docker save are about the same speed, processing the image layer at about 100 MB/s. This is nearly 30% less than the disk’s 4K random write speed. In my opinion, this is barely acceptable for personal use; for a platform product that provides build services to the public, this disk is clearly inappropriate.

8.3 How to choose a storage driver

The following is a comparison table compiled from https://docs.docker.com/storage/storagedriver/select-storage-driver/.

Storage Drivers File System Requirements High Frequency Write Performance Stability Other
overlay2 xfs、ext4 Poor good Current Preferred
fuse-overlayfs No limitation - - For rootless scenarios
btrfs btrfs good - -
zfs zfs good - -
vfs No limitation - - Not recommended for production
aufs xfs、ext4 - good Preferred for Docker 18.06 and earlier, No longer maintained
devicemapper direct-lvm good good No longer maintained
overlay xfs、ext4 Poor, but better than overlay2 - No longer maintained

Excluding the non-maintained and non-production ones, there are not many options. It just so happens that there is a machine that was initialized some time ago with a disk formatted in Btrfs file format that can be used for testing. zfs storage drivers are recommended for high density PaaS systems.

8.4 Testing the Btrfs Storage Driver

  • On the host

    1
    2
    
    Rand_Write_Testing: (groupid=0, jobs=1): err= 0
    write: IOPS=40.0k, BW=160MiB/s (168MB/s)(3072MiB/19191msec); 0 zone resets
    
  • Test commands under the container

    Running containers.

    1
    
    docker run -it shaowenchen/demo-fio bash
    

    Execute the test.

    1
    
    fio -direct=1 -iodepth=128 -rw=randwrite -ioengine=libaio -bs=4k -size=3G -numjobs=1 -runtime=1000 -group_reporting -filename=/data/test.file --allow_mounted_write=1 -name=Rand_Write_Testing
    
    • Testing overlay2 storage drivers

      1
      2
      3
      4
      5
      
      docker info
      
      Server Version: 20.10.12
      Storage Driver: overlay2
          Backing Filesystem: btrfs
      
      1
      2
      
      Rand_Write_Testing: (groupid=0, jobs=1): err= 0: pid=78: Thu Feb  2 02:41:48 2023
      write: IOPS=21.5k, BW=84.1MiB/s (88.2MB/s)(3072MiB/36512msec); 0 zone resets
      
  • Testing the btrfs storage driver

    1
    2
    3
    4
    5
    
    docker info
    
    Server Version: 20.10.12
    Storage Driver: btrfs
    Build Version: Btrfs v5.4.1 
    
    1
    2
    
    Rand_Write_Testing: (groupid=0, jobs=1): err= 0
    write: IOPS=39.8k, BW=156MiB/s (163MB/s)(3072MiB/19750msec); 0 zone resets
    

You can clearly see that the btrfs storage driver outperforms overlay2 in terms of speed.

9. Summary

This article is mainly to record the process of troubleshooting the slow IO problem of Dockerfile build in production environment.

It requires a lot of patience to design various test cases to troubleshoot the problem and verify each element one by one, and it is also easy to go in the wrong direction and come to the wrong conclusion.

The main points of this article are as follows.

  • compress, stream parameters are not necessarily effective for build speed
  • Reducing the build context size can help relieve the build IO pressure.
  • Buildkit’s cache can be cleaned less frequently
  • CPU and Mem will not be limited when build Dockerfile executes commands, but IO speed is slow
  • Build speed with buildkitd is not as fast as Dockerd with DOCKER_BUILDKIT enabled
  • Using Btrfs storage is good for better IO speed

But the easiest thing is to use 4K random read/write fast disks. Before you get a new environment for production, make sure to test it first and execute subsequent plans only when it meets your needs.

10. Ref

  • https://www.chenshaowen.com/blog/troubleshoot-slow-io-when-building-dockerfile.html