A container acts as an instance of a image, a runtime for stateless applications in the images. Stateless is so-called because the lifecycle of a container is very flexible. When a container’s lifecycle ends, the data generated during that time does not persist, but is removed with the deletion of the container. However, most applications are for data now, so the persistence of containers is explored here.

Storage for containers

Before we discuss container persistence, let’s explore what data storage looks like without persistence, i.e. a normal container.

Let’s run a container.

1
2
3
4
# docker run -d ubuntu sleep 100000
# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED              STATUS              PORTS               NAMES
3a8fed62029c        ubuntu              "sleep 100000"      About a minute ago   Up About a minute                       hopeful_kilby

Because a container is also a process by nature, then all the files generated within the container must also be stored on the host in some way, and we can exec into this container and create a file.

1
2
3
4
5
6
7
# docker exec -it 3a8fed62029c bash
root@3a8fed62029c:/# cd home/
root@3a8fed62029c:/home# echo "dao" > hello
root@3a8fed62029c:/home# ls
hello
root@3a8fed62029c:/home# cat hello
dao

Once created, this file can be found under /var/lib/docker/aufs/mnt.

1
2
root@ubuntu:/var/lib/docker/aufs/mnt# find -name hello
./24c7a79f3a3bcb9320e028dab081eec7f55a5a8fb8eb2201d9cfe7b1d9d0e7bf/home/hello

And going in and finding this path /var/lib/docker/aufs/mnt/24c7a79f3a3bcb9320e028dab081eec7f55a5a8fb8eb2201d9cfe7b1d9d0e7bf will reveal that this is the entire filesystem under the container.

If you try to create a number of files in this directory under the host, you will find them in the container as well.

The principle of this is simple and can be found under /var/lib/docker/aufs/layers.

1
2
3
4
5
6
7
root@ubuntu:/var/lib/docker/aufs/layers# cat 24c7a79f3a3bcb9320e028dab081eec7f55a5a8fb8eb2201d9cfe7b1d9d0e7bf
24c7a79f3a3bcb9320e028dab081eec7f55a5a8fb8eb2201d9cfe7b1d9d0e7bf-init
d7b377dff0a5d7b84ae6f30cab5d94b668406e30e8f1584e0eab567b45e27a60
e00588b1d53b6376da8ac07a5176bf44c246a5c1077fd56cf41979565ab8e290
5c05060aac0e7a11db24402fc2639d3fd47dc3ab8a996d279a28ac6d73d1217b
3cba5f639a5bca0a7d7d4b28b68151d8a10101583f1b59f48328ade9f7c668dd
ac5fe0af9b19fe9bb88448932b3c7866e942dbdc9666457195183a2af7caf9f9

As you know, images are stored in layers, and containers are simply layers of images with an additional read/write layer on top for data storage.

docker images layers

Therefore, it is clear that the container storage is nothing more than a mapping of the files inside the container to a certain directory on the host, which is this layer of readable and writable layers. however, when the life cycle of the container ends, this host folder also ceases to exist.

For this reason, we need to consider how to persist the data generated by the container.

Persistence and Volume

There are two ways to persist container-generated data, which are basically two completely different ideas. The first is to persist the temporary layer through docker commit, which is the same as the container’s lifecycle.

1
2
3
4
5
# docker commit 3a8fed62029c
sha256:c82ef37ba95e8cc6b03a487294385e76d774f46bb225496f9278b50c0137175c
# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
<none>              <none>              c82ef37ba95e        3 seconds ago       126.6 MB

In this way, the layer that already contains data is permanently stored in the newly generated image, and the data can be used again by instantiating a new image.

However, this has obvious limitations and does not meet runtime requirements, so we can consider the second persistence method, Volume.

Volume is also very simple to use, just add a -v parameter to run, and we will explore volume in three ways using the -v parameter.

The first way, which is the simplest, is to mount a volume directly to a container.

We run a container and mount it on a volume.

1
# docker run -d -v /volume ubuntu sleep 100000

With the inspect command, we can see the volume information of this container.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
"Mounts": [
  {
    "Name": "8aa697c21e65e104dcb9f8b8507c905715d457ad88ee3c2e79a0c72ef07fff0e",
    "Source": "/var/lib/docker/volumes/8aa697c21e65e104dcb9f8b8507c905715d457ad88ee3c2e79a0c72ef07fff0e/_data",
    "Destination": "/volume",
    "Driver": "local",
    "Mode": "",
    "RW": true,
    "Propagation": ""
  }
],

We can see that docker mounts a file under local /var/lib/docker/volumes to the container’s /volume, a directory.

When we go to /var/lib/docker/aufs/mnt again to find the directory corresponding to the hit container and make changes in the directory /volume, the files in the container do not change much, however, the files in /var/lib/docker/volumes/ 8aa697c21e65e104dcb9f8b8507c905715d457ad88ee3c2e79a0c72ef07fff0e/_data, you can see the corresponding changes inside the container.

However, this time, although the use of volume, but in the end what is the difference with no use?

The difference is that the volume folder does not have the same life cycle as the container, but remains on the host after the container’s life cycle is over.

1
2
3
4
5
6
7
8
# docker volume ls
DRIVER              VOLUME NAME
local               8aa697c21e65e104dcb9f8b8507c905715d457ad88ee3c2e79a0c72ef07fff0e
# docker rm -f f5390aec14bc
f5390aec14bc
# docker volume ls
DRIVER              VOLUME NAME
local               8aa697c21e65e104dcb9f8b8507c905715d457ad88ee3c2e79a0c72ef07fff0e

Then this satisfies the problem of persisting the data generated during the lifetime of the stateless container. However, there are still some inconveniences, such as the inconvenience of managing the volume and running another container to inherit the data generated by the previous container.

The second way satisfies this problem. The second way is to hook a directory of the host to the container.

We can run a container and mount the local /root/data directory to the /volume of the container.

1
2
3
4
5
# docker run -d -v /root/data:/volume ubuntu sleep 1000000
9548d507859b6a331a167a8f60d6d96ccdf2b0244c6bdd24033ccf277b89d162
# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
9548d507859b        ubuntu              "sleep 1000000"     54 seconds ago      Up 53 seconds                           drunk_swanson

We can still get volume-related information by using the inspect command.

1
2
3
4
5
6
7
8
9
"Mounts": [
  {
    "Source": "/root/data",
    "Destination": "/volume",
    "Mode": "",
    "RW": true,
    "Propagation": "rprivate"
  }
],

At this point, it turns out that docker very frugally does not create a folder under volume, but directly creates /root/data, and then, mounts this folder directly to /volume inside the container.

The third way, instantiate a data volume container and then mount this data volume container on the newly created container.

While instantiating a data volume container, there are two ways mentioned above, and we can look at how each of the two different data volumes affects the newly created container.

We start by running a container and using the container containing the normal volume as the data volume container (see the first way above).

1
2
3
4
5
# docker run -d --volumes-from f5390aec14bc ubuntu sleep 100000
8cfe0eb466e1935cafe353195deebd0debd677433625d05770b2458755c1a32b
# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
8cfe0eb466e1        ubuntu              "sleep 100000"      8 seconds ago       Up 8 seconds                            evil_nobel

Then, take this container for inspecting.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
"Mounts": [
  {
    "Name": "8aa697c21e65e104dcb9f8b8507c905715d457ad88ee3c2e79a0c72ef07fff0e",
    "Source": "/var/lib/docker/volumes/8aa697c21e65e104dcb9f8b8507c905715d457ad88ee3c2e79a0c72ef07fff0e/_data",
    "Destination": "/volume",
    "Driver": "local",
    "Mode": "",
    "RW": true,
    "Propagation": ""
  }
],

We can see the same information as before, this container does not create any volume, but just reuses the volume of the data volume container.

Similarly, from a volume container created in the second way, we get a similar result.

1
2
3
4
5
6
7
8
9
"Mounts": [
  {
    "Source": "/root/data",
    "Destination": "/volume",
    "Mode": "",
    "RW": true,
    "Propagation": "rprivate"
  }
],

The newly created container does not create a volume, but still directly uses the original container’s volume. Therefore, if multiple containers need to share a file directory, it is entirely possible for one container to create a volume, and then the other containers can use this volume by volume-from association.