cgroups (control groups) is a feature provided by the Linux kernel that limits, accounts for, and isolates the system resources (such as CPU, memory, disk I/O, network, etc.) used by a group of processes.
In the previous article we have understood the role that Namespace plays in container technology. If Namespace controls what processes in a container can see, then cgroups controls how many resources processes in a container can use. namespace enables process isolation, and cgroups enables resource limiting, which is also the basis for building containers.
In this article, we will follow the line of the Namespace article and actually create a container and observe the changes of cgroups in the host to show how cgroups works and then learn how to configure cgroups by ourselves.
When to create a cgroup
The Linux kernel provides an interface for managing cgroups through a pseudo-file system called cgroupfs. We can list existing cgroups on the system with the lscgroup command, which actually traverses the files in the /sys/fs/cgroup/ directory.
|
|
If you are using a Linux distribution that does not have the lscgroup command, you can download and install it using the command provided by command-not-found.com.
We save the output to a cgroup.a file. Next, start a container in another window following the steps in the Namespace article.
Go back to the original window and execute the lsgroup command again.
|
|
Now compare the output of the lscgroup command twice.
|
|
As you can see from the results, after the mybox container is created, a new cgroup of all types is created specifically for it in the system.
How cgroups control the resources of a container
A cgroup controls processes, which control how much memory/CPU/network/etc. a process or group of processes can use. A cgroup’s tasks list contains the PIDs of the processes it controls, and the tasks is actually a file in the cgroupfs.
init process
We first print out information about the processes in the container in the host, and find the container’s init process.
Print arbitrary lists of tasks for some types of cgroups.
The process is straightforward: after the container is created, the container’s init process is added to the cgroups created for that container, and we can get a more definite result with /proc/$PID/cgroup.
|
|
Other processes in the container
Next we run a new process in the mybox container.
See if a new cgroup will be created.
Since a cgroup can control a group of processes, we assume that any new processes created in the running container will be added to the cgroups to which the init process belongs.
To verify this, first find the PID of the newly created process.
The PID of the new process is 2576, and then the cgroups information for the process is printed.
|
|
The output is identical to that of the PID 2250 process, and we can also print the tasks list of one of the cgroups.
Exactly as expected. In fact, writing the PID of a process directly to the tasks file implements adding the process to that cgroup. When a container is created, a new cgroup is created for each type of resource, and all processes running in the container are added to these cgroups.
By controlling all processes running in the container, cgroups implements resource limits for the container.
How to configure cgroups
Here we will take the memory cgroup as an example to understand how to configure cgroup to achieve memory limitation for the mybox container.
There are two ways to configure a cgroup, either by directly modifying the specified file in cgroupfs or by using an advanced tool like runc or docker.
File system method
By means of cgroupfs, you can view/set the limits of a cgroup by viewing/modifying specific files in that cgroup’s directory.
The maximum available memory can be set by modifying the memory.limit_in_bytes file. Now we have not set any limit for this container, so the current value of the memory limit is a meaninglessly large value, and we now write the new value directly to this file.
|
|
This sets a new memory limit. After the new limit is written, all processes in the container cannot use more than 100M of memory in total, after which they will be kill or sleep processes in the container according to the OOM policy set in the memory.oom_control file.
High-level tools approach
Configuring cgroups through the path provided by the higher-level tools is a more friendly way, although the implementation behind these tools also changes cgroupfs as described above.
For runc, the config.json file in the filesystem bundle needs to be modified to configure the cgroup. setting the memory limit requires modifying the linux.resources field in the JSON object as follows.
For docker it’s even simpler, it’s a wrapped user-oriented tool, and the memory limit can be specified with the -memory option when executing the docker run command. This parameter is actually written to config.json and used by the runtime implementation runc, which in turn changes cgroupfs.