Overview

cgroups are control groups, which are responsible for controlling a range of resources for processes on linux, such as CPU, Memory, Huge Pages, and so on. CPU, Memory, Huge Pages, etc. cgroups are divided into modules by subsystems, and each resource is implemented by a subsystem.

The cgroup provides calls to the outside world by means of a file system, and can be combined in a hierarchical way. This hierarchy is presented in the form of a file system directory. For example, creating a subdirectory under the cgroup cpu directory is equivalent to creating a child cgroup under the root cpu cgroup, and the child cgroup inherits the restrictions of the parent cgroup.

There are currently two versions of cgroup: v1 and v2, and the design of the two versions is quite different. But the concept is similar, so even if the versions are different, they can be understood in the same way. The following is an explanation of the cgroup v1 cpu subsystem.

Use of cpu subsystems

cgroup has always been described as a rather abstract concept. Here is a simple example to help understand how cgroups work.

First start a stress process on the machine, assign a cpu, and then look at the cpu usage of that process.

1
2
3
4
5
6
$ stress -c 1

$ pidstat -p 480164 1
Linux 4.14.81.bm.26-amd64 (n251-254-159)    06/01/2021  _x86_64_    (8 CPU)
02:36:56 PM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
02:36:57 PM  1001    480164  100.00    0.00    0.00  100.00     6  stress

As you can see, the stress process has taken up 1 cpu. now we create a cgroup called stress to limit the cpu.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
$ cd /sys/fs/cgroup/cpu

$ mkdir stress && cd stress

# 将 pid 写入到 cgroup.procs 中,就等同于将这个进程移到该 cgroup 中
$ echo 480164 > cgroup.procs

$ echo 100000 > cpu.cfs_period_us

$ echo 50000 > cpu.cfs_quota_us

# 再看看当前的 CPU 占用
$ pidstat -p 480164 1
Linux 4.14.81.bm.26-amd64 (n251-254-159)    06/04/2021  _x86_64_    (8 CPU)

05:17:49 AM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
05:17:50 AM  1001   480164   50.00    0.00    0.00   50.00     6  stress

The above operation achieves the purpose of limiting the CPU usage of a process by configuring the cpu.cfs_period_us and cpu.cfs_quota_us parameters.

cgroup also provides a cpu.shares parameter to configure the weight of CPU usage by processes when CPU resources are busy. Here we demonstrate this on a virtual machine with cpu 1. Create two sub-cgroups under cgroup to show the effect of this parameter.

1
2
3
4
5
$ cd /sys/fs/cgroup/cpu,cpuacct
$ mkdir stress1 && cd stress1
$ stress -c 1
$ echo 3475127 > cgroup.procs
$ echo 1024 > cpu.shares

At this point the CPU usage of the stress process with PID 3475127 is close to 100%. In a new terminal, execute the following command.

1
2
3
$ mkdir stress2 && cd stress2
$ stress -c 1
$ echo 3479833 > cgroup.procs

At this point, the CPU usage of the two stress processes is approximately equal, close to 50%. Since cpu.shares is not set in the stress2 cgroup, the default value is 1024, so now set the cpu.shares parameter of the stress2 cgroup.

1
2
3
4
5
6
$ echo 512 > cpu.shares

# Use top to view
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM
  3475127 root      20   0    7948     96      0 R  65.1   0.0
  3479833 root      20   0    7948     92      0 R  32.2   0.0

The CPU usage of the processes in stress1 is roughly twice as high as in stress2. This is because the value of cpu.shares in stress1 is twice the value in stress2. Of course, cpu.shares will only work if there are not enough CPU resources. If this is a 2 cpu virtual machine, then both stress1 and stress2 will be 100% occupied.

Parameter description

Some cpu parameters appear above, here is a unified explanation.

  • cpu.cfs_period_us: The length of the time period to reallocate CPU resources, in us. cfs is a linux process scheduler, known as Fully Fair Scheduler. So this parameter is only for processes that use cfs scheduling.
  • cpu.cfs_quota_us: The maximum amount of CPU time the process can use within the set time period. Combined with cpu.cfs_period_us, this limits the total CPU time a process can use. This is calculated as (cpu.cfs_quota_us / cpu.cfs_period_us)*count(cpu). This parameter is only for processes that use cfs scheduling.
  • cpu.shares: This parameter only takes effect when CPU resources are busy, and it can be used to set the CPU weights used by the process. In the above example, the virtual machine has only 1 CPU, process 1 and 2 will both occupy one CPU, so by setting cpu.shares to 1024 for process 1 and 512 for process 2, 2/3 of the cpu will be allocated to process 1 and 1/3 of the cpu will be allocated to process 2.

In addition to the several parameters in the above example, the cgroup cpu subsystem also provides the following parameters.

  • cpu.rt_period_us: The length of the time period to reallocate CPU resources. For processes that use the real-time scheduler
  • cpu.rt_runtime_us: The maximum amount of CPU time the process can use within the set time period length. This is similar to the two parameters of cfs described above.
  • cpu.nr_periods: This is a statistical parameter. It is used to indicate the number of cpu cycles that have passed (specified with cpu.cfs_period_us)
  • cpu.nr_throttled: The number of times processes in the cgroup have been limited (because they used up their allocated cpu time).
  • cpu.throttled_time: The total amount of time (in ns) that processes in the cgroup are throttled.