What is a cgroup

Cgroups are a Linux kernel feature that limits, audits, and isolates the resource usage (CPU, memory, disk I/O, network, etc.) of a group of processes.

cgroups (Control Groups) are a mechanism provided by the linux kernel to consolidate (or segregate) a set of system tasks and their subtasks into different groups according to resource hierarchy on demand, thus providing a unified framework for system resource management. Simply put, cgroups can limit and record the physical resources used by task groups. Essentially, cgroups are a series of hooks (hooks) that the kernel attaches to programs, triggered by the scheduling of resources at program runtime, for the purpose of resource tracking and restriction.

What is cgroupfs

The default Cgroup Driver for docker is cgroupfs.

1
2
$ docker info | grep cgroup
 Cgroup Driver: cgroupfs

Cgroup provides a native interface and makes it available via cgroupfs (from this statement we know that cgroupfs is a wrapper for the interface of Cgroup). Similar to procfs and sysfs, it is a virtual file system. And cgroupfs is mountable, by default in the /sys/fs/cgroup directory.

What is Systemd?

Systemd is also a wrapper around the Cgroup interface. systemd runs as a PID1 at system boot time and provides a set of system management daemons, libraries and utilities to control and manage Linux computer operating system resources.

Why use systemd instead of croupfs

Here is the official description.

When a Linux distribution uses systemd as its initialization system, the initialization process generates and uses a root control group (cgroup) and acts as a cgroup manager. systemd is tightly integrated with cgroups and will assign a cgroup to each systemd unit. you can also configure You can also configure the container runtime and kubelet to use cgroupfs. Using cgroupfs with systemd means that there will be two different cgroup managers.

A single cgroup manager will simplify the view of allocated resources and by default will have a more consistent view of available resources and resources in use. When there are two managers coexisting in a system, it will end up with two views of those resources. Cases have been reported in this area where some nodes are configured so that kubelet and docker use cgroupfs while the rest of the processes running on the node use systemd; such nodes can become unstable under resource pressure.

ubuntu, debian, and centos7 all use systemd to initialize the system. systemd already has a set of cgroup managers, and if the container runtime and kubelet use cgroupfs, there will be two types of cgroup managers, cgroups and systemd. This means that there are two views of resource allocation in the OS, and when the OS is running low on CPU, memory, etc., the processes on the OS will become unstable.

We can simply understand that there should not be two tigers in one mountain, and there can only be one king in a country.

Caution: Do not try to modify the cgroup driver of a node inside the cluster, if necessary, it is better to remove the node and rejoin it.

How to change the default cgroup driver for docker

Add "exec-opts": ["native.cgroupdriver=systemd"] configuration, restart docker and it will work.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
$ cat /etc/docker/daemon.json 
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "registry-mirrors": [
    "https://docker.mirrors.ustc.edu.cn",
    "http://hub-mirror.c.163.com"
  ],
  "max-concurrent-downloads": 10,
  "log-driver": "json-file",
  "log-level": "warn",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
    },
  "data-root": "/var/lib/docker"
}

kubelet configuration cgroup driver

Note: In version 1.22, if the user does not set the cgroupDriver field in KubeletConfiguration, kubeadm init will set it to the default value of systemd.

1
2
3
4
5
6
7
8
# kubeadm-config.yaml
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: v1.21.0
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd

Then use kubeadm to initialize it.

1
$ kubeadm init --config kubeadm-config.yaml