Kubernetes Switching to Containerd

1. Environmental preparation

Ubuntu 20.04 x5
Etcd 3.4.16
Kubernetes 1.21.1
Containerd 1.3.3

1.1. Handling IPVS

Since the new version of Kubernetes Service implementation switches to IPVS, you need to make sure the kernel has IPVS modules loaded; the following command will set the system to automatically load IPVS related modules at boot, and you need to reboot after execution.

# Kernel modules
cat > /etc/modules-load.d/50-kubernetes.conf <<EOF
# Load some kernel modules needed by kubernetes at boot
nf_conntrack
br_netfilter
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
EOF

# sysctl
cat > /etc/sysctl.d/50-kubernetes.conf <<EOF
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
fs.inotify.max_user_watches=525000
EOF

Be sure to check the module loading and kernel parameter settings after the reboot is complete:

# check ipvs modules
➜ ~ lsmod | grep ip_vs
ip_vs_sed              16384  0
ip_vs_nq               16384  0
ip_vs_fo               16384  0
ip_vs_sh               16384  0
ip_vs_dh               16384  0
ip_vs_lblcr            16384  0
ip_vs_lblc             16384  0
ip_vs_wrr              16384  0
ip_vs_rr               16384  0
ip_vs_wlc              16384  0
ip_vs_lc               16384  0
ip_vs                 155648  22 ip_vs_wlc,ip_vs_rr,ip_vs_dh,ip_vs_lblcr,ip_vs_sh,ip_vs_fo,ip_vs_nq,ip_vs_lblc,ip_vs_wrr,ip_vs_lc,ip_vs_sed
nf_conntrack          139264  1 ip_vs
nf_defrag_ipv6         24576  2 nf_conntrack,ip_vs
libcrc32c              16384  5 nf_conntrack,btrfs,xfs,raid456,ip_vs

# check sysctl
➜ ~ sysctl -a | grep ip_forward
net.ipv4.ip_forward = 1
net.ipv4.ip_forward_update_priority = 1
net.ipv4.ip_forward_use_pmtu = 0

➜ ~ sysctl -a | grep bridge-nf-call
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

1.2. Installing Containerd

Containerd is already included in the default official repository in Ubuntu 20, so all you need to do is apt install it:

1
2

# Other packages may be used later, so they are installed here together
apt install containerd bridge-utils nfs-common tree -y

A successful installation can be verified by executing the ctr images ls command. This section will not describe the Containerd configuration, which will be configured during the Kubernetes installation.

2. Installing kubernetes

2.1. Installing Etcd Cluster

Etcd is the core of Kubernetes, so I personally prefer to install it on the host; for the convenience of the host installation I packaged some *-pack kits for quick processing:

Installing CFSSL and ETCD

# Download the installation package
wget https://github.com/mritd/etcd-pack/releases/download/v3.4.16/etcd_v3.4.16.run
wget https://github.com/mritd/cfssl-pack/releases/download/v1.5.0/cfssl_v1.5.0.run

# Install cfssl and etcd
chmod +x *.run
./etcd_v3.4.16.run install
./cfssl_v1.5.0.run install

After installation, adjust the IP related to /etc/cfssl/etcd/etcd-csr.json by yourself, and then execute create.sh in the same directory to generate the certificate.

➜ ~ cat /etc/cfssl/etcd/etcd-csr.json
{
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "O": "etcd",
            "OU": "etcd Security",
            "L": "Beijing",
            "ST": "Beijing",
            "C": "CN"
        }
    ],
    "CN": "etcd",
    "hosts": [
        "127.0.0.1",
        "localhost",
        "*.etcd.node",
        "*.kubernetes.node",
        "10.0.0.11",
        "10.0.0.12",
        "10.0.0.13"
    ]
}

# Copy to 3 masters
➜ ~ for ip in `seq 1 3`; do scp /etc/cfssl/etcd/*.pem root@10.0.0.1$ip:/etc/etcd/ssl; done

After the certificate generation is completed, adjust the Etcd configuration file of each machine and then repair the permission to start.

# Copy Configuration
for ip in `seq 1 3`; do scp /etc/etcd/etcd.cluster.yaml root@10.0.0.1$ip:/etc/etcd/etcd.yaml; done

# Modify Permissions
for ip in `seq 1 3`; do ssh root@10.0.0.1$ip chown -R etcd:etcd /etc/etcd; done

# Start-up per machine
systemctl start etcd

Verify the cluster status via etcdctl after boot:

# To be on the safe side, you should run etcdctl endpoint health
➜ ~ etcdctl member list
55fcbe0adaa45350, started, etcd3, https://10.0.0.13:2380, https://10.0.0.13:2379, false
cebdf10928a06f3c, started, etcd1, https://10.0.0.11:2380, https://10.0.0.11:2379, false
f7a9c20602b8532e, started, etcd2, https://10.0.0.12:2380, https://10.0.0.12:2379, false

2.2. Install kubeadm

kubeadm Chinese users are recommended to use aliyun’s installation source:

# kubeadm
apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
apt update

# ebtables, ethtool kubelet may be used, I forget the details, anyway, from the official documentation
apt install kubelet kubeadm kubectl ebtables ethtool -y

2.3. Install kube-apiserver-proxy

kube-apiserver-proxy is a four-tier proxy-only Nginx that I compiled myself to listen to 127.0.0.1:6443 and load to all Api Server addresses (0.0.0.0:5443):

1
2
3

wget https://github.com/mritd/kube-apiserver-proxy-pack/releases/download/v1.20.0/kube-apiserver-proxy_v1.20.0.run
chmod +x *.run
./kube-apiserver-proxy_v1.20.0.run install

After the installation is complete, adjust your Nginx configuration file according to your IP address, and then start:

➜ ~ cat /etc/kubernetes/apiserver-proxy.conf
error_log syslog:server=unix:/dev/log notice;

worker_processes auto;
events {
        multi_accept on;
        use epoll;
        worker_connections 1024;
}

stream {
    upstream kube_apiserver {
        least_conn;
        server 10.0.0.11:5443;
        server 10.0.0.12:5443;
        server 10.0.0.13:5443;
    }

    server {
        listen        0.0.0.0:6443;
        proxy_pass    kube_apiserver;
        proxy_timeout 10m;
        proxy_connect_timeout 1s;
    }
}

systemctl start kube-apiserver-proxy

2.4. Install kubeadm-config

kubeadm-config is a combination of a series of configuration files and a package of the necessary image files required for a kubeadm installation, which will automatically configure Containerd, ctrictl, etc. after installation:

wget https://github.com/mritd/kubeadm-config-pack/releases/download/v1.21.1/kubeadm-config_v1.21.1.run
chmod +x *.run

# The --load option is used to load the required image of kubeadm into containerd
./kubeadm-config_v1.21.1.run install --load

2.4.1. containerd configuration

The Containerd configuration is located in /etc/containerd/config.toml and is configured as follows:

version = 2
# Specify the storage root directory
root = "/data/containerd"
state = "/run/containerd"
# OOM Rating
oom_score = -999

[grpc]
  address = "/run/containerd/containerd.sock"

[metrics]
  address = "127.0.0.1:1234"

[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    # sandbox mirror
    sandbox_image = "k8s.gcr.io/pause:3.4.1"
    [plugins."io.containerd.grpc.v1.cri".containerd]
      snapshotter = "overlayfs"
      default_runtime_name = "runc"
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
          runtime_type = "io.containerd.runc.v2"
          # Turn on systemd cgroup
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            SystemdCgroup = true

2.4.2. crictl configuration

Switching to Containerd means that the previous docker commands will no longer be available. Containerd comes with a ctr command by default, and the CRI specification comes with a crictl command; the crictl command configuration file is stored in /etc/crictl.yaml:

1
2
3

runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
pull-image-on-create: true

2.4.3. kubeadm configuration

The kubeadm configuration is currently divided into 2 configurations, one is the init configuration for the first boot, and the other is the configuration for other nodes to join to the master; the more important init configuration is as follows:

# /etc/kubernetes/kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
# kubeadm token create
bootstrapTokens:
- token: "c2t0rj.cofbfnwwrb387890"
nodeRegistration:
  # CRI 地址(Containerd)
  criSocket: unix:///run/containerd/containerd.sock
  kubeletExtraArgs:
    runtime-cgroups: "/system.slice/containerd.service"
    rotate-server-certificates: "true"
localAPIEndpoint:
  advertiseAddress: "10.0.0.11"
  bindPort: 5443
# kubeadm certs certificate-key
certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
clusterName: "kuberentes"
kubernetesVersion: "v1.21.1"
certificatesDir: "/etc/kubernetes/pki"
# Other components of the current control plane only connect to the apiserver on the current host.
# This is the expected behavior, see: https://github.com/kubernetes/kubeadm/issues/2271
controlPlaneEndpoint: "127.0.0.1:6443"
etcd:
  external:
    endpoints:
    - "https://10.0.0.11:2379"
    - "https://10.0.0.12:2379"
    - "https://10.0.0.13:2379"
    caFile: "/etc/etcd/ssl/etcd-ca.pem"
    certFile: "/etc/etcd/ssl/etcd.pem"
    keyFile: "/etc/etcd/ssl/etcd-key.pem"
networking:
  serviceSubnet: "10.66.0.0/16"
  podSubnet: "10.88.0.1/16"
  dnsDomain: "cluster.local"
apiServer:
  extraArgs:
    v: "4"
    alsologtostderr: "true"
#    audit-log-maxage: "21"
#    audit-log-maxbackup: "10"
#    audit-log-maxsize: "100"
#    audit-log-path: "/var/log/kube-audit/audit.log"
#    audit-policy-file: "/etc/kubernetes/audit-policy.yaml"
    authorization-mode: "Node,RBAC"
    event-ttl: "720h"
    runtime-config: "api/all=true"
    service-node-port-range: "30000-50000"
    service-cluster-ip-range: "10.66.0.0/16"
#    insecure-bind-address: "0.0.0.0"
#    insecure-port: "8080"
    # The fraction of requests that will be closed gracefully(GOAWAY) to prevent
    # HTTP/2 clients from getting stuck on a single apiserver.
    goaway-chance: "0.001"
#  extraVolumes:
#  - name: "audit-config"
#    hostPath: "/etc/kubernetes/audit-policy.yaml"
#    mountPath: "/etc/kubernetes/audit-policy.yaml"
#    readOnly: true
#    pathType: "File"
#  - name: "audit-log"
#    hostPath: "/var/log/kube-audit"
#    mountPath: "/var/log/kube-audit"
#    pathType: "DirectoryOrCreate"
  certSANs:
  - "*.kubernetes.node"
  - "10.0.0.11"
  - "10.0.0.12"
  - "10.0.0.13"
  timeoutForControlPlane: 1m
controllerManager:
  extraArgs:
    v: "4"
    node-cidr-mask-size: "19"
    deployment-controller-sync-period: "10s"
    experimental-cluster-signing-duration: "8670h"
    node-monitor-grace-period: "20s"
    pod-eviction-timeout: "2m"
    terminated-pod-gc-threshold: "30"
scheduler:
  extraArgs:
    v: "4"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
failSwapOn: false
oomScoreAdj: -900
cgroupDriver: "systemd"
kubeletCgroups: "/system.slice/kubelet.service"
nodeStatusUpdateFrequency: 5s
rotateCertificates: true
evictionSoft:
  "imagefs.available": "15%"
  "memory.available": "512Mi"
  "nodefs.available": "15%"
  "nodefs.inodesFree": "10%"
evictionSoftGracePeriod:
  "imagefs.available": "3m"
  "memory.available": "1m"
  "nodefs.available": "3m"
  "nodefs.inodesFree": "1m"
evictionHard:
  "imagefs.available": "10%"
  "memory.available": "256Mi"
  "nodefs.available": "10%"
  "nodefs.inodesFree": "5%"
evictionMaxPodGracePeriod: 30
imageGCLowThresholdPercent: 70
imageGCHighThresholdPercent: 80
kubeReserved:
  "cpu": "500m"
  "memory": "512Mi"
  "ephemeral-storage": "1Gi"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
# kube-proxy specific options here
clusterCIDR: "10.88.0.1/16"
mode: "ipvs"
oomScoreAdj: -900
ipvs:
  minSyncPeriod: 5s
  syncPeriod: 5s
  scheduler: "wrr"

Please refer to the official documentation for the specific meaning of the init configuration, compared to the init configuration, the join configuration is relatively simple, but it should be noted that if you need to join as master then you need the controlPlane part, otherwise please comment out the controlPlane.

# /etc/kubernetes/kubeadm-join.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
controlPlane:
  localAPIEndpoint:
    advertiseAddress: "10.0.0.12"
    bindPort: 5443
  certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b
discovery:
  bootstrapToken:
    apiServerEndpoint: "127.0.0.1:6443"
    token: "c2t0rj.cofbfnwwrb387890"
    # Please replace with the "--discovery-token-ca-cert-hash" value printed
    # after the kubeadm init command is executed successfully
    caCertHashes:
    - "sha256:97590810ae34a82501717e33acfca76f16044f1a365c5ad9a1c66433c386c75c"
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  kubeletExtraArgs:
    runtime-cgroups: "/system.slice/containerd.service"
    rotate-server-certificates: "true"

2.5. Pull up master

After adjusting the configuration, pulling up the master node requires only one command:

`1`	`kubeadm init --config /etc/kubernetes/kubeadm.yaml --upload-certs --ignore-preflight-errors=Swap`

After pulling up, remember to save the relevant Token for subsequent use.

2.6. Pull up other master

After the first master is started, use the join command to let other masters join; note that the kubeadm-join.yaml configuration requires replacing the caCertHashes with the discovery-token-ca-cert value after the first master is pulled up. in the kubeadm-join.yaml configuration.

`1`	`kubeadm join 127.0.0.1:6443 --config /etc/kubernetes/kubeadm-join.yaml --ignore-preflight-errors=Swap`

2.7. Pull up other node

Pulling up a node is the same as pulling up any other master node, except that the controlPlane section of the configuration needs to be commented out.

# /etc/kubernetes/kubeadm-join.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
#controlPlane:
#  localAPIEndpoint:
#    advertiseAddress: "10.0.0.12"
#    bindPort: 5443
#  certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b
discovery:
  bootstrapToken:
    apiServerEndpoint: "127.0.0.1:6443"
    token: "c2t0rj.cofbfnwwrb387890"
    # Please replace with the "--discovery-token-ca-cert-hash" value printed
    # after the kubeadm init command is executed successfully
    caCertHashes:
    - "sha256:97590810ae34a82501717e33acfca76f16044f1a365c5ad9a1c66433c386c75c"
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  kubeletExtraArgs:
    runtime-cgroups: "/system.slice/containerd.service"
    rotate-server-certificates: "true"

`1`	`kubeadm join 127.0.0.1:6443 --config /etc/kubernetes/kubeadm-join.yaml --ignore-preflight-errors=Swap`

2.8. Other processing

Since the kubelet has certificate rotation turned on, there will be a lot of csr requests from new clusters, so just allow them in bulk:

`1`	`kubectl get csr \| grep Pending \| awk '{print $1}' \| xargs kubectl certificate approve`

Also, in order for the master node to also load the pod, the taint needs to be adjusted:

`1`	`kubectl taint nodes --all node-role.kubernetes.io/master-`

Subsequent CNI, etc. are outside the scope of this article

3. Containerd Common Operations

# List Mirror
ctr images ls

# List k8s images
ctr -n k8s.io images ls

# Import Mirror
ctr -n k8s.io images import xxxx.tar

# Export Mirror
ctr -n k8s.io images export kube-scheduler.tar k8s.gcr.io/kube-scheduler:v1.21.1

4. Resource Warehouse

All *-pack repositories in this article are located at the following addresses:

Reference https://mritd.com/2021/05/29/use-containerd-with-kubernetes/

Table of Contents