Since v1.11, kubernetes has enabled the resize feature and PersistentVolumeClaimResize admission controller by default, so that if the storage volume created by the user is not large enough, it can be expanded without losing the original data. Currently supported storage volumes for resize are AWS-EBS, GCE-PD, Azure Disk, Azure File, Glusterfs, Cinder, Portworx, and Ceph RBD.
- Block file systems such as GCE-PD, AWS-EBS, Azure Disk, Cinder, and Ceph RBD require file system expansion. When Pod reboots, k8s will do the expansion automatically
- NAS file systems such as Glusterfs and Azure File) do not need to restart the Pod because they do not require file system expansion
The following is an example of a Ceph RBD-based Storage class to specify the PVC expansion feature.
1.Preparation for expansion
Administrators need to configure the storage class, turn on the allowVolumeExpansion option
Make kube-controller-manager image and add rbd command
Since the expansion is done by kube-controller-manager, and our platform is running as a container, we need to include the rbd command in the container image, otherwise we will encounter the following error when expanding the capacity.
Warning VolumeResizeFailed 3m (x75 over 4h) volume_expand Error expanding volume "default/resize" of plugin kubernetes.io/rbd : rbd info failed, error: executable file not found in $PATH
I made a controller-manager image of kubernetes v1.11.1 + ceph 10 version
silenceshell/kube-controller-manager-amd64:v1.11.1-ceph-10, you can also make your own.
The production method is simple.
- Based on centos7 base image
- COPY the
kube-controller-managerbinary to the
yum install ceph-common ceph-fs-commonand you’re done. Note that you need to use the corresponding version of the ceph cluster.
Update the image of the kube-controller-manager orchestration file to upgrade the controller-manager Pod
2. Expansion operation
Modify the size of the PVC
Command line use kubectl edit pvc xxx, directly modify the capacity of the PVC can be.
After that the PVC will enter the
FileSystemResizePendingstate and wait for the Pod to restart.
apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: volume.beta.kubernetes.io/storage-provisioner: ceph.com/rbd name: resize spec: resources: requests: storage: 6Gi storageClassName: sata status: accessModes: - ReadWriteOnce capacity: storage: 5Gi conditions: - lastProbeTime: null lastTransitionTime: 2019-04-14T03:51:36Z message: Waiting for user to (re-)start a pod to finish file system resize of volume on node. status: "True" type: FileSystemResizePending phase: Bound
Restart the Pod that mounts the PVC
After that, the controller manager will call the rbd resize command to update the size of the rbd image and remount the volume to the Pod.
Log in to the shell of the new Pod and check the size of the mounted rbd volume; check the status.capacity.storage field of the PVC, which has been updated to the new capacity.
3. online expansion
Version 1.11.1 also provides an alpha feature: ExpandInUsePersistentVolumes, that is, no container can be expanded PVC expansion. Of course, the PVC still has to manage the Pod, that is, it needs to be mounted by Pod.
PVCs that support online expansion are: GCE-PD, AWS-EBS, Cinder, and Ceph RBD.
This feature is not enabled by default, and needs to be turned on manually by adding the following kubelet startup parameters.
After you open it, change the size of RBD pvc, and then you can see that the rbd capacity is automatically updated to the new size in the container.
The online expansion is essentially using the function that ext4, xfs file systems can be expanded online.
This feature requires modifying the code of kubernetes itself, I have implemented this piece, cephfs storage volumes can be expanded online, without the need for container restart.
- need to modify the properties of cephfs, open its expand function, involving apiserver (commission), controller
- need to add the controller cephfs in the expand function, the specific can refer to the implementation of glusterfs
When I have time, I will write about it in detail.
Also found a bug in kubernetes itself, when kubelet restart, it will cause the ceph-fuse mounted volume on the node to fail, as shown in the container df command will prompt ‘Transport endpoint is not connected ‘, the original mounted volume can not be seen.
I submitted an issue.
If you use systemd managed operating system like centos with me, the solution is to use
systemd-run to execute
kubernetes handled the mounting of the
mount command in PR49640, which would hand over the user-state processes from the mount fork to systemd management; but later, when supporting ceph-fuse did not take this into account, causing this bug.