kubernetes supports local volume (local volume) since version 1.10. workloads (not only statefulsets types) can take advantage of local fast SSDs to get better performance than remote volumes (e.g. cephfs, RBD).

Before the advent of local volume, statefulsets could also take advantage of local SSDs by configuring hostPath and binding to specific nodes via nodeSelector or nodeAffinity. However, the problem with hostPath is that administrators need to manually manage the directory of each node of the cluster, which is less convenient.

The following two types of applications are suitable for using local volume.

  • Data caching, where applications can access data nearby for fast processing.
  • Distributed storage systems, such as distributed database Cassandra , distributed file system ceph/gluster

The following will introduce how to use local volume by creating PV, PVC, Pod in a manual way, then introduce the semi-automatic way provided by external storage, and finally introduce some developments in the community.

Create a storage class

First you need to have a sc named local-volume.

1
2
3
4
5
6
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-volume
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

sc’s provisioner is kubernetes.io/no-provisioner.

WaitForFirstConsumer means that the PV should not bind the PVC immediately, but until there is a Pod that needs to use the PVC. The scheduler will take into account the appropriate local PV when scheduling, so that it does not cause conflicts with Pod resource settings, selectors, affinity and anti-affinity policies, etc. Obviously: if the PVC is bound to the local PV first, since the local PV is bound to the node, the selectors, affinity, etc. are basically useless, so it is better to select the node first according to the scheduling policy, and then bind the local PV.

Create a PV statically

Create a 5GiB PV statically via the kubectl command; the PV uses the /data/local/vol1 directory of node ubuntu-1; the sc of the PV is local-volume.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-local-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-volume
  local:
    path: /data/local/vol1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - ubuntu-1

Retain means that after the PV is released with the PVC, the administrator needs to clean it up manually and reset the volume.

You need to specify the sc corresponding to the PV; the directory /data/local/vol1 also needs to be created.

1
2
3
kubectl get pv example-local-pv
NAME               CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
example-local-pv   5Gi        RWO            Retain           Available           local-volume            8d

Using a local volume PV

Next, create a PVC with sc:local-volume associated with it, and then hook that PVC into the nginx container.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: myclaim
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: local-volume
---
kind: Pod
apiVersion: v1
metadata:
  name: mypod
spec:
  containers:
    - name: myfrontend
      image: nginx
      volumeMounts:
      - mountPath: "/usr/share/nginx/html"
        name: mypd
  volumes:
    - name: mypd
      persistentVolumeClaim:
        claimName: myclaim

When you enter the container, you will see the mounted directory, and the size is actually the size of the disk where the PV created above is located.

1
/dev/sdb         503G  235M  478G   1% /usr/share/nginx/html

Create an index.html file in the /data/local/vol1 directory of the host.

1
echo "hello world" > /data/local/vol1/index.html

Then go to the IP address of the curl container and you can get the string you just wrote.

Delete the Pod/PVC, after that the PV status is changed to Released and that PV will not be bound to the PVC anymore.

Dynamic PV creation

Managing local PVs manually is obviously a lot of work, and the community provides external storage to create PVs dynamically (which is still not actually automated enough).

The official arrangement of the local volume provisioner is in the local-volume/provisioner/deployment/kubernetes/example/default_example_provisioner_generated.yaml directory However, the official documentation is a bit confusing with fast-disk and local-storage. I use local-volume here.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: local-provisioner-config
  namespace: default
data:
  storageClassMap: |
    local-volume:
       hostDir: /data/local
       mountDir:  /data/local
       blockCleanerCommand:
         - "/scripts/shred.sh"
         - "2"
       volumeMode: Filesystem
       fsType: ext4    
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: local-volume-provisioner
  namespace: default
  labels:
    app: local-volume-provisioner
spec:
  selector:
    matchLabels:
      app: local-volume-provisioner
  template:
    metadata:
      labels:
        app: local-volume-provisioner
    spec:
      serviceAccountName: local-volume-admin
      containers:
        - image: "silenceshell/local-volume-provisioner:v2.1.0"
          imagePullPolicy: "Always"
          name: provisioner
          securityContext:
            privileged: true
          env:
          - name: MY_NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          volumeMounts:
            - mountPath: /etc/provisioner/config
              name: provisioner-config
              readOnly: true
            - mountPath:  /data/local
              name: local
              mountPropagation: "HostToContainer"
      volumes:
        - name: provisioner-config
          configMap:
            name: local-provisioner-config
        - name: local
          hostPath:
            path: /data/local
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: local-volume-admin
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: local-volume-provisioner-pv-binding
  namespace: default
subjects:
- kind: ServiceAccount
  name: local-volume-admin
  namespace: default
roleRef:
  kind: ClusterRole
  name: system:persistent-volume-provisioner
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: local-volume-provisioner-node-clusterrole
  namespace: default
rules:
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: local-volume-provisioner-node-binding
  namespace: default
subjects:
- kind: ServiceAccount
  name: local-volume-admin
  namespace: default
roleRef:
  kind: ClusterRole
  name: local-volume-provisioner-node-clusterrole
  apiGroup: rbac.authorization.k8s.io

After kubectl is created, a provisioner will be started on each node due to the daemonset type. this provisioner will monitor the discovery directory, which is /data/local configured above.

1
2
3
4
$ kubectl get pods -o wide|grep local-volume
local-volume-provisioner-rrsjp            1/1     Running   0          5m    10.244.1.141   ubuntu-2   <none>
local-volume-provisioner-v87b7            1/1     Running   0          5m    10.244.2.69    ubuntu-3   <none>
local-volume-provisioner-x65k9            1/1     Running   0          5m    10.244.0.174   ubuntu-1   <none>

The previous mypod/myclaim has been deleted and we recreate one, at this point pvc myclaim is in Pending state and the provisoner does not automatically supply the storage. Why?

It turns out that the logic of external-storage is like this: the provisioner itself does not provide local volume, but its provisioner on each node will dynamically “discover” the mount point (discovery directory), when the provisioner of a node finds a mount point in the /data/local/ directory, it will create a PV, the local.path of the PV is the mount point, and set the nodeAffinity to that node.

So how do you get the mount point?

Going straight to creating a directory won’t work because provsioner expects PVs to be isolated, e.g. capacity, io, etc. Try creating a xxx directory under /data/local/ on ubuntu-2 and you will get this alert.

1
discovery.go:201] Path "/data/local/xxx" is not an actual mountpoint

The directory is not a mount point and cannot be used.

The directory must be a real mount to work. One way is to add the drive, format it, and mount it, which is a lot of work. You can actually “trick” the provisioner into thinking it is a mounted disk by formatting the local files (loopfs) and then mounting it, thus automatically creating the PV and binding it to the PVC.

This is done as follows.

Save the following code as a file loopmount, add execute permissions and copy it to the /bin directory, then you can use the command to create a mount point.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#!/bin/bash
  
# Usage: sudo loopmount file size mount-point

touch $1
truncate -s $2 $1
mke2fs -t ext4 -F $1 1> /dev/null 2> /dev/null
if [[ ! -d $3 ]]; then
        echo $3 " not exist, creating..."
        mkdir $3
fi
mount $1 $3
df -h |grep $3

Use the script to create a 6G file and mount it under /data/local. The reason for the 6G is that the front PVC needs 5GB, and the remaining space after formatting will be a little smaller, so set the file a little bigger to bind the PVC later.

1
2
3
# loopmount xxx 6G /data/local/xxx
/data/local/xxx  not exist, creating...
/dev/loop0     5.9G   24M  5.6G   1% /data/local/x1

Looking at the PV, you can see that Provisioner automatically creates the PV, and kubernetes will provision that PV to the PVC myclam in front of it, and mypod is run up.

1
2
3
# kubectl get pv
NAME              CAPACITY  ACCESS MODES   RECLAIM POLICY   STATUS  CLAIM            STORAGECLASS          REASON   AGE
local-pv-600377f7 5983Mi    RWO            Delete           Bound   default/myclaim  local-volume                   1s

As you can see, the current version of local volume is not yet fully automated like cephfs/RBD, and still requires administrator intervention, which is obviously not a good implementation.

Someone in the community has submitted a Proposal to do dynamic provisioning of local volume based on LVM, but progress is slow. The author is a huawei employee, so I think huawei has already implemented it.

In addition to LVM-based, you can also implement dynamic provisioning of LV based on ext4 project quota.

In addition to using disk, you can also consider using an in-memory file system to get higher io performance, just the capacity is not so ideal. Some special applications can be considered.

1
mount -t tmpfs -o size=1G,nr_inodes=10k,mode=700 tmpfs /data/local/tmpfs

In general, local volume local volumes do not currently support dynamic provisioning and cannot really be promoted for use yet, but they can be used to solve some specific problems.

Ref: