In Kubernetes V1.16, the feature of expanding disk volumes online is in public preview. This topic describes how to dynamically expand disk volumes by using the Flexvolume plug-in.

Prerequisites

A RAM role of the target cluster is granted the ResizeDisk permission. The steps for granting the ResizeDisk permission to a RAM role vary depending on the cluster type and the plug-in type.
  • A dedicated cluster that has the CSI plug-in:
    1. In the left-side navigation pane, choose Clusters > Clusters.
    2. On the Clusters page, find the target cluster and click Manage in the Actions column.
    3. On the management page of the target cluster, select the Cluster Resources tab and click the link next to Master RAM Role.
    4. In the RAM console, grant the ResizeDisk permission to the RAM role. For more information, see Modify a custom policy.resizedisk
  • A managed cluster or a dedicated cluster that has the FlexVolume plug-in:

    Perform the first three steps of the preceding procedure. On the management page of the target cluster, select the Cluster Resources tab and click the link next to Worker RAM Role. Then, grant the ResizeDisk permission to the RAM role.

Background information

Dynamic PV expansion consists of the manual expansion of PVs and the automatic expansion of file systems. ACK allows you to expand both PVs and file systems without unmounting PVs from the host directory. However, to ensure the stability of file system expansion, we recommend that you stop application services and unmount the target PV from the host directory first.

To meet different levels of stability requirements, ACK provides the following solutions for PV expansion:
  • Expand a PV without restarting the pod to which the PV is mounted. If you use this method, file system errors may occur if the cluster I/O is high.
  • Expand a PV during the restart of the pod to which the PV is mounted. To ensure data security during expansion, we recommend that you stop application services before expansion.

By default, ACK V1.16 and later support PV expansion without restarting the pod.

Usage notes

  • Data backup

    To avoid data loss caused by PV expansion errors, make sure that snapshots are created for the target PV.

  • Scenarios
    • You can enable online expansion for a PV only if the StorageClassName parameter is specified for the PV.
    • ACK does not support expansion of inline volumes.
    • ACK does not support online expansion of basic disks.
    • The allowVolumeExpansion field of the default StorageClass is set to true. If you create a new StorageClass, you must specify this field to true.
  • Plug-in version

    Make sure that the FlexVolume or CSI plug-in is upgraded to the latest version.

Expand a disk volume online without restarting a pod

  1. Connect to Kubernetes clusters through kubectl.
    Assume that the pod is in the following state.
    # kubectl get pod
    web-0         1/1     Running   0          42s
    
    # kubectl exec web-0 df /data
    Filesystem     1K-blocks  Used Available Use% Mounted on
    /dev/vdb        20511312 45080  20449848   1% /data
    
    # kubectl get pvc
    NAME             STATUS   VOLUME                   CAPACITY   ACCESS MODES   STORAGECLASS              AGE
    disk-ssd-web-0   Bound    d-wz9hpoifm43yn9zie6gl   20Gi       RWO            alicloud-disk-available   57s
    
    # kubectl get pv
    NAME                     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                    STORAGECLASS              REASON   AGE
    d-wz9hpoifm43yn9zie6gl   20Gi       RWO            Delete           Bound      default/disk-ssd-web-0   alicloud-disk-available            65s
  2. Make sure that requirements described in Usage notes are met. Then, you can run the following command to expand the disk volume:
    # kubectl patch pvc disk-ssd-web-0 -p '{"spec":{"resources":{"requests":{"storage":"30Gi"}}}}'
    Wait one minute and run the following commands to check whether the persistent volume (PV) has been successfully expanded.
    # kubectl get pv d-wz9hpoifm43yn9zie6gl
    NAME                     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                    STORAGECLASS              REASON   AGE
    d-wz9hpoifm43yn9zie6gl   30Gi       RWO            Delete           Bound    default/disk-ssd-web-0   alicloud-disk-available            5m23s
    
    # kubectl get pvc
    NAME             STATUS   VOLUME                   CAPACITY   ACCESS MODES   STORAGECLASS              AGE
    disk-ssd-web-0   Bound    d-wz9hpoifm43yn9zie6gl   30Gi       RWO            alicloud-disk-available   5m10s
    
    # kubectl exec web-0 df /data
    Filesystem     1K-blocks  Used Available Use% Mounted on
    /dev/vdb        30832548 45036  30771128   1% /data

    To expand a disk volume online without restarting a pod, you only need to run the preceding command to complete all the required operations.

Expand a disk volume online after restarting a pod

  1. Use kubectl to connect to the target cluster. For more information, see Connect to Kubernetes clusters through kubectl.
    Assume that the pod to which the target PV is mounted is in the following state:
    # kubectl get pod
    web-0         1/1     Running   0          42s
    
    # kubectl exec web-0 df /data
    Filesystem     1K-blocks  Used Available Use% Mounted on
    /dev/vdb        20511312 45080  20449848   1% /data
    
    # kubectl get pvc
    NAME             STATUS   VOLUME                   CAPACITY   ACCESS MODES   STORAGECLASS              AGE
    disk-ssd-web-0   Bound    d-wz9hpoifm43yn9zie6gl   20Gi       RWO            alicloud-disk-available   57s
    
    # kubectl get pv
    NAME                     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                    STORAGECLASS              REASON   AGE
    d-wz9hpoifm43yn9zie6gl   20Gi       RWO            Delete           Bound      default/disk-ssd-web-0   alicloud-disk-available            65s
  2. Run the following command to view the scheduling information about the PV:
    # kubectl get pv d-wz9g2j5qbo37r2lamkg4  -oyaml | grep failure-domain.beta.kubernetes.io/zone
        failure-domain.beta.kubernetes.io/zone: cn-shenzhen-e
  3. Modify the label of the scheduled resource by replacing the value of the zone field. Then, the pod that the PV is bound to cannot be scheduled.

    For example, you can change value of the zone field in the label from cn-shenzhen-e to cn-shenzhen-e-nozone.

    # kubectl label pv d-wz9g2j5qbo37r2lamkg4 failure-domain.beta.kubernetes.io/zone=cn-shenzhen-e-nozone --overwrite
    persistentvolume/d-wz9g2j5qbo37r2lamkg4 labeled
  4. Restart the pod.

    The pod temporarily remains in the Pending state because the zone field of the label is modified.

    # kubectl delete pod web-0
    
    # kubectl get pod
    web-0   0/1     Pending   0          27s
  5. Run the following command to expand the PV.
    # kubectl patch pvc disk-ssd-web-0 -p '{"spec":{"resources":{"requests":{"storage":"30Gi"}}}}'
  6. Change the zone field in the label to the previous setting to restart the pod. In this example, change the value from cn-shenzhen-e-nozone to cn-shenzhen-e.
    # kubectl label pv d-wz9g2j5qbo37r2lamkg4 failure-domain.beta.kubernetes.io/zone=cn-shenzhen-e --overwrite
    persistentvolume/d-wz9g2j5qbo37r2lamkg4 labeled
    Wait one minute and run the following commands to check whether the PV has been successfully expanded.
    # kubectl get pod
    web-0   1/1     Running   0          3m23s
    
    # kubectl get pvc
    disk-ssd-web-0   Bound    d-wz9g2j5qbo37r2lamkg4   30Gi       RWO            alicloud-disk-available   17m
    
    # kubectl get pv d-wz9g2j5qbo37r2lamkg4
    d-wz9g2j5qbo37r2lamkg4   30Gi       RWO            Delete           Bound    default/disk-ssd-web-0   alicloud-disk-available            17m
    
    # kubectl exec web-0 df /data
    /dev/vdb        30832548 45036  30771128   1% /data

    The result shows that the PV is expanded from 20 GiB to 30 GiB.