Dynamic volume provisioning automatically creates and mounts an independent disk for each application replica. This method is ideal for databases, middleware, and other applications that require high I/O and low latency. It also simplifies storage lifecycle management.
How it works
The following process describes how to use a dynamically provisioned disk volume in a StatefulSet:
Define a template
You can create a new StorageClass or use a default one as a template for dynamic disk creation. This template defines key parameters, such as disk type, performance level, and reclaim policy.Declare storage requirements in the application
Define
volumeClaimTemplatesin the StatefulSet and reference the StorageClass. This declares the specifications for the persistent volume claim (PVC) that the pod will use, such as storage capacity and access mode.Automate volume creation and mounting
When the StatefulSet creates a pod, the system automatically generates a unique PVC for the pod based on the template. The Container Storage Interface (CSI) component creates a persistent volume (PV) based on the StorageClass rules and binds the PV to the PVC. Finally, the disk is mounted to the pod.
Scope
Zone limitations: With the exception of regional Enterprise SSDs (ESSDs), other disk types cannot be mounted across zones. They can be mounted only to pods in the same zone.
Instance family restrictions: Some disk types can be attached only to specific instance families.
CSI component limitations: The csi-plugin and csi-provisioner components must be installed.
CSI components are installed by default. Make sure that you have not manually uninstalled them. You can view the installation status on the Add-ons page. We recommend that you upgrade the CSI components to the latest version.
Virtual node limitations: To use disks on virtual nodes, you must meet specific cluster and kube-scheduler version requirements.
Step 1: Select a StorageClass
ACK provides multiple default StorageClasses. A StorageClass cannot be modified after it is created. If the default configurations do not meet your requirements, you can create a new one. For more information, see Create a StorageClass manually.
Use a default StorageClass
Select one of the following default StorageClasses and reference its name in the storageClassName field of your application.
StorageClass name | Dynamically created disk type |
| By default, pods are scheduled before disks are created to prevent mount failures caused by zone mismatches ( |
| Enterprise SSD (ESSD). The default performance level is PL1, and the minimum disk capacity is 20 GiB. Important ESSDs in CloudBox only support the PL0 performance level. You must manually create a StorageClass and specify |
| Standard SSD. The minimum disk capacity is 20 GiB. |
| Ultra disk. The minimum disk capacity is 20 GiB. |
Run the kubectl describe sc <storageclass-name> command to view the detailed configuration of a StorageClass.Create a StorageClass manually
kubectl
Create a file named
disk-sc.yaml.The following example shows a StorageClass that uses
volumeBindingMode: WaitForFirstConsumerto delay PV binding.apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: # Name of the StorageClass name: alicloud-disk-wait-for-first-consumer # Driver type. This value is fixed when using the Alibaba Cloud disk CSI plugin. provisioner: diskplugin.csi.alibabacloud.com parameters: # Disk type. The system selects a type based on priority. type: cloud_auto,cloud_essd,cloud_ssd # File system type fstype: ext4 diskTags: "a:b,b:c" encrypted: "false" # Performance level of the ESSD performanceLevel: PL1 provisionedIops: "40000" burstingEnabled: "false" # Binding mode. We recommend that you use WaitForFirstConsumer in multi-zone scenarios. volumeBindingMode: WaitForFirstConsumer # Reclaim policy reclaimPolicy: Retain # Specifies whether to allow volume expansion allowVolumeExpansion: true # Topology constraint: Restricts disk creation to specified zones allowedTopologies: - matchLabelExpressions: - key: topology.diskplugin.csi.alibabacloud.com/zone values: # Replace with your actual zones - cn-hangzhou-i - cn-hangzhou-kThe following table describes the key parameters.
Parameter
Description
provisionerThe driver type. This is a required parameter. When using the Alibaba Cloud disk CSI plugin, the value is fixed at
diskplugin.csi.alibabacloud.com.parameterstypeThe disk type. This is a required parameter. Valid values:
cloud_essd(default): enterprise SSD (ESSD)cloud_auto: ESSD AutoPL diskcloud_essd_entry: ESSD Entry diskcloud_ssd: standard SSDcloud_efficiency: ultra diskelastic_ephemeral_disk_standard: Standard Edition elastic ephemeral diskelastic_ephemeral_disk_premium: premium elastic ephemeral diskcloud_regional_disk_auto: regional Enterprise SSD (ESSD)
You can specify any combination of these values, such as
type: cloud_ssd,cloud_essd,cloud_auto. The system attempts to create a disk in the specified order. The final disk type depends on factors such as the node instance and the disk types supported in the zone.resourceGroupIdThe resource group to which the disk belongs. The default value is
"".regionIdThe region where the disk is located. This must be the same as the cluster's region.
fstypeThe file system used by the disk. Valid values:
ext4(default) andxfs.mkfsOptionsThe parameters for formatting the disk, such as
mkfsOptions: "-O project,quota".diskTagsThe tags of the disk. For example,
diskTags: "a:b,b:c". You can also specify tags in thediskTags/a: bformat. The CSI component must be v1.30.3 or later.encryptedSpecifies whether to encrypt the disk. The default value is
false, which means the disk is not encrypted.performanceLevelESSD performance level. Valid values are
PL0,PL1(default),PL2, orPL3.When used with CloudBox, this must be set to
PL0.volumeExpandAutoSnapshot[Deprecated]This parameter has been deprecated since CSI v1.31.4.
provisionedIopsUsed to configure the provisioned performance (IOPS) of the disk when you use an ESSD AutoPL disk.
burstingEnabledSpecifies whether to enable Burst (performance burst) for an ESSD AutoPL disk. The default value is
false, which disables performance burst.multiAttachSpecifies whether to enable the disk multi-attach feature. Defaults to
false(disabled).volumeBindingModeThe binding mode of the disk. Valid values:
Immediate(default): Creates the disk before creating the pod.WaitForFirstConsumer: Delays binding. The pod is scheduled first, and then the disk is created in the same zone as the pod.In multi-zone scenarios, we recommend that you use
WaitForFirstConsumerto prevent mount failures caused by the disk and the ECS node being in different zones.If you schedule pods to virtual nodes using specific scheduling methods or adding specific Annotations, you cannot use StorageClasses of the
WaitForFirstConsumertype. For more information, see What do I do if a PVC remains in the Pending state when a pod with a mounted disk is scheduled to a virtual node?.
reclaimPolicyThe reclaim policy of the disk.
Delete(default): When the PVC is deleted, the PV and the disk are also deleted.Retain: When the PVC is deleted, the PV and the disk data are not deleted. You must delete them manually.If data security is a high priority, we recommend that you use
Retainto prevent accidental data deletion.
allowVolumeExpansionWhen set to
true, you can expand a disk volume without service interruptions.allowedTopologiesRestricts disk creation to specific topology domains.
key: The topology domain label. The following values are supported:topology.diskplugin.csi.alibabacloud.com/zone: A dedicated topologykeyprovided by the Alibaba Cloud CSI plugin.alibabacloud.com/ecs-instance-id: When using elastic ephemeral disks, you can specify a node.
values: A list that contains zone or node IDs.
Run the following command to create the StorageClass.
kubectl create -f disk-sc.yamlRun the following command to view the StorageClass.
kubectl get scThe output shows that the StorageClass is created and is in
WaitForFirstConsumerbinding mode.NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE alicloud-disk-wait-for-first-consumer diskplugin.csi.alibabacloud.com Retain WaitForFirstConsumer true 10s
Console
On the Clusters page, click the name of the target cluster. In the left navigation pane, choose .
Click Create, select Cloud Disk as the PV type, set the parameters, and click OK.
Parameter
Description
Parameters
Default parameter:
type.The disk type. This is a required parameter. Valid values:
cloud_essd(default): enterprise SSD (ESSD)cloud_auto: ESSD AutoPL diskcloud_essd_entry: ESSD Entry diskcloud_ssd: standard SSDcloud_efficiency: ultra diskelastic_ephemeral_disk_standard: Standard Edition elastic ephemeral diskelastic_ephemeral_disk_premium: premium elastic ephemeral diskcloud_regional_disk_auto: regional Enterprise SSD (ESSD)
You can specify any combination of these values, such as
type: cloud_ssd,cloud_essd,cloud_auto. The system attempts to create a disk in the specified order. The final disk type depends on factors such as the node instance and the disk types supported in the zone.
Reclaim Policy
The reclaim policy of the disk.
Delete(default): When the PVC is deleted, the PV and the disk are also deleted.Retain: When the PVC is deleted, the PV and the disk data are not deleted. You must delete them manually.If data security is a high priority, we recommend that you use
Retainto prevent accidental data deletion.
Binding Mode
The binding mode of the disk. Valid values:
Immediate(default): Creates the disk before creating the pod.WaitForFirstConsumer: Delays binding. The pod is scheduled first, and then the disk is created in the same zone as the pod.In multi-zone scenarios, we recommend that you use
WaitForFirstConsumerto prevent mount failures caused by the disk and the ECS node being in different zones.If you schedule pods to virtual nodes using specific scheduling methods or adding specific Annotations, you cannot use StorageClasses of the
WaitForFirstConsumertype. For more information, see What do I do if a PVC remains in the Pending state when a pod with a mounted disk is scheduled to a virtual node?.
After the StorageClass is created, you can view it on the StorageClasses page.
Step 2: Create an application and mount a disk
This section uses a StatefulSet as an example to demonstrate how to mount a disk volume.
Disks are non-shared storage. If multi-attach is not enabled, a disk can be mounted to only one pod at a time. If you share a PVC in a multi-replica deployment, new pods fail because they cannot mount the disk that is in use by an existing pod. We recommend that you use a StatefulSet or mount a separate disk for each pod.
To use a cloud disk in a Deployment, we recommend that you use a cloud disk as a temporary storage volume. To enable multi-attach, see Use NVMe cloud disks with multi-attach and Reservation.
Create a file named
statefulset.yaml.The following example creates a StatefulSet with two pods. It uses
volumeClaimTemplatesto automatically create and bind independent persistent storage for each pod.apiVersion: apps/v1 kind: StatefulSet metadata: name: web spec: serviceName: "nginx" replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: # We recommend that you configure the following securityContext to optimize mount performance securityContext: fsGroup: 1000 fsGroupChangePolicy: "OnRootMismatch" containers: - name: nginx image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6 ports: - containerPort: 80 volumeMounts: # Mount the data volume to the /data directory of the container # The name must be the same as metadata.name defined in volumeClaimTemplates - name: pvc-disk mountPath: /data # Define the PVC template volumeClaimTemplates: - metadata: name: pvc-disk spec: # Access mode accessModes: [ "ReadWriteOnce" ] # Associate the previously created StorageClass storageClassName: "alicloud-disk-wait-for-first-consumer" resources: requests: # Requested storage capacity, which is the disk size storage: 20GiImportantConfiguring
securityContext.fsgroupin a pod causes the kubelet to recursively change file permissions (chmod/chown) when the volume is mounted. If the volume contains many files, this process significantly increases the mount time.For clusters that run Kubernetes 1.20 or later, we recommend that you set
fsGroupChangePolicytoOnRootMismatch. This setting performs a recursive permission change only on the first mount and only if the permissions of the volume's root directory do not match the required permissions. This optimizes mount performance. If the performance still does not meet your requirements or if you need more granular permission control, we recommend that you use aninitContainerto run permission adjustment commands before the main application container starts.Run the following command to create the StatefulSet.
kubectl create -f statefulset.yamlRun the following command to confirm that the pods are in the Running state.
kubectl get pod -l app=nginxRun the following command to check the mount path and confirm that the disk is mounted.
In this example, the pod name is
web-1. Replace it with your actual pod name.kubectl exec web-1 -- df -h /dataThe following output is expected:
Filesystem Size Used Avail Use% Mounted on /dev/vdb 20G 24K 20G 1% /data
Step 3: Simulate a pod failure and verify persistent storage
To verify that data stored on the disk persists after the pod is recreated, you can write data to the pod, delete the pod, and then check whether the data remains.
Write test data to the pod.
For the pod
web-1, create atestfile in the mounted disk path/data.kubectl exec web-1 -- touch /data/test kubectl exec web-1 -- ls /dataExpected output:
lost+found testSimulate a pod failure by deleting the pod.
kubectl delete pod web-1Run
kubectl get pod -l app=nginxagain. A new pod namedweb-1is automatically created.Verify the data in the new pod.
Check the
/datafolder in the new podweb-1again.kubectl exec web-1 -- ls /dataThe expected output shows that the
testfile still exists. This confirms that the data is persistently stored even after the pod is deleted and recreated.lost+found test
Going live
High availability
Disk selection
Evaluate the disk's performance, billing method, the node's zone, and the instance family. This ensures that pods can be scheduled to compatible nodes.
When you select a disk type, note that standard SSDs and ultra disks are being phased out. We recommend that you use PL0 ESSDs or ESSD Entry disks to replace ultra disks, and replace standard SSDs with ESSD AutoPL disks.
Build a cross-zone disaster recovery solution
Application-level disaster recovery: For critical services, such as databases, deploy application instances in multiple zones. Use the application's data synchronization mechanism to achieve high availability.
Storage-level disaster recovery: Select a disk type that supports multi-zone disaster recovery. Write data in real time to different zones within the same region to enable cross-zone fault recovery. For more information, see Use regional Enterprise SSDs (ESSDs).
Data security and backup
Prevent accidental data deletion:
To prevent data loss, set the StorageClass
reclaimPolicytoRetain. When a PVC is deleted, the backend disk is not deleted. This simplifies data restoration.Regular backups
Dynamic volumes simplify resource provisioning, but they are not a substitute for data backup. For core services, use Backup Center to back up and restore data.
Enable at-rest encryption: For applications with sensitive data, configure
encrypted: "true"in the StorageClass to encrypt disks.
Performance and cost optimization
Enable parallel attachment
By default, disk operations on a single node are serial. Use parallel disk attachment to accelerate pod startup.
Enable online volume expansion
Set
allowVolumeExpansion: truein the StorageClass. This lets you expand disk volumes online if your storage needs grow.Configure storage monitoring and alerting
Configure alerts based on container storage monitoring to promptly detect volume abnormalities or performance bottlenecks.
Billing
Disks that are dynamically created using a StorageClass are billed on a pay-as-you-go basis. For more information, see Elastic Block Storage billing and Elastic Block Storage pricing.
FAQ
What do I do if a PVC is stuck in the Pending state when a pod with a mounted disk is scheduled to a virtual node?
This issue can occur if you use a StorageClass that does not support scheduling to virtual nodes. When you schedule a pod to a virtual node using specific labels or annotations, StorageClasses with the volumeBindingMode: WaitForFirstConsumer mode are not supported.
Reason:
TheWaitForFirstConsumermode relies on the kube-scheduler to select a physical node for the pod. This determines the pod's zone, which is then used to create the disk. However, some scheduling mechanisms for virtual nodes do not follow this process. This prevents the Container Storage Interface (CSI) from obtaining the zone information. As a result, the PV cannot be created, and the PVC remains in the Pending state.If you encounter this issue, check whether the pod or its namespace contains any of the following configurations:
Label:
alibabacloud.com/eci: "true": Schedules the pod to an ECI pod.alibabacloud.com/acs: "true": Schedules the pod to an ACS pod.
Node specification:
The pod directly specifies a node using
spec.nodeName. The node name has the prefixvirtual-kubelet.
Annotation:
k8s.aliyun.com/eci-vswitch: Specifies the vSwitch for the ECI pod.k8s.aliyun.com/eci-fail-strategy: "fail-fast": Sets the failure handling policy for the ECI pod to fail-fast.
How do I mount a disk volume for a single pod or a single-replica deployment?
For simple applications that do not require multi-replica scaling or a stable Network ID, manually create a PVC and mount it to a pod or deployment to enable persistent storage.
The process is as follows: Select a StorageClass, create a PVC, and then mount the PVC in the application.
Create a PVC to request storage resources.
kubectl
Create a file named
disk-pvc.yaml.apiVersion: v1 kind: PersistentVolumeClaim metadata: name: disk-pvc spec: # Access mode accessModes: - ReadWriteOnce volumeMode: Filesystem resources: requests: # Requested storage capacity, which is the disk size storage: 20Gi # Associate with the previously created StorageClass storageClassName: alicloud-disk-topology-alltypeThe following table describes the parameters.
Parameter
Description
accessModesThe access mode of the volume. Valid values:
ReadWriteOnce,ReadOnlyMany, orReadWriteMany. The supported values depend on themultiAttachsetting in the StorageClass and thevolumeModesetting in the PVC.multiAttachspecifies whether to enable multi-attach for disks. The default value isfalse.If
multiAttachisfalseandvolumeModeis set to any value, onlyReadWriteOnceis supported.If
multiAttachistrueandvolumeModeisFilesystem, onlyReadWriteOnceandReadOnlyManyare supported.If
multiAttachistrueandvolumeModeisBlock, all three access modes are supported.
ImportantIn this scenario, the access mode is typically
ReadWriteOnce(RWO), which means the volume can be mounted by only one pod at a time. Therefore, the number of deployment replicas cannot be greater than 1. If you try to scale out the deployment, the new pods will be stuck in thePendingstate because they cannot mount the disk that is already in use.volumeModeThe mode of the persistent volume. Valid values:
Filesystem(default): The volume is formatted and mounted as a directory.Block: The volume is provided to the pod as an unformatted block device.
storageThe requested storage capacity. The capacity range varies for different disk types. Make sure that the value of
storagecomplies with the capacity limits of the disk type corresponding to the referenced StorageClass to prevent disk creation failures.storageClassNameThe StorageClass to bind.
Create a PVC.
kubectl create -f disk-pvc.yamlView the PVC.
kubectl get pvcIn the output, because the StorageClass uses the
WaitForFirstConsumermode, the PVC is in thePendingstate until the first pod that uses it is successfully scheduled.NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE disk-pvc Pending alicloud-disk-wait-for-first-consumer <unset> 14s
Console
In the navigation pane on the left of the cluster management page, choose .
On the Persistent Volume Claims page, click Create. Set PVC Type to Cloud Disk and configure the parameters as prompted.
Parameter
Description
Allocation Mode
Select Use StorageClass.
Existing StorageClass
A default or manually created StorageClass.
Capacity
The requested storage capacity. The capacity range varies for different disk types. Make sure that the value of
storagecomplies with the capacity limits of the disk type corresponding to the referenced StorageClass to prevent disk creation failures.Access Mode
Only ReadWriteOnce is supported. This means the volume can be mounted as read-write by a single pod.
After the PVC is created, you can view it on the Persistent Volume Claims page.
Mount the PVC in an application.
Create a file named
disk-deployment.yaml.Run the following command to deploy the deployment.
kubectl create -f disk-deployment.yaml
Verify the mount result.
Verify that the pod is running correctly.
kubectl get pods -l app=nginx-singleLog on to the pod and check whether the disk is mounted to the
/datadirectory.# Get the pod name POD_NAME=$(kubectl get pods -l app=nginx-single -o jsonpath='{.items[0].metadata.name}') # Run the df -h command kubectl exec $POD_NAME -- df -h /dataThe following output indicates that the 20 GiB disk is successfully mounted.
Filesystem Size Used Avail Use% Mounted on /dev/vdb 20G 24K 20G 1% /data
References
If you encounter issues when you use disk volumes, see Disk volume FAQ.
For optimization suggestions for multi-zone disk deployments, see Recommended configurations for high availability of disk volumes.
If your cluster still uses the deprecated FlexVolume component, migrate FlexVolume to CSI.
For more information about how to create a workload, see Create a StatefulSet and Create a Deployment.
If you no longer use a disk and want to stop billing for it, release the disk. Releasing a disk deletes the disk and its data, and stops billing.