Regional Enterprise SSD (ESSD) disks automatically and synchronously replicate data across multiple zones within the same region, so your stateful applications keep running when a zone fails — without any code changes. If a node or an entire zone becomes unavailable, Kubernetes reschedules the affected pods to another zone, where they remount the same volume and resume immediately.
Before you begin
Review these topics before proceeding:
-
Disk overview — introduction to regional ESSDs
-
Limits — supported regions and other restrictions
-
Elastic Block Storage billing — regional ESSDs are billed by disk capacity on a pay-as-you-go basis when used as Kubernetes volumes
When to use regional ESSDs
| Standard disk | Regional ESSD | |
|---|---|---|
| Replication | Single zone | Synchronous across multiple zones |
| Zone failure handling | — | Pod is rescheduled to another zone automatically |
| Code changes required | — | None |
| Billing model | — | Pay-as-you-go only (as Kubernetes volume) |
Use a regional ESSD when your stateful application must stay available across zone failures without manual intervention.
Prerequisites
Before you begin, ensure that you have:
-
An ACK managed cluster running Kubernetes 1.26 or later
-
csi-plugin and csi-provisioner at version 1.33.4 or later
Use regional ESSDs in ACK
Step 1: Confirm node support
List all nodes that support regional ESSDs:
kubectl get node -lnode.csi.alibabacloud.com/disktype.cloud_regional_disk_auto=available
To verify cross-zone failover, you need at least two supported nodes in different zones. The steps below use cn-beijing-i and cn-beijing-l as examples.
Step 2: Create a StorageClass
-
Create
sc-regional.yaml:apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: alibabacloud-disk-regional parameters: type: cloud_regional_disk_auto provisioner: diskplugin.csi.alibabacloud.com reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer allowVolumeExpansion: trueKey parameters:
Parameter Value Purpose typecloud_regional_disk_autoProvisions a regional ESSD that replicates data synchronously across multiple zones volumeBindingModeWaitForFirstConsumerDelays volume creation until a pod is scheduled, so the disk is provisioned in the correct zone; without this, the disk may be locked to the wrong zone and block cross-zone failover reclaimPolicyDeleteDeletes the underlying disk when the Persistent Volume Claim (PVC) is deleted allowVolumeExpansiontrueEnables disk capacity expansion -
Apply the StorageClass:
kubectl apply -f sc-regional.yaml
Step 3: Deploy a stateful application
-
Create
disk-test.yamlto define a StatefulSet that uses the StorageClass:apiVersion: apps/v1 kind: StatefulSet metadata: name: disk-test spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6 ports: - containerPort: 80 volumeMounts: - name: pvc-disk mountPath: /data volumeClaimTemplates: - metadata: name: pvc-disk spec: accessModes: [ "ReadWriteOnce" ] storageClassName: alibabacloud-disk-regional resources: requests: storage: 20Gi -
Deploy the application:
kubectl apply -f disk-test.yamlWhen the pod is scheduled, the CSI driver provisions a 20 GiB regional ESSD, creates a Persistent Volume (PV), and mounts it to the pod. The disk then begins synchronously replicating data across multiple zones.
Step 4: Verify the application is running
-
Check the PVC and pod status:
kubectl get pvc pvc-disk-disk-test-0 kubectl get pod disk-test-0Expected output:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE pvc-disk-disk-test-0 Bound d-2ze5xxxxxxxxxxxxxxxx 20Gi RWO alibabacloud-disk-regional <unset> 14m NAME READY STATUS RESTARTS AGE disk-test-0 1/1 Running 0 14mThe PVC is bound and the pod is running, confirming that the regional ESSD was provisioned and mounted successfully.
-
Identify the node and zone where the pod is running:
kubectl get node $(kubectl get pod disk-test-0 -ojsonpath='{.spec.nodeName}') -Ltopology.kubernetes.io/zoneExpected output:
NAME STATUS ROLES AGE VERSION ZONE cn-beijing.172.25.xxx.xx Ready <none> 6m32s v1.32.1-aliyun.1 cn-beijing-iThe pod is scheduled to
cn-beijing-i.
Step 5: Simulate a zone failure and verify cross-zone failover
This operation affects all pods running in the target zone. Do not perform this in a production environment.
-
Taint all nodes in the pod's current zone to simulate a zone failure:
kubectl taint node -ltopology.kubernetes.io/zone=cn-beijing-i testing=regional:NoExecuteThe Kubernetes Controller Manager (KCM) detects the taint, evicts the pod from the affected nodes, and reschedules it to a node in another zone.
-
Check the pod's new status and location:
kubectl get pod disk-test-0 kubectl get node $(kubectl get pod disk-test-0 -ojsonpath='{.spec.nodeName}') -Ltopology.kubernetes.io/zoneExpected output:
NAME READY STATUS RESTARTS AGE disk-test-0 1/1 Running 0 20s NAME STATUS ROLES AGE VERSION ZONE cn-beijing.172.26.xxx.xx Ready <none> 32m v1.32.1-aliyun.1 cn-beijing-lThe pod is now running in
cn-beijing-l. The regional ESSD is reattached automatically and the data remains intact — no manual synchronization required.
Step 6: Clean up
-
Remove the taint to restore normal scheduling in
cn-beijing-i:kubectl taint node -ltopology.kubernetes.io/zone=cn-beijing-i testing- -
Delete the test resources:
kubectl delete sts disk-test kubectl delete pvc pvc-disk-disk-test-0