All Products
Search
Document Center

Container Service for Kubernetes:Use regional ESSDs for cross-zone disaster recovery

Last Updated:Mar 27, 2026

Regional Enterprise SSD (ESSD) disks automatically and synchronously replicate data across multiple zones within the same region, so your stateful applications keep running when a zone fails — without any code changes. If a node or an entire zone becomes unavailable, Kubernetes reschedules the affected pods to another zone, where they remount the same volume and resume immediately.

Before you begin

Review these topics before proceeding:

When to use regional ESSDs

Standard disk Regional ESSD
Replication Single zone Synchronous across multiple zones
Zone failure handling Pod is rescheduled to another zone automatically
Code changes required None
Billing model Pay-as-you-go only (as Kubernetes volume)

Use a regional ESSD when your stateful application must stay available across zone failures without manual intervention.

Prerequisites

Before you begin, ensure that you have:

Use regional ESSDs in ACK

Step 1: Confirm node support

List all nodes that support regional ESSDs:

kubectl get node -lnode.csi.alibabacloud.com/disktype.cloud_regional_disk_auto=available

To verify cross-zone failover, you need at least two supported nodes in different zones. The steps below use cn-beijing-i and cn-beijing-l as examples.

Step 2: Create a StorageClass

  1. Create sc-regional.yaml:

    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: alibabacloud-disk-regional
    parameters:
      type: cloud_regional_disk_auto
    provisioner: diskplugin.csi.alibabacloud.com
    reclaimPolicy: Delete
    volumeBindingMode: WaitForFirstConsumer
    allowVolumeExpansion: true

    Key parameters:

    Parameter Value Purpose
    type cloud_regional_disk_auto Provisions a regional ESSD that replicates data synchronously across multiple zones
    volumeBindingMode WaitForFirstConsumer Delays volume creation until a pod is scheduled, so the disk is provisioned in the correct zone; without this, the disk may be locked to the wrong zone and block cross-zone failover
    reclaimPolicy Delete Deletes the underlying disk when the Persistent Volume Claim (PVC) is deleted
    allowVolumeExpansion true Enables disk capacity expansion
  2. Apply the StorageClass:

    kubectl apply -f sc-regional.yaml

Step 3: Deploy a stateful application

  1. Create disk-test.yaml to define a StatefulSet that uses the StorageClass:

    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      name: disk-test
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6
            ports:
            - containerPort: 80
            volumeMounts:
            - name: pvc-disk
              mountPath: /data
      volumeClaimTemplates:
      - metadata:
          name: pvc-disk
        spec:
          accessModes: [ "ReadWriteOnce" ]
          storageClassName: alibabacloud-disk-regional
          resources:
            requests:
              storage: 20Gi
  2. Deploy the application:

    kubectl apply -f disk-test.yaml

    When the pod is scheduled, the CSI driver provisions a 20 GiB regional ESSD, creates a Persistent Volume (PV), and mounts it to the pod. The disk then begins synchronously replicating data across multiple zones.

Step 4: Verify the application is running

  1. Check the PVC and pod status:

    kubectl get pvc pvc-disk-disk-test-0
    kubectl get pod disk-test-0

    Expected output:

    NAME                   STATUS   VOLUME                   CAPACITY   ACCESS MODES   STORAGECLASS                 VOLUMEATTRIBUTESCLASS   AGE
    pvc-disk-disk-test-0   Bound    d-2ze5xxxxxxxxxxxxxxxx   20Gi       RWO            alibabacloud-disk-regional   <unset>                 14m
    NAME          READY   STATUS    RESTARTS   AGE
    disk-test-0   1/1     Running   0          14m

    The PVC is bound and the pod is running, confirming that the regional ESSD was provisioned and mounted successfully.

  2. Identify the node and zone where the pod is running:

    kubectl get node $(kubectl get pod disk-test-0 -ojsonpath='{.spec.nodeName}') -Ltopology.kubernetes.io/zone

    Expected output:

    NAME                       STATUS   ROLES    AGE     VERSION            ZONE
    cn-beijing.172.25.xxx.xx   Ready    <none>   6m32s   v1.32.1-aliyun.1   cn-beijing-i

    The pod is scheduled to cn-beijing-i.

Step 5: Simulate a zone failure and verify cross-zone failover

Warning

This operation affects all pods running in the target zone. Do not perform this in a production environment.

  1. Taint all nodes in the pod's current zone to simulate a zone failure:

    kubectl taint node -ltopology.kubernetes.io/zone=cn-beijing-i testing=regional:NoExecute

    The Kubernetes Controller Manager (KCM) detects the taint, evicts the pod from the affected nodes, and reschedules it to a node in another zone.

  2. Check the pod's new status and location:

    kubectl get pod disk-test-0
    kubectl get node $(kubectl get pod disk-test-0 -ojsonpath='{.spec.nodeName}') -Ltopology.kubernetes.io/zone

    Expected output:

    NAME          READY   STATUS    RESTARTS   AGE
    disk-test-0   1/1     Running   0          20s
    NAME                       STATUS   ROLES    AGE   VERSION            ZONE
    cn-beijing.172.26.xxx.xx   Ready    <none>   32m   v1.32.1-aliyun.1   cn-beijing-l

    The pod is now running in cn-beijing-l. The regional ESSD is reattached automatically and the data remains intact — no manual synchronization required.

Step 6: Clean up

  1. Remove the taint to restore normal scheduling in cn-beijing-i:

    kubectl taint node -ltopology.kubernetes.io/zone=cn-beijing-i testing-
  2. Delete the test resources:

    kubectl delete sts disk-test
    kubectl delete pvc pvc-disk-disk-test-0