When a zone becomes unavailable, workloads concentrated in that zone go offline entirely. Topology spread constraints distribute Pods evenly across availability zones in an ACS cluster so that a single-zone failure affects only the Pods in that zone — the rest of your workload keeps running.
Prerequisites
Before you begin, ensure that you have:
-
kube-scheduler installed, with a version that meets the following requirements:
ACS cluster version Scheduler component version 1.31 v1.31.0-aliyun-1.2.0 and later 1.30 v1.30.3-aliyun-1.1.1 and later 1.28 v1.28.9-aliyun-1.1.0 and later -
acs-virtual-node installed, version v2.12.0-acs.4 or later
Limitations
The following constraints apply only to Pods that meet all of these conditions:
-
The Pod uses the High-Performance Network GPU (GPU-HPN) compute type.
-
The Pod's
schedulerNameisdefault-scheduler. -
Enable Custom Tags And Scheduler For GPU-HPN Nodes is not selected in the scheduler component configuration.
New versions of the kube-scheduler component enable Enable Custom Tags And Scheduler For GPU-HPN Nodes by default. For details, see kube-scheduler.
For Pods that meet all three conditions, the following topology spread constraint fields behave differently from standard Kubernetes:
| Field | Description | Constraint |
|---|---|---|
labelSelector |
Selects Pods to include when counting the number of Pods in each topology domain. | Pods of other compute types (General-purpose, compute-optimized, GPU) are excluded from the count. |
matchLabelKeys |
A list of label keys used together with labelSelector to identify the set of Pods for distribution calculation. |
No constraint. |
nodeAffinityPolicy |
Controls how nodeAffinity and nodeSelector are applied when calculating topology distribution skew. |
Not supported. |
nodeTaintsPolicy |
Controls how node taints are applied when calculating topology distribution skew. | Not supported. |
For general-purpose, compute-optimized, and GPU compute types, the standard Kubernetes topology spread constraints behavior applies without these restrictions.
Distribute Pods across zones
-
View the virtual nodes in the cluster.
kubectl get nodeThe output lists nodes by zone. For example:
NAME STATUS ROLES AGE VERSION virtual-kubelet-cn-hangzhou-i Ready agent 5h42m v1.28.3-xx virtual-kubelet-cn-hangzhou-j Ready agent 5h42m v1.28.3-xx -
Create the
dep-spread-demo.yamlfile with the following content.apiVersion: apps/v1 kind: Deployment metadata: name: dep-spread-demo labels: app: spread-demo spec: replicas: 4 selector: matchLabels: app: spread-demo template: metadata: labels: app: spread-demo spec: containers: - name: spread-demo image: registry-cn-beijing.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "infinity" # Spread Pods evenly across zones. # maxSkew: 1 means no zone can have more than one extra Pod compared to any other zone. topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: app: spread-demo -
Deploy the workload.
kubectl apply -f dep-spread-demo.yaml -
Verify the Pod distribution across zones.
kubectl get pod -o wideThe output shows 4 Pods distributed across 2 zones, with 2 Pods per zone:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES dep-spread-demo-7c656dbf5f-6twkc 1/1 Running 0 2m29s 192.168.xx.xxx virtual-kubelet-cn-hangzhou-i <none> <none> dep-spread-demo-7c656dbf5f-cgxr8 1/1 Running 0 2m29s 192.168.xx.xxx virtual-kubelet-cn-hangzhou-j <none> <none> dep-spread-demo-7c656dbf5f-f4fz9 1/1 Running 0 2m29s 192.168.xx.xxx virtual-kubelet-cn-hangzhou-j <none> <none> dep-spread-demo-7c656dbf5f-kc6xf 1/1 Running 0 2m29s 192.168.xx.xxx virtual-kubelet-cn-hangzhou-i <none> <none>