All Products
Search
Document Center

Container Service for Kubernetes:Implement zone-disaster recovery with an ALB multi-cluster gateway in ACK One

Last Updated:Jun 25, 2026

Route cross-zone traffic by weight and fail over automatically when a cluster goes down.

How it works

A business architecture has three layers — access, application, and data. Zone-disaster recovery addresses each:

  • Access layer — Distributed Cloud Container Platform for Kubernetes (ACK One) deploys the ALB instance across multiple zones within a region by default, so the entry point is highly available.

  • Application layer — An ALB multi-cluster gateway distributes traffic across clusters in different AZs with weight-based Ingress rules. When a cluster fails, ALB detects unhealthy pods and reroutes traffic to the remaining cluster automatically.

  • Data layer — Data-layer recovery depends on middleware (for example, ApsaraDB RDS) and is not covered here.

Zone-disaster recovery compared to other approaches

Approach Network latency Scope of protection Complexity
Zone-disaster recovery (this topic) Low — same region Zone-level failures (power outage, network interruption, fire) Low
Active geo-redundancy Higher — cross-region Region-level disasters (flood, earthquake) High
Three data centers across two zones Low + higher Zone-level and region-level combined Highest

ALB multi-cluster gateway vs. DNS-based traffic distribution

DNS-based traffic distribution requires a separate load balancer IP address per cluster and relies on DNS caching during failover, causing temporary service interruptions. The ALB multi-cluster gateway:

  • Uses a single IP address for the region, with multi-zone deployment by default

  • Supports Layer 7 request forwarding and weight-based routing

  • Fails over to backend pods in another cluster without client-side DNS cache delays

  • Lets you manage all traffic rules from the Fleet instance — no Ingress controller per cluster needed

Architecture

This example uses a web application (a Deployment and a Service) to demonstrate zone-disaster recovery in the China (Hong Kong) region:

image
  • Cluster 1 runs in AZ 1; Cluster 2 runs in AZ 2.

  • ACK One GitOps distributes web-demo to both clusters.

  • An AlbConfig on the Fleet instance creates an ALB multi-cluster gateway routing traffic across both clusters.

  • Ingress weight annotations control the traffic split. When one cluster is unhealthy, ALB automatically shifts traffic to the other.

  • Data synchronization (ApsaraDB RDS) depends on middleware and is not covered here.

Prerequisites

Ensure you have:

Step 1: Distribute the application to multiple clusters

Use GitOps to deploy web-demo to both associated clusters. You can also create a multi-cluster application or get started with application distribution.

  1. Log on to the ACK One console. In the left-side navigation pane, choose Fleet > Multi-cluster Applications.

  2. On the Multi-cluster Applications page, click Dingtalk_20231226104633.jpg next to the Fleet instance name and select your Fleet instance.

  3. Choose Create Multi-cluster Application > GitOps to go to the Create Multi-cluster Application - GitOps page.

    If GitOps is not enabled, enable GitOps for the Fleet instance. To allow internet access, enable public access to Argo CD.
  4. On the Create from YAML tab, paste the following ApplicationSet and click OK.

    This ApplicationSet deploys web-demo to every associated cluster. The Quick Create tab provides a form-based alternative — changes sync automatically to the Create from YAML tab.
    apiVersion: argoproj.io/v1alpha1
    kind: ApplicationSet
    metadata:
      name: appset-web-demo
      namespace: argocd
    spec:
      template:
        metadata:
          name: '{{.metadata.annotations.cluster_id}}-web-demo'
          namespace: argocd
        spec:
          destination:
            name: '{{.name}}'
            namespace: gateway-demo
          project: default
          source:
            repoURL: https://github.com/AliyunContainerService/gitops-demo.git
            path: manifests/helm/web-demo
            targetRevision: main
            helm:
              valueFiles:
                - values.yaml
              parameters:
                - name: envCluster
                  value: '{{.metadata.annotations.cluster_name}}'
          syncPolicy:
            automated: {}
            syncOptions:
              - CreateNamespace=true
      generators:
        - clusters:
            selector:
              matchExpressions:
                - values:
                    - cluster
                  key: argocd.argoproj.io/secret-type
                  operator: In
                - values:
                    - in-cluster
                  key: name
                  operator: NotIn
      goTemplateOptions:
        - missingkey=error
      syncPolicy:
        preserveResourcesOnDeletion: false
      goTemplate: true

Step 2: Create the ALB multi-cluster gateway

Create an AlbConfig on the Fleet instance to provision an ALB instance and associate your clusters with it.

  1. Get two vSwitch IDs in the Fleet instance VPC.

  2. Create gateway.yaml with the following content. Replace ${vsw-id1} and ${vsw-id2} with the vSwitch IDs from the previous step, and ${cluster1} and ${cluster2} with the associated cluster IDs.

    apiVersion: alibabacloud.com/v1
    kind: AlbConfig
    metadata:
      name: ackone-gateway-demo
      annotations:
        # Cluster IDs to associate with the ALB instance
        alb.ingress.kubernetes.io/remote-clusters: ${cluster1},${cluster2}
    spec:
      config:
        name: one-alb-demo
        addressType: Internet
        addressAllocatedMode: Fixed
        zoneMappings:
        - vSwitchId: ${vsw-id1}
        - vSwitchId: ${vsw-id2}
      listeners:
      - port: 8001
        protocol: HTTP
    ---
    apiVersion: networking.k8s.io/v1
    kind: IngressClass
    metadata:
      name: alb
    spec:
      controller: ingress.k8s.alibabacloud/alb
      parameters:
        apiGroup: alibabacloud.com
        kind: AlbConfig
        name: ackone-gateway-demo

    Key parameters:

    Parameter Required Description
    metadata.name Yes Name of the AlbConfig.
    metadata.annotations: alb.ingress.kubernetes.io/remote-clusters Yes Comma-separated cluster IDs. Clusters must already be associated with the Fleet instance.
    spec.config.name No Name of the ALB instance.
    spec.config.addressType No Network type: Internet (default, public-facing) or Intranet (VPC-internal). Internet-facing ALB instances require an elastic IP address (EIP) and incur instance and bandwidth fees. See Pay-as-you-go.
    spec.config.zoneMappings Yes vSwitch IDs. Specify at least two zones supported by ALB for high availability. See Regions and zones in which ALB is available and Create and manage a vSwitch.
    spec.listeners No Listener port and protocol. The example sets HTTP on port 8001. Keep this configuration — ALB Ingresses require a listener before they can route traffic.
  3. Apply the configuration:

    kubectl apply -f gateway.yaml
  4. Wait 1–3 minutes, then verify the ALB multi-cluster gateway was created:

    kubectl get albconfig ackone-gateway-demo

    Expected output:

    NAME                  ALBID      DNSNAME                               PORT&PROTOCOL   CERTID   AGE
    ackone-gateway-demo   alb-xxxx   alb-xxxx.<regionid>.alb.aliyuncs.com                           4d9h

    Note the DNSNAME value — you need it in Step 4.

  5. Verify the clusters are connected to the gateway:

    kubectl get albconfig ackone-gateway-demo -ojsonpath='{.status.loadBalancer.subClusters}'

    The output lists the associated cluster IDs.

Step 3: Configure Ingress routing rules

Create a namespace and an Ingress on the Fleet instance. Ingress weight annotations control how traffic splits across clusters.

  1. Create the gateway-demo namespace on the Fleet instance. This must match the namespace of the deployed application Services.

  2. Create ingress-demo.yaml with the following content. Replace ${cluster1-id} and ${cluster2-id} with the actual cluster IDs.

    The weights in alb.ingress.kubernetes.io/cluster-weight.* annotations must sum to 100.
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      annotations:
        alb.ingress.kubernetes.io/listen-ports: |
         [{"HTTP": 8001}]
        alb.ingress.kubernetes.io/cluster-weight.${cluster1-id}: "20"
        alb.ingress.kubernetes.io/cluster-weight.${cluster2-id}: "80"
      name: web-demo
      namespace: gateway-demo
    spec:
      ingressClassName: alb
      rules:
      - host: alb.ingress.alibaba.com
        http:
          paths:
          - path: /svc1
            pathType: Prefix
            backend:
              service:
                name: service1
                port:
                  number: 80
  3. Apply the Ingress:

    kubectl apply -f ingress-demo.yaml -n gateway-demo

Step 4: Verify zone-disaster recovery

Verify weighted traffic distribution

Send 500 requests to confirm the 20/80 traffic split between clusters.

Replace alb-xxxx.<regionid>.alb.aliyuncs.com with the DNSNAME from Step 2.

for i in {1..500}; do curl -H "host: alb.ingress.alibaba.com" alb-xxxx.<regionid>.alb.aliyuncs.com:8001/svc1; done > res.txt

The results show approximately 20% of responses from Cluster 1 (poc-ack-1) and 80% from Cluster 2 (poc-ack-2):

image

Simulate a cluster failure and verify automatic failover

  1. Start a continuous request stream:

    for i in {1..500}; do curl -H "host: alb.ingress.alibaba.com" alb-xxxx.<regionid>.alb.aliyuncs.com:8001/svc1; sleep 1; done
  2. While requests are running, scale the application Deployment in Cluster 2 to 0 replicas.

  3. Observe the output: after the change takes effect, all traffic shifts to Cluster 1 automatically.

    image

ALB detects no healthy pods in Cluster 2 and routes all requests to Cluster 1.

Next steps