All Products
Search
Document Center

Container Service for Kubernetes:Disaster recovery scenario for multiple ACK clusters in different VPCs (based on CEN for VPC network connectivity)

Last Updated:Mar 26, 2026

Alibaba Cloud Service Mesh (ASM) provides two geolocation-based traffic management capabilities for multi-cluster deployments. Use inter-region failover to automatically reroute traffic to a healthy region when a service in one region becomes unavailable. Use inter-region traffic distribution to split traffic across regions according to configured weights, enabling multi-region load balancing.

This tutorial uses the Bookinfo sample application to demonstrate both capabilities across two ACK clusters in different regions.

How it works

ASM reads the region of each workload from cloud node labels and uses that information to make locality-aware routing decisions.

Capability Behavior Configuration
Inter-region failover Redirects requests away from an unhealthy region according to priority rules UI path depends on ASM instance version (see Step 5)
Inter-region traffic distribution Splits requests across regions according to configured weights Requires ASM version 1.22.6.66 or later

Both capabilities are configured through Geolocation-based Load Balancing on the ASM instance's Base Information page. Failover and traffic distribution are mutually exclusive — you must disable one before enabling the other.

Prerequisites

Before you begin, ensure that you have:

  • An Alibaba Cloud account with permissions to create ACK clusters, ASM instances, VPCs, and Cloud Enterprise Network (CEN) instances

  • Familiarity with kubectl and kubeconfig management for multiple clusters

Network configuration

Plan the following CIDR blocks before creating any resources. All vSwitches in the same CEN must use non-overlapping CIDR blocks to avoid route conflicts.

For CIDR planning guidance, see Plan CIDR blocks for multiple clusters on the data plane.

VPC configuration

Object VPC name Region IPv4 CIDR block
Cluster vpc-hangzhou cn-hangzhou 20.0.0.0/8
Cluster vpc-shanghai cn-shanghai 21.0.0.0/8
Service Mesh vpc-hangzhou2 cn-hangzhou 192.168.0.0/16

vSwitch configuration

Object vSwitch name VPC IPv4 CIDR block
Cluster vpc-hangzhou-switch-1 vpc-hangzhou 20.0.0.0/16
Cluster vpc-shanghai-switch-1 vpc-shanghai 21.0.0.0/16
Service Mesh vpc-hangzhou-switch-2 vpc-hangzhou2 192.168.0.0/24

Cluster Pod and service CIDRs

Cluster Region VPC Pod CIDR Service CIDR
ack-hangzhou cn-hangzhou vpc-hangzhou 10.0.0.0/16 172.16.0.0/16
ack-shanghai cn-shanghai vpc-shanghai 10.1.0.0/16 172.17.0.0/16

Step 1: Create clusters in different regions

  1. Create the VPCs and vSwitches in the Hangzhou and Shanghai regions using the values in the network configuration tables above. See Create a vSwitch and Create a VPC and a vSwitch.

  2. Create an ACK managed cluster in each region using the VPCs you created. See Create an ACK managed cluster.

  3. Create an ASM instance in the Hangzhou region using the vpc-hangzhou2 VPC. See Create an ASM instance.

Step 2: Connect VPC networks across regions using CEN

Connect the cluster VPCs to each other and to the Service Mesh VPC using CEN transit routers.

  1. Log on to the Cloud Enterprise Network (CEN) console and create a CEN instance. See Create a CEN instance.

  2. Create two transit routers — one in each region: Create the following transit routers:

    1. On the CEN Instances page, click the name of your CEN instance. Under the Basic Information tab, click Create Transit Router.

    2. Set the Region and Name, then click OK.

    Name Region
    hangzhou-router China (Hangzhou)
    shanghai-router China (Shanghai)
  3. Add VPCs to each transit router. Repeat the following steps for each VPC:

    1. Click the transit router ID.

    2. On the Intra-region Connections tab, click Create Connection.

    3. Set Instance Type to Virtual Private Cloud (VPC) and select the VPC for the transit router's region.

    4. Keep all other settings as default and click OK.

  4. Configure bandwidth for inter-region communication. After creation, the connection appears on the Inter-region Connections tab.

    1. Click the name of a transit router, then click Create Connection.

    2. In the Connection With Peer Network Instance dialog, set Region to the local region and Peer Region to the remote region.

    3. Configure the bandwidth and other parameters, then click OK. For details, see Inter-region connections.

    2023-09-27_15-11-25.png

  5. Add inbound security group rules to allow cross-cluster traffic:

    1. Log on to the ACK console. In the left navigation pane, click Clusters.

    2. Get the Pod CIDR for each cluster:

      • Select the Shanghai region, click ack-shanghai, then on the Cluster Information page, click the Basic Information tab to find the Pod CIDR.

      • Repeat for ack-hangzhou in the Hangzhou region.

    3. Add an inbound security group rule to each cluster:

      1. On the Cluster Information page, click Basic Information, then click the security group ID next to Control Plane Security Group.

      2. On the Inbound tab, click Add Rule.

      3. Set Protocol Type to All and Source to the Pod CIDR of the peer cluster. Keep all other settings as default and click Save.

    4. Log on to a node in each cluster and run ping to confirm connectivity to the other cluster. See Log on to nodes.

    For the Flannel network plugin, add the Pod CIDR of the peer cluster as the source. For the Terway network plugin, add the vSwitch CIDR of the peer cluster instead. To find the vSwitch CIDR, log on to the VPC console and check the IPv4 CIDR Block column on the vSwitch page.

Step 3: Add clusters to ASM and create an ingress gateway

  1. Add both the ack-hangzhou and ack-shanghai clusters to your ASM instance. See Add a cluster to an ASM instance.

  2. Create a hosted ingress gateway by applying the following manifest:

    apiVersion: istio.alibabacloud.com/v1beta1
    kind: IstioGateway
    metadata:
      annotations:
        asm.alibabacloud.com/managed-by-asm: 'true'
      name: ingressgateway
      namespace: istio-system
    spec:
      gatewayType: ingress
      dnsPolicy: ClusterFirst
      externalTrafficPolicy: Local
      hostNetwork: false
      ports:
      - name: http
        port: 80
        protocol: TCP
        targetPort: 80
      - name: https
        port: 443
        protocol: TCP
        targetPort: 443
      replicaCount: 1
      resources:
        limits:
          cpu: '2'
          memory: 2G
        requests:
          cpu: 200m
          memory: 256Mi
      rollingMaxSurge: 100%
      rollingMaxUnavailable: 25%
      runAsRoot: true
      serviceType: LoadBalancer

Step 4: Deploy the Bookinfo application

Important

The following steps require switching between the kubeconfig contexts of ack-hangzhou and ack-shanghai. Use kubectl config use-context to switch contexts, or a tool such as kubecm or kubectx to manage multiple kubeconfigs.

  1. Deploy the Bookinfo application in both clusters:

    kubectl apply -f bookinfo.yaml
  2. Connect kubectl to the ASM instance and apply routing rules. Create a file named asm.yaml with the following content:

    View the complete YAML

    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      name: bookinfo-gateway
    spec:
      selector:
        istio: ingressgateway # use istio default controller
      servers:
      - port:
          number: 80
          name: http
          protocol: HTTP
        hosts:
        - "*"
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: bookinfo
    spec:
      hosts:
      - "*"
      gateways:
      - bookinfo-gateway
      http:
      - match:
        - uri:
            exact: /productpage
        - uri:
            prefix: /static
        - uri:
            exact: /login
        - uri:
            exact: /logout
        - uri:
            prefix: /api/v1/products
        route:
        - destination:
            host: productpage
            port:
              number: 9080
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: productpage
    spec:
      host: productpage
      subsets:
      - name: v1
        labels:
          version: v1
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: reviews
    spec:
      host: reviews
      subsets:
      - name: v1
        labels:
          version: v1
      - name: v2
        labels:
          version: v2
      - name: v3
        labels:
          version: v3
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: ratings
    spec:
      host: ratings
      subsets:
      - name: v1
        labels:
          version: v1
      - name: v2
        labels:
          version: v2
      - name: v2-mysql
        labels:
          version: v2-mysql
      - name: v2-mysql-vm
        labels:
          version: v2-mysql-vm
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: details
    spec:
      host: details
      subsets:
      - name: v1
        labels:
          version: v1
      - name: v2
        labels:
          version: v2

    Apply the routing rules:

    kubectl apply -f asm.yaml
  3. Verify the deployment:

    1. Get the ingress gateway IP address.

    2. Open http://<ingress-gateway-IP>/productpage in a browser and refresh the page 10 times. The Bookinfo application distributes requests across the v1, v2, and v3 versions of the reviews service. After 10 refreshes, the traffic ratio across the three versions should be close to 1:1:1. yuque_diagram

Step 5: Configure inter-region failover and traffic distribution

Configure inter-region failover

When the reviews service in ack-hangzhou becomes unavailable, ASM reroutes requests to the reviews service in ack-shanghai. This requires two configurations: outlier detection (so the proxy knows when an endpoint is unhealthy) and a geolocation failover policy (so the proxy knows where to redirect traffic).

Expected routing after failover is active:

Source region Normal routing After failover
cn-hangzhou reviews (ack-hangzhou) reviews (ack-shanghai)
cn-shanghai reviews (ack-shanghai) reviews (ack-hangzhou)

Simulate a service outage

Scale the reviews Deployment in ack-hangzhou to zero replicas:

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. Click the ack-hangzhou cluster name. In the left pane, choose Workloads > Deployments.

  3. On the Deployments page, set Namespace to default, then click Scale in the Actions column for the reviews Deployment.

  4. Set Desired Number Of Pods to 0 and click OK.

Configure outlier detection

Outlier detection tells the sidecar proxies when an endpoint is unhealthy and removes it from the load balancing pool. The configuration serves three purposes:

  • Outlier detection: tells each proxy to eject an endpoint after detecting consecutive failures

  • Connection pool: forces each HTTP request to use a new connection, so failover triggers immediately after a failure is detected (for demonstration purposes only)

  • Ejection recovery: keeps the ejected endpoint out of rotation for a minimum duration before it is reconsidered

  1. On the ASM instance details page, choose Traffic Management Center > DestinationRule.

  2. In the Actions column for reviews, click Edit YAML.

  3. Add the following trafficPolicy block to the spec, then click OK:

    spec:
      # ... existing subsets configuration ...
      trafficPolicy:
        connectionPool:
          http:
            maxRequestsPerConnection: 1  # Each request uses a new connection;
                                         # this triggers ejection immediately after
                                         # a failure. For demonstration only.
        outlierDetection:
          interval: 1s        # Check for unhealthy endpoints every second
          consecutive5xxErrors: 1  # Eject after 1 consecutive 5xx error
          baseEjectionTime: 1m    # Keep the endpoint ejected for at least 1 minute

Enable the geolocation failover policy

The UI path depends on your ASM instance version.

ASM version 1.22.6.66 and later

  1. On the ASM instance details page, choose ASM Instance > Base Information.

  2. Click Configure a Geolocation-based Load Balancing next to Geolocation-based Load Balancing.

  3. Click Specify priority rules for regions.

  4. Set Region in which the failure occurs to cn-shanghai and The region to which the traffic is preferentially routed to cn-hangzhou. Click Add.

  5. Set Region in which the failure occurs to cn-hangzhou and The region to which the traffic is preferentially routed to cn-shanghai. Click Save Configuration.

ASM versions earlier than 1.22.6.66

  1. On the ASM instance details page, choose ASM Instance > Base Information.

  2. Click Setting next to Geolocation-based Load Balancing.

  3. In the Geolocation-based Failover dialog, set Failover to to cn-hangzhou when the Policy is from cn-shanghai, and set Failover to to cn-shanghai when the Policy is from cn-hangzhou. Click Confirm.

Verify failover

Send 10 requests to the ack-hangzhou ingress gateway and count how many reach the v2 reviews service (which runs in ack-shanghai):

for ((i=1;i<=10;i++)); do
  curl http://<Port 80 of the IP address of the ingress gateway in the ack-hangzhou cluster>/productpage 2>&1 | grep full.stars
done | wc -l

Expected output:

20

The output of 20 confirms that all 10 requests were routed to the v2 reviews service in ack-shanghai. Each request to the v2 version returns two lines containing full stars, so 10 requests produce 20 lines.

Configure inter-region traffic distribution

Important

Inter-region traffic distribution requires ASM instance version 1.22.6.66 or later.

Traffic distribution and failover are mutually exclusive. If you configured failover in the previous section, disable it first before enabling traffic distribution.

Target routing after traffic distribution is configured:

Source region Destination Traffic percentage
cn-hangzhou cn-hangzhou (local) 90%
cn-hangzhou cn-shanghai (remote) 10%
  1. Log on to the ASM console. In the left navigation pane, choose Service Mesh > Mesh Management.

  2. Click the name of your ASM instance. In the left navigation pane, choose ASM Instance > Base Information.

  3. Click Configure a Geolocation-based Load Balancing next to Geolocation-based Load Balancing.

  4. If failover mode is currently enabled, click Disable in the upper-right corner to switch to traffic distribution mode.

  5. Click Configure a traffic distribution rule. Set Source to cn-hangzhou, Destination to cn-shanghai, and Traffic Percentage to 10%. Click Save Configuration.

Verify traffic distribution

Send 10 requests and inspect which version of the reviews service responds:

for ((i=1;i<=10;i++)); do
  curl http://<Port 80 of the IP address of the ingress gateway in the ack-hangzhou cluster>/productpage 2>&1 | grep full.stars
done

Expected output:

<!-- full stars: -->
<!-- full stars: -->

Two lines of full stars across 10 requests confirms the distribution is working. Nine requests went to the v1 reviews service in ack-hangzhou (no stars) and one request went to the v2 reviews service in ack-shanghai (two full stars lines), matching the configured 90%/10% split.

FAQ

Why does adding a cluster to ASM fail even after connecting the VPCs through CEN?

When clusters are in different regions, the ASM control plane must communicate with the data plane clusters over the inter-region connection. If you have not purchased an inter-region data transfer plan or have not correctly configured inter-region bandwidth in CEN, this connection cannot be established and cluster addition fails.

Reconfigure the inter-region traffic settings in CEN. See Step 2: Connect VPC networks across regions using CEN.

What's next