When you run microservices across multiple regions, a single-region outage can disrupt all traffic unless you have a cross-region recovery mechanism in place. Service Mesh (ASM) addresses this with two geolocation-based traffic management capabilities:
Inter-region failover: Automatically reroutes traffic to a healthy region when services in the primary region become unavailable.
Inter-region traffic distribution: Splits traffic across regions based on configurable weight percentages for proactive load balancing.
This guide walks through both capabilities using the Bookinfo sample application deployed across two Container Service for Kubernetes (ACK) clusters in separate Virtual Private Clouds (VPCs), connected through Cloud Enterprise Network (CEN).
How it works
When you add multiple ACK clusters in different regions to a single ASM instance, ASM acts as the unified control plane for cross-region traffic management:
Client requests arrive at the ingress gateway in the primary region.
ASM evaluates geolocation-based load balancing rules to determine the target cluster.
For failover, outlier detection identifies unhealthy endpoints and redirects traffic to the backup region.
For traffic distribution, ASM splits traffic across regions according to configured weight percentages.
CEN provides the underlying network connectivity between VPCs in different regions, enabling pod-to-pod communication across clusters.
The following diagram illustrates the architecture used in this guide:
China (Hangzhou) China (Shanghai)
+--------------------------+ +--------------------------+
| ack-hangzhou | --CEN--| ack-shanghai |
| (reviews v1, v3) | | (reviews v2) |
| vpc-hangzhou | | vpc-shanghai |
+--------------------------+ +--------------------------+
^ ^
| |
ASM instance ----------- manages --------+
(vpc-hangzhou2)Failover path: When the reviews service in ack-hangzhou becomes unavailable, ASM reroutes all traffic to the reviews v2 service in ack-shanghai.
Traffic distribution path: ASM splits traffic between ack-hangzhou and ack-shanghai based on configured weights (for example, 90%/10%).
| Capability | Behavior | Use when |
|---|---|---|
| Inter-region failover | Reroutes all traffic to a backup region only when the primary region fails. | You run services actively in one region with a standby backup. |
| Inter-region traffic distribution | Splits traffic across regions based on fixed weight percentages at all times. | You run services actively across multiple regions to share production load. |
Prerequisites
Before you begin, make sure you have:
An Alibaba Cloud account with permissions to create VPC, ACK, ASM, and CEN resources
kubectlinstalled locallyBasic familiarity with Kubernetes and Istio traffic management concepts (VirtualService, DestinationRule, Gateway)
This guide creates all required infrastructure from scratch. If you already have multi-region ACK clusters connected through CEN, skip to Add clusters to ASM and create an ingress gateway.
Plan non-overlapping CIDR blocks
All VPCs, vSwitches, and cluster networks must use non-overlapping CIDR blocks to avoid routing conflicts when CEN connects the VPC networks. For detailed planning guidance, see Plan CIDR blocks for multiple clusters on the data plane.
The following tables show the example configurations used throughout this guide.
VPC configuration
| Object | VPC name | Region | IPv4 CIDR block |
|---|---|---|---|
| Cluster | vpc-hangzhou | cn-hangzhou | 20.0.0.0/8 |
| Cluster | vpc-shanghai | cn-shanghai | 21.0.0.0/8 |
| Service Mesh | vpc-hangzhou2 | cn-hangzhou | 192.168.0.0/16 |
vSwitch configuration
No two vSwitches can share the same CIDR block. Overlapping CIDR blocks cause route conflicts when CEN connects the VPC networks.
| Object | vSwitch name | VPC | IPv4 CIDR block |
|---|---|---|---|
| Cluster | vpc-hangzhou-switch-1 | vpc-hangzhou | 20.0.0.0/16 |
| Cluster | vpc-shanghai-switch-1 | vpc-shanghai | 21.0.0.0/16 |
| Service Mesh | vpc-hangzhou-switch-2 | vpc-hangzhou2 | 192.168.0.0/24 |
Pod and Service CIDR blocks
| Cluster name | Region | VPC | Pod CIDR | Service CIDR |
|---|---|---|---|---|
| ack-hangzhou | cn-hangzhou | vpc-hangzhou | 10.0.0.0/16 | 172.16.0.0/16 |
| ack-shanghai | cn-shanghai | vpc-shanghai | 10.1.0.0/16 | 172.17.0.0/16 |
Step 1: Create clusters in different regions
Create VPCs and vSwitches in the China (Hangzhou) and China (Shanghai) regions using the CIDR blocks listed above. See Create a VPC and a vSwitch and Create a vSwitch.
Create an ACK managed cluster in each region using the corresponding VPC. See Create an ACK managed cluster.
Create an ASM instance in the China (Hangzhou) region. See Create an ASM instance.
Step 2: Connect VPC networks through CEN
CEN connects the VPC networks between the two ACK clusters and between each cluster and the ASM instance, enabling cross-region pod-to-pod communication.
Create a CEN instance and transit routers
Log on to the CEN console and create a CEN instance. See Create a CEN instance.
On the CEN Instances page, click the CEN instance name. On the Basic Information tab, click Create Transit Router. Create two transit routers:
Region: China (Shanghai), Name: shanghai-router
Region: China (Hangzhou), Name: hangzhou-router
Attach VPCs to transit routers
Repeat the following steps for each transit router:
Click the transit router ID.
On the Intra-region Connections tab, click Create Connection.
Set Instance Type to Virtual Private Cloud (VPC) and select the VPC that corresponds to the transit router's region under Network Instance.
Keep other settings at their defaults and click OK.
Configure inter-region bandwidth
Click the name of a transit router, then click Create Connection.
In the Connection With Peer Network Instance dialog box, set Region to the local transit router's region and Peer Region to the remote region. For example, set Region to China (Hangzhou) and Peer Region to China (Shanghai). For parameter details, see Inter-region connections.
After creation, verify the connection appears on the Inter-region Connections tab.
Add security group rules
Allow cross-cluster pod traffic by adding the peer cluster's Pod CIDR block to each cluster's security group.
The following steps apply to clusters using the Flannel network plugin. For clusters using the Terway network plugin, use the cluster vSwitch CIDR block instead of the Pod CIDR block. Find the vSwitch CIDR in the IPv4 CIDR Block column on the vSwitch page of the VPC console.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
Get the Pod CIDR block for each cluster:
On the Clusters page, select the China (Shanghai) region. Click the ack-shanghai cluster name. On the Cluster Information page, click the Basic Information tab to find the Pod CIDR block.
Repeat for the ack-hangzhou cluster in the China (Hangzhou) region.
Add the peer cluster's Pod CIDR to each cluster's security group:
On the Cluster Information page of each cluster, click the Basic Information tab. Click the security group ID next to Control Plane Security Group.
On the Inbound tab, click Add Rule.
Set Protocol Type to All and Source to the Pod CIDR block of the peer cluster. Keep other defaults and click Save.
Verify connectivity by logging on to a node in each cluster and running
pingagainst a node in the other cluster. See Log on to nodes.
Step 3: Add clusters to ASM and create an ingress gateway
Add the ack-hangzhou and ack-shanghai clusters to the ASM instance. See Add a cluster to an ASM instance.
Create an ingress gateway by applying the following YAML to the ASM instance:
apiVersion: istio.alibabacloud.com/v1beta1 kind: IstioGateway metadata: annotations: asm.alibabacloud.com/managed-by-asm: 'true' name: ingressgateway namespace: istio-system spec: gatewayType: ingress dnsPolicy: ClusterFirst externalTrafficPolicy: Local hostNetwork: false ports: - name: http port: 80 protocol: TCP targetPort: 80 - name: https port: 443 protocol: TCP targetPort: 443 replicaCount: 1 resources: limits: cpu: '2' memory: 2G requests: cpu: 200m memory: 256Mi rollingMaxSurge: 100% rollingMaxUnavailable: 25% runAsRoot: true serviceType: LoadBalancer
Step 4: Deploy the Bookinfo application
Deploy the application
Deploy Bookinfo in both the ack-hangzhou and ack-shanghai clusters:
kubectl apply -f bookinfo.yamlCreate routing rules
Switch kubectl to the ASM instance context and apply the following routing rules.
Save the following YAML as
asm.yaml:Apply the routing rules:
kubectl apply -f asm.yaml
Verify the deployment
Open
http://<ingress-gateway-ip>/productpagein a browser and refresh the page 10 times. The Bookinfo application distributes requests across v1, v2, and v3 of the reviews service in roughly equal proportions (1:1:1).
Step 5: Configure inter-region failover
Inter-region failover reroutes traffic to a backup region when the local region's service becomes unavailable. The following steps demonstrate failover by scaling down the reviews service in one cluster and verifying that traffic shifts to the other.
Simulate a regional failure
Scale the reviews Deployment in the ack-hangzhou cluster to zero replicas:
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the ack-hangzhou cluster name. In the left-side navigation pane, choose Workloads > Deployments.
On the Deployments page, set Namespace to default. Click Scale in the Actions column for the reviews deployment.
Set Desired Number Of Pods to 0 and click OK.
Configure outlier detection
Add outlier detection to the reviews DestinationRule so that unhealthy endpoints are ejected and failover is triggered.
On the ASM instance details page, choose Traffic Management Center > DestinationRule in the left-side navigation pane.
Click Edit YAML in the Actions column of the reviews DestinationRule.
Add the following
trafficPolicyblock underspecand click OK: The following table explains each parameter and its role in failover behavior:Parameter Value Description maxRequestsPerConnection1Limits each connection to a single request. This disables keep-alive, forcing new connections that can be routed to healthy endpoints. baseEjectionTime1mKeeps an unhealthy endpoint ejected for 1 minute before re-evaluating. consecutive5xxErrors1Ejects an endpoint after a single 5xx error. interval1sRuns the ejection scan every 1 second. spec: # ... existing content ... trafficPolicy: connectionPool: http: maxRequestsPerConnection: 1 outlierDetection: baseEjectionTime: 1m consecutive5xxErrors: 1 interval: 1s
Enable geolocation-based failover
ASM version 1.22.6.66 and later
On the ASM instance details page, choose ASM Instance > Base Information in the left-side navigation pane.
Click Configure a Geolocation-based Load Balancing next to Geolocation-based Load Balancing.
Click Specify priority rules for regions. Set Region in which the failure occurs to
cn-shanghaiand The region to which the traffic is preferentially routed tocn-hangzhou.Click Add. Set Region in which the failure occurs to
cn-hangzhouand The region to which the traffic is preferentially routed tocn-shanghai.Click Save Configuration.
ASM versions earlier than 1.22.6.66
On the Base Information page, click Setting next to Geolocation-based Load Balancing.
In the Geolocation-based Failover dialog box, configure the failover mapping:
When Policy is cn-shanghai, set Failover to to cn-hangzhou.
When Policy is cn-hangzhou, set Failover to to cn-shanghai.
Click Confirm.
Verify failover
Run the following command to send 10 requests to the Bookinfo application and count responses from the v2 reviews service:
for ((i=1;i<=10;i++)); do
curl http://<ingress-gateway-ip>/productpage 2>&1 | grep full.stars
done | wc -lReplace <ingress-gateway-ip> with port 80 of the ingress gateway IP in the ack-hangzhou cluster.
Expected output:
20Each request routed to the v2 reviews service returns two lines containing full stars. An output of 20 confirms that all 10 requests reached the v2 reviews service in the ack-shanghai cluster, which means failover is working.
Step 6: Configure inter-region traffic distribution
Inter-region traffic distribution requires ASM version 1.22.6.66 or later.
Inter-region traffic distribution splits traffic across regions based on configurable weight percentages. Unlike failover, which activates only during outages, traffic distribution actively balances load across regions at all times.
The following table summarizes the intended traffic split configured below:
| Source region | Destination region | Traffic percentage |
|---|---|---|
| cn-hangzhou | cn-hangzhou (local) | 90% |
| cn-hangzhou | cn-shanghai | 10% |
Configure traffic distribution
Geolocation-based load balancing defaults to failover mode. If failover is currently enabled, click Disable in the upper-right corner of the configuration page before enabling traffic distribution.
Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.
On the Mesh Management page, click the ASM instance name. In the left-side navigation pane, choose ASM Instance > Base Information.
Click Configure a Geolocation-based Load Balancing next to Geolocation-based Load Balancing.
Click Congifure a traffic distribution rule. Set Source to
cn-hangzhou, Destination tocn-shanghai, and Traffic Percentage to 10%.Click Save Configuration.
Verify traffic distribution
Run the following command to send 10 requests and observe the distribution:
for ((i=1;i<=10;i++)); do
curl http://<ingress-gateway-ip>/productpage 2>&1 | grep full.stars
doneReplace <ingress-gateway-ip> with port 80 of the ingress gateway IP in the ack-hangzhou cluster.
Expected output:
<!-- full stars: -->
<!-- full stars: -->Out of 10 requests, approximately 9 reach the v1 reviews service in ack-hangzhou (no full stars output), and 1 reaches the v2 reviews service in ack-shanghai (2 lines of full stars). This confirms a 90%/10% traffic split matching the configured weights.
FAQ
Why does adding a cluster to ASM fail after connecting VPC networks through CEN?
The most likely cause is a missing or misconfigured inter-region data transfer plan in CEN. Without proper inter-region bandwidth, the ASM control plane in one region cannot reach the data plane cluster in another region, even though intra-region VPC connectivity works.
To fix this, verify and reconfigure the inter-region connection settings in CEN as described in Step 2: Connect VPC networks through CEN. Make sure inter-region bandwidth is allocated between the two transit routers.