Use ASM to route traffic based on the locality - Alibaba Cloud Service Mesh

By default, Kubernetes distributes traffic across pods in round-robin fashion with no awareness of where pods run. In multi-zone deployments, requests regularly cross zone boundaries, adding latency and increasing cross-zone traffic costs.

Service Mesh (ASM) provides locality-based routing built on Istio. Traffic is directed to the pods closest to the originating pod, keeping calls within the same zone whenever possible. This reduces latency and lowers cross-zone traffic fees.

ASM supports two locality routing modes:

Mode	Behavior	Use case
Locality-prioritized	Routes traffic to pods in the same zone. If no healthy pod is available locally, traffic fails over to another zone.	Default choice when each zone has enough capacity for its local traffic
Locality-weighted	Distributes traffic across zones by percentage to prevent any single zone from being overloaded.	When traffic is unevenly distributed or you need to limit per-zone load

How locality routing works

ASM uses the Istio locality model to determine where each pod runs. A locality is defined by three hierarchical levels, matched in order from broadest to most specific:

Level	Description	Kubernetes label	Example
Region	A large geographic area	`topology.kubernetes.io/region`	`cn-beijing`
Zone	A set of compute resources within a region	`topology.kubernetes.io/zone`	`cn-beijing-g`
Sub-zone	A further subdivision of a zone for fine-grained control	`topology.istio.io/subzone`	Custom value

When a client pod sends a request, the Envoy sidecar proxy evaluates the locality of available backend pods and routes traffic to the closest match. Localities are expressed in the format <region>/<zone>/<sub-zone>, for example cn-beijing/cn-beijing-g/*.

Why outlier detection is required

Both routing modes require a DestinationRule with outlier detection enabled. Outlier detection monitors pod health by tracking consecutive errors. When a pod exceeds the error threshold, it is temporarily ejected from the load balancing pool. Without outlier detection, the mesh cannot identify unhealthy local pods or trigger failover to another zone.

Prerequisites

Before you begin, make sure that you have:

An ASM instance. For more information, see Create an ASM instance
A managed multi-zone Container Service for Kubernetes (ACK) cluster, with nodes in at least two zones of the same region. This tutorial uses China (Beijing) Zone G (cn-beijing-g) and Zone H (cn-beijing-h). For more information, see Create an ACK managed cluster

Deploy the sample application

This tutorial uses a two-zone setup with separate backend and client services to demonstrate and verify locality routing behavior.

Deploy the backend service

Deploy two versions of an nginx backend service in the ACK cluster, one per zone. Each version returns a different response string (v1 or v2), making it easy to identify which zone handled the request.

Component	Name	Zone	Returns
ConfigMap	mynginx-configmap-v1	-	`v1`
Deployment	nginx-v1	cn-beijing-g	`v1`
ConfigMap	mynginx-configmap-v2	-	`v2`
Deployment	nginx-v2	cn-beijing-h	`v2`
Service	nginx (port 8000)	-	-

Create the ConfigMap and Deployment for version 1. The ConfigMap configures nginx to return v1 on every request, and the node selector pins the pod to Zone G.

apiVersion: v1
kind: ConfigMap
metadata:
  name: mynginx-configmap-v1
  namespace: backend
data:
  default.conf: |-
    server {
        listen       80;
        server_name  localhost;

        location / {
            return 200 'v1\n';
        }
    }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-v1
  namespace: backend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
      version: v1
  template:
    metadata:
      labels:
        app: nginx
        version: v1
    spec:
      containers:
      - image: docker.io/nginx:1.15.9
        imagePullPolicy: IfNotPresent
        name: nginx
        ports:
        - containerPort: 80
        volumeMounts:
        - name: nginx-config
          mountPath: /etc/nginx/conf.d
          readOnly: true
      volumes:
      - name: nginx-config
        configMap:
          name: mynginx-configmap-v1
      nodeSelector:
        failure-domain.beta.kubernetes.io/zone: "cn-beijing-g"

Create the ConfigMap and Deployment for version 2. This version returns v2 and is pinned to Zone H.

apiVersion: v1
kind: ConfigMap
metadata:
  name: mynginx-configmap-v2
  namespace: backend
data:
  default.conf: |-
    server {
        listen       80;
        server_name  localhost;

        location / {
            return 200 'v2\n';
        }
    }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-v2
  namespace: backend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
      version: v2
  template:
    metadata:
      labels:
        app: nginx
        version: v2
    spec:
      containers:
      - image: docker.io/nginx:1.15.9
        imagePullPolicy: IfNotPresent
        name: nginx
        ports:
        - containerPort: 80
        volumeMounts:
        - name: nginx-config
          mountPath: /etc/nginx/conf.d
          readOnly: true
      volumes:
      - name: nginx-config
        configMap:
          name: mynginx-configmap-v2
      nodeSelector:
        failure-domain.beta.kubernetes.io/zone: "cn-beijing-h"

Create the Service to expose both versions under a single endpoint.

apiVersion: v1
kind: Service
metadata:
  name: nginx
  namespace: backend
  labels:
    app: nginx
spec:
  ports:
  - name: http
    port: 8000
    targetPort: 80
  selector:
    app: nginx

Deploy the client service

Deploy two versions of a sleep client service, one per zone. These pods send requests to the nginx backend, making it possible to verify routing behavior from different localities.

Create a Deployment for each zone.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sleep-cn-beijing-g
  namespace: backend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sleep
      version: v1
  template:
    metadata:
      labels:
        app: sleep
        version: v1
    spec:
      containers:
      - name: sleep
        image: tutum/curl
        command: ["/bin/sleep","infinity"]
        imagePullPolicy: IfNotPresent
      nodeSelector:
        failure-domain.beta.kubernetes.io/zone: "cn-beijing-g"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sleep-cn-beijing-h
  namespace: backend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sleep
      version: v2
  template:
    metadata:
      labels:
        app: sleep
        version: v2
    spec:
      containers:
      - name: sleep
        image: tutum/curl
        command: ["/bin/sleep","infinity"]
        imagePullPolicy: IfNotPresent
      nodeSelector:
        failure-domain.beta.kubernetes.io/zone: "cn-beijing-h"

Create the Service for the client pods.

apiVersion: v1
kind: Service
metadata:
  name: sleep
  namespace: backend
  labels:
    app: sleep
spec:
  ports:
  - name: http
    port: 80
  selector:
    app: sleep

Verify default round-robin routing

Before configuring locality routing, confirm that traffic is distributed evenly across both backend pods.

Run the following script to send 20 requests from each client pod:

echo "--- Zone G client (sleep v1) ---"
export SLEEP_ZONE_1=$(kubectl get pods -lapp=sleep,version=v1 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}; do
  kubectl exec -it $SLEEP_ZONE_1 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
done

echo "--- Zone H client (sleep v2) ---"
export SLEEP_ZONE_2=$(kubectl get pods -lapp=sleep,version=v2 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}; do
  kubectl exec -it $SLEEP_ZONE_2 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
done

Expected result: Both client pods return a roughly equal mix of v1 and v2 responses, confirming round-robin distribution with no locality awareness.

Show the expected output

--- Zone G client (sleep v1) ---
v2
v1
v1
v2
v2
v1
v1
v2
v2
v1
v1
v2
v2
v1
v2
v1
v1
v2
v2
v1
--- Zone H client (sleep v2) ---
v2
v2
v1
v1
v2
v1
v2
v1
v2
v2
v1
v1
v2
v2
v1
v2
v1
v2
v1
v1

Configure locality-prioritized load balancing

Locality-prioritized load balancing routes each request to a backend pod in the same zone as the client. If no healthy pod exists in the local zone, traffic automatically fails over to another zone.

To enable this mode, create a VirtualService and a DestinationRule with outlier detection in the ASM console.

Create the VirtualService and DestinationRule

Log on to the ASM console.
In the left-side navigation pane, choose Service Mesh > Mesh Management.
On the Mesh Management page, find the target ASM instance and click its name, or click Manage in the Actions column.
Create the VirtualService:
1. In the left-side navigation pane of the ASM instance details page, choose Traffic Management Center > VirtualService. Click Create from YAML.
2. Select backend from the Namespace drop-down list, paste the following YAML, and click Create.
```
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: nginx
  namespace: backend
spec:
  hosts:
    - nginx
  http:
  - route:
    - destination:
        host: nginx
```

Create the DestinationRule:

In the left-side navigation pane, choose Traffic Management Center > DestinationRule. Click Create from YAML.
Select backend from the Namespace drop-down list, paste the following YAML, and click Create.

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: nginx
  namespace: backend
spec:
  host: nginx
  trafficPolicy:
    outlierDetection:
      consecutiveErrors: 7
      interval: 30s
      baseEjectionTime: 30s

The outlierDetection block is required for locality routing. It tracks consecutive errors per pod and temporarily ejects unhealthy pods:

Parameter	Value	Description
`consecutiveErrors`	`7`	Eject a pod after 7 consecutive errors
`interval`	`30s`	Run health checks every 30 seconds
`baseEjectionTime`	`30s`	Keep an ejected pod out of the pool for at least 30 seconds

Without outlier detection, the mesh cannot detect unhealthy local pods or trigger failover.

Verify prioritized routing

Run the verification script again:

echo "--- Zone G client (sleep v1) ---"
export SLEEP_ZONE_1=$(kubectl get pods -lapp=sleep,version=v1 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}; do
  kubectl exec -it $SLEEP_ZONE_1 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
done

echo "--- Zone H client (sleep v2) ---"
export SLEEP_ZONE_2=$(kubectl get pods -lapp=sleep,version=v2 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}; do
  kubectl exec -it $SLEEP_ZONE_2 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
done

Expected result: The Zone G client returns mostly v1 (the Zone G backend), and the Zone H client returns mostly v2 (the Zone H backend). Traffic stays in the same zone.

Show the expected output

--- Zone G client (sleep v1) ---
v1
v1
v1
v1
v1
v1
v1
v2
v1
v1
v2
v1
v1
v1
v1
v1
v1
v1
v1
v1
--- Zone H client (sleep v2) ---
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2

Configure locality-weighted load balancing

Locality-prioritized routing sends all traffic to the local zone, which can overload that zone when traffic is unevenly distributed. Locality-weighted load balancing addresses this by splitting traffic across zones by percentage.

The following table shows the distribution configured in this example:

Source zone	Zone G (cn-beijing-g)	Zone H (cn-beijing-h)
Zone G	80%	20%
Zone H	20%	80%

Create the VirtualService and DestinationRule

Log on to the ASM console.
In the left-side navigation pane, choose Service Mesh > Mesh Management.
On the Mesh Management page, find the target ASM instance and click its name, or click Manage in the Actions column.
Create the VirtualService:
1. In the left-side navigation pane of the ASM instance details page, choose Traffic Management Center > VirtualService. Click Create from YAML.
2. Select backend from the Namespace drop-down list, paste the following YAML, and click Create.
```
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: nginx
  namespace: backend
spec:
  hosts:
    - nginx
  http:
  - route:
    - destination:
        host: nginx
```

Create the DestinationRule with weighted distribution:

In the left-side navigation pane, choose Traffic Management Center > DestinationRule. Click Create from YAML.
Select backend from the Namespace drop-down list, paste the following YAML, and click Create.

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: nginx
  namespace: backend
spec:
  host: nginx
  trafficPolicy:
    outlierDetection:
      consecutiveErrors: 7
      interval: 30s
      baseEjectionTime: 30s
    loadBalancer:
      localityLbSetting:
        enabled: true
        distribute:
        - from: cn-beijing/cn-beijing-g/*
          to:
            "cn-beijing/cn-beijing-g/*": 80
            "cn-beijing/cn-beijing-h/*": 20
        - from: cn-beijing/cn-beijing-h/*
          to:
            "cn-beijing/cn-beijing-g/*": 20
            "cn-beijing/cn-beijing-h/*": 80

The localityLbSetting.distribute field uses the format <region>/<zone>/* to define traffic weights. Each from entry specifies a source zone, and the to map assigns the percentage of traffic sent to each destination zone. The percentages for each from entry must total 100.

Verify weighted routing

Run the verification script again:

echo "--- Zone G client (sleep v1) ---"
export SLEEP_ZONE_1=$(kubectl get pods -lapp=sleep,version=v1 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}; do
  kubectl exec -it $SLEEP_ZONE_1 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
done

echo "--- Zone H client (sleep v2) ---"
export SLEEP_ZONE_2=$(kubectl get pods -lapp=sleep,version=v2 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}; do
  kubectl exec -it $SLEEP_ZONE_2 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
done

Expected result: The Zone G client returns approximately 80% v1 and 20% v2 responses. The Zone H client returns approximately 80% v2 and 20% v1 responses.

Show the expected output

--- Zone G client (sleep v1) ---
v1
v1
v1
v1
v2
v1
v1
v2
v1
v1
v1
v2
v1
v1
v1
v1
v1
v2
v1
v1
--- Zone H client (sleep v2) ---
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v2
v1
v2
v1
v2
v2

References

To keep traffic strictly within specific regions, create ACK clusters in different regions and enable in-cluster traffic retention. For more information, see Enable the feature of keeping traffic in-cluster in multi-cluster scenarios.