By default, Kubernetes distributes traffic across pods in round-robin mode, regardless of where each pod runs. In a multi-zone cluster, this means requests frequently cross zone boundaries, adding latency and incurring cross-zone network charges.
Service Mesh (ASM) provides locality-based routing to solve this problem. Built on Istio, ASM routes traffic to the pod closest to the caller, keeping most service-to-service calls within the same availability zone.
ASM supports two locality routing modes:
| Mode | Behavior | When to use |
|---|---|---|
| Locality-prioritized | Routes traffic to the same zone first. If no healthy pod is available locally, traffic fails over to another zone. | Each zone has enough backend capacity to handle local traffic. |
| Locality-weighted | Splits traffic across zones by percentage. | You need to prevent a single zone from becoming overloaded, or you want fine-grained control over cross-zone distribution. |
How locality routing works
ASM determines each pod's locality from Kubernetes node labels that identify the region and zone:
| Level | Kubernetes node label | Example |
|---|---|---|
| Region | topology.kubernetes.io/region | cn-beijing |
| Zone | topology.kubernetes.io/zone | cn-beijing-g |
A locality is defined as a <region>/<zone>/<sub-zone> triplet. When a sidecar proxy intercepts an outbound request, it compares the caller's locality with the backend endpoints and applies the routing mode defined in the DestinationRule.
Note: Two pods in zones with the same name but in different regions are not considered local to each other. Locality matching requires both region and zone to match.
Locality-based routing requires two Istio resources:
A VirtualService that defines the routing target.
A DestinationRule with an outlier detection policy. Outlier detection monitors endpoint health and triggers failover when a zone's endpoints become unhealthy. Without it, the sidecar proxy cannot detect unhealthy endpoints, so locality routing does not take effect.
Prerequisites
Before you begin, make sure that you have:
An ASM instance
A managed multi-zone ACK cluster with nodes in at least two zones of the same region
This tutorial uses Zone G and Zone H of the China (Beijing) region as an example.
Deploy the sample backend service
Deploy two versions of an nginx backend service in the ACK cluster. Each version runs in a different zone, making the effect of locality routing visible in test results.
Version 1 (v1): runs in Zone G, returns
v1in the response bodyVersion 2 (v2): runs in Zone H, returns
v2in the response body
Step 1: Create the ConfigMaps
Create ConfigMaps that define the nginx response for each version.
ConfigMap for v1:
apiVersion: v1
kind: ConfigMap
metadata:
name: mynginx-configmap-v1
namespace: backend
data:
default.conf: |-
server {
listen 80;
server_name localhost;
location / {
return 200 'v1\n';
}
}ConfigMap for v2:
apiVersion: v1
kind: ConfigMap
metadata:
name: mynginx-configmap-v2
namespace: backend
data:
default.conf: |-
server {
listen 80;
server_name localhost;
location / {
return 200 'v2\n';
}
}Step 2: Create the Deployments
Create a Deployment for each version. Node selectors pin each pod to a specific zone.
Deployment for v1 (Zone G):
Deployment for v2 (Zone H):
Step 3: Create the Service
apiVersion: v1
kind: Service
metadata:
name: nginx
namespace: backend
labels:
app: nginx
spec:
ports:
- name: http
port: 8000
targetPort: 80
selector:
app: nginxDeploy the sample client service
Deploy two versions of a sleep client service, one in each zone, to test traffic routing from both zones.
Step 1: Create the Deployments
Client in Zone G:
apiVersion: apps/v1
kind: Deployment
metadata:
name: sleep-cn-beijing-g
namespace: backend
spec:
replicas: 1
selector:
matchLabels:
app: sleep
version: v1
template:
metadata:
labels:
app: sleep
version: v1
spec:
containers:
- name: sleep
image: tutum/curl
command: ["/bin/sleep","infinity"]
imagePullPolicy: IfNotPresent
nodeSelector:
failure-domain.beta.kubernetes.io/zone: "cn-beijing-g"Client in Zone H:
apiVersion: apps/v1
kind: Deployment
metadata:
name: sleep-cn-beijing-h
namespace: backend
spec:
replicas: 1
selector:
matchLabels:
app: sleep
version: v2
template:
metadata:
labels:
app: sleep
version: v2
spec:
containers:
- name: sleep
image: tutum/curl
command: ["/bin/sleep","infinity"]
imagePullPolicy: IfNotPresent
nodeSelector:
failure-domain.beta.kubernetes.io/zone: "cn-beijing-h"Step 2: Create the Service
apiVersion: v1
kind: Service
metadata:
name: sleep
namespace: backend
labels:
app: sleep
spec:
ports:
- name: http
port: 80
selector:
app: sleepStep 3: Verify default round-robin behavior
Before enabling locality routing, confirm that traffic distributes evenly across zones. Run the following script to send 20 requests from each client pod:
echo "entering into the 1st container"
export SLEEP_ZONE_1=$(kubectl get pods -lapp=sleep,version=v1 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}
do
kubectl exec -it $SLEEP_ZONE_1 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
done
echo "entering into the 2nd container"
export SLEEP_ZONE_2=$(kubectl get pods -lapp=sleep,version=v2 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}
do
kubectl exec -it $SLEEP_ZONE_2 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
doneExpected output: both client pods receive a roughly equal mix of v1 and v2 responses, confirming default round-robin behavior.
Configure locality-prioritized load balancing
Locality-prioritized load balancing routes each request to a backend pod in the same zone as the caller. If no healthy backend pod exists in that zone, traffic fails over to another zone.
The following table shows how the sidecar proxy assigns priority to backend endpoints based on locality matching:
| Caller zone | Priority 0 (preferred) | Priority 1 (fallback) |
|---|---|---|
| cn-beijing-g | Backend in cn-beijing-g | Backend in cn-beijing-h |
| cn-beijing-h | Backend in cn-beijing-h | Backend in cn-beijing-g |
Traffic stays at Priority 0 as long as the local endpoints are healthy. If outlier detection ejects the local endpoints (for example, after 7 consecutive errors), the sidecar proxy fails over to Priority 1 endpoints.
Step 1: Create a VirtualService
Log on to the ASM console.
In the left-side navigation pane, choose Service Mesh > Mesh Management.
On the Mesh Management page, find the ASM instance that you want to configure. Click the name of the ASM instance or click Manage in the Actions column.
In the left-side navigation pane, choose Traffic Management Center > VirtualService. On the page that appears, click Create from YAML.
Select
backendfrom the Namespace drop-down list, paste the following YAML, and click Create.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: nginx
namespace: backend
spec:
hosts:
- nginx
http:
- route:
- destination:
host: nginxStep 2: Create a DestinationRule with outlier detection
In the left-side navigation pane, choose Traffic Management Center > DestinationRule. On the page that appears, click Create from YAML.
Select
backendfrom the Namespace drop-down list, paste the following YAML, and click Create.
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: nginx
namespace: backend
spec:
host: nginx
trafficPolicy:
outlierDetection:
consecutiveErrors: 7
interval: 30s
baseEjectionTime: 30sThe outlierDetection section tells the sidecar proxy how to detect unhealthy endpoints and trigger failover to another zone:
| Field | Value | Description |
|---|---|---|
consecutiveErrors | 7 | Eject an endpoint after 7 consecutive errors |
interval | 30s | Check endpoint health every 30 seconds |
baseEjectionTime | 30s | Keep an ejected endpoint out of the load balancing pool for at least 30 seconds |
Verify locality-prioritized routing
Run the same test script from the round-robin verification step:
echo "entering into the 1st container"
export SLEEP_ZONE_1=$(kubectl get pods -lapp=sleep,version=v1 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}
do
kubectl exec -it $SLEEP_ZONE_1 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
done
echo "entering into the 2nd container"
export SLEEP_ZONE_2=$(kubectl get pods -lapp=sleep,version=v2 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}
do
kubectl exec -it $SLEEP_ZONE_2 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
doneExpected output: the client in Zone G receives mostly v1 responses (from the backend in Zone G), and the client in Zone H receives mostly v2 responses (from the backend in Zone H). This confirms that traffic preferentially stays within the same zone.
Configure locality-weighted load balancing
Locality-prioritized mode routes all traffic to the local zone, which can overload backends when one zone handles a disproportionate share of clients. Locality-weighted load balancing addresses this by splitting traffic across zones based on configurable percentages.
This example applies the following distribution:
| Traffic origin | To cn-beijing-g | To cn-beijing-h |
|---|---|---|
| cn-beijing-g | 80% | 20% |
| cn-beijing-h | 20% | 80% |
Note: If you already created the VirtualService and DestinationRule from the locality-prioritized section, delete or update them before applying the weighted configuration. You can manage them from Traffic Management Center > VirtualService (or DestinationRule) in the ASM console.
Step 1: Create a VirtualService
Log on to the ASM console.
In the left-side navigation pane, choose Service Mesh > Mesh Management.
On the Mesh Management page, find the ASM instance that you want to configure. Click the name of the ASM instance or click Manage in the Actions column.
In the left-side navigation pane, choose Traffic Management Center > VirtualService. On the page that appears, click Create from YAML.
Select
backendfrom the Namespace drop-down list, paste the following YAML, and click Create.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: nginx
namespace: backend
spec:
hosts:
- nginx
http:
- route:
- destination:
host: nginxStep 2: Create a DestinationRule with weighted distribution
In the left-side navigation pane, choose Traffic Management Center > DestinationRule. On the page that appears, click Create from YAML.
Select
backendfrom the Namespace drop-down list, paste the following YAML, and click Create.
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: nginx
namespace: backend
spec:
host: nginx
trafficPolicy:
outlierDetection:
consecutiveErrors: 7
interval: 30s
baseEjectionTime: 30s
loadBalancer:
localityLbSetting:
enabled: true
distribute:
- from: cn-beijing/cn-beijing-g/*
to:
"cn-beijing/cn-beijing-g/*": 80
"cn-beijing/cn-beijing-h/*": 20
- from: cn-beijing/cn-beijing-h/*
to:
"cn-beijing/cn-beijing-g/*": 20
"cn-beijing/cn-beijing-h/*": 80The distribute field maps each source locality to a set of destination localities with weights. Locality values follow the format <region>/<zone>/*.
Verify locality-weighted routing
Run the same test script:
echo "entering into the 1st container"
export SLEEP_ZONE_1=$(kubectl get pods -lapp=sleep,version=v1 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}
do
kubectl exec -it $SLEEP_ZONE_1 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
done
echo "entering into the 2nd container"
export SLEEP_ZONE_2=$(kubectl get pods -lapp=sleep,version=v2 -n backend -o 'jsonpath={.items[0].metadata.name}')
for i in {1..20}
do
kubectl exec -it $SLEEP_ZONE_2 -c sleep -n backend -- sh -c 'curl http://nginx.backend:8000'
doneExpected output: the client in Zone G receives approximately 80% v1 and 20% v2 responses. The client in Zone H receives approximately 80% v2 and 20% v1 responses.
Usage notes
Capacity planning: Before enabling locality-prioritized load balancing, confirm that each zone has enough backend capacity to handle local traffic. If one zone has significantly more clients than backend pods, the local backends can become overloaded while backends in other zones sit idle. Use locality-weighted load balancing to distribute the load more evenly.
Outlier detection is required. Both routing modes depend on outlier detection to monitor endpoint health and trigger failover. Without an
outlierDetectionconfiguration in the DestinationRule, locality routing does not take effect.Node labels: ASM reads locality from the node labels
topology.kubernetes.io/regionandtopology.kubernetes.io/zone(or the legacy labelfailure-domain.beta.kubernetes.io/zone). Confirm that your cluster nodes have the correct zone labels before enabling locality routing.
References
Enable the feature of keeping traffic in-cluster in multi-cluster scenarios: Restrict traffic to stay within a specific cluster when you manage services across multiple ACK clusters in different regions.