All Products
Search
Document Center

Alibaba Cloud Service Mesh:Configure fallback routing in ASM

Last Updated:Mar 11, 2026

When a service version becomes unavailable, requests to that version return HTTP 503 errors. Fallback routing solves this by automatically redirecting requests to an alternative service version at the proxy level. Service Mesh (ASM) extends the Istio VirtualService resource with a fallback field that defines this alternative route.

This topic walks through fallback routing using the Bookinfo sample application. You will:

  • Route all traffic to a specific service version

  • Define a fallback target that receives traffic when the primary version is unreachable

  • Combine fallback with weighted routing for advanced scenarios

  • Verify fallback behavior through custom access logs

How fallback differs from retries and circuit breakers

ASM provides several resilience mechanisms. Choose the right one based on the failure mode you need to handle:

MechanismFailure modeBehaviorWhen to use
FallbackAll endpoints in a subset are unreachableRedirects to an alternative subsetA backup version or deployment can serve requests
RetriesTransient errors (timeouts, 5xx)Resends the request to the same subsetErrors are intermittent and likely to succeed on retry
Circuit breakerSustained high error rateStops sending traffic to unhealthy endpointsProtect downstream services from cascading failures

Fallback operates at the routing level: the sidecar proxy checks whether the target subset has healthy endpoints before forwarding the request. If no healthy endpoints exist, it routes to the fallback target instead.

Prerequisites

RequirementDetails
ASM instanceEnterprise Edition or Ultimate Edition, version 1.17.2.22 or later
Kubernetes clusterA Container Service for Kubernetes (ACK) cluster added to the ASM instance. See Add a cluster to an ASM instance.
Sample applicationBookinfo deployed in the cluster. See Deploy an application in an ASM instance.
Sidecar proxy version1.17 or later on all data plane pods
Note

If your ASM instance version is earlier than 1.17, update it to 1.17.2.22 or later. See Update an ASM instance. Alternatively, submit a ticket for technical support.

Note

Check the sidecar proxy version for each pod on the Instances Status page in the ASM console. See Upgrade management.

Bookinfo reviews service versions

The Bookinfo reviews service has three versions, each identifiable by its star rating on the product page:

VersionStar rating
v1No stars
v2Black stars
v3Red stars

The productpage service calls reviews. A VirtualService routes all traffic to reviews-v3. When v3 becomes unavailable, the fallback rule redirects requests to reviews-v2 instead of returning a 503 error.

Download the configuration files used in this topic.

Step 1: Create the destination rule for the reviews service

Define a DestinationRule with subsets for each version of the reviews service.

  1. Create a file named reviews.yaml with the following content:

       apiVersion: networking.istio.io/v1alpha3
       kind: DestinationRule
       metadata:
         name: reviews
       spec:
         host: reviews
         subsets:
         - name: v1
           labels:
             version: v1
         - name: v2
           labels:
             version: v2
         - name: v3
           labels:
             version: v3
  2. Connect to the ASM instance using kubectl and the KubeConfig file, then apply the rule:

       kubectl apply -f reviews.yaml
  3. Get the ingress gateway IP address using either method:

  4. Open http://<gateway-ip>/productpage in a browser. The Reviews served by field and the star rating identify the version serving each request. Refresh the page several times -- requests are distributed across v1, v2, and v3. For example, reviews-v2 displays black stars:

    reviews-v2 with black stars

Step 2: Configure a basic fallback rule

Route all traffic to reviews-v3 and set reviews-v2 as the fallback target.

  1. Create a file named reviews-route-fallback-sample1.yaml: The fallback.target field tells the sidecar proxy to forward requests to reviews-v2 when reviews-v3 is unreachable.

       apiVersion: networking.istio.io/v1beta1
       kind: VirtualService
       metadata:
         name: reviews-route
         namespace: default
       spec:
         hosts:
           - reviews
         http:
           - route:
               - destination:
                   host: reviews
                   subset: v3
                 fallback:
                   target:
                     host: reviews
                     subset: v2
  2. Apply the VirtualService:

       kubectl apply -f reviews-route-fallback-sample1.yaml
  3. Open http://<gateway-ip>/productpage and refresh. All requests now route to v3 (red stars).

    reviews-v3 with red stars

  4. Simulate a v3 failure by scaling its deployment to zero replicas:

       kubectl scale deployment reviews-v3 --replicas=0
  5. Refresh the page. Requests now fall back to v2 (black stars).

Verify fallback through access logs

Add fallback-related fields to the custom access log format to confirm the fallback triggered. See Customize access log fields for setup instructions.

FieldValueDescription
fallback_path%DYNAMIC_METADATA(com.aliyun.fallback:fallback-path)%Fallback chain. A:B means A fell back to B. A:B:C means A fell back to B, then B to C.
fallback_final_cluster_name%DYNAMIC_METADATA(com.aliyun.fallback:final-cluster)%Cluster that ultimately handled the request after a successful fallback.
fallback_result%DYNAMIC_METADATA(com.aliyun.fallback:fallback-result)%Fallback outcome. If the fallback failed, the request goes to the original route's cluster.
Custom log format configuration

Check the productpage-v1 istio-proxy logs. A successful fallback from v3 to v2 produces a log entry similar to the following:

{
    "authority": "reviews:9080",
    "authority_for": "reviews:9080",
    "bytes_received": "0",
    "bytes_sent": "442",
    "downstream_local_address": "192.168.255.46:9080",
    "downstream_remote_address": "172.16.0.252:57238",
    "duration": "10",
    "fallback_path": "outbound|9080|v3|reviews.default.svc.cluster.local:outbound|9080|v2|reviews.default.svc.cluster.local",
    "fallback_final_cluster_name": "outbound|9080|v2|reviews.default.svc.cluster.local",
    "fallback_result": "fallback successful",
    "istio_policy_status": "-",
    "method": "GET",
    "path": "/reviews/0",
    "protocol": "HTTP/1.1",
    "request_id": "15b2dffc-5f3f-4060-b9fa-898eab08****",
    "requested_server_name": "-",
    "response_code": "200",
    "response_flags": "-",
    "route_name": "-",
    "start_time": "2023-05-30T07:02:26.990Z",
    "trace_id": "18b3aed8af41****",
    "upstream_cluster": "outbound|9080|v2|reviews.default.svc.cluster.local",
    "upstream_host": "172.16.0.11:9080",
    "upstream_local_address": "172.16.0.252:44448",
    "upstream_service_time": "9",
    "upstream_transport_failure_reason": "-",
    "user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.X.X Safari/537.36",
    "x_forwarded_for": "-"
}

The three fallback-specific fields confirm the behavior:

  • fallback_path shows the chain: v3 cluster to v2 cluster.

  • fallback_final_cluster_name identifies v2 as the cluster that served the request.

  • fallback_result is "fallback successful".

Step 3: Combine fallback with weighted routing

Fallback rules work alongside weighted routing. In this scenario, traffic splits 50/50 between v3 and v2, and each version has its own fallback target.

  1. Restore v3:

       kubectl scale deployment reviews-v3 --replicas=1
  2. Create a file named reviews-route-fallback-sample2.yaml: Two independent fallback chains are defined: Retries are disabled (attempts: 0) to isolate fallback behavior from retry behavior during testing.

    • v3 traffic (50%): falls back to v2 if v3 is unavailable

    • v2 traffic (50%): falls back to v1 if v2 is unavailable

       apiVersion: networking.istio.io/v1beta1
       kind: VirtualService
       metadata:
         name: reviews-route
         namespace: default
       spec:
         hosts:
           - reviews
         http:
           - route:
               - destination:
                   host: reviews
                   subset: v3
                 fallback:
                   target:
                     host: reviews
                     subset: v2
                 weight: 50
               - destination:
                   host: reviews
                   subset: v2
                 fallback:
                   target:
                     host: reviews
                     subset: v1
                 weight: 50
             retries:
               attempts: 0
  3. Apply the updated VirtualService:

       kubectl apply -f reviews-route-fallback-sample2.yaml
  4. Refresh the product page. Requests are distributed evenly between v3 (red stars) and v2 (black stars).

Test fallback when v3 is down

  1. Scale v3 to zero: Refresh the page. All requests now reach v2: the 50% originally targeted at v3 falls back to v2, and the other 50% routes to v2 directly.

       kubectl scale deployment reviews-v3 --replicas=0

Test fallback when both v3 and v2 are down

  1. Scale v2 to zero: Refresh the page. Two behaviors alternate: productpage error when both v3 and v2 are unavailable

    • 50% of requests succeed: These are requests originally routed to v2, which fall back to v1 (no stars).

    • 50% of requests fail with a 503 error: These are requests originally routed to v3. The proxy attempts to fall back to v2, but v2 is also down, so the fallback chain does not extend to v1.

       kubectl scale deployment reviews-v2 --replicas=0
  2. Confirm this in the logs: A failed fallback produces a log entry like the following: Key fields in the failed fallback log:

    • response_code: 503 indicates the request was not served.

    • response_flags: UH (Upstream Host Unhealthy).

    • fallback_result: "fallback cluster is unhealthy" indicates the fallback target (v2) was also unavailable.

    • upstream_cluster: still shows the v3 cluster, confirming the original routing destination.

       kubectl logs -f deployment/productpage-v1 -c istio-proxy --tail=10
       {
           "authority": "reviews:9080",
           "authority_for": "reviews:9080",
           "bytes_received": "0",
           "bytes_sent": "19",
           "downstream_local_address": "192.168.255.46:9080",
           "downstream_remote_address": "172.16.0.252:47738",
           "duration": "0",
           "fallback_path": "outbound|9080|v3|reviews.default.svc.cluster.local:outbound|9080|v2|reviews.default.svc.cluster.local",
           "fallback_final_cluster_name": "-",
           "fallback_result": "fallback cluster is unhealthy",
           "istio_policy_status": "-",
           "method": "GET",
           "path": "/reviews/0",
           "protocol": "HTTP/1.1",
           "request_id": "b207a764-b6d7-4ef8-bc71-59f264c3****",
           "requested_server_name": "-",
           "response_code": "503",
           "response_flags": "UH",
           "route_name": "-",
           "start_time": "2023-05-30T07:32:08.999Z",
           "trace_id": "a40c32a7b2cf****",
           "upstream_cluster": "outbound|9080|v3|reviews.default.svc.cluster.local",
           "upstream_host": "-",
           "upstream_local_address": "-",
           "upstream_service_time": "-",
           "upstream_transport_failure_reason": "-",
           "user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.X.X Safari/537.36",
           "x_forwarded_for": "-"
       }

Use fallback in production

Fallback routing masks service failures, which is valuable for availability but requires careful planning in production environments.

Monitor fallback frequency

A sustained high fallback rate indicates an underlying issue with the primary service version. Set up alerts on the fallback_result access log field to detect when fallback triggers repeatedly, rather than relying on it as a silent failover.

Capacity planning

When fallback triggers, the fallback target receives additional traffic beyond its normal load. Make sure the fallback service version has enough capacity to handle both its regular traffic and redirected traffic. In the weighted routing scenario from Step 3, v2 must handle up to 100% of total traffic if v3 goes down.

Fallback chain depth

Fallback does not cascade across multiple hops automatically. In the weighted routing example, when v3 falls back to v2 and v2 is also down, the request fails with a 503 -- it does not continue to v1. Each route entry maintains its own independent fallback target. Plan your fallback chains accordingly.

Related topics