All Products
Search
Document Center

Alibaba Cloud Service Mesh:Use canary mode to enhance upgrade stability

Last Updated:Dec 04, 2025

A canary release involves deploying a new application version as a canary while the original version remains available. This strategy lets you test the new version's performance, identify and resolve issues early, and ensure overall system stability. This topic describes how to use canary mode to enhance upgrade stability.

Applicability

How it works

Alibaba Cloud Service Mesh (ASM) supports a revision- and label-based upgrade mode that lets you perform canary upgrades of a new control plane version more stably and securely. In this upgrade mode, the mesh proxies on the data plane are associated with a specific control plane version. This lets you deploy the new version in the cluster with low risk. No proxies connect to the new version until you explicitly select it. You can also gradually migrate workloads to the new control plane. Each independent control plane is called a revision and has the istio.io/rev label.

To support this revision-based upgrade, Istio introduces an istio.io/rev label for namespaces. This label indicates which control plane version injects the Sidecar proxy for workloads in the corresponding namespace. For example, the label istio.io/rev=1-23-6 indicates that the version 1.23.6 Sidecar proxy will be injected for workloads in that namespace.

During a canary upgrade, you can upgrade some services first to verify that the destination version meets your expectations. If the version does not meet your expectations, you can quickly roll back to ensure service stability. After you verify that the new version meets your expectations, you can promote the canary version to the stable version. You can then use a rolling update to upgrade all workloads to the latest version. This completes the data plane upgrade. Finally, you can uninstall the old version to complete the upgrade.

Preparations

A canary upgrade requires you to specify the injected Sidecar proxy version using a namespace label. Therefore, you must configure the injection policy correctly. Follow these steps to confirm that your injection policy configuration meets the requirements for a canary upgrade.

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose Data Plane Component Management > Sidecar Proxy injection.

  3. On the Injection Policy Configuration page, in the Injection Policy Configuration Management section, confirm that The Label Of The Namespace Where The Pod Is Located Needs To Meet The Condition is set to Contains Istio-injection: Enabled.Dingtalk_20230724162312.png

    Note

    The istio-injection: enabled label has the same semantics as the istio.io/rev: stable label. During a canary upgrade, you can use the istio.io/rev: stable label to inject the stable version of the mesh proxy into pods in the corresponding namespace. You can use the istio.io/rev: canary label to inject the canary version of the mesh proxy into pods in the corresponding namespace.

    After the upgrade, injection continues to work normally even if you do not replace istio.io/rev:stable with istio-injection: enabled because the two labels have the same semantics.

    The following figure shows the status before the upgrade:

    image

Step 1: Upgrade the ASM control plane

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Instance > Upgrade Management.

  3. On the Upgrade Management page, click the Canary Upgrade tab. On the Control Plane tab, select a Canary Version and Create A New Server Load Balancer (CLB) Instance. Click Confirm. In the Confirm Upgrade? dialog box, click OK.

    A canary upgrade can skip at most one minor version. In this example, the ASM instance version is 1.22, so you can upgrade it to a maximum of version 1.23. This topic uses an upgrade to v1.23.6 as an example. When you deploy the destination version for the canary upgrade, an associated CLB instance is created. If you do not have special requirements, you can select the default specifications for the CLB instance. For more information about CLB billing, see Billing overview of CLB.

    The deployment of the canary version is an asynchronous process and takes a few minutes. Wait for the related components to be deployed. After the new version is deployed, the page appears as shown in the following figure.

    Snipaste_2025-10-28_10-10-05

    The following figure shows the status at this point:

    image

Step 2: Upgrade the injected proxy for reviews-v2 to the new version

In Step 1, you deployed an Istio control plane of v1.23.6 using a canary upgrade. The following steps describe how to update the injected mesh proxy version for the reviews-v2 version of the Bookinfo application to v1.23 and verify that this version meets your expectations.

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Instance > Global Namespace.

  3. On the Global Namespace page, in the Automatic Injection column, check whether the label for the default namespace is istio-injection: enabled.

    If the label is istio-injection: enabled, the version 1.22 mesh proxy is injected.

  4. On the Global Namespace page, in the Automatic Injection column, find the default namespace and click Switch To Inject Version 1-23-6. In the Confirm dialog box, click OK.

    The label of the default global namespace switches to istio.io/rev: canary. The data plane's default namespace label is also immediately synchronized to istio.io/rev: canary. The Global Namespace page then shows that the version 1.23 mesh proxy is injected for the default namespace. New workloads (pods) created in the default namespace will be injected with the version 1.23 mesh proxy. The labels of other namespaces are not changed. They are still injected with the version 1.22 mesh proxy.

    Note

    The preceding steps describe how to switch the injected Sidecar mesh proxy version for a namespace where automatic injection was already enabled before the upgrade. For namespaces where Sidecar mesh proxy injection was not enabled before the upgrade, you can enable automatic injection as usual. When you enable injection, you can select the required Sidecar mesh proxy version. The service mesh adds the istio.io/rev:stable or istio.io/rev:canary label to the namespace based on your selection.

  5. Perform a rolling update for the reviews-v2 workload.

    1. Log on to the ACK console. In the left navigation pane, click Clusters.

    2. On the Clusters page, find the cluster you want and click its name. In the left-side pane, choose Workloads > Deployments.

    3. On the Deployments page, find reviews-v2. In the Actions column, choose More > Redeploy. In the Redeploy dialog box, click OK.

  6. On the Deployments page, click reviews-v2. On the Pods tab, check whether the pod corresponding to reviews-v2 started successfully after the rolling update and whether the new pod is injected with the version 1.23 Sidecar proxy.

    Snipaste_2025-10-a-03-28

    The pod corresponding to reviews-v2 starts successfully after the rolling update, and the new pod is injected with the v1.23 Sidecar mesh proxy.

  7. Open a browser and access the Bookinfo page. Check whether the traffic meets your expectations.

    As shown in the following figure, you can access the reviews-v2 version. This result meets the expectations.访问bookinfo页面.png

    The following figure shows the status at this point:

    image

Step 3: Roll back reviews-v2 to version 1.22

If the verification in Step 2 fails, or to roll back for other reasons, you can follow these steps.

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Instance > Global Namespace.

  3. On the Global Namespace page, in the Automatic Injection column, find the default namespace and click Switch to inject version 1-22-6. In the Confirm dialog box, click OK.

    Before the rollback, the namespace label is istio.io/rev: canary, and the version 1.23 mesh proxy is injected for the default namespace.

    After the rollback, the label switches to istio.io/rev: stable. The data plane's default namespace label is also synchronized to istio.io/rev: stable. The version 1.22 mesh proxy is now injected for the default namespace.

  4. Redeploy the reviews-v2 workload in the ACK console.

    1. Log on to the ACK console. In the left navigation pane, click Clusters.

    2. On the Clusters page, find the cluster you want to manage and click its name. In the left navigation pane, choose Workloads > Deployments.

    3. On the Deployments page, set Namespace to default. In the Actions column, find reviews-v2 and choose More > Redeploy. In the Redeploy dialog box, click OK.

    4. On the Deployments page, click the name reviews-v2. On the Pods tab, check whether the pod corresponding to reviews-v2 started successfully after the rolling update and whether the new pod is injected with the version 1.22 Sidecar proxy.

      Snipaste_2025-10-a-03-28

      The pod corresponding to reviews-v2 starts successfully after the rolling update, and the new pod is injected with the version 1.22 Sidecar proxy.

Step 4: Revoke the upgrade

After the rollback succeeds, you can revoke the canary upgrade and restore the ASM instance to the original version 1.22.

Important

Before you revoke the upgrade, make sure that all namespaces are injected with the stable Sidecar proxy version. Otherwise, you cannot revoke the upgrade.

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Instance > Upgrade Management.

  3. On the Upgrade Management page, on the Canary Upgrade tab, click Revoke Upgrade. In the Confirm to Revoke the Upgrade? dialog box, click OK.

    After you click Revoke Upgrade, the control plane component of the canary release version (version 1.23 in this example) is deleted. Only the control plane component of the stable version (version 1.22 in this example) is retained.

    The following figure shows the status at this point:

    image

Step 5: Re-upgrade and verify

The entire ASM instance is now in the state before Step 1 and is ready for an upgrade. This example only verifies the reviews-v1, reviews-v2, and reviews-v3 workloads. You can follow the preceding steps to upgrade and verify any workload as needed until the verification is successful.

  1. Perform Step 1 again to upgrade the control plane to the canary version.

  2. Perform Step 2 again to verify that the injected new version of the proxy meets expectations.

    For example, if you redeploy the reviews-v1, reviews-v2, and reviews-v3 stateless workloads from the ACK console, the pods for these workloads are all injected with the version 1.23 Sidecar mesh proxy.

Step 6: Verification passed, switch to the official version

The preceding steps have verified that reviews-v1, reviews-v2, and reviews-v3 can use the version 1.23 (new version) mesh proxy and that the features work as expected. At this point, you can promote version 1.23 to the official version.

Important
  • After you switch versions, the upgrade proceeds to unpublish the old version. At this point, all workloads must be switched to the new version 1.23 Sidecar mesh proxy, and you can no longer revoke the upgrade. Make sure to complete all verification work in the New Version Deployment phase.

  • After the version switch, namespaces with the istio.io/rev: stable and istio-injection: enabled labels are injected with the new version 1.23 Sidecar mesh proxy. The istio.io/rev: canary label no longer has an effect. Therefore, when switching versions, ASM automatically changes all istio.io/rev: canary labels to istio.io/rev: stable. You must confirm this when you click Switch Version on the Canary Upgrade tab of the Upgrade Management page.

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Instance > Upgrade Management.

  3. On the Upgrade Management page, on the Canary Upgrade tab, click Upgrade Version. In the Confirm to Switch the Version? dialog box, carefully read the prompt. After you confirm the information, click OK.

    After the switch is successful, the existing mesh proxies of version 1.22 are retained, and the corresponding workloads are not affected. However, all redeployed pods will be injected with the version 1.23 Sidecar proxy. The page after the switch and update is shown in the following figure.

    Snipaste_2025-10-28_10-10-05

    The following figure shows the status at this point:

    image

Step 7: Upgrade the data plane

At this point, the ASM version is 1.23. Namespaces with the istio.io/rev=stable, istio.io/rev=1-23-6, or istio-injection=enabled label are all injected with the version 1.23 Sidecar mesh proxy. You can perform a rolling update of workloads to upgrade the injected Sidecar mesh proxy to the new version 1.23. This completes the data plane upgrade.

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Instance > Upgrade Management.

  3. On the Upgrade Management page, on the Canary Upgrade tab, click the Data Plane tab. You can upgrade the ASM gateway or workloads as needed.

    • Upgrade an ASM gateway: In the ASM Gateway section, find the target gateway. In the Actions column, click Rolling Upgrade. In the Confirm to Perform Rolling Upgrade? dialog box, click OK to upgrade the ASM gateway to the new version 1.23.

    • Upgrade a workload: In the Workloads To Be Upgraded section, switch the Namespace. Then, find the target workload. In the Actions column, click Rolling Upgrade. In the Confirm to Perform Rolling Upgrade? dialog box, click OK to upgrade the workload to the new version 1.23.

      数据面升级.png

      Note

      The list does not show ASM gateways or workloads that have been upgraded.

      You can also click Mesh Status in the navigation pane on the left to view the global workloads or gateway instances that have not been upgraded.

      The following figure shows the status after the data plane upgrade is complete.

      image

Step 8: Unpublish the old version

After all workloads on the data plane are upgraded, you can unpublish the old version 1.22.

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Instance > Upgrade Management.

  3. On the Upgrade Management page, on the Canary Upgrade tab, click Unpublish Old Version. In the Confirm to Unpublish the Old Version of the Control Plane? dialog box, click OK.

    The following figure shows the status at this point.

    image