All Products
Search
Document Center

Microservices Engine:Implement an end-to-end canary release by using MSE cloud-native gateways

Last Updated:Mar 11, 2026

Releasing new versions of multiple microservices at the same time requires a way to test them together before sending all production traffic to them. An end-to-end canary release routes requests with specific characteristics through canary versions of every service in the call chain, while regular traffic continues to flow through the base versions. If a service has no canary version, traffic automatically falls back to its base version.

Microservices Engine (MSE) implements end-to-end canary releases by combining cloud-native gateways with Microservices Governance. Define isolated traffic lanes for canary versions of your applications, configure routing rules at the gateway, and let MSE propagate those rules across the entire call chain -- no business code changes required.

Implementation process

Key concepts

ConceptDescription
LaneAn isolated runtime environment for applications of the same version. Only requests matching specific routing rules reach the tagged applications in a lane. Applications and lanes have a many-to-many relationship.
Lane groupA collection of lanes that distinguishes between different teams or scenarios.
Base environmentThe environment where untagged applications run. It provides disaster recovery for other environments.
MSE cloud-native gatewayA gateway compatible with Kubernetes Ingress that supports service discovery from multiple sources, including Container Service for Kubernetes (ACK) clusters and Nacos instances.

Sample scenario

The following e-commerce order placement scenario demonstrates an end-to-end canary release from an MSE cloud-native gateway through a Spring Cloud backend.

The architecture includes three applications:

ApplicationRolePort
Application ATransaction center20001
Application BCommodity center20002
Application CInventory center20003

The call chain is: Client > MSE cloud-native gateway > A > B > C

Service discovery uses an MSE Nacos instance. Client-based and HTML-based access to the backend applications are both supported.

New versions are released for Application A and Application C. Before going live, test the canary versions through an end-to-end canary release. Application B has no canary version, so it continues to serve from its base version.

Scenario diagram

Limitations

Prerequisites

Before you begin, make sure that you have:

Important

The MSE Java agent version must be 3.2.3 or later. Earlier versions may cause issues.

Note The MSE cloud-native gateway must be deployed in the same virtual private cloud (VPC) as your ACK cluster or MSE Nacos instance.

Step 1: Deploy the base versions of backend applications

  1. Log on to the ACK console.

  2. In the left-side navigation pane, click Clusters. Then, click the name of the target cluster.

  3. In the left-side navigation pane, choose Workloads > Deployments.

  4. Select the namespace and click Create from YAML.

  5. Paste the YAML code for Application A, Application B, and Application C. Choose the appropriate YAML based on your service source.

Use an MSE Nacos instance as the service source

Important

Replace {nacos server address} with the internal endpoint of your MSE Nacos instance and remove the curly braces {}.

Show YAML code

# Base version of Application A
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-cloud-a
  namespace: default
spec:
  selector:
    matchLabels:
      app: spring-cloud-a
  template:
    metadata:
      labels:
        app: spring-cloud-a
        msePilotCreateAppName: spring-cloud-a
        msePilotAutoEnable: 'on'
    spec:
      containers:
      - name: spring-cloud-a
        image: registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/spring-cloud-a:3.0.1
        imagePullPolicy: Always
        ports:
          - containerPort: 20001
        livenessProbe:
          tcpSocket:
            port: 20001
          initialDelaySeconds: 30
          periodSeconds: 60
        env:
        - name: spring.cloud.nacos.discovery.server-addr
          value: {nacos server address}
        - name: dubbo.registry.address
          value: 'nacos://{nacos server address}:8848'
---
# Base version of Application B
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-cloud-b
  namespace: default
spec:
  selector:
    matchLabels:
      app: spring-cloud-b
  template:
    metadata:
      labels:
        app: spring-cloud-b
        msePilotCreateAppName: spring-cloud-b
        msePilotAutoEnable: 'on'
    spec:
      containers:
      - name: spring-cloud-b
        image: registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/spring-cloud-b:3.0.1
        imagePullPolicy: Always
        ports:
          - containerPort: 20002
        livenessProbe:
          tcpSocket:
            port: 20002
          initialDelaySeconds: 30
          periodSeconds: 60
        env:
        - name: spring.cloud.nacos.discovery.server-addr
          value: {nacos server address}
        - name: dubbo.registry.address
          value: 'nacos://{nacos server address}:8848'
---
# Base version of Application C
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-cloud-c
  namespace: default
spec:
  selector:
    matchLabels:
      app: spring-cloud-c
  template:
    metadata:
      labels:
        app: spring-cloud-c
        msePilotCreateAppName: spring-cloud-c
        msePilotAutoEnable: 'on'
    spec:
      containers:
      - name: spring-cloud-c
        image: registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/spring-cloud-c:3.0.1
        imagePullPolicy: Always
        ports:
          - containerPort: 20003
        livenessProbe:
          tcpSocket:
            port: 20003
          initialDelaySeconds: 30
          periodSeconds: 60
        env:
        - name: spring.cloud.nacos.discovery.server-addr
          value: {nacos server address}
        - name: dubbo.registry.address
          value: 'nacos://{nacos server address}:8848'

Use an ACK cluster as the service source

  1. Deploy a self-managed Nacos instance as a service registry.

Important

The YAML code below registers the base version endpoint with the self-managed Nacos instance.

Show YAML code for the self-managed Nacos instance

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nacos-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nacos-server
  template:
    metadata:
      labels:
        msePilotAutoEnable: "off"
        app: nacos-server
    spec:
      containers:
        - name: nacos-server
          image: 'registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/nacos-server:v2.1.2'
          env:
            - name: MODE
              value: standalone
            - name: JVM_XMS
              value: 512M
            - name: JVM_XMX
              value: 512M
            - name: JVM_XMN
              value: 256M
          imagePullPolicy: Always
          livenessProbe:
            failureThreshold: 3
            initialDelaySeconds: 15
            periodSeconds: 10
            successThreshold: 1
            tcpSocket:
              port: 8848
            timeoutSeconds: 3
          readinessProbe:
            failureThreshold: 5
            initialDelaySeconds: 15
            periodSeconds: 15
            successThreshold: 1
            tcpSocket:
              port: 8848
            timeoutSeconds: 3
          resources:
            requests:
              cpu: '1'
              memory: 2Gi
      dnsPolicy: ClusterFirst
      restartPolicy: Always

---
apiVersion: v1
kind: Service
metadata:
  name: nacos-server
spec:
  type: ClusterIP
  ports:
    - name: nacos-server-8848-8848
      port: 8848
      protocol: TCP
      targetPort: 8848
    - name: nacos-server-9848-9848
      port: 9848
      protocol: TCP
      targetPort: 9848
  selector:
    app: nacos-server
  1. Deploy the base versions of Application A, Application B, and Application C.

Show YAML code for base versions (ACK service source)

# Base version of Application A
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-cloud-a
  namespace: default
spec:
  selector:
    matchLabels:
      app: spring-cloud-a
  template:
    metadata:
      labels:
        app: spring-cloud-a
        msePilotCreateAppName: spring-cloud-a
        msePilotAutoEnable: 'on'
    spec:
      containers:
      - name: spring-cloud-a
        image: registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/spring-cloud-a:3.0.1
        imagePullPolicy: Always
        ports:
          - containerPort: 20001
        livenessProbe:
          tcpSocket:
            port: 20001
          initialDelaySeconds: 30
          periodSeconds: 60
        # Access the self-managed Nacos instance
        env:
        - name: spring.cloud.nacos.discovery.server-addr
          value: nacos-server
        - name: dubbo.registry.address
          value: 'nacos://nacos-server:8848'
---
# Base version of Application B
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-cloud-b
  namespace: default
spec:
  selector:
    matchLabels:
      app: spring-cloud-b
  template:
    metadata:
      labels:
        app: spring-cloud-b
        msePilotCreateAppName: spring-cloud-b
        msePilotAutoEnable: 'on'
    spec:
      containers:
      - name: spring-cloud-b
        image: registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/spring-cloud-b:3.0.1
        imagePullPolicy: Always
        ports:
          - containerPort: 20002
        livenessProbe:
          tcpSocket:
            port: 20002
          initialDelaySeconds: 30
          periodSeconds: 60
        # Access the self-managed Nacos instance
        env:
        - name: spring.cloud.nacos.discovery.server-addr
          value: nacos-server
        - name: dubbo.registry.address
          value: 'nacos://nacos-server:8848'
---
# Base version of Application C
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-cloud-c
  namespace: default
spec:
  selector:
    matchLabels:
      app: spring-cloud-c
  template:
    metadata:
      labels:
        app: spring-cloud-c
        msePilotCreateAppName: spring-cloud-c
        msePilotAutoEnable: 'on'
    spec:
      containers:
      - name: spring-cloud-c
        image: registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/spring-cloud-c:3.0.1
        imagePullPolicy: Always
        ports:
          - containerPort: 20003
        livenessProbe:
          tcpSocket:
            port: 20003
          initialDelaySeconds: 30
          periodSeconds: 60
        # Access the self-managed Nacos instance
        env:
        - name: spring.cloud.nacos.discovery.server-addr
          value: nacos-server
        - name: dubbo.registry.address
          value: 'nacos://nacos-server:8848'
  1. Create a Kubernetes Service for Application A so the gateway can route traffic to it.

Show YAML code for the Application A service

apiVersion: v1
kind: Service
metadata:
  name: sc-a
  namespace: default
spec:
  ports:
    - port: 20001
      protocol: TCP
      targetPort: 20001
  selector:
    app: spring-cloud-a
  type: ClusterIP

Step 2: Expose Application A through the gateway

Configure the MSE cloud-native gateway to route external traffic to Application A. Choose one of the following approaches based on whether the service has already been added.

Add a new service

If Application A has not been added to the gateway, add it first:

  1. Log on to the MSE console. In the left-side navigation pane, choose Cloud-native Gateway > Gateways and click the gateway name. In the left-side navigation pane, click Routes. On the Services tab, click Add Service. For details, see Add a service.

    • ACK cluster as the service source: Set Service Source to Container Service, Namespace to default, and Services to sc-a.

    • MSE Nacos instance as the service source: Set Service Source to MSE Nacos, Namespace to public, and Services to sc-A.

  2. On the Routes tab, click Add Route to create a route for sc-a (or sc-A). For details, see Create a route.

    ParameterValue
    PathSelect Prefix and enter /a
    Route PointSelect Single Service
    Backend ServiceSelect sc-A

Use an existing service

If Application A is already imported, modify an existing route:

  1. Log on to the MSE console. In the left-side navigation pane, choose Cloud-native Gateway > Gateways and click the gateway name. On the Routes tab, modify the route.

    ParameterValue
    PathSelect Prefix and enter /a
    Route PointSelect Single Service
    Backend ServiceSelect sc-A

Step 3: Verify base version traffic

  1. Log on to the MSE console. Select a region in the top navigation bar.

  2. In the left-side navigation pane, choose Cloud-native Gateway > Gateways and click the gateway name.

  3. In the left-side navigation pane, click Overview. On the Endpoint tab, find the ingress IP address of the Server Load Balancer (SLB) instance.

  4. Send a request to verify that traffic flows through the base versions:

# Replace x.x.1.1 with the SLB ingress IP address
curl x.x.1.1/a

Expected output:

A[10.0.3.178][config=base] -> B[10.0.3.195] -> C[10.0.3.201]

Application B and Application C have no version suffix, confirming that all traffic goes through base versions.

Step 4: Deploy canary versions of Application A and Application C

Application A and Application C receive new feature updates. Application B remains unchanged and continues to run the base version only.

  1. Log on to the ACK console.

  2. In the left-side navigation pane, click Clusters. Then, click the name of the target cluster.

  3. In the left-side navigation pane, choose Workloads > Deployments.

  4. Select the namespace and click Create from YAML.

  5. Paste the YAML code for the canary versions of Application A and Application C. Choose the appropriate YAML based on your service source.

Use an MSE Nacos instance as the service source

Important

Replace {nacos server address} with the internal endpoint of your MSE Nacos instance and remove the curly braces {}.

Show YAML code

# Canary version of Application A
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-cloud-a-gray
  namespace: default
spec:
  selector:
    matchLabels:
      app: spring-cloud-a-gray
  template:
    metadata:
      labels:
        alicloud.service.tag: gray
        app: spring-cloud-a-gray
        msePilotCreateAppName: spring-cloud-a
        msePilotAutoEnable: 'on'
    spec:
      containers:
      - name: spring-cloud-a
        image: registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/spring-cloud-a:3.0.1
        imagePullPolicy: Always
        ports:
          - containerPort: 20001
        livenessProbe:
          tcpSocket:
            port: 20001
          initialDelaySeconds: 30
          periodSeconds: 60
        env:
        - name: spring.cloud.nacos.discovery.server-addr
          value: {nacos server address}
        - name: dubbo.registry.address
          value: 'nacos://{nacos server address}:8848'
---
# Canary version of Application C
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-cloud-c-gray
  namespace: default
spec:
  selector:
    matchLabels:
      app: spring-cloud-c-gray
  template:
    metadata:
      labels:
        alicloud.service.tag: gray
        app: spring-cloud-c-gray
        msePilotCreateAppName: spring-cloud-c
        msePilotAutoEnable: 'on'
    spec:
      containers:
      - name: spring-cloud-c
        image: registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/spring-cloud-c:3.0.1
        imagePullPolicy: Always
        ports:
          - containerPort: 20003
        livenessProbe:
          tcpSocket:
            port: 20003
          initialDelaySeconds: 30
          periodSeconds: 60
        env:
        - name: spring.cloud.nacos.discovery.server-addr
          value: {nacos server address}
        - name: dubbo.registry.address
          value: 'nacos://{nacos server address}:8848'

Use an ACK cluster as the service source

The canary YAML adds alicloud.service.tag: gray to spec.template.metadata.labels to distinguish canary nodes from base nodes.

Important

The canary version endpoint is registered with the self-managed Nacos instance.

Show YAML code

# Canary version of Application A
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-cloud-a-gray
  namespace: default
spec:
  selector:
    matchLabels:
      app: spring-cloud-a
  template:
    metadata:
      labels:
        alicloud.service.tag: gray
        app: spring-cloud-a
        msePilotCreateAppName: spring-cloud-a
        msePilotAutoEnable: 'on'
    spec:
      containers:
      - name: spring-cloud-a
        image: registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/spring-cloud-a:3.0.1
        imagePullPolicy: Always
        ports:
          - containerPort: 20001
        livenessProbe:
          tcpSocket:
            port: 20001
          initialDelaySeconds: 30
          periodSeconds: 60
        # Access the self-managed Nacos instance
        env:
        - name: spring.cloud.nacos.discovery.server-addr
          value: nacos-server
        - name: dubbo.registry.address
          value: 'nacos://nacos-server:8848'

---
# Canary version of Application C
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-cloud-c-gray
  namespace: default
spec:
  selector:
    matchLabels:
      app: spring-cloud-c
  template:
    metadata:
      labels:
        alicloud.service.tag: gray
        app: spring-cloud-c
        msePilotCreateAppName: spring-cloud-c
        msePilotAutoEnable: 'on'
    spec:
      containers:
      - name: spring-cloud-c
        image: registry.cn-hangzhou.aliyuncs.com/mse-governance-demo/spring-cloud-c:3.0.1
        imagePullPolicy: Always
        ports:
          - containerPort: 20003
        livenessProbe:
          tcpSocket:
            port: 20003
          initialDelaySeconds: 30
          periodSeconds: 60
        # Access the self-managed Nacos instance
        env:
        - name: spring.cloud.nacos.discovery.server-addr
          value: nacos-server
        - name: dubbo.registry.address
          value: 'nacos://nacos-server:8848'

Step 5: Create a lane group

A lane group defines the set of applications that participate in the canary release.

  1. Log on to the MSE console and select a region in the top navigation bar.

  2. In the left-side navigation pane, choose Microservices Governance > Full link grayscale.

  3. Click Create Lane Group and Lane. If a lane group already exists in the selected microservice namespace, click Create Lane Group.

  4. In the Create Lane Group panel, configure the following parameters and click OK:

    ParameterValue
    Lane Group NameEnter a name for the lane group
    Ingress TypeSelect MSE Cloud-native Gateway
    Ingress GatewaySelect the target cloud-native gateway
    Lane Group ApplicationSelect spring-cloud-a, spring-cloud-b, and spring-cloud-c

    Create lane group

After creation, the lane group appears in the Lane Group section of the Full link grayscale page. To modify it, click the edit icon.

Step 6: Create a lane

A lane defines the routing rules that direct specific traffic to canary versions of your applications.

Note
  • Tag canary application nodes to distinguish them from base nodes. In a container environment, add alicloud.service.tag: ${tag} to spec.template.metadata.labels. In an Elastic Compute Service (ECS) environment, add the Java startup parameter -Dalicloud.service.tag=${tag}.
  • The lane routing mode must be the same for all lanes in a lane group. You can set the mode only when creating the first lane.

MSE supports two routing modes when the ingress type is MSE Cloud-native Gateway:

Routing modeWhen to useBehavior
Routing by request contentThe request content (headers, parameters) can identify canary trafficCanary requests stay within the same environment throughout the call chain
Routing by percentageThe request content cannot identify canary traffic and the system cannot be modified to add identifiersRequests from the same source may be routed to different lanes

Create a lane with routing by request content

  1. In the lower part of the Full link grayscale page, click Create First Split Lane (or Create Lane if a lane already exists).

  2. In the Create Lane panel, configure the following parameters and click OK:

    ParameterConfiguration
    Add Node TagAdd tags to canary application nodes to distinguish them from base nodes
    Enter lane informationSet Lane Tag to the tag value for requests routed to this lane. Use Confirm Matching Relationship to verify the expected number of tagged nodes
    Add Canary Release RuleSet Canary Release Mode to Canary Release by Content. Set Canary Release Condition to Meet All Conditions. Configure the condition: Parameter Type = Header, Parameter = canary, Condition = ==, Value = gray

Canary release conditions

ConditionEffect
Meet All ConditionsRoutes traffic that meets every specified condition
Meet Any ConditionRoutes traffic that meets at least one specified condition

Condition operators

OperatorDescription
==Exact match. The traffic value must be exactly the same as the condition value.
!=Not equal. The traffic value must differ from the condition value.
inInclusive match. The traffic value must be in the specified list.
PercentageHash-based match. Routes traffic when hash(get(key)) % 100 < value.
Regular expressionRegex match. The traffic value must match the specified pattern.

Create a lane with routing by percentage

Note Routing by percentage requires ack-onepilot version 3.0.18 or later and agent version 3.2.3 or later.
  1. In the lower part of the Full link grayscale page, click Create First Split Lane (or Create Lane if a lane already exists).

  2. In the Create Lane panel, configure the following parameters and click OK:

    ParameterConfiguration
    Add Node TagAdd tags to canary application nodes to distinguish them from base nodes
    Enter lane informationSet Lane Tag to the tag value for requests routed to this lane. Use Confirm Matching Relationship to verify the expected number of tagged nodes
    Configure Routing and Canary Release RulesSet Canary Release Mode to Canary Release by Ratio. Set Flow ratio to 30 (percentage)
Note You can also configure different traffic percentages for each gateway base route. If you enable this, the sum of traffic percentages across all lane groups for a given base route must not exceed 100%.

Manage lanes

After creating a lane, it appears in the Traffic Distribution section of the Full link grayscale page. Available actions:

ActionEffect
EnableActivate the lane so traffic is routed based on the lane configuration. Matching traffic goes to the tagged application version. If no tagged version exists, traffic falls back to untagged versions.
DisableDeactivate the lane. All traffic routes to untagged application versions.

You can also view the traffic percentage of the lane and configure application status in the lane from this page.

Step 7: Verify canary traffic

Verify routing by request content

Send a request with the canary: gray header to test that traffic flows through the canary versions:

# Replace x.x.x.x with the SLB ingress IP address
curl -H "canary: gray" x.x.x.x/a

Expected output:

Agray[10.0.3.177][config=base] -> B[10.0.3.195] -> Cgray[10.0.3.180]

The gray suffix on Application A and Application C confirms that canary traffic reaches the canary versions. Application B has no canary version, so traffic routes to its base version.

Verify routing by percentage

Run the following Python script to test traffic distribution. Replace x.x.x.x with the SLB IP address of the cloud-native gateway.

pip3 install requests
python3 traffic.py

Expand to view the Python script

import requests


TOTAL_REQUEST = 100
ENTRY_URL = 'http://x.x.x.x/a'

def parse_tag(text: str):
    '''
    A[10.0.23.64][config=base] -> B[10.0.23.65] -> C[10.0.23.61]
    Agray[10.0.23.64][config=base] -> B[10.0.23.65] -> Cgray[10.0.23.61]
    '''
    print(text)
    app_parts = text.split(' -> ')
    tag_app = app_parts[-1]

    splits = tag_app.split('[')
    tag_part = splits[0]
    tag = tag_part[1:]
    return tag if len(tag) > 0 else 'base'

def get_tag(url: str):
    resp = requests.get(url)
    resp.encoding = resp.apparent_encoding
    return parse_tag(resp.text)

def cal_tag_count(url: str, total_request: int):
    count_map = {}
    for i in range(total_request):
        tag = get_tag(url)
        if tag not in count_map:
            count_map[tag] = 1
        else:
            count_map[tag] += 1

    print()
    print('Total Request:', total_request)
    print('Traffic Distribution:', count_map)


if __name__ == '__main__':
    cal_tag_count(ENTRY_URL, TOTAL_REQUEST)

Expected result: approximately 30% of requests are routed to the canary environment.

Percentage routing test result

(Optional) Monitor canary traffic

Monitor canary traffic from any of the following locations to identify issues early.

Monitor from the cloud-native gateway

On the Routes page of the MSE cloud-native gateway, click the Services tab. Click the target service name, then view the metric data on the Monitor tab.

Gateway monitoring

Monitor from Microservices Governance

On the Full link grayscale page, click the target application. In the QPS Data section, view the traffic data for base and canary versions:

MetricDescription
Total QPSTotal queries per second for the application
Exception QPSError request count for the application
GrayQPSQueries per second for the canary version
Microservices Governance monitoring

Monitor from Application Real-Time Monitoring Service (ARMS)

If the application is connected to ARMS, view canary traffic data on the Full link grayscale tab of the Scenario-based Analysis page in the ARMS console. Use this data to decide whether to roll back or proceed with the full release.

ARMS monitoring

What's next

  • After verifying that the canary versions are stable, promote them by updating the base deployments to use the new image versions and removing the canary deployments.

  • To roll back, disable the lane and delete the canary deployments. All traffic returns to the base versions automatically.