MSE realizes the full link grayscale based on Apache APISIX

What is full link grayscale?

In the microservice architecture, the dependencies between services are complex, and sometimes the release of a function relies on multiple services being upgraded and launched simultaneously. We hope that small traffic grayscale verification can be performed on new versions of these services simultaneously, which is a unique full-link grayscale scenario in the microservices architecture. By building an environmental isolation from the gateway to the entire back-end service, grayscale verification can be performed on multiple different versions of services.

During the publishing process, we only need to deploy the grayscale version of the service. When traffic flows over the calling link, the gateways, middleware, and microservices that flow through identify the grayscale traffic, and dynamically forward it to the grayscale version of the corresponding service. As shown below:

The above figure can well demonstrate the effect of this scheme. We use different colors to represent different versions of grayscale traffic. It can be seen that both the microservice gateway and the microservice itself need to identify traffic and make dynamic decisions based on governance rules. When the service version changes, the forwarding of this call link will also change in real time. Compared to grayscale environments built using machines, this scheme not only can save a lot of machine costs and operation and maintenance manpower, but also can help developers quickly and accurately control online traffic over the full link.

So how to achieve full link grayscale? Through the above discussion, we need to address the following issues:

1. Each component and service on the link can dynamically route based on request traffic characteristics

2. It is necessary to group all nodes under the service to distinguish versions

3. It is necessary to conduct grayscale identification and version identification for traffic

4. It is necessary to identify different versions of grayscale traffic

The following will introduce the v1alpha1 standard defined by OpenSergo for traffic routing to tell you the technical details required to achieve full link grayscale.

Q: What is OpenSergo?

A: OpenSergo is a set of open, universal, distributed service architecture oriented service governance standards that cover the entire link heterogeneous ecosystem. It forms a common standard for service governance based on industry service governance scenarios and practices. The biggest feature of OpenSergo is that it defines service governance rules with a unified set of configurations/DSLs/protocols, and faces a multilingual heterogeneous architecture to achieve full link ecological coverage. Whether the language of a microservice is Java, Go, Node.js, or any other language, whether it is a standard microservice or Mesh access, from gateway to microservice, from database to cache, and from service registration discovery to configuration, developers can implement unified governance and control for each layer through the same set of OpenSergo CRD standard configurations, without paying attention to differences between various frameworks and languages, reducing heterogeneity Complexity of full link service governance and control

Q: Why introduce OpenSergo to me before understanding full link grayscale?

A: OpenSergo defines a unified set of YAML configuration methods to implement full link service governance specifications for distributed architectures. While introducing the specifications and standards, we can understand the implementation of the technical details. At the same time, we can also implement new components with OpenSergo standards.

OpenSergo traffic routing v1alpha1 standard

Traffic routing, as the name implies, is to route traffic with certain attribute characteristics to a specified destination. Traffic routing is an important part of traffic governance. Developers can implement various scenarios based on traffic routing standards, such as grayscale publishing, canary publishing, disaster tolerance routing, label routing, and so on.

Full link grayscale example:

The traffic routing rules (v1alpha1) are mainly divided into three parts:

• Workload LabelRule: Label a group of workloads accordingly, which can be understood as labeling each upstream of APISIX

• Traffic Label Rule: Label traffic with certain attribute characteristics accordingly

• Perform matching routing based on Workload tags and traffic tags, routing traffic with specified tags to the matching Workload

We can assign different semantics to tags to achieve routing capabilities in various scenarios.

Marking the flow:

Traffic with certain attribute characteristics needs to be labeled accordingly.

Suppose you now need to grayscale the internal test user to the new version of the home page, with the test user uid=12345, and the UID located in the X-User ID header.

Through the above configuration, we can mark HTTP traffic with a path of/index and a uid header of 12345 with a gray flag, indicating that this traffic is grayscale traffic.

Label Workload:

So how do you add different tags to service nodes? Driven by today's hot cloud native technology, most businesses are actively engaged in the journey of container transformation. Here, I will take a container based application as an example to introduce how to label a service Workload node in two scenarios: using the Kubernetes Service as a service discovery and using the popular Nacos registry.

In a business system using Kubernetes Service as service discovery, the service provider completes service exposure by submitting service resources to ApiServer. The service consumer listens to the Endpoint resources associated with the service resources, obtains the associated business Pod resources from the Endpoint resources, reads the above Labels data, and serves as the metadata information for the node. Therefore, we just need to add a label to the node in the Pod template in the business application description resource deployment.

In business systems that use Nacos as a service discovery tool, it is generally necessary for businesses to determine the marking method based on the microservice framework they use. If Java applications use the Spring Cloud microservice development framework, we can add corresponding environment variables to the business container to complete the tag addition operation. For example, if we want to add a version grayscale to a node, add gray to the business container, so that the framework will add a gray label to the node when registering it with Nacos.

For some complex workload marking scenarios (such as database instances and cache instance tags), we can use WorkloadLabelRule CRD for marking.

Flow dyeing:

How do components on a request link identify different grayscale traffic? The answer is traffic coloring, which adds different grayscale identifiers to request traffic to facilitate differentiation. We can dye the traffic at the source of the request, and the front end marks the traffic based on user information or platform information when initiating the request. If the front-end cannot do this, we can also dynamically add traffic identifiers to requests that match specific routing rules on the microservices gateway. In addition, when traffic flows through grayscale nodes in the link, if the request information does not contain a grayscale identifier, it needs to be automatically colored, and then traffic can give priority to accessing the grayscale version of the service in the subsequent flow process.

Currently, the standard for traffic coloring is not defined in detail in OpenSergo v1alpha1, and it can be discussed with the community to design the traffic coloring standard. Apache APISIX will also adapt to the OpenSergo standards. Developers can use the same set of OpenSergo CRD standards to conduct unified governance and control for the traffic gateway layer, which can release the new value of the microservice architecture based on Apache APSIX.

Full link grayscale is one of the core functions of microservices, and it is also a function that cloud users must have during the in-depth process of microservices. Due to the numerous technologies and scenarios involved in the full link grayscale, if an enterprise implements itself one by one, it needs to spend a lot of labor costs to expand and maintain it.

Product Practice of Full Link Gray Scale Scheme Based on Apache APISIX

After introducing the technology, let's introduce the full link grayscale product practice based on Apache APISIX on Alibaba Cloud.


Step 1: Install the Ingress-APISIX component

The APISIX architecture is shown below. We need to install APISIX.

1. Install components such as apisix, apisix-ingress-controller, and etcd

You can see stateless apisix, apisix ingeress controller applications, and stateful etcd applications under the ingeress apisix namespace.

2. Install APISIX Admin

After installation, you can bind an SLB

Access the APISIX console through {slb ip}: 9000 (default password admin/admin)

Step 2: Enable microservice governance

In this step, it is necessary to activate MSE microservice governance, install the MSE service governance component (ack on pilot), and enable microservice governance for the application. For specific operation information, please refer to Alibaba Cloud's official tutorial:

Step 3: Deploy the Demo application

Deploy three applications A, B, and C in Alibaba Cloud container service, with each application deploying a base version and a gray version respectively; And deploy a Nacos Server application for service discovery. For details, please refer to this tutorial to complete application deployment: Deploying a Demo application. After the deployment is completed, you can configure the application configuration service for upstream configuration through the APISIX Dashboard.

Application scenario: Routing according to specified request parameters to achieve full link grayscale

Some clients cannot rewrite the domain name and hope to access by passing in different parameters to route to a grayscale environment. For example, in the following figure, the request parameter env=gray is used to access the grayscale environment.

Call the link Ingress API SIX ->A ->B ->C, where A can be a spring boot application.

Configure APISIX routing rules

Select a route on the APISIX Dashboard and click Create. Create a new advanced matching rule in the matching criteria, request path selection/*, and select the corresponding upstream. Configure the following routes respectively:

• When the host is and the request parameter env=gray, the routing priority matches the upstream corresponding to the id 401163331936715388, that is, spring cloud a gray svc;

• When the host is, the route will match the upstream corresponding to the ID 401152455435354748, that is, spring cloud a svc.

Configure the routing corresponding to Gray.

Configure MSE full link grayscale

You need to configure and complete the full link publishing of MSE. For specific operation details, please refer to this tutorial: Configure Full Link Gray Scale.

Result Validation

At this point, visit to route to the baseline environment

At this time, when visiting and env=gray, route to the grayscale environment

Note: is the public IP address of APISIX


Currently, MSE service governance full link grayscale capabilities have supported cloud native gateways, ALB, APISIX, Apache Dubbo, Spring Cloud, RocketMQ, and databases.

Based on the flexible routing capabilities of Apache APISIX, combined with MSE's full link grayscale capabilities, it can quickly achieve enterprise level full link grayscale capabilities. APSIX supports routing in multiple ways such as headers, cookies, Params, and domain names. It is only necessary to route traffic to different "swimlane" environments on the gateway side based on demand, and the traffic is automatically closed in the "swimlane" of the corresponding label. When there are no other services in the swimlane that are dependent on in the call chain, the traffic needs to be rolled back to the baseline environment, and further routed back to the swimlane of the corresponding label when necessary.

Service governance is the only way to go after microservice transformation has reached a certain stage, and in this process, we continue to encounter new problems.

• What other capabilities does service governance have besides full link grayscale?

• Is there a standard definition of service governance capability, and what does it include?

• Are there any best practices or standards for full links in multilingual scenarios?

• How can heterogeneous microservices be uniformly governed?

When exploring service governance and connecting with other microservices, we found that the difficulties caused by different governance systems are enormous, and the cost of connecting two or even multiple governance systems is also enormous. For this reason, we proposed the OpenSergo project. OpenSergo aims to solve the problem of fragmentation and interoperability of concepts in microservice governance in different frameworks and languages.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us