Alibaba Implements Environmental Isolation Based on Nacos

With the release of Nacos 0.9 version, Nacos is one step closer to the official production version (GA). In fact, many companies have already started production, such as Huya Live.

This Wednesday (today), from 19:00 to 21:00 in the evening, the Nacos 1.0.0 release feature preview and upgrade and usage guidance will be broadcast live on the Nacos DingTalk group.

Nacos Environmental Isolation
Usually, the process of enterprise research and development is as follows: first develop and test functions in the test environment, then grayscale, and finally release to the production environment. In addition, in order to stabilize the production environment, it is necessary to isolate the test environment from the production environment. At this time, the problem is bound to be a multi-environment problem, that is:

How is data in multiple environments isolated?
How to isolate gracefully? (no changes required by the user)
This article will introduce Ali's practical experience in Nacos environment isolation.

What is the environment?

When it comes to environmental isolation, we should first define what an environment is.

There is currently no unified definition of the word environment. Some companies call it environment, in Alibaba Cloud it is called region, and in Kubernetes architecture it is called namespace. This paper believes that the environment is a logically or physically independent set of systems, which includes all components for processing user requests, such as gateways, service frameworks, microservice registration centers, configuration centers, message systems, caches, databases, etc. , which can handle requests of the specified category.

For example, many websites have the concept of user ID, which can be divided according to user ID. Requests with user IDs ending in even numbers are all handled by one system, while requests ending with odd numbers are handled by another system. As shown below. The environment isolation we are talking about here refers to physical isolation, that is, different environments refer to different machine clusters.

What is the use of environmental isolation

The previous section defined the concept of an environment, a system that contains all the necessary components for handling user requests to handle requests of a specified class. This section discusses the benefits of environmental isolation. From the definition of the concept, it can be seen that environmental isolation has at least three benefits: fault isolation, fault recovery, and grayscale testing;

fault isolation

First of all, because the environment is an independent component unit that can process user requests, that is to say, no matter how long the processing link of user requests is, it will not jump out of the specified machine cluster. Even if this part of the machine fails, it will only affect some users, thus isolating the fault within the specified range. If we divide all the machines into ten environments according to the user id, if there is a problem in one environment, the impact on the user will be reduced to one tenth, which greatly improves the system availability.


Another important benefit of environmental isolation is the ability to quickly recover from failures. When there is a problem with the services of a certain environment, the routing direction of user requests can be changed quickly by delivering the configuration, and the request can be routed to another environment to achieve second-level fault recovery. Of course, this requires the support of a powerful distributed system, especially a powerful configuration center (such as Nacos), which needs to quickly push the routing rule configuration data to the application processes of the entire network.

Grayscale test

Grayscale testing is an integral part of the R&D process. In the traditional R&D process, testing and grayscale links require test students to do various configurations, such as binding hosts, configuring jvm parameters, environment variables, etc., which is rather troublesome. After years of practice, Alibaba's internal testing and grayscale are very friendly to development and testing. The environment isolation function is used to ensure that requests are processed in a designated machine cluster. No configuration is required for development and testing, which greatly improves R&D efficiency. .

How does Nacos do environmental isolation

The first two sections talked about the concept of environment and the role of environment isolation. This section introduces how to implement environment isolation based on Nacos.

Nacos was born out of the soft load team of Alibaba's middleware department. In the practice of environment isolation, we isolated multiple physical clusters based on Nacos. Automatic routing of the environment can be achieved.

Before we start, let's make some constraints:

Applications deployed on one machine are all in one environment;
In an application process, only one environment of Nacos is connected by default;
By some means, you can get the IP of the machine where the client is located;
The user has a plan for the network segment of the machine;
The basic principle is:

The 32-bit IPV4 in the network can be divided into many network segments, such as, and generally medium and large enterprises will have network segment planning, and divide the network segments according to certain purposes. We can use this principle to isolate the environment, that is, IPs of different network segments belong to different environments, such as belongs to environment A, belongs to environment B, etc.
Nacos has two ways to initialize the client instance, one is to directly tell the client the IP of the Nacos server; the other is to tell the client an Endpoint, and the client requests the Endpoint through HTTP to query the IP list of the Nacos server. Here, we use the second way to initialize.
Enhance the functionality of Endpoint. Configure the mapping relationship between the network segment and the environment on the Endpoint side. After receiving the request from the client, the Endpoint calculates the environment to which the client belongs according to the network segment to which the source IP of the client belongs, and then finds the IP list of the corresponding environment and returns it to the client. As shown below

An example of an environment isolation server

The above mentioned the constraints and basic principles of environment isolation based on IP segments, so how to implement an address server? The easiest way is to implement based on nginx, use the geo module of nginx to map the IP side and the environment, and then use nginx to return the content of static files.

Install nginx
Configure geo mapping in nginx-proxy.conf, refer to here
geo $env {
default "";;;
Configure the nginx root path and forwarding rules, here you only need to simply return the content of the static file;
# Configure the root path in the http module
root /tmp/htdocs;
# Configure in the server module
location / {
rewrite ^(.*)$ /$1$env break;
Configure the IP list configuration file of the Nacos server, and configure the file ending with the environment name in the /tmp/hotdocs/nacos directory. The content of the file is IP, one per line
$ll /tmp/hotdocs/nacos/
total 0
-rw-r--r-- 1 user1 users 0 Mar 5 08:53 serverlist
-rw-r--r-- 1 user1 users 0 Mar 5 08:53 serverlist-env-a
-rw-r--r-- 1 user1 users 0 Mar 5 08:53 serverlist-env-b
$cat /tmp/hotdocs/nacos/serverlist
curl 'localhost:8080/nacos/serverlist'
So far, a simple example of environment isolation based on IP network segment can work. Nacos clients on different network segments will automatically obtain different Nacos server IP lists to achieve environment isolation. The advantage of this method is that the user does not need to configure any parameters, and the code and configuration of each environment are the same, but the students who provide the underlying services need to do network planning and related configuration.

This article briefly introduces the concept of environment isolation, the three benefits of environment isolation, and how Nacos implements environment isolation based on network segments. Finally, an example of environment isolation configuration based on Nginx as Endpoint server is given. It should be noted that this article only lists a feasible method, and does not rule out a more elegant implementation method. If you have a better method, you are welcome to contribute to the Nacos community or official website.

The author of this article: Zhengji, GitHub ID @jianweiwang, responsible for the development and community maintenance of Nacos, senior development engineer at Alibaba.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us