The release of Nacos 0.9 brings Nacos GA closer to reality. Actually many enterprises have already applied Nacos in production, for example, Huya.
Generally, enterprises follow this development process: feature development and test in the test environment, then phased release, and finally the release in the production environment. To ensure the stability of the production environment, it is necessary to implement the isolation between the test environment and the production environment. An inevitable issue is related to multiple environments:
This article shows how Alibaba solves this issue by implementing Nacos-based environment isolation.
Before explaining environment isolation, let's first clearly understand what an environment is.
Currently the term "environment" does not have a universal definition. Some companies directly use the word "environment", and an environment in the Kubernetes architecture is called a "namespace", while at Alibaba Cloud, an environment is called a "region". This article defines an environment as a whole set of logically or physically independent systems that include all the components for processing user requests and specified types of requests, such as gateways, service frameworks, microservices registry centers, configuration centers, message systems, cache, and databases.
For example, many websites involve user IDs. We can use one set of systems to process user IDs ending with an even number and another set of systems to process user IDs ending with an odd number. See the following flow chart. The environment isolation that we mention here is physical isolation, that is, different environments are different machine clusters.
The previous section defines the environment as a set of systems consisting of all necessary components for processing user requests and specified types of requests. This section describes the advantages of environment isolation. From the definition, we see at least three advantages: fault isolation, fault recovery, and phased release.
First, an environment is a unit of independent components that can process user requests. That is to say, the user request processing link is always related to specified machine clusters, no matter how long it is. Even if these machines have faults, only a portion of users will be affected, and faults will be isolated within the specified range. If we divide all the machines into ten environments by user ID, faults in one environment only have ten times smaller impact on users than treating all the machines as one environment as a whole. This can significantly improve system availability.
Another important advantage of environment isolation is that it enables fast fault recovery. When a service in a certain environment encounters faults, environment isolation allows us to distribute configuration, change the routing direction of user requests and route requests to another environment to implement fault recovery in seconds. To do this, we need a powerful distributed system, especially a powerful configuration center like Nacos, to quickly push routing rule configuration data to application processes across the entire network.
Phased release is an indispensable part of the R&D process. In tradition R&D, testing and phased release are very complicated and require a variety of configurations from testers, such as binding a host and configuring JVM parameters or environment variables. Years of practice at Alibaba have proven that testing and phased release in Alibaba are development and test friendly. Environment isolation ensures that requests are processed on specified machine clusters and that no configuration work is required for development and test, significantly improving the R&D efficiency.
The last two sections respectively describe the definition and role of environment isolation. This section shows how to implement environment isolation based on Nacos.
Nacos is originated form the software load balancing group of the Alibaba middleware department. In the practical implementation of environment isolation, we isolate multiple physical clusters based on Nacos. At the same time, the Nacos client can implement automatic environment routing without requiring any code changes.
Before we explain the implementation of environment isolation, let's make some constraints first:
The following shows the basic principles:
The previous section describes constraints and basic principles of the environment isolation based on CIDR blocks. However, how can we exactly implement an IP address server? The simplest method is the nginx-based implementation: Configure mapping between IPs and environments by using the geo module of nginx and then return static file content by using nginx.
geo $env {
default "";
192.168.1.0/24 -env-a;
192.168.2.0/24 -env-b;
}
# Configure the root path in the HTTP module
root /tmp/htdocs;
# Configure the following in the server module
location / {
rewrite ^(.*)$ /$1$env break;
}
$ll /tmp/hotdocs/nacos/
total 0
-rw-r--r-- 1 user1 users 0 Mar 5 08:53 serverlist
-rw-r--r-- 1 user1 users 0 Mar 5 08:53 serverlist-env-a
-rw-r--r-- 1 user1 users 0 Mar 5 08:53 serverlist-env-b
$cat /tmp/hotdocs/nacos/serverlist
192.168.1.2
192.168.1.3
curl 'localhost:8080/nacos/serverlist'
192.168.1.2
192.168.1.3
At this point, this simple environment isolation example based on IP CIDR blocks is ready to work. Nacos clients with different CIDR blocks will automatically obtain different Nacos server IP lists to implement environment isolation. The advantage of this method is that it does not require users to configure any parameters with code and configurations remaining the same in individual environments. However, this implementation method does require underlying service providers to make proper network plans and related configurations.
This article briefly explains the definition of environment isolation, the three advantages of environment isolation, and how-tos on implementing environment isolation based on CIDR blocks. An nginx-based environment isolation example for Endpoints is given at the end of this article. Note that this article only provides one feasible method. Maybe a more efficient implementation method can be used for the same purpose. If you have some better solutions, feel free to contribute them to the Nacos community or the official website.
Zheng Ji (GitHub ID: @jianweiwang), Senior Development Engineer at Alibaba, is responsible for the development of Nacos and its community maintenance.
Double 11 Real-Time Monitoring System with Time Series Database
How Will Front-End Engineers Embrace the Trend of Serverless?
2,599 posts | 763 followers
FollowAlibaba Tech - March 19, 2020
Alibaba Clouder - December 24, 2018
Alibaba Developer - June 23, 2020
Alibaba Cloud Community - February 15, 2022
Alibaba Clouder - July 11, 2018
Aliware - August 18, 2021
2,599 posts | 763 followers
FollowMulti-source metrics are aggregated to monitor the status of your business and services in real time.
Learn MoreA configuration audit service that provides configuration history of enterprise resources in Alibaba Cloud and audits the compliance of resource configurations.
Learn MoreAccelerate and secure the development, deployment, and management of containerized applications cost-effectively.
Learn MoreAlibaba Cloud (in partnership with Whale Cloud) helps telcos build an all-in-one telecommunication and digital lifestyle platform based on DingTalk.
Learn MoreMore Posts by Alibaba Clouder