×
Community Blog Technical Knowledge Sharing | An Interpretation of CEN 2.0 Technology

Technical Knowledge Sharing | An Interpretation of CEN 2.0 Technology

This article discusses the features, architecture, and usage scenarios of CEN 2.0.

By Alibaba Cloud Network

1. Features of CEN 2.0

_1

Cloud Enterprise Network (CEN) is a networking service developed to build private networks for enterprises. It provides an intent-based global cloud network that enables interconnection between data centers and multiple regions worldwide, such as Beijing and Hangzhou. It can also connect various cloud services, such as OSS and RDS. Providing various private network connections is the basic capability provided by CEN.

Compared with CEN 1.0, version 2.0 has continuously expanded its rich connection capabilities to support the VPC loading of multiple CEN instances. It will soon launch cloud and cross-domain multicast to support nearby forwarding capabilities. In terms of scale, CEN 2.0 supports ultra-large-scale networking capabilities, with a maximum of 1000 VPC attachments in a single region and large-scale networking with 5000 routes worldwide, which is 100 times larger than the original networking scale.

CEN supports dynamic route propagation to simplify network management and maintenance. CEN also supports route aggregation and static routing to reduce the number of routes that need to be managed and help maintain networks in a fine-grained manner. CEN allows you to easily integrate security services (such as firewalls) into your private networks to enhance network security by providing support for multiple route tables, associated forwarding correlations, and service chaining. In addition, there are flowlog, traffic marking, ledger, and other capabilities to improve the manageability of the network.

2. Architecture of CEN 2.0

_2

The preceding figure shows the technical architecture behind CEN 2.0. CEN 2.0 is a networking service developed on top of Alibaba Cloud Luoshen Cloud Network Technology.

The underlying layer consists of infrastructure resources, such as data center networks, WANs, Internet, and Express Connect circuits.

The second layer is an architecture that integrates software and hardware, including servers, Mesh of Clusters (MoC), Field-Programmable Gate Array (FPGA), and programmable switch chips. This layer consists of high-performance network gateways and virtual machines that virtualize computing resources and networks.

The third layer is an elastic and open virtual network element (NE) platform named CyberStar. This platform provides environments, disaster recovery capabilities, and elastic scheduling for applications.

The top layer consists of various NEs. The NEs in this layer can focus on business logic without worrying about the underlying layers. In addition to TR, there are many other network elements, such as NAT Gateway, ALB, etc.

The CEN SDN control plane on the right side functions as the brain of the intent-based network of CEN 2.0. CEN 1.0 used Software-Defined Networking (SDN) as the brain to provide various features. Apsara Network Intelligence is used to analyze networks and collect metrics and insights into the network status.

3. Intelligent Controller for CEN 2.0

_3

The CEN SDN controller functions as a brain that translates user intents and configurations into resources and connectivity configurations that can be used to connect private networks. The CEN SDN controller can also receive events and trigger scheduling activities to optimize the underlying services. The CEN SDN controller provides the following benefits:

  1. The CEN SDN controller adopts the concept and method of SDN that use remote process calls to inject all routes to the controller. These routes are related to VPCs, Express Connect, VPN, and Cloud Connect Network (CCN). Unlike the conventional method used to inject all routes to the routing protocol stack, this endows users with high flexibility. Users can gain full routing control using features such as routemap. For example, users can modify routes or filter routes by attribute. In addition, these routing capabilities can be used with cloud services (such as Server Load Balancer (SLB) and Data Transmission Service (DTS).
  2. The CEN SDN controller provides intelligent awareness. The controller is aware of the location and demand of your business and empowers your business with intelligence, including nearby access, intelligence services, Quality of Service (QoS), disaster recovery, and auto-scaling.
  3. The CEN SDN controller provides a large number of planes and supports ultra-large networks. The CEN SDN controller adopts RAM computing. The earlier versions of the CEN SDN controller have generated a large amount of data that needs to be persisted. This significantly increases the I/O overheads. In CEN 2.0, the controller processes the following types of data: topology data and status data. Most of the data processed by the controller is status data, which is updated dynamically compared with more static topology data.

The controller used in CEN 2.0 stores most of the status data in RAM in a layered or distributed manner to ensure the reliability of data and increase the efficiency of data retrieval. This significantly improves the performance of the controller used in CEN 2.0.

4. Forwarding Network Element of CEN 2.0

_4

Transit routers are the key component used to forward data in CEN 2.0. Transit routers run on the CyberStart platform and are visible to tenants. CyberStar is a network functions virtualization (NFV) platform provided in Luoshen 3.0. CyberStart manages Elastic Compute Service (ECS) clusters on demand to run business.

ECS instances that run workloads are deployed in VPCs. All VPCs use the ENI-bonding technology to redirect user traffic to the ECS cluster connected to the transit router. The adoption of the ENI-bonding technology reserves the features of VPC and Elastic Network Internet (ENI) for the VPC attachments between transit routers and tenants. For example, ENI-bonding can be used with subnet routing 2.0 to implement segmentation or service chaining.

The middle layer in the preceding figure is an ECS resource pool. Each ENI-bonding is associated with multiple ECS instances. User traffic from each VPC is redirected to the associated ECS instances. This horizontally scales the processing capability. An ECS cluster can be automatically scaled out to handle unexpected traffic spikes.

The ECS clusters are deployed in different zones. User traffic is preferably routed to the ECS cluster in the local zone. This way, user traffic can be processed on ECS instances in the local zone to reduce network latency. In addition, multiple zones are used to implement disaster recovery and horizontally scale the computing capacity.

The following methods are used to implement disaster recovery when ECS instances are down. One of the methods is using the scaling capability of ENI-bonding to isolate unhealthy ECS instances. When a small number of ECS instances in the cluster are down, the system can isolate the unhealthy ECS instances and create the same number of ECS instances for replacement. This ensures business continuity. When a large number of ECS instances are down, implementing disaster recovery based on a single cluster is difficult. The system redirects user traffic in VPC attachments to the ECS clusters specified by tenants in other zones to resolve this issue. This minimizes the impact of service interruptions.

CEN 2.0 allows you to specify only one zone when you create a VPC attachment. However, we recommend specifying multiple zones for disaster recovery.

The sandbox method is used to allow tenants to isolate their workloads in a sandbox cluster when traffic spikes adversely affect the services of other tenants.

5. Cloud-Native Connection for CEN 2.0

_5

VPC attachments use the ENI-bonding technology to redirect traffic through cloud-native connections. Before adopting the ENI-bonding technology, we used only ENIs. This approach has the following disadvantages:

ECS clusters cannot be horizontally scaled. In addition, ECS clusters lack disaster recovery capabilities or require a long period to complete failovers.

Due to the limited device virtualization capability, you can create only 16 to 32 ENIs for each ECS instance.

When OS processes the addition of equipment, it will not treat it as a Time-Critical task. Therefore, adding and deleting ENI equipment requires many steps. You need to perform the timing scanning of the PCI bus, operating system response, identifying the equipment type according to the ID of the equipment, querying and loading the corresponding driver, and initializing the equipment allocation memory, etc., before delivering the service processing to the network element. It usually takes minutes, which cannot meet the requirements of fast and elastic scale-in of NFV network elements.

The underlying layer of the CyberStar platform relies on Alibaba Cloud's ENI-bonding technology to resolve the preceding issues. The technology allows you to bind an ENI to multiple ECS instances and add the ENI as a subinterface to the virtual network interface controller (NIC) on each ECS instance. The ENI-bonding technology enables a single ECS instance to support more than 1,000 ENIs and shortens the duration of associating or disassociating an ENI from seconds to subseconds. In the event of a failure, it can health check in real-time and converge in real-time on the conversion plane, which can be switched within or between clusters in seconds. The ENI-bonding technology also provides shuffle sharding to reduce the blast radius of an outage significantly.

Numerous traffic scheduling solutions are provided for traditional networking. Network engineers must configure routes, policy-based routes, and MAC or ARP proxies to deploy network services (such as firewalls and WAN acceleration). The engineers must centrally provision these mandatory resources at the data egress of the network.

Traffic in cloud networks is controlled by SDN. Therefore, only a few solutions are available to address the traffic scheduling issue. Most of these solutions are incompatible with the traditional networking architectures unless you modify or redesign the architectures.

6. CEN 2.0 Service Chain

_6

The preceding figure shows how workloads are deployed using CEN 2.0. The user's network consists of the following components:

The first part is Internet access, where services on the Internet are placed, such as NAT, SLB, and EIP. As shown in the preceding figure, two AZs are shown, indicating multi-AZ disaster recovery. The component in the lower-right corner of the figure consists of applications deployed on the cloud. Applications or tenants that belong to different organizations are isolated using VPCs. The component in the lower-left corner of the figure consists of services deployed at the data ingress of the network, such as VPN Gateway, Express Connect, and Smart Access Gateway (SAG). User traffic must pass through security services before the traffic can reach applications. These security services are used to filter east-west traffic between private networks and north-south traffic from private networks to the Internet.

In CEN 2.0, transit routers support associated forwarding correlations and multiple route tables. CEN 2.0 combines these features with subnet routing provided by VPCs to allow you to schedule traffic to your desired network services.

The two route tables used by the transit router in the preceding figure are the key components. The green route table for trusted traffic is only used to route scrubbed traffic to different NEs. The route table for untrusted traffic is used to route all user traffic to the firewall before routing the traffic to the NEs. After the firewall scrubs the user traffic, it routes the traffic back to the transit router. Then, the transit router routes the traffic to the NEs.

Our solution is open to third-party NEs and other service providers, with two forwarding modes: transparent and proxy. We are the first cloud vendor among domestic products to provide such solutions.

7. Summary and Outlook

_7

CEN 2.0 is developed based on the architecture of Luoshen 3.0.

Luoshen 3.0 will continue to help enterprises manage and analyze large-scale, high-performance, and complex networks and make informed decisions in the future. Luoshen 3.0 is application-oriented and ecological networking platform. It will continuously use cloud-native technologies to help enterprises and institutions expand their networks from the cloud to the edge and connect the digital world.

About the Author

Wen Shuguang (Fengzhe) is an Alibaba Cloud Networking Senior Technical Expert, currently responsible for the design and development of CEN transit router products. He has long been engaged in virtual, software-defined, and high-performance networks. He has a wide range of interests and research on operating systems, distributed systems, and applications in the cloud era.

0 1 0
Share on

Alibaba Cloud Community

1,072 posts | 263 followers

You may also like

Comments