By Jing Cai
In most cases, the business architecture of an enterprise can be divided into the following layers from the top down: access layer, application layer, and data layer.
• Access layer: serves as an entry point for ingress traffic. This layer routes ingress traffic to the backend application layer based on forwarding rules.
• Application layer: hosts applications. This layer processes ingress traffic and sends the results back to the upper layer.
• Data layer: stores data. This layer provides data and storage services for the application layer.
When you build a disaster recovery system for your business, you must enforce recovery measures on each layer.
• Access layer: Cross-AZ high availability (HA) is supported. Active zone-disaster recovery and cross-region disaster recovery can also be implemented by controlling routers at the application layer.
• Application layer: The application layer must be deployed in multiple clusters across availability zones or in multiple regions.
• Data layer: Disaster recovery and data synchronization on the data layer.
This article describes how to use the multi-cluster gateway of Distributed Cloud Container Platform for Kubernetes (ACK One) to implement zone-disaster recovery of public cloud applications, active zone-disaster recovery of hybrid cloud applications, and cross-region disaster recovery.
ACK One is an enterprise-grade distributed cloud container platform launched by Alibaba Cloud. ACK One is designed for scenarios such as hybrid cloud, multi-cluster, distributed computing, and disaster recovery. ACK One provides centralized management capabilities for multiple clusters. ACK One registered cluster can be used to connect other public cloud providers and on-premises Kubernetes clusters to the Container Service for Kubernetes (ACK) console. Fleet also provides unified application distribution, traffic management, observability, operational management, and security management for the registered clusters, and for the on-cloud ACK and ACK Edge clusters.
The multi-cluster gateway of ACK One is a service provided by Alibaba Cloud for application disaster recovery and north-south traffic management in hybrid cloud or multi-cluster environments. The service helps you quickly implement zone-disaster recovery or cross-region disaster recovery for hybrid cloud and multi-cluster applications, and facilitates multi-cluster traffic management.
The multi-cluster gateway of ACK One provides capabilities by hosting multi-cluster Ingress controllers on the Fleet instance and processing multi-cluster Ingress in a unified manner. The following section describes the main process:
• Create a Fleet instance
• Associate a cluster: Associate ACK clusters or registered clusters with the Fleet instance to implement centralized management.
• Create a multi-cluster gateway: You can use AlbConfig or MseIngressConfig to create an ALB multi-cluster or MSE multi-cluster gateway on the Fleet instance.
• Create an Ingress: Create an Ingress on the Fleet instance, bind a Service in the sub-cluster, and configure a forwarding rule or router for the Service in the sub-cluster.
• Use a multi-cluster gateway to access the service: You can use the domain name or IP address of the gateway to access the Service in the sub-cluster.
ACK One multi-cluster gateways provide the following benefits:
• Fully managed and O&M-free gateways.
• Reduce the number of gateways and costs. ACK One multi-cluster gateways serve as region-level multi-cluster gateways for layer-7 north-south traffic management.
• Simplify traffic management in multi-cluster environments. You can configure forwarding rules for multi-cluster Ingresses on the Fleet instance instead of configuring the rules in each cluster.
• Designed for cross-zone HA.
• Provide millisecond-level fallback. If the backend server error occurs in a cluster, multi-cluster gateways smoothly redirect traffic to other backends.
Active zone-disaster recovery is a solution selected by most customers. Compared with the active-standby zone-disaster recovery solution, the active zone-disaster recovery solution has the following advantages:
• Higher resource utilization and lower costs.
• Higher service quality and stronger fault tolerance: The number of Service replicas improves service quality and response speed, and handles traffic peaks better. Service interruptions are not caused by switchovers when faults occur. In addition, system updates or maintenance can be performed without service interruption.
• Enhanced scalability: If a zone has insufficient resources, you can quickly scale the application in other zones that have available resources.
ACK One allows you to use ALB multi-cluster gateways and MSE multi-cluster gateways to implement cross-AZ disaster recovery. The following figure shows the architecture
1. Create Cluster 1 and Cluster 2 in AZ 1 and AZ 2 in the same region.
2. Use ACK One GitOps to distribute the Service to Cluster 1 and Cluster 2.
3. Create multi-cluster gateways by using ACK One Fleet instances.
4. After a multi-cluster gateway is created, you can create an Ingress on the Fleet instance to implement zone-disaster recovery. When a cluster is abnormal, traffic is automatically rerouted to a healthy cluster. Multi-cluster gateways also provide various capabilities.
a) Load balancing and forwarding traffic based on the total number of replicas across multiple clusters.
b) Load balancing and forwarding traffic based on specified weights.
c) Forwarding traffic based on HTTP headers, which facilitates canary releases.
d) Automatic switching of traffic in milliseconds or seconds in case of application or cluster failures.
5. Data synchronization based on ApsaraDB RDS has middleware dependencies.
Compared with the active zone-disaster recovery solutions based on DNS traffic distribution, the active zone-disaster recovery system based on ACK One multi-cluster gateways has the following advantages:
Millisecond-level and second-level failovers eliminate DNS caching issues.
The following figure shows the architecture of common DNS-based zone-disaster recovery solutions.
ACK One also supports ALB multi-cluster gateways and MSE multi-cluster gateways to implement a hybrid cloud or multi-cloud zone-disaster recovery system. This allows you to quickly build disaster recovery capabilities for on-premises services on Alibaba Cloud and improve service capabilities by using on-premises elasticity capabilities.
The following network requirements must be met:
If the on-premises cluster uses an overlay network plug-in:
The following figure shows the architecture of the active zone-disaster recovery system, which is based on ACK One ALB multi-cluster gateways (MSE multi-cluster gateways share the same architecture):
Geo-disaster recovery can prevent regional disaster damage. However, geo-disaster recovery has higher latency and higher fees and maintenance costs. The geo-disaster recovery system based on ACK One multi-cluster gateways and the geo-disaster recovery system based on DNS have different scenarios. The following describes their architectures and their respective scenarios.
ACK One allows you to use ALB multi-cluster gateways to implement a geo-disaster recovery solution. This solution is suitable for the following scenarios:
• Cross-region HA is required and resources in the local region are insufficient. For example, in the current AI boom, GPU resources are extremely scarce.
• Client applications do not require low latency but require improved multi-cluster traffic management.
The following figure shows this architecture.
The following section describes the benefits of cross-region DR solutions based on ACK One multi-cluster gateways:
• Enhanced multi-cluster traffic routing: This solution provides content-based advanced traffic routing and a health check mechanism that is more flexible than GTM to meet the requirements of complex scenarios.
• Centralized multi-cluster traffic management: This solution uses an ACK One Fleet instance as a unified control plane for Ingress configurations. This simplifies service extensions and application maintenance and reduces management costs.
• Mitigation of DNS Client Cache Issues: In the preceding disaster recovery scenarios, service exceptions or cluster exceptions occur more frequently. In comparison, cross-region DR solutions do not need to switch IP addresses. Failover is possible within milliseconds or seconds.
The architecture of this solution implements disaster recovery based on ALB multi-cluster gateways and GTM. ALB multi-cluster gateways can manage and forward traffic to multiple clusters in a centralized manner.
• If cluster errors and service exceptions in Region 1 and errors in Region 2 occur, the ALB multi-cluster gateway can automatically redirect traffic to healthy clusters without the need to switch the DNS IP address.
• GTM switches the IP address based on health check only when Region 1 is down or the ALB service in Region 1 is down.
The advantage of the DNS-based geo-disaster recovery solution is that the GTM is global level and is suitable for scenarios such as nearby access.
The following figure shows the architecture of the DNS-based geo-disaster recovery solution.
In summary, the multi-cluster gateway of ACK One can help you quickly build an active zone-disaster recovery system, a hybrid cloud zone-disaster recovery system, and a geo-disaster recovery system. ACK One also allows you to smoothly failover in milliseconds or seconds. This allows you to manage and scale multi-cluster services and reduces management costs. For more information, see Overview of multi-cluster gateways and Multi-cluster disaster recovery.
OpenYurt v1.6: Introduce Node-level Traffic Multiplexing Capability
ACK Container Storage Monitoring: Making Your Applications Run More Stably and Transparently
196 posts | 33 followers
FollowAlibaba Container Service - June 13, 2024
Alibaba Container Service - December 5, 2024
Alibaba Container Service - April 12, 2024
Alibaba Container Service - April 17, 2024
Alibaba Cloud Community - May 7, 2025
Alibaba Container Service - November 21, 2024
196 posts | 33 followers
FollowProvides a control plane to allow users to manage Kubernetes clusters that run based on different infrastructure resources
Learn MoreAlibaba Cloud DNS is an authoritative high-availability and secure domain name resolution and management service.
Learn MoreAlibaba Cloud DNS PrivateZone is a Virtual Private Cloud-based (VPC) domain name system (DNS) service for Alibaba Cloud users.
Learn MoreAccelerate and secure the development, deployment, and management of containerized applications cost-effectively.
Learn MoreMore Posts by Alibaba Container Service