This topic describes the high-availability architecture of Server Load Balancer (SLB) in terms of different system designs and product configurations to meet different business needs. You can also use SLB together with Alibaba Cloud DNS to achieve cross-region disaster recovery. SLB provides a multi-zone service availability of 99.99% and a single-zone service availability of 99.90%.
High availability of the SLB system
Deployed in clusters, SLB can synchronize sessions among node servers to protect the SLB system from single points of failure (SPOFs). This improves redundancy and guarantees service stability. Layer-4 SLB uses the open source software Linux Virtual Server (LVS) and Keepalived to achieve load balancing. Layer-7 SLB uses Tengine to achieve load balancing. Tengine, a Web server project based on Nginx, adds advanced features dedicated for high-traffic websites.
Requests from the Internet reach the LVS cluster through Equal-Cost Multi-Path (ECMP) routing. Each LVS in the LVS cluster synchronizes the session to other LVS machines through multicast packets, thereby implementing session synchronization among machines in the LVS cluster. At the same time, the LVS cluster performs health checks on the Tengine cluster and removes abnormal machines to guarantee the availability of layer-7 SLB.
Session synchronization protects persistent connections from being affected by server failures in the cluster. However, for short connections or when the session synchronization rule is not triggered by the connection (the three-way handshake is not completed), server failures in the cluster may still affect user requests. To prevent session interruptions caused by machine failures in the cluster, you can add a retry function to the service logic to reduce the impact on user access.
High availability of a single SLB instance
To provide more reliable services, multiple zones for SLB are deployed in most regions. If a primary zone becomes unavailable, SLB rapidly switches to a secondary zone to restore its service capabilities within 30 seconds. When the primary zone becomes available, SLB automatically switches back to the primary zone.
- We recommend that you create an SLB instance in a region with multiple zones for disaster tolerance.
- We recommend that you deploy ECS instances in both the primary and secondary zones for disaster recovery. You can set the zone to which most ECS instances belong as the primary zone to minimize access latency.
High availability of multiple SLB instances
You can configure multiple SLB instances if a single SLB instance cannot meet your availability requirements. You can use Alibaba Cloud DNS to schedule requests or achieve cross-region disaster recovery through global SLB.
You can deploy SLB instances and ECS instances in multiple zones of a region or in multiple regions and schedule access requests by using Alibaba Cloud DNS.
High availability of backend ECS instances
SLB checks the service availability of backend ECS instances by performing health checks. Health checks improve the overall availability of frontend services and help reduce the service impact when backend servers are abnormal.
When SLB discovers that an instance is unhealthy, it distributes requests to other healthy ECS instances, and only resumes distributing requests to the ECS instance when it has restored to a healthy status. For more information, see Health check overview.
You must enable and correctly configure the health check function. For more information, see Configure health checks.