All Products
Search
Document Center

Global Accelerator:Distribute traffic across endpoint groups in different scenarios

Last Updated:Apr 01, 2026

Global Accelerator (GA) lets you attach multiple endpoint groups to a single listener and control how much traffic each group receives. Set a traffic distribution ratio per endpoint group to split client requests across regions, and enable health checks so GA automatically reroutes traffic when an endpoint group becomes unavailable.

How it works

Traffic distribution ratio

Each endpoint group has a traffic distribution ratio between 0% and 100% (default: 100%).

  • 0% — no traffic is forwarded to this endpoint group.

  • 100% — all available traffic is forwarded to this endpoint group.

GA determines which endpoint group to use first based on scheduling priority, which reflects network latency between the nearest access point and the endpoint group. The closer the access point is to the endpoint group's region, the higher the priority. Traffic is always sent to the highest-priority healthy endpoint group first, and the distribution ratio is applied on top of that routing decision.

For subscription billing, only TCP and UDP listeners support traffic distribution. For pay-as-you-go billing, all listener types support traffic distribution.

When health checks are enabled:

  • If a higher-priority endpoint group fails its health check, GA forwards all traffic to the next available endpoint group with the highest priority. The configured traffic distribution ratio is ignored during failover.

  • When the failed endpoint group passes the health check again, GA automatically routes traffic back to it. No manual intervention is required.

If a custom forwarding policy exists, traffic is distributed among the endpoint groups associated with the matched forwarding policy.

Traffic distribution formula

The ratio you set is applied to the traffic that reaches each endpoint group in order of priority — not to the total listener traffic all at once. The examples below show how this works.

One acceleration region, multiple endpoint groups

Scenario A: Both endpoint groups set to 100%

image
No.Description
Client requests are scheduled to the nearest access point in the China (Beijing) region and forwarded to the Alibaba Cloud global transmission network.
The listener checks connection requests based on the configured protocol and port, then forwards requests to endpoint groups based on their priorities and traffic distribution ratios.
The China (Beijing) endpoint group has higher priority and passes its health check. With a ratio of 100%, all requests go to the China (Beijing) endpoint group.
Servers in China (Beijing) process the requests.
If the China (Beijing) endpoint group fails its health check and the China (Shanghai) endpoint group passes, all traffic shifts to China (Shanghai).
Servers in China (Shanghai) process the requests.

Scenario B: China (Beijing) at 50%, China (Shanghai) at 100%

image

The China (Beijing) endpoint group still has higher priority, so it receives traffic first. With its ratio set to 50%, half of the requests go there. The remaining 50% move on to the China (Shanghai) endpoint group, which has a ratio of 100% — so it receives all of that remaining traffic.

Net result: China (Beijing) receives 50% of requests; China (Shanghai) receives 50%.

If you lower China (Beijing)'s ratio to 30%, China (Beijing) receives 30% and China (Shanghai) receives 70%.

Scenario C: Both endpoint groups set to 50%

image
No.Description
Client requests are scheduled to the nearest access point in China (Beijing) and forwarded to the Alibaba Cloud global transmission network.
The listener forwards requests to endpoint groups based on priorities and traffic distribution ratios.
China (Beijing) has higher priority and a ratio of 50%, so 50% of requests go there in the first round.
Servers in China (Beijing) handle 50% of requests.
The remaining 50% move to China (Shanghai), which also has a ratio of 50%. China (Shanghai) receives 50% x 50% = 25% of total requests. That leaves 25% unallocated.
The unallocated 25% is distributed evenly across both endpoint groups — 12.5% each.
China (Beijing) receives 50% (first round) + 12.5% (second round) = 62.5% total.
China (Shanghai) receives 25% (first round) + 12.5% (second round) = 37.5% total.

Multiple acceleration regions, multiple endpoint groups

Scenario D: Both endpoint groups set to 100%

image
No.Description
Requests from China (Beijing) clients go to the nearest China (Beijing) access point. Requests from China (Shanghai) clients go to the nearest China (Shanghai) access point. Both forward to the Alibaba Cloud global transmission network.
The listener forwards requests to endpoint groups based on priorities and traffic distribution ratios.
GA routes each group of clients to their nearest endpoint group. China (Beijing) clients go to the China (Beijing) endpoint group (higher priority, ratio 100%). China (Shanghai) clients go to the China (Shanghai) endpoint group (higher priority, ratio 100%).
Servers in each region handle their respective client requests.

Scenario E: China (Beijing) at 50%, China (Shanghai) at 100%

image

For China (Beijing) clients: 50% go to the China (Beijing) endpoint group; the remaining 50% go to China (Shanghai) (which accepts all of them at a ratio of 100%).

For China (Shanghai) clients: all requests go to the China (Shanghai) endpoint group.

Net result: China (Beijing) endpoint group receives 50% of China (Beijing) client requests. China (Shanghai) endpoint group receives 100% of China (Shanghai) client requests plus 50% of China (Beijing) client requests.

Scenario F: Both endpoint groups set to 50%

image
No.Description
Requests from China (Beijing) clients go to the China (Beijing) access point. Requests from China (Shanghai) clients go to the China (Shanghai) access point. Both forward to the Alibaba Cloud global transmission network.
The listener forwards requests to endpoint groups based on priorities and traffic distribution ratios.
For China (Beijing) clients: 50% go to the China (Beijing) endpoint group; 50% x 50% = 25% go to the China (Shanghai) endpoint group; the remaining 25% is unallocated. For China (Shanghai) clients: 50% go to the China (Shanghai) endpoint group; 50% x 50% = 25% go to the China (Beijing) endpoint group; the remaining 25% is unallocated.
GA evenly distributes the remaining requests. Each endpoint group receives 12.5% of China (Beijing) client requests and 12.5% of China (Shanghai) client requests from the second-round distribution.
Servers in each region handle the requests they receive.

Use cases

Use caseWhen to use it
Deploy an application in multiple regionsServers in one region cannot keep up with demand, or users in a specific region have poor network experience
Balance load across regionsA single-region deployment is overwhelmed; you want to spread requests across multiple regions
Cross-region disaster recoveryHigh availability is required; health checks trigger automatic failover to a healthy region
Drain or update a service by regionYou need to unpublish or update a service in one region with minimal impact on clients

Deploy an application in multiple regions

When your current servers cannot handle growing demand, or users in a specific region experience high latency or network jitter, deploy the application in an additional region and add a corresponding endpoint group.

Add an endpoint group to increase capacity

image

In this setup, an application is deployed in China (Beijing). As the client base grows, the China (Beijing) servers become overloaded. To offload traffic:

  1. Deploy servers in China (Shanghai).

  2. Add an endpoint group in China (Shanghai) for the listener. For more information, see Create a default endpoint group. Start with a low traffic distribution ratio (for example, 1%) to validate routing behavior before committing to full traffic.

  3. Verify that China (Shanghai) clients are being served by the new endpoint group. At 1%, China (Shanghai) handles 1% of local client requests, and the China (Beijing) endpoint group handles the remaining 99%.

  4. After validation, set the China (Shanghai) endpoint group ratio to 100%. For more information, see Set the traffic distribution ratio for an endpoint group. China (Shanghai) clients now route entirely to China (Shanghai) servers. China (Beijing) servers no longer handle China (Shanghai) traffic.

Add an acceleration region to reduce latency

image

If China (Shanghai) clients are connecting through the China (Beijing) access point (the only configured acceleration region), they experience high latency and packet loss. To fix this, add China (Shanghai) as an acceleration region and create an endpoint group there. China (Shanghai) clients then connect to the nearest access point and are routed to the China (Shanghai) endpoint group directly. For more information, see Add and manage acceleration areas and Create a default endpoint group.

Balance load across regions

image

When a single endpoint group handles all traffic for an acceleration region, server overload can cause latency and packet loss. Distribute the load by adjusting traffic distribution ratios across multiple endpoint groups.

For example, if China (Beijing) clients are all routed to the China (Beijing) endpoint group (ratio: 100%) and that group is overloaded, change the ratio to 50%. GA then sends 50% of China (Beijing) requests to China (Beijing) servers and the remaining 50% to China (Shanghai) servers. For more information, see Set the traffic distribution ratio for an endpoint group.

Cross-region disaster recovery

image

Enable health checks on each endpoint group to achieve automatic cross-region failover without manual intervention.

With health checks enabled on both the China (Beijing) and China (Shanghai) endpoint groups:

  • If the China (Shanghai) endpoint group fails its health check, the listener automatically redirects all China (Shanghai) client requests to the healthy China (Beijing) endpoint group.

  • Once the China (Shanghai) endpoint group passes the health check again, the listener automatically routes China (Shanghai) client traffic back to the China (Shanghai) endpoint group.

Failback behavior: When a failed endpoint group recovers and passes its health check, GA automatically routes traffic back to it. No manual reconfiguration is needed.

For health check configuration, see Enable and manage health checks.

Drain or update a service by region

image

Use the traffic distribution ratio to gradually drain traffic from a region before unpublishing or updating the service there. Starting with a small percentage gives you time to observe behavior and minimize the risk of disrupting active clients.

To unpublish a service in a region:

  1. Reduce the traffic distribution ratio for the target endpoint group to a low value (for example, 1%) to redirect most traffic to another region.

  2. Monitor request volume to the target region. When it drops below acceptable levels, set the ratio to 0%. No further traffic is forwarded to that endpoint group.

To update a service in a region:

  1. Follow the same steps as for unpublishing — reduce to 1%, then set to 0%.

  2. Update the service.

  3. Set the traffic distribution ratio back to 100% to restore full traffic to the updated region.

What's next