The auto scaling feature automatically adjusts instance resources based on traffic fluctuations. This ensures service performance during peak hours and saves resource costs during off-peak hours. You can configure a scheduled scaling policy or a metric-based scaling policy for an instance. This topic describes how to configure an auto scaling policy for a Cloud-native API Gateway instance.
Scale-out policy selection
Scheduled policy
If your business has predictable traffic peaks, you can create a scheduled scaling policy. For example, you can do so for promotions.
Metric-based policy
If your business has unpredictable traffic fluctuations, you can configure a scaling policy based on metrics such as CPU or memory usage. For more information, see Capacities.
Usage notes
A scheduled scaling policy cannot be changed during an ongoing scaling activity.
Scaled nodes are charged on a pay-as-you-go basis by quantity and duration.
Configure auto scaling
Log on to the API Gateway console.
In the left-side navigation pane, click . In the top navigation bar, select a region.
Go to the Auto Scaling panel using one of the following two methods:
Method 1:
On the Instance page, find the instance that you want to manage and click More > Auto Scaling in the Actions column.
Method 2:
On the Instance page, click the instance that you want to manage. On the instance details page, click Overview in the left-side navigation tree, click the Basic Information tab, and click Enable next to Auto Scaling in the Running Information section.
In the Auto Scaling panel, configure the following parameters on the Scaling Configurations tab:
NoteWhen you update a scaling policy, if the start time of the new time period is later than the current time, the policy takes effect from the next cycle.
The following table describes the parameters:
Parameter
Description
Enabling Auto Scaling
Specify whether to turn on auto scaling.
NoteManual scaling or specification change is not allowed when auto scaling is enabled.
Scaling Method
Specify the scaling policy type. Valid values: Scaling by Time and Expansion and contraction by water level.
Scaling by Time: If a major event, such as promotion, is planned, you can select this option to configure a scheduled scaling policy and use it together with throttling.
Expansion and contraction by water level: Scaling takes minutes and cannot be completed instantly.
Time Period Configuration
Configure the time periods and the target total number of nodes. To update the policy, change the Time Period Configuration on the Scaling Configurations tab.
NoteThe number of nodes is increased to the target number at the beginning of the time period and reduced to the original number at the end of the time period.
Time Period (UTC+8): Specify a time period in a day based on the 24-hour system. You can specify up to three time periods for a day. Time periods cannot overlap with each other. A time period can span days.
Target Total Nodes: The number of nodes after the gateway scales out. At the end of each time period, the gateway scales in to its original number of nodes.
Expansion and contraction by water level
Maximum nodes
Configure the maximum number of nodes that you want the instance to contain.
Safe Threshold
Ensure high throughput and low latency performance when bursty traffic increases to twice the current traffic.
Warning Threshold
When the warning threshold is reached, the latency of the gateway may increase. For more information about the capacity thresholds and queries per second (QPS) of different specifications, see Capacities.
Scaling Events
You can click the Scaling Events tab of the Auto Scaling panel to go to the Event page. On this page, you can manage and analyze scaling events and policy execution in a unified manner. For more information, see Manage events.
