All Products
Search
Document Center

Auto Scaling:How Auto Scaling works

Last Updated:Jan 14, 2025

This topic describes the Auto Scaling workflow and how to configure scaling modes. It also includes workflow diagrams for Auto Scaling.

Auto Scaling manages scaling groups containing Elastic Compute Service (ECS) instances and elastic container instances in the same way. This topic uses ECS instances to illustrate the Auto Scaling workflow. For more information about ECS instances and elastic container instances, see What is ECS? and What is Elastic Container Instance?

How Auto Scaling works

The following figure shows how Auto Scaling adds ECS instances.

image

In this example, a web application with a three-tier architecture is used. ECS instances process requests, as indicated by the dotted line box on the right side of the previous figure. In the architecture, the Server Load Balancer (SLB) instance at the top layer forwards client requests to ECS instances in the scaling group at the middle layer. The ECS instances process the requests, while ApsaraDB RDS instances at the bottom layer store the service data.

You can use Auto Scaling to adjust the number of ECS instances at the middle layer based on your business requirements. The following procedure describes how Auto Scaling adjusts the number of ECS instances:

  1. Auto Scaling executes scaling activities when the conditions specified in the scaling modes are met. The following table describes the supported scaling modes. For information about how to configure scaling modes, see Configure scaling modes.

    You can combine the scaling modes described in the following table based on your business requirements. For example, if your workload increases significantly at 12:00 p.m. daily, you can configure a schedule task to automatically create 20 ECS instances at that time. To ensure the number of ECS instances meets your business requirements, you can combine scheduled mode with other scaling modes, such as dynamic and custom modes. This approach helps address potential discrepancies between the created instances and your needs.

    Scaling mode

    Description

    User guide

    API references

    Fixed-quantity mode

    • After you configure the Minimum Number of Instances parameter during scaling group creation, Auto Scaling will automatically add ECS instances to meet the specified minimum if the total number of ECS instances falls below this minimum.

    • After you configure the Maximum Number of Instances parameter during scaling group creation, Auto Scaling will automatically remove ECS instances if the total number exceeds the specified maximum, reducing the instance count to the set value.

    • If you configure the Expected Number of Instances parameter during scaling group creation, Auto Scaling will automatically adjust the number of ECS instances in the scaling group to match the specified value.

    Configure scaling groups

    CreateScalingGroup

    Health mode

    If you enable the health check feature during scaling group creation, Auto Scaling will monitor ECS instances at specified intervals. If an ECS instance is found to be unhealthy, Auto Scaling will remove it from the scaling group.

    Note

    The health check feature is an integral component of scaling groups. If an SLB instance with the health check feature enabled is attached to your scaling group, the health check features of both the scaling group and the SLB instance will be active simultaneously. The SLB instance can be either a Classic Load Balancer (CLB) instance or an Application Load Balancer (ALB) instance.

    Configure scaling groups

    CreateScalingGroup

    Scheduled mode

    You can create a scheduled task to automatically execute a scaling rule at a designated time point.

    Configure a scheduled task

    CreateScheduledTask

    Custom mode

    You can manually perform scaling actions by executing scaling rules or adding, removing, or deleting ECS instances.

    Dynamic mode

    You can create an event-triggered task based on a performance metric monitored by CloudMonitor, such as CPU utilization. When the metric value of a scaling group meets the alert condition, an alert is triggered, and the corresponding scaling rule is executed. For example, if the average CPU utilization of all ECS instances in a scaling group exceeds 60%, the alert is triggered, and the scaling action is performed.

    Manage event-triggered tasks

    CreateAlarm

  2. Auto Scaling calls the ExecuteScalingRule API operation to execute scaling activities. This API operation must include the unique identifier of the scaling rule that you want to execute. Example: ari:acs:ess:cn-hangzhou:140692647406****:scalingrule/asr-bp1dvirgwkoowxk7****.

    • If you create a scaling rule in the Auto Scaling console, you can find the scaling rule in the scaling rule list and click the ID of the scaling rule in the Scaling Rule ID/Name column. On the page that appears, you can view the unique identifier of the scaling rule. Example: asr-bp14u7kzh8442w9z****. For more information about how to create scaling rules, see Configure scaling rules.

    • If you create a scaling rule by calling an API operation, you can call the DescribeScalingRules API operation to query the unique identifier of the scaling rule.

  3. Auto Scaling uses the unique identifier to retrieve information about the scaling rule, including the associated scaling group and scaling configuration. It then initiates a scaling activity based on the information.

    1. Auto Scaling uses the unique identifier to retrieve information about the scaling rule and its associated scaling group. It then determines the required number of ECS instances based on your business needs. Additionally, Auto Scaling can query information about the SLB instance and the ApsaraDB RDS instances for ECS instance attachment.

    2. Auto Scaling retrieves the scaling configuration details of the scaling group, including the vCPUs, memory size, and bandwidth required to create ECS instances.

    3. Auto Scaling initiates a scaling activity based on the required number of ECS instances, the instance configuration source, and the SLB instance and ApsaraDB RDS instances for ECS instance attachment.

  4. During scaling, Auto Scaling automatically creates ECS instances and configures the SLB instance and ApsaraDB RDS instances for ECS instance attachment.

    1. Auto Scaling provisions the required number of ECS instances based on the instance configuration source.

    2. The private IP addresses of the ECS instances are added to the ApsaraDB RDS instance whitelists, and the ECS instances are registered as backend servers for the SLB instance.

  5. After the scaling activity is complete, Auto Scaling activates the cooldown period for the scaling group.

    The scaling group can only process new scaling requests after the cooldown period ends.

Configure scaling modes

Auto Scaling automatically adjusts the number of ECS instances in your scaling group based on your configurations, adding or removing instances as needed. You can configure scaling modes described in the following table in Auto Scaling to execute scaling activities.

Scaling mode

Configuration method

Description

Fixed-quantity mode

Scaling group + Instance configuration source

The scaling effect in fixed-quantity mode depends on the settings of the following parameters:

  • Minimum number of instances

  • Maximum number of instances

  • (Optional) Expected Number of Instances

Health mode

Scaling group + Instance configuration source

You must turn on Instance Health Check for the scaling group.

Scheduled mode

Scaling group + Instance configuration source + Scaling rule + Scheduled task

The scaling effect in scheduled mode depends on the configurations of scheduled tasks.

Dynamic mode

Scaling group + Instance configuration source + Scaling rule + Event-triggered task

The scaling effect in dynamic mode depends on the configurations of event-triggered tasks.

Custom mode

Custom configuration method

In this mode, you can manually add, remove, or delete ECS instances. You can also manually execute scaling rules.

Multi-mode

Combination of the preceding configuration methods

The scaling effect in the multi-mode depends on the scaling modes that are included. The scaling modes operate independently, without any priority. Auto Scaling applies the configurations of the first triggered scaling mode.

For example, when using scheduled and dynamic modes together, you must create a scheduled task and an event-triggered task. If the scheduled task is triggered before the event-triggered task, Auto Scaling will execute the scheduled task first.

The following table describes each configuration method.

No.

Configuration method

Description

Scaling group + Instance configuration source

Create a scaling group, configure an instance configuration source for it, and then enable both the instance configuration source and the scaling group.

Auto Scaling can only scale instances after the preceding operations are complete. The scaling group and instance configuration source are essential components of the basic configuration unit.

Scaling group + Instance configuration source + Scaling rule + Scheduled task

Along with the basic configuration unit in Method 1, create a scaling rule and a scheduled task.

Auto Scaling initiates the scheduled task to execute the scaling rule.

Scaling group + Instance configuration source + Scaling rule + Event-triggered task

Along with the basic configuration unit in Method 1, create a scaling rule and an event-triggered task.

Auto Scaling initiates the event-triggered task to execute the scaling rule.

Workflows

Auto Scaling enables you to associate your scaling group with one or more SLB and ApsaraDB RDS instances. When a client sends a request from a mobile device or PC, the associated SLB instance forwards the request to an ECS instance in the scaling group. The ECS instance processes the request, and the ApsaraDB RDS instance stores the application data.

Auto Scaling adjusts the number of ECS instances in the scaling group based on your business requirements and the configured scaling modes. The following figures show the scaling and elastic recovery (health check) workflows.

Figure 1 Scale-out workflow弹性扩张示意图

Figure 2 Scale-in workflow弹性收缩示意图

Figure 3 Elastic recovery workflow弹性自愈示意图