This topic describes how Auto Scaling works and how to configure scaling modes. This topic provides the workflow diagrams of Auto Scaling.

This topic describes how Auto Scaling works for Elastic Compute Service (ECS) instances. If you use a scaling group to manage elastic container instances, you cannot associate the scaling group with an ApsaraDB RDS instance. You also cannot manually add elastic container instances to or delete elastic container instances from the scaling group. Except for the preceding cases, you can manage elastic container instances in a scaling group by using the methods that you use to manage ECS instances in a scaling group.

Workflow

The following figure shows how Auto Scaling adds ECS instances.

How Auto Scaling works

In the example, a web application is used. The web application has a three-layer system architecture and uses ECS instances to process requests, as shown in the black dotted line box on the right side of the preceding figure. In the system architecture, the Server Load Balancer (SLB) instance at the top layer forwards the requests from the client to the ECS instances in the scaling group, which are at the middle layer. The ECS instances process the requests from the client. ApsaraDB RDS instances at the bottom layer store business data from the ECS instances.

You can use Auto Scaling to adjust the number of ECS instances at the middle layer based on your business requirements. The following procedure describes how Auto Scaling adjusts the number of ECS instances:

  1. Auto Scaling triggers scaling activities if the conditions specified in scaling modes are met. For information about how to configure scaling modes, see Configure a scaling mode. Auto Scaling supports the following scaling modes:
    • Fixed-number mode:
      • If you configure the Minimum Number of Instances parameter when you create a scaling group, Auto Scaling automatically adds ECS instances to the scaling group to maintain the specified minimum number of ECS instances in the scaling group.
      • If you configure the Maximum Number of Instances parameter when you create a scaling group, Auto Scaling automatically removes the excess ECS instances from the scaling group to maintain the specified maximum number of ECS instances in the scaling group.
      • If you configure the Expected Number of Instances parameter when you create a scaling group, Auto Scaling automatically adds ECS instances to or removes ECS instances from the scaling group to maintain the expected number of ECS instances in the scaling group.
    • Health mode: If you enable the health check feature when you create a scaling group, Auto Scaling checks the status of the ECS instances in the scaling group at specified intervals. If an ECS instance is not in the Running state, Auto Scaling considers the instance unhealthy and removes the instance from the scaling group.
    • Scheduled mode: You can create a scheduled task to automatically execute a scaling rule at a specified point in time.
    • Custom mode: You can manually perform scaling operations. For example, you can manually execute scaling rules, or add, remove, or delete ECS instances.
    • Dynamic mode: You can create an event-triggered task based on a performance metric that is monitored by CloudMonitor, such as the CPU utilization. If the metric value of the scaling group meets the specified alert condition, an alert is triggered and the specified scaling rule is executed. For example, if the average CPU utilization of all ECS instances in a scaling group exceeds 60%, an alert is triggered, and the specified scaling rule is executed.
    Note You can use the preceding scaling modes together based on your business requirements. For example, if your business loads significantly increase from 12:00:00 every day, you can create a scheduled task to automatically create 20 ECS instances at 12:00:00 every day. To make sure that the number of ECS instances meets your business requirements, you can use the scheduled mode together with other scaling modes such as the dynamic mode and the custom mode.
  2. Auto Scaling calls the ExecuteScalingRule API operation to trigger scaling activities. In this API operation, Auto Scaling specifies the unique identifier of the scaling rule that you want the system to execute. Example: ari:acs:ess:cn-hangzhou:140692647406****:scalingrule/asr-bp1dvirgwkoowxk7****.
    Note If the metrics that are used in the dynamic mode are the metrics reported by your monitoring system to CloudMonitor, you must call the ExecuteScalingRule API operation in your program.
    • If a scaling rule is created in the Auto Scaling console, you can find the scaling rule in the scaling rule list and click the ID of the scaling rule in the Scaling Rule ID/Name column to view the unique identifier of the scaling rule on the page that appears. Sample scaling rule ID: asr-bp14u7kzh8442w9z****. For more information about how to create scaling rules, see Create a scaling rule.
    • If a scaling rule is called by calling an API operation, you can call the DescribeScalingRules API operation to query the unique identifier of the scaling rule.
  3. Auto Scaling uses the unique identifier to query the information about the scaling rule, scaling group, and scaling configuration and then triggers scaling activities.
    1. Auto Scaling uses the unique identifier to query the information about the scaling rule and the scaling group to which the scaling rule applies, and then calculates the number of ECS instances that are required. Auto Scaling also queries the information about the SLB instance and the ApsaraDB RDS instance to which the required ECS instances are attached.
    2. Auto Scaling queries the information about the scaling configuration in the scaling group. The information includes the vCPUs, memory, and bandwidth of the ECS instances that are required.
    3. Auto Scaling triggers scaling activities based on the required number of ECS instances, instance configuration, SLB instance, and ApsaraDB RDS instance.
  4. During the scaling activities, Auto Scaling creates the required number of ECS instances and attaches the ECS instances to the SLB instance and ApsaraDB RDS instance.
    1. Auto Scaling creates the required number of ECS instances based on the instance configuration information.
    2. Auto Scaling adds the private IP addresses of the ECS instances that are created to the whitelist that manages access to the ApsaraDB RDS instance, and then adds the ECS instances as the backend servers of the specified SLB instance.
  5. After a scaling activity is complete, Auto Scaling enables the cooldown feature for the scaling group.

    The scaling group can receive new requests to execute scaling rules only after the cooldown time expires.

Configure a scaling mode

Auto Scaling can automatically trigger scaling activities to add ECS instances to or remove ECS instances from a scaling group based on your configurations. You can configure scaling modes for Auto Scaling to trigger scaling activities. The following table describes the scaling modes.

Scaling mode Configuration method Description
Fixed-number mode Scaling group + Instance configuration source1 The scaling in the fixed-number mode varies based on the values of the following parameters:
  • Minimum Number of Instances
  • Maximum Number of Instances
  • (Optional) Expected Number of Instances
Health mode Scaling group + Instance configuration source1 You must turn on Instance Health Check for the scaling group.
Scheduled mode Scaling group + Instance configuration source + Scaling rule + Scheduled task2 The scaling in the scheduled mode varies based on the configurations of scheduled tasks.
Dynamic mode Scaling group + Instance configuration source + Scaling rule + Event-triggered task3 The scaling in the dynamic mode varies based on the configurations of event-triggered tasks.
Custom mode Custom configuration method In this mode, you can manually add, remove, or delete ECS instances. You can also manually execute the scaling rules that you created.
Multiple modes Combination of the preceding configuration methods

The configurations that take effect vary based on the scaling modes that are used. The scaling modes that are used are independent of each other. The scaling modes have no priorities. Auto Scaling first executes the configurations of the scaling mode that is first triggered.

For example, if you use the scheduled mode together with the dynamic mode to cope with business demands, you must create both scheduled tasks and event-triggered tasks. If the scheduled mode is triggered earlier than the dynamic mode, Auto Scaling executes the scheduled task before it executes the event-triggered task.

The following items describe each configuration method:

  • 1. Scaling group + Instance configuration source. You must create a scaling group, configure an instance configuration source for the scaling group, and then enable the instance configuration source and the scaling group. Auto Scaling can automatically scale instances only after the preceding operations are complete. Scaling group and instance configuration source are the basic configuration unit. You must configure the basic configuration unit.
  • 2. Scaling group + Instance configuration source + Scaling rule + Scheduled task. In addition to the basic configuration unit in Method 1, you must create a scaling rule and a scheduled task. Auto Scaling executes the scaling rule based on the scheduled task.
  • 3. Scaling group + Instance configuration source + Scaling rule + Event-triggered task. In addition to the basic configuration unit in Method 1, you must create a scaling rule and an event-triggered task. Auto Scaling executes the scaling rule based on the event-triggered task.

Workflow diagrams

Auto Scaling allows you to associate your scaling groups with SLB instances and ApsaraDB RDS instances. When you send a request from a mobile device or a PC, the associated SLB instance forwards the request to one ECS instance in the scaling group. The ECS instance receives and processes the request. The associated ApsaraDB RDS instance stores the application data.

Auto Scaling adjusts the number of ECS instances in your scaling group based on your business requirements and the scaling modes that you configured. The following figures show the processes of scale-out, scale-in, and elastic recovery (health check) events in Auto Scaling.

Figure 1. Scale-out in Auto Scaling
Scale-out in Auto Scaling
Figure 2. Scale-in in Auto Scaling
Scale-in in Auto Scaling
Figure 3. Elastic recovery in Auto Scaling
Elastic recovery in Auto Scaling