This topic describes how Auto Scaling works and how to configure scaling modes. This topic also provides the workflow diagrams of Auto Scaling.

In this topic, Elastic Compute Service (ECS) instances are used to show how Auto Scaling works. If you use a scaling group to manage elastic container instances, you cannot associate the scaling group with an ApsaraDB RDS instance. You also cannot manually add elastic container instances to or delete elastic container instances from the scaling group. Except for the preceding cases, you can manage elastic container instances in a scaling group by using the methods that you use to manage ECS instances in a scaling group.

Workflow

The following figure shows how to add ECS instances in Auto Scaling.

In the example, a web application is used. The web application uses ECS instances to process requests and has a three-layered system architecture, as shown in the black dotted line box on the right of the preceding figure. In the system architecture, the Server Load Balancer (SLB) instance at the top layer forwards the requests from the client to ECS instances in the scaling group. ECS instances at the middle layer process the requests from the client. ApsaraDB RDS instances at the bottom layer store business data from the ECS instances.

You can use Auto Scaling to adjust the number of ECS instances at the middle layer to meet your business requirements. The following information describes how Auto Scaling adjusts the number of ECS instances:

  1. Auto Scaling triggers scaling activities if the conditions specified in scaling modes are met. For information about how to configure scaling modes, see Configure a scaling mode. Auto Scaling has the following scaling modes:
    • Fixed-number mode:
      • If you set the Minimum Number of Instances parameter when you create a scaling group, Auto Scaling automatically adds ECS instances to the scaling group to increase the number of instances to the minimum number when the number of ECS instances in the scaling group is less than the minimum number.
      • If you set the Maximum Number of Instances parameter when you create a scaling group, Auto Scaling automatically removes ECS instances from the scaling group to reduce the number of existing ECS instances to the maximum number when the number of ECS instances in the scaling group is greater than the maximum number.
      • If you set the Expected Number of Instances parameter when you create a scaling group, Auto Scaling automatically adds ECS instances to or removes ECS instances from the scaling group to maintain the expected number.
    • Health mode: If you enable the health check feature when you create a scaling group, Auto Scaling checks the status of the ECS instances in the scaling group on a regular interval. If an ECS instance is not in the Running state, Auto Scaling considers the instance unhealthy and removes the instance from the scaling group.
    • Scheduled mode: You can create a scheduled task to automatically execute a scaling rule at a specified point in time.
    • Custom mode: You can manually perform scaling operations. For example, you can manually execute scaling rules, or add, remove, or delete ECS instances.
    • Dynamic mode: You can create an event-triggered task based on a CloudMonitor performance metric, such as the CPU utilization. If the metric value of the scaling group meets the specified alert condition, an alert is triggered to execute the specified scaling rule. For example, if the average CPU utilization of all ECS instances in a scaling group exceeds 60%, an alert is triggered to execute the scaling rule.
    Note You can combine the preceding scaling modes to meet your business requirements. For example, if your business loads significantly increase starting from 12:00 every day, you can create a scheduled task to automatically create 20 ECS instances at 12:00 every day. To prevent the situation in which the number of the created ECS instances does not meet your business requirements, you can use the scheduled mode together with other scaling modes such as the dynamic mode and custom mode.
  2. Auto Scaling calls the ExecuteScalingRule API operation to trigger scaling activities. In the API operation, Auto Scaling specifies the unique identifier of the scaling rule that is in the pending execution state. Example: ari:acs:ess:cn-hangzhou:140692647406****:scalingrule/asr-bp1dvirgwkoowxk7****.
    Note If the metrics that are used in the dynamic mode are the metrics reported by your monitoring system to CloudMonitor, you must call the ExecuteScalingRule API operation in your program.
    • If you create a scaling rule in the Auto Scaling console, you can find the scaling rule in the scaling rule list and click the ID of the scaling rule in the Scaling Rule ID/Name column to view the unique identifier of the scaling rule on the page that appears. Sample scaling rule ID: asr-bp14u7kzh8442w9z****. For more information about how to create scaling rules, see Create a scaling rule.
    • If you create a scaling rule by calling an API operation, you can call the DescribeScalingRules API operation to query the unique identifier of the scaling rule.
  3. Auto Scaling uses the unique identifier to query the information about the scaling rule, scaling group, and scaling configuration and then create scaling activities.
    1. Auto Scaling uses the unique identifier to query the information about the scaling rule and the scaling group that contains the scaling rule, and then calculates the number of ECS instances that are required. Auto Scaling also obtains the information about the SLB instance and the ApsaraDB RDS instance to which the required ECS instances are attached.
    2. Auto Scaling queries the information about the scaling configuration in the scaling group. The information about the scaling configuration contains the information about vCPUs, memory, and bandwidth of the ECS instances that are required.
    3. Auto Scaling creates scaling activities based on the required number of ECS instances, instance configuration, SLB instance, and ApsaraDB RDS instance.
  4. During the scaling activities, Auto Scaling creates the required number of ECS instances and attaches the ECS instances to the SLB instance and ApsaraDB RDS instance.
    1. Auto Scaling creates the specified number of ECS instances based on the instance configuration information.
    2. Auto Scaling adds the internal IP addresses of the ECS instances that are created to the whitelist that manages access to the ApsaraDB RDS instance, and adds the ECS instances as the backend servers of the specified SLB instance.
  5. After a scaling activity is complete, Auto Scaling enables the cooldown time feature for the scaling group.

    The scaling group can receive new requests to execute scaling rules only after the cooldown time expires.

Configure a scaling mode

Auto Scaling can automatically trigger scaling activities to add ECS instances to or remove ECS instances from a scaling group based on your configurations. Auto Scaling also allows you to configure scaling modes to trigger scaling activities. The following table describes the scaling modes.

Scaling mode Configuration method Description
Fixed-number mode Scaling group + Instance configuration source1 The scaling in the fixed-number mode varies based on the values of the following parameters:
  • Minimum Number of Instances
  • Maximum Number of Instances
  • (Optional) Expected Number of Instances
Health mode Scaling group + Instance configuration source1 You must turn on Instance Health Check for the scaling group.
Scheduled mode Scaling group + Instance configuration source + Scaling rule + Scheduled task2 The scaling in the scheduled mode is determined by scheduled tasks.
Dynamic mode Scaling group + Instance configuration source + Scaling rule + Event-triggered task3 The scaling in the dynamic mode is determined by event-triggered tasks.
Custom mode Custom configuration method In this mode, you can manually add, remove, or delete ECS instances. You can also manually execute the scaling rules that you created.
Multi-mode Combination of the preceding configuration methods The configurations that take effect vary based on the scaling modes that are used. For example, if you configure scheduled mode and dynamic mode at the same time, you must create both scheduled tasks and event-triggered tasks.

The following information describes each configuration method:

  • 1. Scaling group + Instance configuration source. You must create a scaling group, configure an instance configuration source for the scaling group, and then enable the instance configuration source and the scaling group. Only after the preceding operations are complete, Auto Scaling can automatically scale out or in instances. Scaling group and instance configuration source are the basic configuration unit. You must configure the basic configuration unit.
  • 2. Scaling group + Instance configuration source + Scaling rule + Scheduled task. In addition to the basic configuration unit in 1, you must create a scaling rule and a scheduled task. Auto Scaling executes the scaling rule based on the scheduled task.
  • 3. Scaling group + Instance configuration source + Scaling rule + Event-triggered task. In addition to the basic configuration unit in 1, you must create a scaling rule and an event-triggered task. Auto Scaling executes the scaling rule based on the event-triggered task.

Workflow diagrams

Auto Scaling allows you to associate your scaling groups with SLB instances and ApsaraDB RDS instances. When you initiate a request from a mobile device or a PC, the associated SLB instance forwards the request to one ECS instance in the scaling group. The ECS instance receives and processes the request. The associated ApsaraDB RDS instance stores the application data.

Auto Scaling adjusts the number of ECS instances in your scaling group based on your business requirements and the scaling modes that you configure. The following figures show the processes of scale-out, scale-in, and elastic recovery (health check) events in Auto Scaling.

Figure 1. Scale-out in Auto Scaling
Scale-out in Auto Scaling
Figure 2. Scale-in in Auto Scaling
Scale-in in Auto Scaling
Figure 3. Elastic recovery in Auto Scaling
Elastic recovery in Auto Scaling