automatically increase or decrease the number of ECS instances based on your business workloads - Auto Scaling

Auto Scaling helps you handle fluctuating workloads efficiently, improving resource utilization and reducing costs. For unpredictable workloads, you can create event-triggered tasks that monitor metrics like CPU utilization. When a metric crosses a defined threshold, Auto Scaling automatically adds or removes the specified number of Elastic Compute Service (ECS) instances.

Scenario

Description

When you manage dynamic workload fluctuations, determining the optimal number of servers and the timing for scaling can be challenging. If you want to adjust server capacity based on real-time workload changes, you can create event-triggered tasks in the Auto Scaling console.

For example, traffic on a news website is highly variable. Page views spike with breaking news and decline as the news becomes less timely.

Solution

By linking event-triggered tasks to CloudMonitor, you can configure Auto Scaling to automatically execute a scaling rule when the monitored metric reaches the defined threshold. This enables server scaling based on business workloads.

Benefits

Zero over-provisioning costs
Auto Scaling creates and releases ECS instances based on your business requirements, which eliminates upfront costs related to resource setup. You only need to reserve computing resources for regular business traffic.
Automatic scaling
Auto Scaling is integrated with CloudMonitor to monitor fluctuations in workloads, thereby efficiently implementing automatic scaling of ECS instances. This integration aids in meeting business demands while simultaneously reducing resource and O&M costs.

Prerequisites

The first time you use Auto Scaling, you must complete Resource Access Management (RAM) user authorization.

The AliyunServiceRoleForAutoScaling service-linked role is created. For more information, see Service-linked role.
If you use Auto Scaling as a RAM user, the RAM user must be granted the AliyunESSFullAccess policy. For more information, see Grant permissions to a RAM user.

Step 1: Create a scaling group

A scaling group is a collection of instances that meet your business requirements and serves as the core unit of Auto Scaling. Auto Scaling adjusts the number of instances in scaling groups by adding or removing them as needed.

Go to the Scaling Groups page.
1. Log on to the Auto Scaling console.
2. In the left-side navigation pane, click Scaling Groups.
3. In the top navigation bar, select the region where Auto Scaling is activated.
In the upper-left corner of the Scaling Groups page, click Create.

On the Create by Form tab, configure the scaling group and click Create.

The following table describes the parameter settings used in this topic. Parameters that are not covered in the following table default to their default settings. For information about how to create a scaling group, see Create scaling groups.

Parameter	Example	Description
Scaling Group Name	test	Enter a name for the scaling group. The name must meet the format requirements displayed on the UI.
Type	ECS	Select ECS, which specifies that the scaling group contains ECS instances.
Instance Configuration Source	Create from Scratch	Do not specify the template for automatically creating ECS instances at this stage. After the scaling group is created, you can proceed to create a scaling configuration.
Minimum Number of Instances	1	Specify the minimum number of instances in the scaling group. If the number of instances in the scaling group falls below this value, Auto Scaling will add ECS instances until the desired minimum is reached.
Maximum Number of Instances	5	Specify the maximum number of instances in the scaling group. If the number of instances exceeds this value, Auto Scaling will remove ECS instances until the number of instances falls below the specified limit.
VPC	vpc-bp1jmxxau0lur929p****	Select a VPC for the ECS instances in the scaling group.
vSwitch	vsw-2zeknnyw2ewufbs4z** vsw-2zesy03h8eaf9fe0l**	Select one or more vSwitches for the ECS instances in the scaling group. We recommend that you select multiple vSwitches to improve the success rate of scale-out events.

Step 2: Create a scaling configuration and enable the scaling configuration and scaling group

A scaling configuration is a template used to create ECS instances during scale-out events. It includes information such as the billing method, instance type, storage, and network settings. After you create a scaling configuration, you can enable the scaling configuration and then enable the scaling group.

Find the desired scaling group and use one of the following methods to open the scaling group details page.
- Click the ID of the scaling group in the Scaling Group Name/ID column.
- Click Details in the Actions column.
In the upper part of the details page, click the Instance Configuration Sources tab.
On the Scaling Configurations tab, click Create Scaling Configuration.

On the Create Scaling Configuration page, configure parameters to create a scaling configuration and then click Create.

The following table describes the parameter settings used in this topic. Parameters not covered in the following table default to their default settings. For more information about how to create a scaling configuration, see Create a scaling configuration of the ECS type.

Section	Parameter	Example	Description
Basic Information	Scaling Configuration Name	test	Enter a name for the scaling configuration. The name must meet the format requirements displayed on the UI.
Basic Information	Billing Method	Set the value to Pay-as-you-go.	Auto Scaling is free. However, charges apply for ECS instances created during scale-out events. In this example, the pay-as-you-go billing method is used. For more information, see Billing overview of ECS.
Image and Instance	Instance Configuration Mode	Specify Instance Pattern	Select Specify Instance Pattern to choose specifications for ECS instances.
	Instance Attribute Combination	2 vCPUs, 4 GiB Memory, Enterprise Level	Select the appropriate number of vCPUs and memory size for ECS instances based on your business requirements.
	Select Image	Public Image: Alibaba Cloud Linux 3.2104 LTS 64-bit	Select an image to deploy ECS instances. This example uses a public image. In real-world scenarios, you can choose a custom image specific to your application.
Storage	System Disk	Enterprise SSD (ESSD), 40 GiB, PL0	Select a system disk for ECS instances.
Network and Security Group	Public IP Address	Assign Public IPv4 Address, Pay-by-bandwidth, and 1 Mbit/s bandwidth	Specify whether to assign public IP addresses to ECS instances. Outbound public bandwidth is charged separately, and the fees are included in your ECS instance costs.
Network and Security Group	Security Group	sg-bp18kz60mefsicfg****	Select an existing security group. For information about how to create a security group, see Create a security group.
Management Settings	Logon Credentials	Set Later	Select Set Later, which requires you to manually configure passwords for ECS instances after you create the instances.

In the Preview Scaling Configuration dialog box, confirm the information and click Create.
In the The scaling configuration is created. message, click Enable.
In the Enable Scaling Configuration dialog box, click OK.
Note
In a scaling group, you must enable one scaling configuration. After you enable a scaling configuration, the scaling configuration enters the Active state.
In the Enable Scaling Group message, click OK.
The scaling group must be enabled to allow Auto Scaling to scale instances automatically based on your business requirements.
In this example, the Minimum Number of Instances parameter is set to 1. When you enable the scaling group, Auto Scaling automatically creates one ECS instance from the enabled scaling configuration. You can go to the Instances tab of the scaling group details page and check the instance information on the Auto Created tab.

Step 3: Create scaling rules

A scaling rule defines the specific action to be taken during a scaling event, such as how many instances to add or remove.

On the details page of the scaling group, click the Scaling Rules and Event-triggered Tasks tab. Then, click the Scaling Rules tab.

Click Create Scaling Rule, configure parameters to create the scaling rule, and then click OK.

In this example, simple scaling rules are created. For more information about how to create a scaling rule, see Configure scaling rules.

Parameter	Description
Rule Name	Enter a name for the scaling rule. The name must meet the format requirements displayed on the UI.
Rule Type	Specify the type of the scaling rule. In this example, select Simple Scaling Rule. For more information about scaling rules, see Overview.
Operation	Specify the number of instances to add or remove when the scaling rule is executed. The number of instances added or removed during each scaling event must not exceed 1,000.
Cooldown Time	Optional. Specify the cooldown period for the scaling rule. Unit: seconds. If you do not configure this parameter, the cooldown period of the scaling group takes effect. For more information, see Cooldown period.

Repeat this step to create scale-out and scale-in rules. The following table describes the configurations used in this example.

Scaling rule	Example
Scale-out rule	Rule Name: add Rule Type: Simple Scaling Rule Operation: Add 1 Instance
Scale-in rule	Rule Name: remove Rule Type: Simple Scaling Rule Operation: Remove 1 Instance.

Step 4: Create event-triggered tasks

Event-triggered tasks monitor specific metrics and collect real-time data. When the data meets predefined alert conditions, Auto Scaling triggers alerts and executes scaling rules.

On the details page of the scaling group, click the Scaling Rules and Event-triggered Tasks tab. Then, click the Event-triggered Tasks tab.

On the Event-triggered Tasks (System) tab, click Create Event-triggered Task, configure parameters to complete task creation, and then click OK.

Note

In this example, system metrics are specified in event-triggered tasks. You can also report custom metrics to CloudMonitor and specify the custom metrics in your event-triggered tasks. For more information, see Overview.

The following table describes the parameter settings used in this topic. Parameters that are not covered in the following table default to their default settings. For more information about event-triggered tasks, see Manage event-triggered tasks.

Parameter	Description
Name	Enter a name for the event-triggered task. The name must follow the format requirements displayed on the UI.
Alert Condition	Enter the condition under which alerts are reported when the metric data reaches a specified threshold. You must specify at least one metric. If you want to specify multiple metrics, click Add Metric. Take note of the following items: Metric: the name of the system metric that you want to monitor. For example, if you use the (ECS) CPU Utilization metric, the CPU utilization of all ECS instances in the scaling group is monitored. Statistical method: the method that you want to use to determine whether the metric data exceeds a specified threshold. You can use Average, Maximum, or Minimum as the statistical method. For example, if you use the Average method and specify a rule which is Average >= 70%, an alert is triggered when the average CPU utilization of all ECS instances in the scaling group reaches or exceeds 70%.
Scaling Rule Triggered Upon Alerting	Specify the scaling rule that you want to execute when alerts are reported.

Repeat this step to create event-triggered tasks that initiate the execution of scale-out and scale-in rules. The following table describes the configurations used in this example.

Event-triggered task	Sample configuration
Scale-out upon alerting	Name: Alarm-add Alert Condition: (ECS) CPU Utilization > Average (Average) > 70% Scaling Rule Triggered Upon Alerting: add
Scale-in upon alerting	Name: alarm-remove Alert Condition: (ECS) CPU utilization > Average(Average) < 20% Scaling Rule Triggered Upon Alerting: remove

Test the auto-scaling configuration

After you create the event-triggered tasks, Auto Scaling continuously monitors the metric data of the scaling group and executes scaling rules when the defined conditions are met.

This setup results in the following behavior:

When the CPU utilization of ECS instances in the scaling group exceeds 70%, one ECS instance is automatically added to the scaling group.
When the CPU utilization of ECS instances in the scaling group drops below 20%, one ECS instance is automatically removed from the scaling group.

You can use a stress testing tool to verify the scaling behavior in the following ways:

When the event-triggered tasks are executed, go to the scaling group details page and choose Instances > Auto Created. Then, check the changes in the instance count.
When the event-triggered tasks are executed, go to the scaling group details page and click the Scaling Activities tab. Then, check whether any scaling activity is generated. If a corresponding scaling activity is generated, click its ID to view the scaling activity details.