what is Auto Scaling? - Auto Scaling - Alibaba Cloud Documentation Center

Auto Scaling is a cloud service that automatically adds or removes instances based on workload changes and scaling policies. You can use Auto Scaling to ensure sufficient computing resources, prevent idle resources, and reduce costs.

The following video uses Elastic Compute Service (ECS) instances as an example to describe how to use Auto Scaling.

Why Auto Scaling

When your business demand grows, Auto Scaling helps add ECS instances or elastic container instances to your scaling group to provide sufficient computing power. When your business demand drops, Auto Scaling helps remove ECS instances or elastic container instances from your scaling group to minimize resource costs. Therefore, Auto Scaling can automatically adjust the number of instances of a specific type in your scaling group to meet your fluctuating business demand.

The following table describes the Auto Scaling benefits.

Benefit	Description
Automated scaling	During scale-outs, Auto Scaling automatically creates instances of a specific type in your scaling group and attaches the created instances to the Server Load Balancer (SLB) instances that are associated with your scaling group. Auto Scaling also automatically adds the private IP addresses of the created instances to the IP address whitelists of the associated ApsaraDB RDS instances. During scale-ins, Auto Scaling automatically removes instances of a specific type from your scaling group and detaches the removed instances from the associated SLB instances. Auto Scaling also automatically removes the private IP addresses of the removed instances from the IP address whitelists of the associated ApsaraDB RDS instances.
Cost-effectiveness	If you use Auto Scaling, you do not need to invest a lot of manpower in adjusting the number of instances on which your application runs or prepare instances before traffic spikes occur. With Auto Scaling, you have no idle resources. Auto Scaling automatically scales instances to help you reduce resource costs. Auto Scaling automatically monitors the changes in metric values or in the expected number of instances in your scaling group. The default statistical period is one minute. If Auto Scaling detects that the metric values are not within the allowed range, Auto Scaling immediately triggers scaling activities. The following factors may affect the scaling speed: The startup time of instances that are waiting to be added to your scaling group. The startup time of an instance starts from the point in time at which the instance is created to the point in time at which the instance can provide services as expected. The number of instances that are waiting to be added to your scaling group. If the number of instances that are waiting to be added to your scaling group is less than or equal to 1,000, Auto Scaling can complete scaling within one minute.
High availability	You can use Auto Scaling to check the health status of ECS instances or elastic container instances to ensure high availability of your application. Auto Scaling automatically checks the health status of your instances. If Auto Scaling detects that an instance is unhealthy, Auto Scaling automatically adds an instance to replace the unhealthy instance. The new instance and the unhealthy instance must be of the same type.
Flexibility and intelligence	You can use Auto Scaling to scale ECS instances or elastic container instances based on your business requirements. Auto Scaling provides multiple scaling modes to meet complex business requirements. The scaling modes include the fixed-number mode, health mode, scheduled mode, dynamic mode, and custom mode. The dynamic mode allows you to interconnect Auto Scaling with external monitoring systems by using API operations. Auto Scaling allows you to select a template to create instances based on your business requirements. This helps improve the success rate of scale-outs. Auto Scaling also supports multiple scaling policies that you can use in different business scenarios.
Easy auditing	Auto Scaling logs the details of each scaling activity and monitors scaling groups. This can help identify and resolve issues effectively.

For more information, see Benefits.

Features

Auto Scaling supports the scale-in and scale-out of only ECS instances and elastic container instances. Auto Scaling does not support the configuration changes of a single ECS instance or elastic container instance. To change the configurations of an ECS instance or an elastic container instance, you can activate Alibaba Cloud CloudOps Orchestration Service (OOS). The configurations include the number of CPU cores, memory size, and bandwidth. For more information, see What is OOS?

Auto Scaling can automatically scale the required number of ECS instances or elastic container instances. The following table describes the main components of Auto Scaling.

Component	Description
Scaling group	A scaling group consists of instances of the same type that you can use for similar business scenarios. You can configure a scaling group to specify the type of instance to which computing resources are allocated. You can also specify the instance configuration source, the maximum number and minimum number of instances, and the Classic Load Balancer (CLB) instances or the Application Load Balancer (ALB) server groups with which you want to associate the scaling group. If you want to use Auto Scaling in multiple scenarios, you can create more than one scaling group. Auto Scaling automatically allocates computing resources to each scaling group based on your configurations.
Instance configuration source	An instance configuration source specifies information about the template that is used to manage your ECS instances or elastic container instances. Auto Scaling uses the template of the ECS type to create ECS instances and the template of the Elastic Container Instance type to create elastic container instances during scale-outs.
Scaling rule	A scaling rule is used to trigger a scaling activity. For example, you can create a scaling rule that triggers a scale-out in which an ECS instance or elastic container instance is added. You can manually execute a scaling rule. You can also create an event-triggered task or a scheduled task to automatically execute a scaling rule. Scaling rules also help change the maximum or minimum number of instances that are allowed or required in your scaling group.
Event-triggered task	Auto Scaling is integrated into CloudMonitor to monitor the metrics of your scaling group in real time. If the monitored metrics reach the specified thresholds, the specified scaling rules are triggered.
Scheduled task	You can create scheduled tasks to automatically execute scaling rules at the specified points in time.

Auto Scaling can work only after you configure and enable a scaling group and specify the instance configuration source for the scaling group. You can specify other configurations based on your business requirements. The following figure shows how to use Auto Scaling.

Auto Scaling also provides the following features to meet diverse requirements:

If a scaling activity succeeds, fails, or is rejected, Auto Scaling sends notifications based on the rules that are described in the following table.

Rule	Description
Regular notification rule	Auto Scaling sends notifications by using text messages, internal messages, and emails.
Advanced notification rule	Auto Scaling automatically sends notifications to CloudMonitor or Message Service (MNS). If you use MNS, notifications are sent to the specified MNS topic or MNS queue. You are charged for using MNS. For MNS billing information, see Pricing.

To help you manage instances in a scaling group, Auto Scaling also supports the features that are described in the following table.

Feature	Description
Lifecycle hook	A lifecycle hook is used to manage the lifecycles of ECS instances or elastic container instances in a scaling group. During a scaling activity, a lifecycle hook can be triggered to switch the status of ECS instances or elastic container instances to Pending Add or Pending Remove. You can perform any operations on the instances until the lifecycle hook times out.
Manually manage instances in a scaling group	Auto Scaling supports manually adding or removing ECS instances, elastic container instances, or managed third-party instances to or from scaling groups.
Rolling update	If your scaling group is of the ECS type, you can use the rolling update feature to manage the ECS instances. You can create rolling update tasks to update the configurations of multiple ECS instances in a batch. For example, you can update images, run scripts, or install OOS packages on ECS instances that are in the In Service state.

Scenarios

Auto Scaling provides a variety of scaling features that can be used in the following business scenarios:

Your workload fluctuations can be predicted.
For example, a video production company whose workload fluctuations can be predicted uses Auto Scaling to create scheduled tasks. Auto Scaling automatically adds an ECS instance or elastic container instance to help the company handle traffic spikes that occur at 20:00:00 each Friday.
Your workload fluctuations cannot be predicted.
For example, a video production company whose workload fluctuations cannot be predicted uses Auto Scaling to create event-triggered tasks and monitor the CPU utilization of instances. If the CPU utilization exceeds 60%, Auto Scaling automatically adds an ECS instance or elastic container instance to help the company handle traffic spikes.

For more information, see Scenarios.

Working principle

Auto Scaling triggers scaling activities based on scaling modes to add or remove ECS instances or elastic container instances to or from scaling groups. ECS instances or elastic container instances are used to process client requests. You can use Auto Scaling to add or remove instances based on changing business demands. For more information, see How Auto Scaling works.

Billing

Auto Scaling is free of charge. However, you are charged for using ECS instances, elastic container instances, ApsaraDB RDS instances, SLB instances (including ALB instances, ALB server groups, and NLB server groups), and MNS in the Auto Scaling console. For more information, see Billing.

Use Auto Scaling

Auto Scaling console: a web page that supports interactive operations.
API: a remote procedure call (RPC) API that supports GET and POST requests. For more information about API operations, see List of operations by function. If you want to call the Auto Scaling API, use the following common developer tools:
- Alibaba Cloud CLI: a flexible and scalable management tool based on Alibaba Cloud APIs. You can use the CLI to encapsulate Alibaba Cloud native APIs and develop custom features.
- OpenAPI Explorer: a tool that allows you to retrieve API operations, call API operations online, and dynamically generate SDK sample code.

Related services

Service	Description
ECS	A ready-to-use and scalable IaaS-level service provided by Alibaba Cloud.
Elastic Container Instance	An agile and secure serverless container runtime service provided by Alibaba Cloud. Elastic container instances are scalable and secure for your business system and can help reduce resource and O&M costs.
ApsaraDB RDS	A secure, reliable, cost-effective, and scalable online database service that helps you resolve database O&M issues.
SLB	A load balancing service that distributes network traffic on demand. You can use SLB to eliminate single points of failure in application systems and improve application availability. SLB provides the following types of services: ALB, NLB, and CLB.
CloudMonitor	A service that monitors Alibaba Cloud resources and Internet applications. CloudMonitor helps you fully understand the usage of Alibaba Cloud resources and the status of your business. You can also handle faulty resources in a timely manner to ensure the normal running of your business.
MNS	An efficient, reliable, secure, convenient, and scalable distributed messaging service. MNS helps developers freely transfer data and messages between distributed components of applications to build loosely coupled systems.