what is auto scaling - Auto Scaling - Alibaba Cloud Documentation Center

Add instances when traffic spikes, then remove them when demand drops.
Replace unhealthy instances automatically to maintain availability.
Reduce total cost of ownership by eliminating over-provisioning.

Auto Scaling supports Elastic Compute Service (ECS) instances and Elastic Container Instance (ECI) instances.

Auto Scaling only scales the number of instances. To change instance specifications such as CPU, memory, or bandwidth, use CloudOps Orchestration Service (OOS).

Auto Scaling, also known as Elastic Scaling Service (ESS), automatically adjusts the number of compute instances based on policies you define, improving resource utilization and reducing costs.

Why use Auto Scaling?

When business demand increases, Auto Scaling adds instances of a specified type to maintain capacity. When demand decreases, it removes instances to reduce costs.

Benefit	Description
Automation	Scale-out: creates instances and registers them with a load balancer, and associates them with ApsaraDB RDS instances. Scale-in: removes instances and detaches them from the load balancer, and disassociates them from RDS instances.
Cost savings	Eliminates manual resource adjustments, over-provisioning, and idle instance tracking. Auto Scaling checks metrics once per minute by default. If a metric does not match your specified threshold, a scaling activity starts immediately. Response time depends on the startup time of the scaled instances (the time from when an instance is created until its operating system is ready) and the number of instances to scale. For groups of up to 1,000 instances, scaling activities typically complete within one minute.
High availability	Monitors ECS and ECI instance health. Automatically replaces instances that are not in a running state to maintain your desired capacity.
Flexibility	Supports five scaling modes to handle diverse workloads: fixed-number, health, scheduled, dynamic, and custom. The dynamic mode integrates with external monitoring systems via API. Flexible instance templates help increase the instance creation success rate.
Auditing	Records every scaling activity and provides monitoring for scaling groups to help you diagnose issues quickly.

For more information, see Benefits.

Features

Auto Scaling scales the number of ECS or ECI instances based on your configuration. It does not change the configuration of individual instances (CPU, memory, bandwidth). To adjust instance specifications, use CloudOps Orchestration Service (OOS).

Configure the following components to use Auto Scaling. A scaling group and an instance configuration source are required; all other components are optional.

Component	Description
Scaling group	A logical grouping of identical instances for similar business scenarios. Defines the instance type, minimum and maximum instance counts, and the associated Classic Load Balancer (CLB) or Application Load Balancer (ALB) server groups. Create multiple scaling groups for different application scenarios.
Instance configuration source	The template Auto Scaling uses to create instances during a scale-out event. ECS-type templates create ECS instances; ECI-type templates create ECI instances.
Scaling rule	Defines the scaling action, such as adding one ECS or ECI instance. Run a scaling rule manually, or trigger it from an event-triggered task or a scheduled task. Scaling rules can also dynamically adjust the minimum and maximum instance counts for a scaling group.
Event-triggered task	Monitors scaling group metrics in real time using CloudMonitor. When a metric meets the configured threshold, Auto Scaling executes the corresponding scaling rule.
Scheduled task	Executes a scaling rule at a specified time.

The following figure shows the Auto Scaling workflow.

Auto Scaling also provides the following features.

Notifications

Auto Scaling can send notifications when a scaling activity succeeds, fails, or is rejected.

Rule	Description
Regular notification rule	Sends notifications by SMS, internal message, or email.
Advanced notification rule	Sends messages to CloudMonitor system events or Simple Message Queue (SMQ, formerly MNS). SMQ is a paid service with topics and queues service models. For pricing, see Billing overview.

Instance management

Feature	Description
Lifecycle hook	Manages the lifecycle of ECS or ECI instances within a scaling group. During a scale-out or scale-in activity, a lifecycle hook places affected instances in a pending state, giving you time to perform custom operations before the hook times out and the activity continues.
Manual instance management	Adds or removes ECS instances, ECI instances, or managed instances manually.
Rolling update	Updates ECS instances in batches within an ECS-type scaling group. Applies to instances in the `In Service` state. Supported updates include image replacement, script execution, and OOS package installation.

Use cases

Auto Scaling handles both predictable and unpredictable changes in traffic.

Predictable traffic patterns. A video streaming platform sees a weekly traffic surge every Friday at 8:00 PM when a popular show airs. Create a scheduled task to add one ECS or ECI instance at that time each week.
Unpredictable traffic patterns. A live streaming platform has traffic that is hard to forecast. Create an event-triggered task to add one ECS or ECI instance whenever CPU utilization exceeds 60%.

For more information, see Use cases.

How it works

Auto Scaling executes a scaling activity based on the configured scaling mode to add or remove instances in a scaling group. For more information, see Working principle.

Billing

Auto Scaling itself is free. You pay only for the resources that Auto Scaling creates and manages, such as ECS instances, ECI instances, ApsaraDB RDS instances, Server Load Balancer (SLB) services (CLB instances, ALB server groups, or Network Load Balancer (NLB) server groups), and SMQ. For more information, see Billing overview.

Access Auto Scaling

Console: Auto Scaling console — a web interface for interactive management.
API: RPC-style API supporting GET and POST requests. See List of operations by function.
- Alibaba Cloud CLI: A command-line tool for scripting and automation across Alibaba Cloud services.
- OpenAPI Explorer: An online tool for searching operations, making test calls, and generating SDK sample code.

Related services

Service	Description
Elastic Compute Service (ECS)	An IaaS cloud computing service that provides on-demand compute resources.
Elastic Container Instance	A serverless container service. Using ECI as a container runtime reduces O&M overhead and improves application elasticity.
ApsaraDB RDS	A managed relational database service. Auto Scaling associates and disassociates RDS instances during scale-out and scale-in.
Server Load Balancer (SLB)	Distributes incoming traffic across instances to eliminate single points of failure. Includes ALB, NLB, and CLB. Auto Scaling registers new instances with the load balancer during scale-out.
CloudMonitor	Monitors Alibaba Cloud resources and internet applications. Event-triggered tasks use CloudMonitor metrics to determine when to scale.
Simple Message Queue (SMQ, formerly MNS)	A lightweight message queue service. Used in advanced notification rules to receive scaling activity events.