All Products
Search
Document Center

Elastic Container Instance: Use Auto Scaling to automatically scale in or out elastic container instances

Last Updated:Jun 16, 2026

Auto Scaling adjusts the number of elastic container instances based on workloads and scaling policies, so your application always has enough capacity without paying for idle resources.

How it works

Group elastic container instances running the same service into a scaling group. Set a minimum instance count to guarantee baseline capacity and a maximum to cap costs. Auto Scaling then adds or removes instances in response to workload changes — either on a schedule or based on live metrics.

ESS2

Use cases

Choose your scaling approach based on whether your traffic patterns are predictable:

  • Predictable traffic — Use a scheduled task to scale out at peak start and scale in when peak ends. For example, a gaming platform that sees a sharp traffic increase from 18:00 to 24:00 every day can schedule extra instances before 18:00 and remove them after midnight.

    ESS3

  • Unpredictable traffic — Use an event-triggered task to monitor a metric and scale automatically. For example, a video streaming platform with variable traffic can track CPU utilization and maintain it at 60% by scaling out when utilization climbs and scaling in when it drops.

    ESS4

Set up Auto Scaling for elastic container instances

Setup overview

The following figure shows the setup flow:

image
If you set Instance Configuration Source to Select Existing Instance when creating a scaling group, Auto Scaling automatically creates and enables a scaling configuration. Skip step 2 in that case.

Step 1: Create a scaling group

A scaling group manages elastic container instances designed for the same workload. When creating one, set:

  • The minimum number of instances — keeps baseline capacity running at all times

  • The maximum number of instances — prevents runaway costs during traffic spikes

  • The instance template and removal policy

For details, see Create a scaling group.

Step 2: Create a scaling configuration

Auto Scaling uses the scaling configuration to create new elastic container instances when scaling out. Specify the instance type, image, and other runtime settings.

Tip: Container images can be large, and pulling them at startup slows instance creation. Enable automatic image cache matching in the scaling configuration to speed up scale-out.

For details, see Create a scaling configuration of the Elastic Container Instance type.

Step 3: Enable the scaling group

Scaling events only occur in scaling groups in the Enabled state. When the scaling group has an active scaling configuration, the console prompts you to enable it. Alternatively, go to the Scaling Groups page and enable it manually.

For details, see Enable a scaling group.

Step 4: Create a scaling rule

Scaling rules define how the scaling group adjusts capacity. Choose the rule type based on your use case:

Rule type How it works Use when
Simple scaling rule Adds, removes, or sets instances to a fixed count You know the exact instance count for peak and off-peak periods
Target tracking scaling rule Automatically adjusts capacity to keep a metric — such as CPU utilization — close to the target value. Think of it like a thermostat: set your target, and the system does the rest. Traffic is unpredictable and you want metric-driven automation

For details, see Create a scaling rule.

Step 5: Execute the scaling rule

Trigger scaling rules in one of three ways:

Method When to use
Manual execution Addressing one-off or temporary capacity needs
Scheduled task Workload fluctuations that follow a predictable pattern — scale out before peak, scale in after
Event-triggered task Workload fluctuations that are unpredictable — the system monitors a metric and triggers the rule automatically

For details, see Execute a scaling rule. For scheduled tasks, see Scheduled tasks overview. For event-triggered tasks, see Event-triggered tasks overview.

Configuration examples

Scenario 1: Scheduled scaling for predictable peak traffic

Situation: Your application needs 10 instances for normal operations, but traffic spikes from 18:00 to 23:00 every day and requires 15 instances.

Step Configuration
Create a scaling group Type: elastic container instance, minimum: 10, maximum: 20
Create a scaling configuration Specify the instance configuration source
Enable the scaling group The system creates 10 instances on enable
Create two simple scaling rules Rule 1: set instance count to 15 / Rule 2: set instance count to 10
Create two scheduled tasks Task 1: run Rule 1 at 17:55 every day / Task 2: run Rule 2 at 23:05 every day

Result: 10 instances handle traffic during off-peak hours. From 18:00 to 23:00, the count increases to 15.

Scenario 2: Metric-based scaling for unpredictable traffic

Situation: Your application needs 10 instances as a baseline, but you cannot predict when or how much traffic will spike.

Step Configuration
Create a scaling group Type: elastic container instance, minimum: 10, maximum: 30
Create a scaling configuration Specify the instance configuration source
Enable the scaling group The system creates 10 instances on enable
Create a target tracking scaling rule Monitoring metric: CPU utilization / Target value: 60% / Other parameters: defaults
Verify the event-triggered task After you create the target tracking scaling rule, the system automatically creates an event-triggered task — check it to confirm the scaling trigger conditions

Result: The system continuously monitors CPU utilization and scales in or out to keep it below 60%, while always maintaining at least 10 instances.