All Products
Search
Document Center

Auto Scaling:Target tracking scaling rules

Last Updated:Feb 28, 2026

When workloads fluctuate due to traffic spikes or hardware failures, manual scaling cannot respond fast enough. Target tracking scaling rules automatically adjust instance counts to maintain a CloudMonitor metric at a target value you specify.

How it works

A target tracking scaling rule links a CloudMonitor metric to a target value. Auto Scaling continuously monitors the metric, calculates the required number of instances, and adds or removes instances to keep the metric close to the target. Target tracking scaling rules build on simple scaling rules but remove the need for manual threshold tuning.

Note

Auto Scaling supports four types of scaling rules: simple, step, predictive, and target tracking. For details, see Overview of scaling rules.

Auto-created event-triggered tasks

When you create a target tracking scaling rule, Auto Scaling automatically creates two event-triggered tasks:

Task Metric collection interval Trigger condition
Progressive scale-out Every 60 seconds Threshold exceeded for 3 consecutive minutes
Conservative scale-in Every 60 seconds Threshold exceeded for 15 consecutive minutes

These tasks cannot be modified or deleted directly. To remove them, delete the target tracking scaling rule. You can view, disable, and enable these tasks.

Instance count calculation

Auto Scaling calculates the exact number of instances for each scaling activity based on metric history and target value:

  • Scale-out: Rounds up. If 1.5 instances are needed, 2 instances are added.

  • Scale-in: Rounds down. If 1.5 instances can be removed, only 1 instance is removed.

No scaling activity occurs if the metric data does not reach the threshold. If the calculated number of instances is less than one, alerts are continuously reported but no scale-in is triggered.

Supported metrics

Metrics used for target tracking must represent instance workload and scale proportionally with the number of instances in the scaling group.

ECS-type scaling groups

API metric

Console metric

Description

CpuUtilization

(ECS) Average CPU Utilization

Average CPU usage across instances

IntranetRx

(ECS) Average Inbound Internal Traffic

Average inbound traffic over the internal network

IntranetTx

(ECS) Average Outbound Internal Traffic

Average outbound traffic over the internal network

ClassicInternetRx / VpcInternetRx

(ECS) Average Inbound Public Traffic

Average inbound traffic from the Internet. Uses classic network or VPC based on the Network Type of the scaling group.

ClassicInternetTx / VpcInternetTx

(ECS) Average Outbound Public Traffic

Average outbound traffic to the Internet. Uses classic network or VPC based on the Network Type of the scaling group.

MemoryUtilization

(Agent) Memory

Memory usage collected by the CloudMonitor agent

LoadBalancerRealServerAverageQps

(ALB) QPS per Backend Server

Queries per second per backend server. Requires an ALB Server Group to be configured.

Elastic Container Instance-type scaling groups

API metric

Console metric

Description

EciPodCpuUtilization

CPU Utilization

Average CPU usage across pods

EciPodMemoryUtilization

Memory

Average memory usage across pods

LoadBalancerRealServerAverageQps

(ALB) QPS per Backend Server

Queries per second per backend server. Requires an ALB Server Group to be configured.

Disable scale-in

By default, a target tracking scaling rule creates both scale-out and scale-in event-triggered tasks. Turn on Disable Scale-in to prevent the rule from removing instances.

When Disable Scale-in is on:

  • Only a scale-out event-triggered task is created. No scale-in task is created.

  • Use other methods to handle scale-in, such as a separate event-triggered task with a simple scaling rule.

When Disable Scale-in is off, both scale-out and scale-in tasks are created.

Configure Disable Scale-in through either method:

  • Console: In the Create Scaling Rule dialog box, turn on the toggle next to Disable Scale-in.

  • API: Set DisableScaleIn to true in the CreateScalingRule operation.

Configure instance warmup time

Instance Warmup Time is the period a newly added instance needs before it can serve production traffic. During warmup, Auto Scaling does not collect metric data from the instance. This prevents premature scaling triggered by incomplete metric data.

Set the warmup period based on the time required to deploy services, pass Server Load Balancer (SLB) health checks, and begin reporting stable metrics.

Behavior during warmup

  • The instance belongs to the scaling group but does not contribute to CloudMonitor metrics.

  • Instances being warmed up are not counted as the base for scale-out calculations.

  • During the warmup period, Auto Scaling rejects requests to execute scaling rules in the scaling group.

  • After the warmup period expires, the instance begins reporting data to CloudMonitor and is counted as part of the scaling group capacity.

Example: A scaling group contains 2 instances. A scale-out adds 5 new instances with a 300-second warmup period. Until the warmup expires, only the original 2 instances are used as the base for any subsequent scale-out calculation.

Scale-in protection during warmup

Auto Scaling sets an appropriate default cooldown time during scale-in to prevent removing instances that are still warming up due to data latency.

Create a target tracking scaling rule

Create a target tracking scaling rule through the Auto Scaling console or by calling the API:

Limits

  • Each metric can only be assigned to one target tracking scaling rule per scaling group.

  • If the scaling group has a small number of instances, the metric value may differ significantly from the target value because each instance has a large impact on the aggregate metric.

Target tracking vs. simple scaling rules

Aspect

Simple scaling rule

Target tracking scaling rule

User involvement

High. Requires manual rule creation, threshold monitoring, and scaling activity management.

Low. Specify only the metric and target value. Auto Scaling handles the rest.

Adjustment precision

Fixed. Rules are based on static thresholds with no dynamic adaptation.

Dynamic. Auto Scaling calculates the exact instance count based on metric history.

Metric reliability

No warmup support. Newly added instances may report incomplete metrics, causing false alerts.

Warmup support. Metrics from new instances are excluded until warmup completes.

Scaling stability

Prone to oscillation. Separate scale-out and scale-in rules may conflict.

Stable. Auto Scaling calculates a target range based on metric history, reducing unnecessary scaling activities.

Data latency handling

Metric changes lag behind instance count changes. Alerts may continue after scaling, triggering unnecessary activities.

Auto Scaling accounts for data latency by using conservative scale-in and progressive scale-out policies.