When workloads fluctuate due to traffic spikes or hardware failures, manual scaling cannot respond fast enough. Target tracking scaling rules automatically adjust instance counts to maintain a CloudMonitor metric at a target value you specify.
How it works
A target tracking scaling rule links a CloudMonitor metric to a target value. Auto Scaling continuously monitors the metric, calculates the required number of instances, and adds or removes instances to keep the metric close to the target. Target tracking scaling rules build on simple scaling rules but remove the need for manual threshold tuning.
Auto Scaling supports four types of scaling rules: simple, step, predictive, and target tracking. For details, see Overview of scaling rules.
Auto-created event-triggered tasks
When you create a target tracking scaling rule, Auto Scaling automatically creates two event-triggered tasks:
| Task | Metric collection interval | Trigger condition |
|---|---|---|
| Progressive scale-out | Every 60 seconds | Threshold exceeded for 3 consecutive minutes |
| Conservative scale-in | Every 60 seconds | Threshold exceeded for 15 consecutive minutes |
These tasks cannot be modified or deleted directly. To remove them, delete the target tracking scaling rule. You can view, disable, and enable these tasks.
Instance count calculation
Auto Scaling calculates the exact number of instances for each scaling activity based on metric history and target value:
-
Scale-out: Rounds up. If 1.5 instances are needed, 2 instances are added.
-
Scale-in: Rounds down. If 1.5 instances can be removed, only 1 instance is removed.
No scaling activity occurs if the metric data does not reach the threshold. If the calculated number of instances is less than one, alerts are continuously reported but no scale-in is triggered.
Supported metrics
Metrics used for target tracking must represent instance workload and scale proportionally with the number of instances in the scaling group.
ECS-type scaling groups
|
API metric |
Console metric |
Description |
|
CpuUtilization |
(ECS) Average CPU Utilization |
Average CPU usage across instances |
|
IntranetRx |
(ECS) Average Inbound Internal Traffic |
Average inbound traffic over the internal network |
|
IntranetTx |
(ECS) Average Outbound Internal Traffic |
Average outbound traffic over the internal network |
|
ClassicInternetRx / VpcInternetRx |
(ECS) Average Inbound Public Traffic |
Average inbound traffic from the Internet. Uses classic network or VPC based on the Network Type of the scaling group. |
|
ClassicInternetTx / VpcInternetTx |
(ECS) Average Outbound Public Traffic |
Average outbound traffic to the Internet. Uses classic network or VPC based on the Network Type of the scaling group. |
|
MemoryUtilization |
(Agent) Memory |
Memory usage collected by the CloudMonitor agent |
|
LoadBalancerRealServerAverageQps |
(ALB) QPS per Backend Server |
Queries per second per backend server. Requires an ALB Server Group to be configured. |
Elastic Container Instance-type scaling groups
|
API metric |
Console metric |
Description |
|
EciPodCpuUtilization |
CPU Utilization |
Average CPU usage across pods |
|
EciPodMemoryUtilization |
Memory |
Average memory usage across pods |
|
LoadBalancerRealServerAverageQps |
(ALB) QPS per Backend Server |
Queries per second per backend server. Requires an ALB Server Group to be configured. |
Disable scale-in
By default, a target tracking scaling rule creates both scale-out and scale-in event-triggered tasks. Turn on Disable Scale-in to prevent the rule from removing instances.
When Disable Scale-in is on:
-
Only a scale-out event-triggered task is created. No scale-in task is created.
-
Use other methods to handle scale-in, such as a separate event-triggered task with a simple scaling rule.
When Disable Scale-in is off, both scale-out and scale-in tasks are created.
Configure Disable Scale-in through either method:
-
Console: In the Create Scaling Rule dialog box, turn on the toggle next to Disable Scale-in.
-
API: Set
DisableScaleIntotruein the CreateScalingRule operation.
Configure instance warmup time
Instance Warmup Time is the period a newly added instance needs before it can serve production traffic. During warmup, Auto Scaling does not collect metric data from the instance. This prevents premature scaling triggered by incomplete metric data.
Set the warmup period based on the time required to deploy services, pass Server Load Balancer (SLB) health checks, and begin reporting stable metrics.
Behavior during warmup
-
The instance belongs to the scaling group but does not contribute to CloudMonitor metrics.
-
Instances being warmed up are not counted as the base for scale-out calculations.
-
During the warmup period, Auto Scaling rejects requests to execute scaling rules in the scaling group.
-
After the warmup period expires, the instance begins reporting data to CloudMonitor and is counted as part of the scaling group capacity.
Example: A scaling group contains 2 instances. A scale-out adds 5 new instances with a 300-second warmup period. Until the warmup expires, only the original 2 instances are used as the base for any subsequent scale-out calculation.
Scale-in protection during warmup
Auto Scaling sets an appropriate default cooldown time during scale-in to prevent removing instances that are still warming up due to data latency.
Create a target tracking scaling rule
Create a target tracking scaling rule through the Auto Scaling console or by calling the API:
-
Console: Navigate to the scaling group, then create a scaling rule. For step-by-step instructions, see Manage scaling rules.
-
API: Call the CreateScalingRule operation.
Limits
-
Each metric can only be assigned to one target tracking scaling rule per scaling group.
-
If the scaling group has a small number of instances, the metric value may differ significantly from the target value because each instance has a large impact on the aggregate metric.
Target tracking vs. simple scaling rules
|
Aspect |
Simple scaling rule |
Target tracking scaling rule |
|
User involvement |
High. Requires manual rule creation, threshold monitoring, and scaling activity management. |
Low. Specify only the metric and target value. Auto Scaling handles the rest. |
|
Adjustment precision |
Fixed. Rules are based on static thresholds with no dynamic adaptation. |
Dynamic. Auto Scaling calculates the exact instance count based on metric history. |
|
Metric reliability |
No warmup support. Newly added instances may report incomplete metrics, causing false alerts. |
Warmup support. Metrics from new instances are excluded until warmup completes. |
|
Scaling stability |
Prone to oscillation. Separate scale-out and scale-in rules may conflict. |
Stable. Auto Scaling calculates a target range based on metric history, reducing unnecessary scaling activities. |
|
Data latency handling |
Metric changes lag behind instance count changes. Alerts may continue after scaling, triggering unnecessary activities. |
Auto Scaling accounts for data latency by using conservative scale-in and progressive scale-out policies. |