Auto scaling for applications in ECS clusters - Enterprise Distributed Application Service

When traffic to your application fluctuates, fixed instance counts lead to either wasted resources during low-traffic periods or degraded performance during traffic spikes. Auto scaling in Enterprise Distributed Application Service (EDAS) monitors instance-level metrics and automatically adds or removes instances to match demand, keeping response times stable while minimizing idle resource costs. Built on the traffic flood management technology that Alibaba Group uses during Double 11, EDAS provides auto scaling within seconds.

How auto scaling works

Auto scaling applies to applications deployed in ECS Clusters. EDAS evaluates the following metrics once per minute and triggers scale-out or scale-in events when thresholds are breached for a sustained duration.

Metric	What it measures	Value format
CPU	CPU utilization of the instance	Percentage
RT	Response time per request	Milliseconds
Load	System load on the instance	Positive integer

Scaling behavior:

Single-instance environments -- Auto scaling ensures that one instance keeps running.
Multi-instance environments -- EDAS adds or removes instances based on the scaling rules and metric values you configure.

You configure separate rules for scale-out and scale-in. Each rule specifies which metrics to watch, how long a threshold must be breached before action is taken, and how many instances to add or remove.

Note

When you configure both scale-in and scale-out rules, make sure that the metric values of the scale-in rules are not greater than the metric values of the scale-out rules. Otherwise, an error occurs when you click Save.

Note

During a scale-in, instances created from elastic resources are released first.

Configure auto scaling rules

The following steps walk through scale-out rule configuration. Configure scale-in rules the same way, with lower threshold values.

Log on to the EDAS console.
In the left-side navigation pane, choose Application Management > Applications.
In the top navigation bar, select the region where your application resides. In the upper part of the Applications page, select the microservices namespace from the Microservices Namespace drop-down list, then click the application name.
In the left-side navigation pane of the application details page, click Auto Scaling.
In the upper-right corner of the Scale-out Rule section, turn on the switch to enable the rule.
Configure the rule parameters described in the following tables, then click Save in the lower-left corner of the Auto Scaling page.

Trigger metrics and conditions

Set thresholds for CPU, RT, and Load under Trigger Metrics, then choose when scaling triggers.

Parameter	Description
Trigger Conditions	Select Any One of the Metrics to trigger scaling when any single metric breaches its threshold, or All Metrics to trigger only when all metrics breach their thresholds simultaneously.
Last for More Than	Duration (in minutes) that the average metric value must exceed the threshold before scaling triggers. EDAS evaluates metrics once per minute. Set a shorter duration for latency-sensitive services and a longer duration for workloads that tolerate brief spikes.

Application source

Select where new instances come from during a scale-out.

Option	Behavior
Existing Resources	Uses idle ECS instances in the cluster. If fewer idle instances are available than required, EDAS scales out with as many as are available.
Elastic Resources	Purchases new ECS instances based on existing specifications or a launch template, then adds them to the cluster. Configure the elastic resource parameters in the table below.
Existing Resources First	Uses idle instances first. If idle instances are not enough, purchases additional instances automatically.

Elastic resource parameters

When Elastic Resources or Existing Resources First is selected as the application source, configure the following parameters:

Parameter	Description
Creation Method	Purchase Based on Existing Instance Specifications: Purchases instances that match an existing specification in the cluster. Purchase Based on Instance Launch Template: Purchases instances based on a launch template created in the ECS console.
Template Host or Launch Template	Appears based on your selected creation method. Select an existing instance specification set (for Template Host) or a launch template (for Launch Template).
Password	Required when you select Purchase Based on Existing Instance Specifications. Select a logon key pair for the new instances.
Terms of Service	Select Elastic Compute Service Terms of Service \| Terms of Service for Images to proceed.

Instance limits

Parameter	Description
Number of Instances to Add for Each Scale-Out	The number of instances added each time a scale-out triggers.
Maximum Number of Instances in Group	The upper limit on total instances. No scale-out triggers after this limit is reached. Set this based on your resource budget.

Verify auto scaling results

After a scale-out or scale-in triggers, confirm that the rule took effect:

Check the instance count -- Open the basic information page of your application. Verify that the number of running instances has increased (scale-out) or decreased (scale-in) as expected.
Check the change record -- In the left-side navigation pane of the application details page, click Change Records. Look for a record with the following values:
Field Expected value
Change type Scale Out or Scale in Application
Source auto_scale
Click View in the Actions column to see the full change details.

Field	Expected value
Change type	Scale Out or Scale in Application
Source	auto_scale

If the instance count does not change after the configured duration, verify that your metric thresholds and Last for More Than values match your actual workload. Also confirm that enough idle instances are available for scale-out.

Enterprise Distributed Application Service:Auto scaling for applications in ECS clusters