All Products
Search
Document Center

Server Load Balancer:Automatically scale backend servers for an ALB

Last Updated:Mar 26, 2026

You can associate Auto Scaling with an Application Load Balancer (ALB) server group and use event-triggered tasks to automatically adjust the number of ECS instances in the group. This improves the high availability of your ALB and reduces resource costs.

Scenario example

Consider a news website that experiences a sudden traffic surge after publishing a major story. The servers become overloaded, and users cannot load the page. After interest in the story fades, traffic returns to normal, and the page becomes accessible again. The unpredictable nature of these traffic spikes makes it difficult to manually adjust the number of instances in time or determine the right number of instances to add.

To address this scenario, you can use event-triggered tasks in Auto Scaling. CloudMonitor automatically adjusts resources based on metrics such as CPU utilization. After you associate a scaling group with an ALB server group, any changes to the number of ECS instances in the scaling group—whether automatic or manual—are synchronized with the ALB server group. The ALB instance then distributes traffic to these ECS instances based on its traffic distribution and health check policies, significantly improving resource availability and elasticity.

This topic uses the following solution as an example. An ALB instance distributes client requests to the backend server ECS01 based on configured forwarding rules. By using Auto Scaling and configuring event-triggered tasks, when the CPU utilization of ECS01 rises to a specified threshold, an ECS instance is automatically created and added to the backend server group. When the CPU utilization of ECS01 drops to a specified threshold, the ECS instance is automatically removed from the backend server group and released.

image

Prerequisites

  • You have created one or more ALB instances that are in the Active state. For more information, see Create and manage ALB instances.

  • You have created a server group (named RS1 in this topic) for the ALB instance and added an ECS instance (named ECS01) to it. The server group is in the Available state.

  • You have created a custom image from the ECS01 instance. For more information, see Create a custom image from an instance.

  • Health checks are enabled for the ALB instance. For more information, see Health checks for ALB.

  • At least one listener is configured for the ALB instance. For more information, see Add an HTTP listener.

  • The ALB instance and the scaling group must be in the same VPC.

Step 1: Create a scaling group

  1. Log on to the Auto Scaling console.

  2. In the left-side navigation pane, click Scaling Groups.

  3. In the top navigation bar, select a region.

  4. On the scaling groups page, click Create scaling group.

    The following table describes some of the key parameters. For more information about other parameters, see Create an ECS scaling group.

    Parameter

    Description

    Instance Configuration Source

    Select Create from Scratch.

    Minimum Number of Instances

    If the current number of instances in the scaling group is less than this value, Auto Scaling automatically adds instances to reach this minimum. In this example, set this parameter to 1 to ensure at least one instance (ECS01) is running.

    Maximum Number of Instances

    If the current number of instances in the scaling group exceeds this value, Auto Scaling automatically removes instances to meet this maximum. In this example, set this parameter to 2 to allow a maximum of one additional instance to be added in addition to ECS01.

    Default cooldown (seconds)

    The period after a scaling activity finishes, during which the scaling group ignores other scaling triggers from CloudMonitor. In this example, set this parameter to 0. Adjust this value based on your actual needs.

    VPC and vSwitch

    Select the VPC and vSwitch where the ECS01 instance is located.

    Associated ALB, NLB, and GWLB Server Groups

    Select the server group and configure the port and weight.

  5. Configure the remaining options as needed and click Create. You can view the created scaling group on the Scaling Groups page.

Step 2: Create a scaling configuration

  1. On the Scaling Groups page, find the scaling group and click Details in the Actions column. On the Instance Configuration Sources tab, select Scaling Configurations.

  2. On the Scaling Configurations tab, click Create Scaling Configuration. The following table describes some of the key parameters. For more information about other parameters, see Create a scaling configuration.

    Parameter

    Description

    Billing method

    The billing method for ECS instances. This example uses Pay-as-you-go.

    Select image

    Select the custom image of the ECS01 instance that you created.

    Instance Configuration Mode

    This example uses Specify instance type.

    Select Instance Type

    Select the same instance type as the ECS01 instance.

    Security Group

    Select the security group where the ECS01 instance is located.

  3. After you complete the configuration, click Create, and then click Confirm.

  4. In the Scaling configuration created dialog box, click Enable Configuration, and then enable the scaling configuration and scaling group in the dialog box that appears.

Step 3: Create scaling rules

  1. On the Scaling Groups page, find the scaling group and click Details in the Actions column. On the Scaling Rules and Event-triggered Tasks tab, click the Scaling Rules tab.

  2. Create a scale-out rule: On the Scaling Rules tab, click Create Scaling Rule. Configure the parameters based on the following table. For more information about other parameters, see Configure scaling rules. When finished, click OK.

    Parameter

    Description

    Rule Name

    Enter a custom name, such as auto-add-1-instance.

    Rule Type

    Select Simple Scaling Rule.

    Scaling Activity

    Select Add 1 instance.

  3. Create a scale-in rule: On the Scaling Rules tab, click Create Scaling Rule. Configure the parameters based on the following table and your business needs. When finished, click OK.

    Parameter

    Description

    Rule Name

    Enter a custom name, such as auto-remove-1-instance.

    Rule Type

    Select Simple Scaling Rule.

    Scaling Activity

    Select Remove 1 instance.

Step 4: Create and associate event-triggered tasks

  1. In the left-side navigation pane, click Event-triggered tasks.

  2. On the Alert Task page, click the System Monitoring tab, and then click Create Event-triggered Task.

  3. Create a scale-out task: In the Create Event-triggered Task dialog box, configure the parameters based on the following table. For more information about other parameters, see Manage event-triggered tasks. When finished, click OK.

    Parameter

    Description

    Name

    Enter a custom name, such as auto-add-task.

    Resource Monitored

    Select the scaling group that you created in Step 1.

    Alert Condition

    Select (Agent) CPU utilization, Maximum >= 60%.

    Reference Period

    Select 1 Minute.

    Trigger After

    Select 1 Time(s).

    Scaling Rule Triggered Upon Alerting

    Select the scale-out rule that you created in Step 3.

  4. Create a scale-in task: In the Create Event-triggered Task dialog box, configure the parameters based on the following table and your business needs. When finished, click OK.

    Parameter

    Description

    Name

    Enter a custom name, such as auto-remove-task.

    Resource Monitored

    Select the scaling group that you created in Step 1.

    Alert Condition

    Select (Agent) CPU utilization, Maximum <= 30%.

    Reference Period

    Select 1 Minute.

    Trigger After

    Select 1 Time(s).

    Scaling Rule Triggered Upon Alerting

    Select the scale-in rule that you created in Step 3.

Step 5: Add an existing instance

Auto Scaling monitors existing servers to trigger scaling activities. Therefore, you must add your existing instance to the scaling group for monitoring.

  1. On the Scaling Groups page, find the scaling group and click Details in the Actions column. Click the Instances tab and then click the Manually Added tab.

  2. Click Add Existing Instance, select the ECS01 instance, and then click Add.

  3. On the Manually Added tab, verify that the ECS01 instance has been successfully added.

Step 6: Verify auto scaling

Use a stress testing tool to increase the CPU utilization of the ECS01 instance to over 60%. This triggers the scale-out event-triggered task, and you will see a new ECS instance automatically added in the console. After the stress test ends, the CPU utilization of ECS01 drops below 30%, which triggers the scale-in task to remove the previously added instance.

  1. Log on to the ECS01 instance and run the following commands to install the stress tool.

    sudo yum install -y epel-release 
    sudo yum install -y stress 
  2. Run the following command to perform a stress test on the ECS01 instance for 60 seconds.

    sudo stress --cpu 1 --io 4 --vm 2 --vm-bytes 128M --timeout 60s &
  3. Return to the Alert Task page. After a few minutes, the task's Status changes to Alert.

  4. Go to the Scaling Groups page. In the Instances/Capacity column, check that the Total Instances have increased by one. This confirms that a new ECS instance has been added to the scaling group.

  5. Log on to the ALB console.

  6. In the left-side navigation pane, choose ALB > Server Groups.

  7. Click the ID of the target server group and click the Backend Servers tab. You can see two backend servers. The instance with a name like ESS-XX is the ECS instance that was automatically added by Auto Scaling.

  8. After the 60-second stress test is complete, you can log on to the Auto Scaling and Application Load Balancer consoles again to verify that the newly added instance has been automatically removed.