All Products
Search
Document Center

Auto Scaling:Enable automatic scaling of elastic container instances based on the (ALB) QPS per Backend Server metric

Last Updated:Aug 07, 2024

Auto Scaling provides the (ALB) QPS per Backend Server system metric as a trigger condition for event-triggered tasks of the system monitoring type. This enables automatic scaling of ECS instances and elastic container instances. This topic describes how to use the (ALB) QPS per Backend Server system metric to enable automatic scaling of elastic container instances.

Concepts

  • Application Load Balancer (ALB) is an Alibaba Cloud service that runs at the application layer and supports various protocols, such as HTTP, HTTPS, and Quick UDP Internet Connections (QUIC). ALB provides high elasticity and can process a large amount of network traffic at the application layer. For more information, see What is ALB?

  • Query per second (QPS) measures the number of HTTP or HTTPS queries or requests that can be processed by ALB per second. This metric is specific to Layer-7 listeners. (ALB) QPS per Backend Server is a metric that is used to evaluate the service performance of a server. The value of this metric is calculated by using the following formula: QPS per backend server = Total number of client requests that are received by ALB per second/Total number of ECS instances or elastic container instances of ALB backend server groups.

Scenarios

If you use ALB instances to receive client requests (or QPS) in a centralized manner, the client requests are forwarded to the backend servers for processing. The backend servers can be ECS instances or elastic container instances. When the number of client requests surges, your business system must scale out backend servers in an agile manner to keep your business running smoothly and stably.

In this case, you can create event-triggered tasks that use the (ALB) QPS per Backend Server metric as the alert trigger condition to enable automatic scaling of backend servers. This solution improves the high availability of your business application. This solution provides the following benefits:

  • When the value of the (ALB) QPS per Backend Server metric is greater than the specified threshold, Auto Scaling triggers a scale-out event to increase the number of backend servers and reduce the workloads on each backend server. This improves your system response time and stability.

  • When the value of the (ALB) QPS per Backend Server metric is less than the specified threshold, Auto Scaling triggers a scale-in event to decrease the number of backend servers and improve cost efficiency.

Prerequisites

Step 1: Configure ALB

  1. Create an ALB instance.

    For more information, see Create an ALB instance.

    The following table describes the parameters that are required to create an ALB instance.

    Parameter

    Description

    Example

    Instance Name

    Enter a name for the ALB instance.

    alb-qps-instance

    VPC

    Select a VPC for the ALB instance.

    vpc-test****-001

    Zone

    Select zones and vSwitches for the ALB instance.

    Note

    ALB supports cross-zone deployment. If two or more zones are available in the specified region, select at least two zones to ensure high availability.

    • Zones: Hangzhou Zone G and Hangzhou Zone H

    • vSwitches: vsw-test003 and vsw-test002

  2. Create a server group for the ALB instance.

    For more information, see Create and manage server groups.

    The following table describes the parameters that are required to create a server group.

    Parameter

    Description

    Example

    Server Group Type

    Specify the type of the server group that you want to create. The value Server specifies that the server group contains elastic container instances.

    Server

    Server Group Name

    Enter a name for the server group.

    alb-qps-servergroup

    VPC

    Select a VPC from the VPC drop-down list. You can add only servers that reside in the VPC to the server group.

    Important

    In this step, select the VPC that is specified in Step 1.

    vpc-test****-001

  3. Configure a listener for the ALB instance.

    1. In the left-side navigation pane, choose ALB > Instances.

    2. On the Instances page, find the ALB instance whose name is lb-qps-instance and click Create Listener in the Actions column.

    3. In the Configure Listener step of the Configure Server Load Balancer wizard, set the Listener Port parameter to 80, retain the default settings of other parameters, and then click Next.

      Note

      You can specify another port number for the Listener Port parameter based on your business requirements. In this example, port number 80 is used.

    4. In the Select Server Group step of the Configure Server Load Balancer wizard, select Server Type below the Server Group field, select the server group whose name is alb-qps-servergroup that was created in Step 2, and then click Next. 后端服务器-zh

    5. In the Configuration Review step of the Configure Server Load Balancer wizard, confirm the parameter settings and click Submit. In the message that appears, click OK.

  4. After you complete the configurations, click the Instance Details tab to obtain the elastic IP addresses (EIPs) of the ALB instance. ALB的IP-zh

Step 2: Create a scaling group for the ALB server group

The steps to create a scaling group of the ECS type are different from the steps to create a scaling group of the Elastic Container Instance type. The parameters that are displayed when you create a scaling group shall prevail. In this step, a scaling group of the Elastic Container Instance type is created.

  1. Create a scaling group.

    For more information, see Create scaling groups. The following table describes the parameters that are required to create a scaling group. You can configure the parameters that are not described in the following table based on your business requirements.

    Parameter

    Description

    Example

    Scaling Group Name

    Enter a name for the scaling group.

    alb-qps-scalinggroup

    Type

    Specify the type of instances that are managed by the scaling group to provide computing power.

    ECI

    Instance Configuration Source

    Auto Scaling creates instances based on the value of the Instance Configuration Source parameter.

    Create from Scratch

    Minimum Number of Instances

    If the total number of elastic container instances is less than the lower limit, Auto Scaling automatically creates elastic container instances in the scaling group until the total number of elastic container instances reaches the lower limit.

    1

    Maximum Number of Instances

    If the total number of elastic container instances exceeds the upper limit, Auto Scaling automatically removes elastic container instances from the scaling group until the total number of elastic container instances drops below the upper limit.

    5

    VPC

    Select the VPC in which the ALB instance resides.

    vpc-test****-001

    vSwitch

    Select the vSwitches that are used by the ALB instance.

    In this example, vsw-test003 and vsw-test002 are selected.

    Associate ALB and NLB Server Groups

    Select the ALB server group that is created in Step 1. Then, enter port 80.

    • Server group: sgp-****/alb-qps-servergroup

    • Port: 80

  2. Create and enable a scaling configuration.

    For more information, see Create a scaling configuration of the Elastic Container Instance type. The following table describes the parameters that are required to create a scaling configuration. You can configure the parameters that are not described in the following table based on your business requirements.

    Parameter

    Description

    Example

    Container Group Configurations

    Select the specifications for the container group. The specifications include the number of vCPUs and memory size.

    • vCPU: 2 vCPUs

    • Memory: 4 GiB

    Container Configurations

    Select a container image and image tag.

    • Container image: registry-vpc.cn-hangzhou.aliyuncs.com/eci_open/nginx

    • Image tag: latest

  3. Enable the scaling group.

    For more information, see Enable or disable scaling groups.

    Note

    In this example, the Minimum Number of Instances parameter is set to 1. In this case, Auto Scaling automatically triggers a scale-out event to create one elastic container instance after you enable the scaling group.

  4. Check the status of the elastic container instance and the container running on the instance and make sure that the container runs as expected.

  5. Access the EIP of the ALB instance that is created in Step 1 to check whether nginx can be accessed as expected. 访问ip

Step 3: Create event-triggered tasks based on the (ALB) QPS per Backend Server metric

  1. Log on to the Auto Scaling console.

  2. Create a scaling rule.

    In this example, two scaling rules are created. One scale-out rule named Add1 is used to add one elastic container instance. One scale-in rule named Reduce1 is used to remove one elastic container instance from the scaling group. For more information, see Manage scaling rules.

  3. Create event-triggered tasks.

    1. On the Scaling Groups page, find the scaling group whose name is alb-qps-scalinggroup and click Details in the Actions column.

    2. Choose Scaling Rules and Event-triggered Tasks > Event-triggered Tasks > Event-triggered Tasks (System) and click Create Event-triggered Task.

      In this example, two event-triggered tasks are created. One event-triggered task named Alarm1 is used to trigger Add 1. The other event-triggered task named Alarm2 is used to trigger Reduce1. For more information, see Manage event-triggered tasks.

      • When you create Alarm1, select the (ALB) QPS per Backend Server metric and specify the following alert trigger condition: Average(Average) >= 100 Count/s.

        Note

        QPS per backend server = Total QPS/Total number of elastic container instances

        alarm1

      • When you create Alarm2, select the (ALB) QPS per Backend Server metric and specify the following alert trigger condition: Average(Average) < 50 Count/s.alarm2-zh1

Check the monitoring effect

You can use a stress testing tool, such as Apache JMeter, ApacheBench, or wrk, to perform a stress test on the EIP of the ALB instance that is created in Step 1. When you set up a stress test, you can simulate a scenario in which the QPS value reaches 500. During the test, you can observe the following trends on the Monitoring tab of the Auto Scaling console:

When the QPS value exceeds the alert threshold, the event-triggered task for scale-out events is triggered to add elastic container instances. Each time a new elastic container instance is added, the workloads on each instance are reduced, which decreases the QPS value.