All Products
Search
Document Center

Auto Scaling:Use (ALB) QPS per Backend Server to trigger scaling activities

Last Updated:May 18, 2023

Auto Scaling allows you to use the (ALB) QPS per Backend Server metric when you create an event-triggered task of the system monitoring type. This topic describes how to use the (ALB) QPS per Backend Server metric to trigger scaling activities.

Prerequisites

  • An Alibaba Cloud account is created. If you do not have an Alibaba Cloud account, sign up for an Alibaba Cloud account.

    Note

    If you want to log on to the Auto Scaling console as a RAM user, you must grant the RAM user the permissions on Application Load Balancer (ALB) instances. For more information, see Grant permissions to a RAM user.

  • ALB is activated. Make sure that your Alibaba Cloud account has the AliyunServiceRoleForAlb service-linked role. For more information, see ALB service-linked roles.

  • At least one virtual private cloud (VPC) and one vSwitch are created. For more information, see the Create a VPC and a vSwitch section in the "Create a VPC with an IPv4 CIDR block" topic.

Scenarios

If your application is sensitive to the access traffic to your servers, you can use the (ALB) QPS per Backend Server metric to trigger scaling activities. For example, you use ALB to receive client requests and distribute the requests to backend servers such as Elastic Compute Service (ECS) instances and elastic container instances for processing. When the number of client requests increases, more backend servers are required to process the client requests.

In this case, you can create an event-triggered task for your scaling group and specify (ALB) QPS per Backend Server as the metric when you create the event-triggered task. This task enables Auto Scaling to trigger scaling activities in your scaling group, which improves the service availability of your application. This best practice provides the following benefits:

  • When the QPS value of each ALB server group is greater than the specified threshold, Auto Scaling triggers scale-outs. This reduces the traffic load on each ALB server group and improves system stability.

  • When the QPS value of each ALB server group is less than the specified threshold, Auto Scaling triggers scale-ins. This helps reduce costs on servers.

Concepts

  • ALB is a service of Alibaba Cloud that runs at the application layer and supports protocols, such as HTTP, HTTPS, and Quick UDP Internet Connections (QUIC). ALB provides high elasticity and can process a large amount of network traffic at the application layer. For more information, see the What is ALB? topic.

  • QPS measures the number of HTTP or HTTPS queries or requests that can be processed by ALB per second. This metric is specific to Layer-7 listeners. (ALB) QPS per Backend Server is a metric that is used to evaluate the service performance of a server. The value of this metric is calculated by using the following formula: (ALB) QPS per backend server = Total number of client requests that are received by ALB per second/Total number of ECS instances or elastic container instances of ALB backend server groups.

  • Performance Testing (PTS) is an easy-to-use SaaS-based stress testing platform that has powerful distributed stress testing capabilities.

Step 1: Configure ALB

  1. Create an ALB instance.

    For more information, see the Create an ALB instance topic.

    The following table describes the parameters that are required to create an ALB instance.

    Parameter

    Description

    Example

    Instance Name

    The name of the ALB instance.

    alb-qps-instance

    VPC

    The VPC in which you want to create the ALB instance.

    vpc-test****-001

    Zone

    The zone and the vSwitch to which the ALB instance belongs.

    Note

    ALB supports cross-zone deployment. If two or more zones are available in the specified region, select at least two zones to ensure high availability.

    • Zones: Hangzhou Zone G and Hangzhou Zone H

    • vSwitches: vsw-test003 and vsw-test002

  2. Create an ALB server group.

    For more information, see the Create and manage a server group topic.

    The following table describes the parameters that are required to create an ALB server group.

    Parameter

    Description

    Example

    Server Group Name

    The name of the server group.

    alb-qps-servergroup

    VPC

    The VPC in which you want to create the ALB server group. You can add only servers that reside in this VPC to the ALB server group.

    Note

    In this step, select the VPC that was specified in Step 1.

    vpc-test****-001

  3. Configure a listener.

    1. If the left-side navigation pane of the Server Load Balancer (SLB) console, choose ALB > Instances.

    2. On the Instances page, find the ALB instance named lb-qps-instance and click Create Listener in the Actions column.

    3. In the Configure Listener step, set the Listener Port parameter to 80, retain the default settings of other parameters, and then click Next.

      Note

      You can specify another port number as the value of the Listener Port parameter based on your business requirements. In this example, port number 80 is used.

    4. In the Select Server Group step, select Server Type below the Server Group field, select the server group named alb-qps-servergroup that was created in Step 2, and then click Next. Figure - Backend servers - zh

    5. In the Configuration Review step, confirm the parameter settings and click Submit. In the message that appears, click OK.

  4. After you complete the configuration, click the name of the ALB instance to go to the Instance Details tab. On this tab, you can obtain the elastic IP address (EIP) of the ALB instance.Figure - IP address of the ALB instance - zh

Step 2: Create a scaling group for the ALB server group

The steps to create an ECS scaling group are different from the steps to create an Elastic Container Instance scaling group. The parameters that are displayed when you create a scaling group shall prevail. In this step, an Elastic Container Instance scaling group is created.

  1. Create a scaling group.

    For information about how to create a scaling group, see the Create scaling groups section in the "Manage scaling groups" topic. The following table describes the parameters that are required to create a scaling group. You can configure the parameters that are not described in the table based on your business requirements.

    Parameter

    Description

    Example

    Scaling Group Name

    The name of the scaling group.

    alb-qps-scalinggroup

    Type

    The type of instances that are managed by the scaling group to provide computing power.

    ECI

    Instance Configuration Source

    The template that is used by Auto Scaling to create elastic container instances.

    Create from Scratch

    Minimum Number of Instances

    The minimum number of elastic container instances that must be contained in the scaling group. When the total number of elastic container instances falls below the minimum limit, Auto Scaling automatically creates elastic container instances in the scaling group until the total number of elastic container instances reaches the value of this parameter.

    1

    VPC

    The VPC in which you want to create the scaling group. The VPC of the scaling group must be the same as the VPC of the ALB instance.

    vpc-test****-001

    vSwitch

    The vSwitches that you want to associate with the scaling group. The vSwitches of the scaling group must be the same as the vSwitches of the ALB instance.

    vsw-test003 and vsw-test002

    Associate ALB Server Group

    The ALB server group that you want to associate with the scaling group. In this example, the ALB sever group created in Step 1 is used.

    • Server group: sgp-****/alb-qps-servergroup

    • Port number: 80

  2. Create and enable a scaling configuration.

    For more information, see the Create scaling configurations for scaling groups that contain elastic container instances topic. The following table describes the parameters that are required to create a scaling configuration. You can configure the parameters that are not described in the table based on your business requirements.

    Parameter

    Description

    Example

    Container Group Configurations

    The number of vCPUs and the memory size that you want to allocate to the container group.

    • vCPU: 2 vCPUs

    • Memory: 4 GiB

    Container Configurations

    The image and the image version of the container.

    Note

    In this example, an nginx image is used.

    • Container image: registry-vpc.cn-hangzhou.aliyuncs.com/eci_open/nginx

    • Image version: latest

  3. Enable the scaling group.

    For information about how to enable a scaling group, see the Enable or disable scaling groups section in the "Manage scaling groups" topic.

    Note

    After you enable the scaling group, Auto Scaling automatically creates an elastic container instance because the Minimum Number of Instances parameter of the scaling group is set to 1.

  4. Access the EIP of the ALB instance obtained in Step 1 to check whether nginx can be accessed as expected. Figure - Access the EIP

Step 3: Create event-triggered tasks based on the (ALB) QPS per Backend Server metric

  1. Log on to the Auto Scaling console.

  2. Create scaling rules.

    In this example, two scaling rules are created. One scale-out rule named Add1 is used to add one elastic container instance. One scale-in rule named Reduce1 is used to remove one elastic container instance from the scaling group. For information about how to create a scaling rule, see the Create a scaling rule section in the "Manage scaling rules" topic.

  3. Create event-triggered tasks.

    1. On the Scaling Groups page, find the scaling group named alb-qps-scalinggroup and click Details in the Actions column.

    2. Click the Scaling Rules and Event-triggered Tasks tab, click the Scheduled and Event-triggered Tasks tab, and then click the Event-triggered Tasks (System) tab. On the page that appears, click Create Event-triggered Task.

      In this example, two event-triggered tasks are created. One event-triggered task named Alarm1 is used to trigger the scale-out rule named Add 1. The other event-triggered task named Alarm2 is used to trigger the scale-in rule named Reduce1. For information about how to create an event-triggered task, see the Create event-triggered tasks section in the "Manage event-triggered tasks" topic.

      • When you create the Alarm1 event-triggered task, specify (ALB) QPS per Backend Server as the metric and Average > 100 counts/s as the alert condition.

        Note

        QPS per backend server = Total QPS/Total number of elastic container instances

        alarm1
      • When you create the Alarm2 event-triggered task, specify (ALB) QPS per Backend Server as the metric and Average < 50 counts/s as the alert condition.alarm2-zh1

Check the monitoring effect

  1. Log on to the PTS console.

  2. Create a PTS scenario for the EIP of the ALB instance.

    1. In the left-side navigation pane of the PTS console, choose Performance Test > Create Scenario and click Stress Test.

    2. On the Scenario Settings tab, click Add PTS Node.

      Note

      If you click Add PTS Node, the system considers that you want to add an HTTP PTS node.

    3. Enter the name of the API operation to perform the stress test and then enter the stress test URL.

      In this step, nginx_api is used as the API operation to perform the stress test and the EIP of the ALB instance obtained in Step 1 is the stress test URL.

    4. Click the Stress Settings tab. Configure parameters of the PTS node as prompted.

      The following table describes the parameters that you must configure on the Stress Settings tab. For parameters that are not described in the following table, retain the default settings.

      Parameter

      Description

      Example

      Stress Testing Mode

      Valid values:

      • Concurrent Mode: The system determines the service performance based on the number of users who can initiate requests to the service at the same time.

      • RPS Mode: The system determines the service performance based on the number of received requests per second.

      RPS Mode

      Stress Test Period

      The amount of time that is required to complete the stress test. Unit: minutes.

      10 Minutes

      RPS Upper Limit

      In request per second (RPS) mode, the system tests the service throughput of an API operation. Therefore, you must specify a maximum number of RPS and an initial number of RPS for each API operation.

      500

      Initial RPS

      The initial number of RPS of each API operation.

      500

  3. In the lower part of the Stress Settings tab, click Debug Scenario.

    If no error is reported, click Close.

  4. In the lower part of the Stress Settings tab, click Save and Test. In the dialog box that appears, set the Trigger Cycle parameter to Trigger Now and click OK, Test Now.

    If the stress test is successful, you can view the stress test report and sampling logs.

  5. View the monitoring details of the (ALB) QPS per Backend Server metric.

    For information about how to view the monitoring details of a metric, see the View instance metrics topic. Figure - Monitoring details

    The preceding figure shows an example. From 17:25:00 to 17:35:00, the stress test whose QPS value is 500 starts. At 17:30:00, 17:32:00, 17:34:00, and 17:36:00, Auto Scaling adds one elastic container instance during each scale-out. Each time a scale-out is completed, QPS per backend server of the scaling group decreases. This best practice proves that (ALB) QPS per Backend Server is an effective metric for triggering scaling activities in scaling groups in real time.