All Products
Search
Document Center

Serverless App Engine:Configure auto scaling policy

Last Updated:Feb 26, 2025

In order to respond to sudden traffic spikes, you can configure a scaling policy for microservice applications in the Serverless App Engine (SAE) console. In this way, SAE automatically scales in or out application instances. This topic describes how to configure an auto scaling policy for a microservice application.

Overview

SAE supports the following methods to scale your applications:

  • Manual scaling: If you need to immediately scale your applications, we recommend that you use the manual scaling feature. For example, you can add instances when unexpected traffic surges occur. For more information, see Manual scaling.

  • Auto scaling: If you do not need to immediately scale your applications, we recommend that you use the auto scaling feature. For example, you can configure auto scaling policies to automatically add or remove instances based on periodic traffic changes. SAE supports the following types of auto scaling policies: scheduled auto scaling policy, metric-based auto scaling policy, and hybrid auto scaling policy. For more information, see Best practices for SAE auto scaling.

The following figure shows the process of configuring an auto scaling policy.dg_sae_auto_scaling_rule

Scenarios

SAE supports the following auto scaling policies:

  • Scheduled auto scaling policy is suitable for scenarios in which an application needs to use resources within a specific period of time. Scheduled auto scaling policies are commonly used in industries such as securities, healthcare, public administration, and education.

  • Metric-based auto scaling policy is suitable for scenarios in which burst traffic and periodic traffic occur when an application uses resources. Metric-based auto scaling policies are commonly used in industries such as Internet, gaming, and social media.

  • Hybrid auto scaling policy is suitable for scenarios in which an application needs to use resources within a specific period of time and burst traffic and periodic traffic occur when the application uses the resources. Hybrid auto scaling policies are commonly used in industries such as Internet, education, and catering.

Usage notes

  • You can create up to five scheduled auto scaling policies, one metric-based auto scaling policy, or one hybrid auto scaling policy. The three types of auto scaling policies cannot be used at the same time.

  • If an auto scaling policy is enabled for an application, you cannot manage the lifecycle of the application. For example, you cannot scale, deploy (including single-batch release, phased release, and canary release), stop, or restart the application, or change the instance type. If you want to perform the preceding operations, you must disable the auto scaling policy.

  • If you manage the lifecycle of an application, you can create or enable an auto scaling policy for the application only after you complete the management process.

  • Up to 50 instances can be deployed for a single application. To increase the quota, join the DingTalk group 32874633 and apply to be added to a whitelist.

Prerequisites

An application is deployed. For more information, see Application deployment.

Procedure

  1. Log on to the SAE console. In the left-side navigation pane, choose Applications > Applications. On the Applications page, select a region in the top navigation bar and a namespace from the Application List drop-down list, and then click the desired application name.

    IXAcRBAUok

  2. On the Basic Information page of the target application, click the Auto Scaling tab, and then click Create Auto Scaling Policy in the Auto Scaling area.

Configure auto scaling policy

Scheduled auto scaling policy

  1. In the Create Auto Scaling Policy panel, configure the following parameters, and then click Next: Preview Scheduled Auto Scaling Policy.

    Parameter

    Description

    Example

    Policy Type

    Select Scheduled Auto Scaling Policy.

    Scheduled Auto Scaling Policy

    Policy Name

    Enter a custom name for the policy.

    demo

    Time Settings

    You can select Long-term or Short-term:

    • Short-term: You need to specify a start date and an end date.

    • Long-term: If long-term is selected, this policy is valid for a long time.

    Long-term

    Cycle

    Select Daily, Weekly, or Monthly.

    Daily

    Trigger Time On Single Day

    Configure Trigger Time and Instance Retained After Trigger Time.

    Trigger Time: The time when the scaling policy is triggered.

    Instances Retained After Trigger Time: The number of instances after the policy is triggered.

    Trigger Time: 08:00 and 20:00

    Instances Retained After Trigger Time: 10 and 3

    image

  2. Click the Next: Preview Scheduled Auto Scaling Policy to view the number of instances for a specific time period, and then click OK.

Metric-based auto scaling policy

In the Create Auto Scaling Policy panel, configure the following parameters, and then click OK.

Parameter

Description

Example

Policy Type

Select Metric-based Auto Scaling Policy.

Metric-based Auto Scaling Policy

Policy Name

Enter a custom name for the policy.

demo

Trigger Condition

Select a metric:

  • CPU Utilization.

  • Memory Usage.

  • TCP Active Connections.

  • Total TCP Connections.

  • Application QPS.

  • Application RT.

  • Internet-facing CLB QPS.

  • Internet-facing CLB RT.

  • Internal-facing CLB QPS.

  • Internal-facing CLB RT.

Note
  • For information about metric aggregation method, see the description in the console.

  • You can add multiple metrics simultaneously.

CPU utilization

Specify a value for the metric. If the value of the metric equals the configured value, the scaling policy is triggered to automatically scale application instances.

70%

Instances

Configure Minimum Application Instances, Maximum Application Instances, and Minimum Available Instances.

Note

Minimum Available Instances is the minimum number of available instances for each update. You can specify a value By Number or By Ratio.

  • Minimum Application Instances: 6

  • Maximum Application Instances: 50

  • Minimum Avaliable Instances: equal to 3 if you select By Number

Advanced Settings

(Optional) Configure the following information as needed:

  • Scale-out Step Size: the maximum number of instances that can be added per unit time.

  • Scale-in Step Size: the maximum number of instances that can be removed per unit time.

  • Scale-out Stabilization Window: the period of time during which the system is stable. The auto scaling algorithm is used to ensure that the minimum number of expected instances calculated within the specified interval is used when a scale-out operation is performed.

  • Scale-in Stabilization Window: the period of time during which the system is stable. The auto scaling algorithm is used to ensure that the maximum number of expected instances calculated within the specified interval is used when a scale-in operation is performed.

  • Disable Scale-in: If you turn on this switch, the application instances are never scaled in. This prevents business risks during peak hours. By default, the switch is turned off.

None

image

Hybrid auto scaling policy

Note

Hybrid Auto Scaling Policy combines Scheduled Auto Scaling Policy and Metric-based Auto Scaling Policy.

  1. In the Create Auto Scaling Policy panel, specify the following parameters.

    1. Select Hybrid Scaling Policy as Policy Type, and enter a custom name for the Policy Name.

    2. For more information on how to configure Metric Settings, see Metric-based auto scaling policy.

      image

    3. Optional: Click Advanced Settings and configure the following information as needed.

      Configuration item

      Description

      Example

      Scale-out Step Size

      The maximum number of instances that can be added per unit time.

      3

      Scale-in Step Size

      The maximum number of instances that can be removed per unit time.

      2

      Scale-out Stabilization Window

      The period of time during which the system is stable. The auto scaling algorithm is used to ensure that the minimum number of expected instances calculated within the specified interval is used when a scale-out operation is performed.

      300 seconds

      Scale-in Stabilization Window

      The period of time during which the system is stable. The auto scaling algorithm is used to ensure that the maximum number of expected instances calculated within the specified interval is used when a scale-in operation is performed.

      300 seconds

      Prohibit Scale-in

      If you turn on this switch, the application instances are never scaled in. This prevents business risks during peak hours. By default, the switch is turned off.

      Enable

      9d12xgdrlp

    4. For information on how to configure Special Time Period Settings, see Scheduled auto scaling policy.

      image

  2. Click Next: Preview Scheduled Auto Scaling Policy to view the number of instances for a specific time period, and then click OK.

Verify whether the scaling policy has taken effect

Method 1

On the Basic Information page of the target application, click the Auto Scaling tab. If the number of running instances is the same as the required number of instances in the auto scaling policy, the scaling policy has taken effect.

pB8HRLMehK

Method 2

  1. Add an Internet CLB instance for the application.

    1. Log on to the SAE console. On the Basic Information page of the target application, click AddInternetCLB Access in the Application Access Settings area.

      image

    2. In the Bind Internet CLB Instance panel, select Create CLB Instance (Pay-by-specification), configure the HTTP Port and Container Port, and click OK.

      image

    3. After the Public Endpoint is added, copy it.

      image

  2. Perform a stress test on the application.

    1. Log on to the PTS console. Enter the public endpoint on the Overview page (format: https://public IP Address), and click Test.

      image

    2. In the Test Settings panel, enter the Requests Per Second (RPS), agree to the stress test terms by selecting The test is permitted and complies with the applicable laws and regulations, and click Start.

      image

  3. Return to the SAE console. On the Basic Information page, click Instance tab to see if the number of application instances has been scaled out automatically. If so, the scaling policy has taken effect.

    image

    Note

    After the stress test completes, the number of instances will be scaled in automatically. This process may take some time, so please be patient.

What to do next

  1. On the Basic Information page of the target application, click the Auto Scaling tab.

  2. On this tab, expand the Auto Scaling section. In the Actions column of the configured policy, you can enable, disable, edit, delete the Events that trigger the policy based on your business requirements.

    image