All Products
Search
Document Center

Serverless App Engine:Configure an auto scaling policy

Last Updated:Jan 30, 2024

When you perform O&M operations on traditional clusters, the following issues may occur: high system maintenance costs, slow scaling, difficult capacity planning, and low resource utilization. Auto scaling is an important O&M capability for distributed application management. If auto scaling is enabled, the number of instances can be automatically adjusted based on the instance status. SAE provides the application monitoring feature that allows you to monitor your resource usage. SAE also provides multiple types of auto scaling policies. You can select metrics and specify thresholds based on your business requirements. This helps you increase the resource utilization and reduce resource costs.

Prerequisites

An application is deployed. For more information, see Deploy a demo application on SAE.

Background information

SAE supports the following methods to scale your applications:

  • Manual scaling: If you need to immediately scale your applications, we recommend that you use the manual scaling feature. For example, you can add instances when unexpected traffic surges occur. For more information, see Manual scaling.

  • Auto scaling: If you do not need to immediately scale your applications, we recommend that you use the auto scaling feature. For example, you can configure auto scaling policies to automatically add or remove instances based on periodic traffic changes. SAE supports the following types of auto scaling policies: scheduled auto scaling policy, metric-based auto scaling policy, and hybrid auto scaling policy. For more information, see Best practices for SAE auto scaling.

The following figure shows the process of configuring an auto scaling policy.dg_sae_auto_scaling_rule

Scenarios

SAE supports the following auto scaling polices:

  • Scheduled auto scaling policies are suitable for scenarios in which an application needs to use resources within a specific period of time. Scheduled auto scaling policies are commonly used in industries such as securities, healthcare, public administration, and education.

  • Metric-based auto scaling policies are suitable for scenarios in which burst traffic and periodic traffic occur when an application uses resources. Metric-based auto scaling policies are commonly used in industries such as Internet, gaming, and social media.

  • Hybrid auto scaling policies are suitable for scenarios in which an application needs to use resources within a specific period of time and burst traffic and periodic traffic occur when the application uses the resources. Hybrid auto scaling policies are commonly used in industries such as Internet, education, and catering.

Usage notes

  • You can create up to five scheduled auto scaling policies, one metric-based auto scaling policy, or one hybrid auto scaling policy. The three types of auto scaling policies cannot be used at the same time.

  • If an auto scaling policy is enabled for an application, you cannot manage the lifecycle of the application. For example, you cannot scale, deploy (including single-batch release, phased release, and canary release), stop, or restart the application, or change the instance type. If you want to perform the preceding operations, you must disable the auto scaling policy.

  • If you manage the lifecycle of an application, you can create or enable an auto scaling policy for the application only after you complete the management process.

  • Up to 50 instances can be deployed for a single application. To increase the quota, join the DingTalk group 32874633 and apply to be added to a whitelist.

View the metrics of application instances (invitational preview)

If the auto scaling feature is disabled, you can view the metrics of application instances and determine whether to configure auto scaling policies. If the auto scaling feature is enabled, you can view the metrics of application instances and compare the values of the metrics before and after auto scaling is enabled. This helps you evaluate the effects of auto scaling, and adjust the alert rules and the maximum and minimum numbers of instances accordingly.

Note

The application instance trend chart is in invitational preview. If you want to use the feature, join the DingTalk group 32874633 and apply to be added to a whitelist.

  1. Log on to the SAE console.

  2. In the left-side navigation pane, click Applications. In the top navigation bar, select a region. Then, click the name of an application.

  3. In the Application Instance Trend Chart section on the Basic Information tab, select a dimension based on your business requirements.sc_pod_trend_indicator_chart

    You can modify the selected time range on the timeline below the trend chart to view the data based on your business requirements. You can also click a legend in the lower-left corner of the chart to view data of a specific metric.

    Note

    SAE allows you to query only metric data of the previous seven days to monitor application instances.

Configure an auto scaling policy

Warning

To prevent unpredictable errors such as business interruptions during scaling operations, we recommend that you configure or enable auto scaling policies based on your business scenario.

  1. Log on to the SAE console.

  2. In the left-side navigation pane, click Applications. In the top navigation bar, select a region. Then, click the name of an application.

  3. In the upper-right corner of the Basic Information page, click Auto Scaling. The Create Auto Scaling Policy panel appears.

  4. In the Create Auto Scaling Policy panel, configure the parameters.

    • Scheduled Scaling Policy

      Parameter

      Description

      Policy Type

      Select Scheduled Scaling Policy.

      Policy Name

      Enter a custom name for the policy.

      Select Time

      • Short-term: This setting is suitable for scenarios in which you need to specify a start date and an end date for a scheduled auto scaling policy.
      • Long-term: This setting is suitable for scenarios in which you do not need to specify an end date for a scheduled auto scaling policy.

      Cycle

      The frequency at which the scheduled auto scaling policy is executed.

      • Daily: The scheduled auto scaling policy is executed during a specific period of time every day.
      • Weekly: The scheduled auto scaling policy is executed during a specific period of time on the specified number of days per week.
      • Monthly: The scheduled auto scaling policy is executed during a specific period of time on the specified dates per month.

      Trigger Time on Single Day

      The point in time at which the scheduled auto scaling policy is triggered and the number of application instances that are retained during the corresponding period of time. Example:

      In this example, the Cycle parameter is set to Daily, one start time is set to 08:00 and the corresponding number of retained instances is set to 10, and another start time is set to 20:00 and the corresponding number of retained instances is set to 3. After the scheduled auto scaling policy is enabled, SAE retains 10 instances from 08:00 to 20:00 on a day and retains 3 instances from 20:00 of the day to 08:00 of the next day based on the specified trigger points in the scheduled auto scaling policy.

      Important
      • The interval between two consecutive trigger points that are specified in the Trigger Time on Single Day section of a scheduled auto scaling policy must be greater than 5 minutes. If you configure multiple scheduled auto scaling policies, SAE checks whether the interval between two consecutive trigger points within the same date is greater than 5 minutes.

      • If you configure multiple scheduled auto scaling policies and two trigger points overlap, the system uses the newer trigger point. You must make sure that the interval is greater than 5 minutes when you configure scheduled auto scaling policies.

    • Metric-based Auto Scaling Policy

      Parameter

      Description

      Policy Type

      Select Metric-based Auto Scaling Policy.

      Policy Name

      Enter a custom name for the policy.

      Trigger Conditions

      Select one or more of the following metrics. By default, the CPU Utilization and Memory Usage metrics are displayed. You can click Add to add more metrics.

      • CPU Utilization: the average CPU utilization per instance.
      • Memory Usage: the average memory usage per instance.
      • TCP Active Connections: the average number of active TCP connections within 30 seconds per instance.
      • Total TCP Connections: the average total number of TCP connections within 30 seconds per instance.
      • Application QPS: the average queries per second (QPS) within 1 minute per instance.
      • Application RT: the average response time (RT) of all API operations in the application per minute.
      • Internet-facing CLB QPS: the average QPS of the Internet-facing Classic Load Balancer (CLB) instance within 15 seconds per instance.
      • Internet-facing CLB RT: the average RT of the Internet-facing CLB instance within 15 seconds.
      • Internal-facing CLB QPS: the average QPS of the internal-facing CLB instance within 15 seconds per instance.
      • Internal-facing CLB RT: the average RT of the internal-facing CLB instance within 15 seconds.
      Note
      • You can specify the Application QPS or Application RT metric only if the data of the Application QPS or Application RT metric in the previous 30 minutes is displayed on the Application Monitoring page.
      • If one of the values of the specified metrics is greater than or equal to the specified limit, the application instances are scaled out. The number of instances after the scale-out operation is less than or equal to the value of the Maximum Application Instances parameter. If the values of all specified metrics are less than the limits, the application instances are scaled in. The number of instances after the scale-in operation is greater than or equal to the value of the Minimum Application Instances parameter.
      • Before you specify CLB-related metrics, you must bind a CLB instance to the application and enable the access log management feature. You must specify CLB ports to which you want to listen over HTTP or HTTPS. For more information, see Bind an SLB instance to an application and Enable the access log management feature.

      Instances

      Configure the Minimum Application Instances and Minimum Application Instances parameters. Alternatively, use the slider to select a value range for the number of instances.

      • Minimum Application Instances: the minimum number of instances after a scale-in operation is performed if the specified scale-in conditions are met.
      • Maximum Application Instances: the maximum number of instances after a scale-out operation is performed if the specified scale-out conditions are met.
      • Minimum Available Instances: the minimum number of available instances for each deployment. The value must be greater than or equal to 0 and less than or equal to the total number of instances in the current application. This parameter is available only if the value of the Minimum Application Instances parameter is less than the value of the Minimum Available Instances parameter. Click Settings. In the Specify Minimum Available Instances dialog box, specify the minimum number of available instances.
      Note The maximum numbers of instances that can be specified in Manual Scaling and Create Auto Scaling Policy are synchronized. Make sure that at least one instance is available during application deployment and rollback. This ensures business continuity. If you set the value to 0, business interruptions occur when the application is upgraded.

      Advanced Settings

      • Scale-out Step Size: the maximum number of instances that can be added per unit time.
      • Scale-in Step Size: the maximum number of instances that can be removed per unit time.
      • Scale-out Stabilization Window: the period of time during which the system is stable. The auto scaling algorithm is used to ensure that the minimum number of expected instances calculated within the specified interval is used when a scale-out operation is performed.
      • Scale-in Stabilization Window: the period of time during which the system is stable. The auto scaling algorithm is used to ensure that the maximum number of expected instances calculated within the specified interval is used when a scale-in operation is performed.
      • Disable Scale-in: If you turn on this switch, the application instances are never scaled in. This prevents business risks during peak hours. By default, the switch is turned off.
    • Hybrid Scaling Policy

      Parameter

      Description

      Policy Type

      Select Hybrid Scaling Policy.

      Policy Name

      Enter a custom name for the policy.

      Trigger Conditions

      Select one or more of the following metrics. By default, the CPU Utilization and Memory Usage metrics are displayed. You can click Add to add more metrics.

      • CPU Utilization: the average CPU utilization per instance.
      • Memory Usage: the average memory usage per instance.
      • TCP Active Connections: the average number of active TCP connections within 30 seconds per instance.
      • Total TCP Connections: the average total number of TCP connections within 30 seconds per instance.
      • Application QPS: the average queries per second (QPS) within 1 minute per instance.
      • Application RT: the average response time (RT) of all API operations in the application per minute.
      • Internet-facing CLB QPS: the average QPS of the Internet-facing Classic Load Balancer (CLB) instance within 15 seconds per instance.
      • Internet-facing CLB RT: the average RT of the Internet-facing CLB instance within 15 seconds.
      • Internal-facing CLB QPS: the average QPS of the internal-facing CLB instance within 15 seconds per instance.
      • Internal-facing CLB RT: the average RT of the internal-facing CLB instance within 15 seconds.
      Note
      • You can specify the Application QPS or Application RT metric only if the data of the Application QPS or Application RT metric in the previous 30 minutes is displayed on the Application Monitoring page.
      • If one of the values of the specified metrics is greater than or equal to the specified limit, the application instances are scaled out. The number of instances after the scale-out operation is less than or equal to the value of the Maximum Application Instances parameter. If the values of all specified metrics are less than the limits, the application instances are scaled in. The number of instances after the scale-in operation is greater than or equal to the value of the Minimum Application Instances parameter.
      • Before you specify CLB-related metrics, you must bind a CLB instance to the application and enable the access log management feature. You must specify CLB ports to which you want to listen over HTTP or HTTPS. For more information, see Bind an SLB instance to an application and Enable the access log management feature.

      Maximum and Minimum Application Instances

      Default Settings

      The maximum and minimum numbers of application instances.

      • Minimum Application Instances: the minimum number of instances after a scale-in operation is performed if the specified scale-in conditions are met.
      • Maximum Application Instances: the maximum number of instances after a scale-out operation is performed if the specified scale-out conditions are met.
      • Minimum Available Instances: the minimum number of available instances for each deployment. The value must be greater than or equal to 0 and less than or equal to the total number of instances in the current application. This parameter is available only if the value of the Minimum Application Instances parameter is less than the value of the Minimum Available Instances parameter. Click Settings. In the Specify Minimum Available Instances dialog box, specify the minimum number of available instances.
      Note The maximum numbers of instances that can be specified in Manual Scaling and Create Auto Scaling Policy are synchronized. Make sure that at least one instance is available during application deployment and rollback. This ensures business continuity. If you set the value to 0, business interruptions occur when the application is upgraded.

      Special Time Settings

      The maximum and minimum numbers of application instances within a specific period of time. By default, the Select Time 1 section is displayed. Click + Add Special Time Period and configure the parameters in the Select Time 2 section that appears.

      • Short-term: This setting is suitable for scenarios in which you need to specify a start date and an end date for a scheduled auto scaling policy.
      • Long-term: This setting is suitable for scenarios in which you do not need to specify an end date for a scheduled auto scaling policy.
      Note
      • You can create up to two Select Time configurations in the Special Time Settings section.

      • In a Select Time configuration of a hybrid auto scaling policy, you can specify up to 20 special periods of time within the same day. Make sure that the specified periods of time within the same day do not overlap.

      Cycle

      Select Daily, Weekly, or Monthly from the Cycle drop-down list.

      • Daily: The scheduled auto scaling policy is executed during a specific period of time every day.
      • Weekly: The scheduled auto scaling policy is executed during a specific period of time on the specified number of days per week.
      • Monthly: The scheduled auto scaling policy is executed during a specific period of time on the specified dates per month.

      Trigger Time on Single Day

      The period of time during which the hybrid auto scaling policy takes effect, and the maximum and minimum numbers of application instances during this period. Example:

      In this example, the Cycle parameter is set to Daily, the Special Time Period parameter is set to 08:00 to 20:00, the Minimum Application Instances parameter is set to 3, and the Maximum Application Instances parameter is set to 10. After the hybrid auto scaling policy is enabled, SAE executes the preceding scheduled auto scaling policy based on the settings in the Trigger Conditions section within the specified periods of time.

      Advanced Settings

      • Scale-out Step Size: the maximum number of instances that can be added per unit time.
      • Scale-in Step Size: the maximum number of instances that can be removed per unit time.
      • Scale-out Stabilization Window: the period of time during which the system is stable. The auto scaling algorithm is used to ensure that the minimum number of expected instances calculated within the specified interval is used when a scale-out operation is performed.
      • Scale-in Stabilization Window: the period of time during which the system is stable. The auto scaling algorithm is used to ensure that the maximum number of expected instances calculated within the specified interval is used when a scale-in operation is performed.
      • Disable Scale-in: If you turn on this switch, the application instances are never scaled in. This prevents business risks during peak hours. By default, the switch is turned off.
      Note
      • Multiple special periods of time in a hybrid auto scaling policy cannot overlap.

      • A trigger point in a hybrid auto scaling policy cannot be earlier than the current system time. If you specify a trigger point that is earlier than the current system time, the hybrid auto scaling policy takes effect in the next cycle.

  5. Click Next: Preview Scheduled Scaling Policy to view the policy settings. Then, click Confirm.

  6. After you configure the policy, go to the Auto Scaling section on the Instance Deployment Information tab. Find the policy that you configured and click Enable in the Actions column.

  7. In the Enable Auto Scaling Policy message, click OK.

    If the value that is displayed in the Status column changes to Enabled, the policy is enabled.

Verify an auto scaling policy

After an auto scaling policy is enabled for an application, SAE starts to monitor the application status. If the trigger conditions are met, SAE automatically scales the application based on the policy. You can perform the following steps to check whether the auto scaling policy has taken effect.

  1. Log on to the SAE console.

  2. In the left-side navigation pane, click Applications. In the top navigation bar, select the region in which your application is deployed. On the Applications page, find the application for which the auto scaling policy is configured.

    You can check the status of the auto scaling policy in the Scaling Policy Status and Current Instances/Expected Instances columns. You can also click the application name to go to the Basic Information tab and view the Running Instances parameter in the Application Information section.

    If the value of the Running Instances parameter is the same as the required number of instances in the auto scaling policy, the policy has taken effect.

  3. Optional. In the left-side navigation pane, click Application Events. In the All Source Types drop-down list, select Auto Scaling (HorizontalPodAutoscaler) to view the causes of auto scaling events.

What to do next

After you configure an auto scaling policy for an application, you can view the details of the policy on the Instance Deployment Information tab.

  1. Log on to the SAE console.

  2. In the left-side navigation pane, click Applications. In the top navigation bar, select a region. Then, click the name of an application.

  3. On the Basic Information page, click the Instance Deployment Information tab. On this tab, expand the Auto Scaling section. In the Actions column of the configured policy, enable, disable, modify, or delete the policy based on your business requirements.