AHPA Resilience Forecasting Best Practices

In the cloud native scenario, the resource capacity is often difficult to estimate, while using K8s native HPA often faces the problems of elastic lag and complex configuration. The AHPA (Advanced Horizontal Pod Autoscaler) elastic prediction, launched by Alibaba Cloud Container Service in cooperation with the Decision Intelligence Timing Team of the Dharma Academy, can automatically identify the elastic period and predict the capacity according to the business history indicators, help you make elastic planning in advance and solve the problem of elastic lag. How can AHPA be configured to unlock the best use posture? This article brings you the best practice of AHPA elastic prediction. Next, we will introduce it from the following aspects:

• Introduction to AHPA elastic prediction

• Introduction of index source and AHPA configuration parameters

• Application of boundary protection

• What can be done with noise reduction and algorithm quantile

• Typical scenario: from HPA to AHPA

• Typical scenario: elastic recommendation

AHPA elastic prediction introduction

Why do we need to make elastic prediction? First of all, the current application has the problem of cold start. We can look at the application start stage, including resource scheduling, pull image, container creation, container startup and application startup. In addition to solving the problems of IaaS resource allocation, Kubernetes scheduling, and image pulling, cold start also involves the application startup time. The application startup time can range from milliseconds to minutes. The application startup time is completely a business behavior and can hardly be controlled at the bottom platform level.

In addition, the current general elastic solution is faced with the following problems: poor availability, unable to estimate capacity, insufficient quantity, and excessive waste. After the stability risk elasticity, the configuration is solidified, the usability is poor, and the configuration is cumbersome. For example, the scheduled CronHPA needs to evaluate how much capacity is expanded and shrunk in each time period, and needs to be adjusted at any time as the business changes.

We think about the core problem of flexibility: to ensure the stability of business while improving the utilization of resources.

If we can predict in advance how many resources we need in the future based on historical data and machine learning algorithm, we can avoid the above problems. What do we need to do around this idea?

First, there should be historical indicator data, which is the premise of prediction, and then a prediction algorithm is needed to predict how many resources will be needed in the future through the algorithm, and finally take effect on the workload. I summed him up as the elastic prediction triple. Of course, our goal is to preheat resources in advance, automatically plan flexibly, and support flexible degradation to ensure stability.

The scheme is implemented as follows:

• Rich data indicators: including CPU, Memory, QPS, RT and external indicators

• Stability guarantee: AHPA's elastic logic is based on the strategy of active warm-up and passive bottoming out, and combined with degradation protection to ensure the stability of resources.

• Active prediction: predict the trend results of the future for a period of time according to the history, which is applicable to periodic applications.

• Passive prediction: real-time prediction. For sudden traffic scenarios, resources are prepared in real time through passive prediction.

• Degraded protection: supports the configuration of multiple instances with the maximum and minimum time range.

• Multiple scaling methods: AHPA supports scaling methods including Knative, HPA and deployment:

• Knative: solve the problem of flexible cold start based on concurrency/QPS/RT in the Serverless application scenario

• HPA: Simplify HPA flexibility policy configuration, reduce the threshold of user flexibility, and solve the problem of cold start when using HPA

• Deployment: directly use deployment to automatically expand and shrink capacity


Through AHPA, we can achieve millisecond prediction and second elasticity, and the recognition rate for complex cycles can reach more than 95%. At the same time, we support robustness and minute-level boundary protection configuration.

AHPA Best Practice Configuration

Indicator source configuration

First, we will introduce the indicator source configuration. Here we configure the corresponding indicator source through ConfigMap, as follows.

AdvancedHorizontalPodAutoscaler Configuration

In AHPA, we use the AdvancedHorizontalPodAutoscaler to configure the elastic policy, target object, and expansion/contraction time boundary. The details are as follows.

Indicator pretreatment

Due to the uneven quality of indicator data, we need to pre-process indicators in AHPA, including indicator de-duplication, indicator replenishment and indicator cleaning. When an application is started, it often encounters the phenomenon that the CPU utilization rate is relatively high during the startup process. Take the Java application as an example, the class loading process is relatively CPU intensive. This situation is not caused by the business flow itself, so it is not necessary to participate in the input of forecast indicators for such indicators, which need to be cleaned and filtered out. Therefore, this part of prediction interference is eliminated through index preprocessing.

Flexible use of boundary protection configuration

In AHPA, you can set the maximum and minimum values of different time periods, so that even in the case of abnormal indicators or inaccurate elastic prediction, you can also have a bottom-up strategy.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us