By Yuanyi and Zibai
In the era of cloud-native containers, users need to face a large number of business scenarios, including periodic businesses and serverless on-demand use. When using automatic elasticity, you will inevitably encounter several problems, among which the most significant ones are elastic lag and cold start. These challenges have inspired the Alibaba Cloud-Native team and the Decision Intelligent Timing Team of the Alibaba DAMO Academy to jointly develop the Advanced Horizontal Pod Autoscaler (AHPA) elasticity prediction component. The main starting point of this solution is to make "timing planning" based on the detected cycle and realize the purpose of early expansion through planning so that you can use it on demand while ensuring business stability.
The expectations for cloud-native elasticity has been increasing from two aspects. One is the rise of cloud-native concepts; from the VM era to the container era, cloud usage patterns have changed. The second is the rise of new business models, which are built on the cloud at the beginning of their design, and naturally have a demand for elasticity.
With the cloud, users no longer need to build infrastructure from physical servers and data centers. The cloud provides users with flexible infrastructure. The biggest advantage of the cloud is that it can provide users with flexible resource supply, especially in the cloud-native era when the demand for elasticity from users is getting stronger. The strength of elastic demand is still at the minute-level of manual operation in the VM era. In the container era, the requirements are now within the second-level. Users are facing different business scenarios, and their expectations and requirements for the cloud are changing:
So can the existing elastic scheme in Kubernetes solve the problems in preceding scenarios?
Generally, there are three ways to manage the number of application instances in Kubernetes: fixed number of instances, HPA, and CronHPA. The most common approach is using a fixed number of instances. The biggest problem with this is that it causes resource waste during off-peak hours of the business. To solve the problem of resource waste, there is HPA, but the elastic trigger of HPA is lagging, which leads to resource supply lag. This may result in the decline of business stability. CronHPA can be scaled regularly, which seems to solve the problem of elastic lag, but we need to think about how fine the specific timing granularity is, and is there a need to manually adjust the timing elastic policy when the traffic volume changes? If you do this, it will bring heavier O&M complexity and potentially more errors.
The main starting point of AHPA (Advanced Horizontal Pod Autoscaler) elasticity prediction is to make "timing planning" based on the detected period and realize the purpose of advanced expansion through planning. However, since it is planning, there will be omissions. You need to have the ability to adjust the number of instances planned in real-time. This scenario has two elastic strategies: active prediction and passive prediction. The active prediction uses the RobustPeriod algorithm of DAMO academy  to identify the cycle length and then uses the RobustSTL algorithm  to generate periodic trends to proactively predict the number of instances to be applied in the next cycle. Passive prediction sets the number of instances based on real-time data of applications to cope with bursts of traffic. In addition, AHPA adds a bottom protection policy that users can set the upper and lower bounds of the number of instances. The number of instances that finally take effect in the AHPA algorithm is the maximum in active prediction, passive prediction, and bottom-up strategies.
Elasticity is first carried out under the condition of a stable business. The core purpose of Auto Scaling is not only to help users save costs but also to enhance the overall stability of the business, O&M-free, and build the core competitiveness. The basic principles of AHPA architecture design include:
The following figure shows the architecture:
Stability Assurance: The elastic logic of AHPA is based on the strategy of active warm-up and passive bottom-up, combined with degradation protection, to ensure resource stability.
Multiple Scaling Methods: AHPA supports Knative, HPA, and Deployment:
AHPA is ideal for scenarios including:
After AHPA elasticity is enabled, we provide a visualization page to view AHPA effects. Here is an example of a prediction based on CPU metrics (compared to using HPA):
The results show that AHPA can use predictive scaling to handle fluctuating workloads as expected.
To learn more, visit the documentation of Alibaba Cloud Container Service for AHPA elastic prediction at this link: https://www.alibabacloud.com/help/en/container-service-for-kubernetes/latest/ahpa
1 Qingsong Wen, Kai He, Liang Sun, Yingying Zhang, Min Ke, and Huan Xu. RobustPeriod: Robust Time-Frequency Mining for Multiple Periodicity Detection, in Proc. of 2021 ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD 2021), Xi'an, China, Jun. 2021.
2 Qingsong Wen, Jingkun Gao, Xiaomin Song, Liang Sun, Huan Xu, Shenghuo Zhu. RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series, in Proc. of the 33rd AAAI Conference on Artificial Intelligence (AAAI 2019), 2019, pp. 5409-5416, Honolulu, Hawaii, Jan 2019.
3 Qingsong Wen, Zhe Zhang, Yan Li, and Liang Sun. Fast RobustSTL: Efficient and Robust Seasonal-Trend Decomposition for Time Series with Complex Patterns, in Proc. of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2020), San Diego, CA, Aug. 2020.
Cloud-Native DevOps - Modeling Application Delivery Is Important
349 posts | 40 followersFollow
Alibaba Cloud Native - January 6, 2023
Alibaba Cloud Native Community - July 19, 2022
Alibaba Container Service - April 11, 2019
Alibaba Container Service - August 25, 2020
Alibaba Cloud Native - June 9, 2022
Alibaba Clouder - June 9, 2020
349 posts | 40 followersFollow
An online computing service that offers elastic and secure virtual cloud servers to cater all your cloud hosting needs.Learn More
High Performance Computing (HPC) and AI technology helps scientific research institutions to perform viral gene sequencing, conduct new drug research and development, and shorten the research and development cycle.Learn More
A HPCaaS cloud platform providing an all-in-one high-performance public computing serviceLearn More
Alibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.Learn More
More Posts by Alibaba Cloud Native Community