Resource demand is difficult to predict in cloud-native scenarios. Horizontal Pod Autoscaler (HPA) provided by Kubernetes scales resources with a scaling delay and the configuration is complex. To resolve the preceding issues, ACS supports predictive scaling based on AHPA. AHPA can automatically learn the pattern of workload fluctuations and predict resource demand based on historical metric data to help you implement predictive scaling. This topic describes the business architecture, advantages, and scenarios of AHPA.
Background information
The following traditional methods are used to manage the pods of an application: manually specify the number of pods, use HPA, and use CronHPA. The following table describes the disadvantages of the preceding methods.
Method | Disadvantage |
Manually specify the number of pods | Resources are wasted and you are charged for idle resources during off-peak hours. |
HPA | Scaling activities are performed after a scaling delay. Scale-out activities are triggered only if the resource usage exceeds the threshold and scale-in activities are triggered only if the resource usage drops below the threshold. |
CronHPA |
|
To address the preceding issues, ACS provides predictive scaling based on AHPA to improve resource utilization.
AHPA can analyze historical data and predict the number of pods that are required per minute within the next 24 hours. If you use CronHPA, you must manually create 1,440 (24 hours × 60 minutes) schedules instead. The following figure shows the difference between traditional horizontal pod scaling and predictive horizontal pod scaling.
Traditional horizontal pod scaling: Scale-out activities are triggered after the amount of workloads increases. The system cannot provision pods at the earliest opportunity to handle the fluctuating workloads due to the scaling delay.
Predictive horizontal pod scaling: AHPA identifies workload fluctuations based on the historical values of specific metrics and the amount of time that a pod spent before the state of the pod changes to Ready. This way, AHPA can provision pods that are ready to be scheduled before a traffic spike occurs. This ensures that resources are allocated at the earliest opportunity.
Business Architecture
Various metrics: AHPA supports metrics such as CPU, GPU, memory, queries per second (QPS), response time (RT), and external metrics.
Stability: AHPA uses proactive prediction, passive prediction, and service degradation to guarantee sufficient resources for applications.
Proactive prediction: AHPA predicts the trend of workload fluctuations based on historical metric data. Proactive prediction is suitable for applications whose workloads periodically fluctuate.
Passive prediction: AHPA predicts workload fluctuations in real time. AHPA can predict workload fluctuations and deploy resources in real time.
Service degradation: AHPA allows you to specify the maximum and minimum numbers of pods within one or more time periods.
Multiple scaling methods: AHPA can use HPA and Deployments to perform scaling.
HPA: AHPA can simplify the configuration of HPA scaling policies and help beginners handle the scaling delay issue.
Deployment: AHPA can directly use Deployments to perform auto scaling.
Advantages
High performance: AHPA can predict workload fluctuations within milliseconds and scale resources within seconds.
High accuracy: AHPA can identify workload fluctuations with an accuracy higher than 95% based on proactive prediction and passive prediction.
High stability: AHPA allows you to specify the maximum and minimum numbers of pods required within time periods that are accurate to minutes.
Scenarios
Applications whose workloads periodically fluctuate, such as live streaming, online education, and gaming applications.
Scenarios in which the number of pods that are deployed is fixed and auto scaling is also used to handle workload fluctuations. For example, you can handle unexpected burst traffic in regular business scenarios.
System recommendations on the number of pods to be provisioned are required. AHPA provides a standard Kubernetes API to allow you to obtain prediction results. You can integrate the API into your business systems.
References
For more information about how to deploy and use AHPA in ACS, see Deploy and use AHPA to predict resource demand.