The Advanced Horizontal Pod Autoscaler (AHPA) is an ACK component that adds predictive scaling to Kubernetes workloads. Standard Horizontal Pod Autoscaler (HPA) reacts to metrics after they spike — AHPA eliminates that lag by combining machine learning-based proactive prediction with real-time passive prediction, so pods are ready before traffic arrives.
Install the AHPA controller to enable the predictive scaling feature.
How it works
AHPA runs two complementary scaling paths in parallel:
-
Proactive prediction: AHPA analyzes historical metric data using machine learning algorithms from DAMO Academy and forecasts pod demand up to 24 hours ahead. Pods are pre-provisioned before a predicted traffic surge arrives. This path is effective for applications with periodic traffic patterns — daily peaks, weekly cycles, and similar rhythms.
-
Passive prediction: When real-time metrics deviate from the predicted baseline — for example, during an unexpected traffic spike — AHPA adjusts the pod count immediately based on current data. This path provides a safety net for unpredictable load changes.
Service degradation lets you set maximum and minimum pod counts within one or more specific time windows.
The following figure shows the AHPA architecture.
The architecture comprises the following modules:
| Module | Role |
|---|---|
| DAMO Academy ML algorithms | Analyze historical metrics to predict future pod demand up to 24 hours ahead |
| Proactive predictor | Schedules pre-scaling based on the ML forecast |
| Passive predictor | Monitors real-time metrics and triggers immediate scaling when deviations exceed the predicted baseline |
| Scaling executor | Applies scaling decisions to Knative, HPA, or Deployments |
| Service degradation layer | Enforces configured min/max pod counts within specific time windows |
Supported metrics
AHPA supports the following metric types: CPU, memory, QPS (queries per second), RT (response time), and external metrics.
Scaling methods
AHPA applies scaling decisions through three target types:
| Target | Best for |
|---|---|
| Knative | Serverless workloads; resolves cold start issues; scales based on concurrency, QPS, or RT |
| HPA | Standard Kubernetes workloads; simpler policy configuration; also handles cold start |
| Deployment | Direct Deployment scaling without an intermediate HPA resource |
For ACK Serverless clusters where all pods run on elastic container instances, AHPA enables zero-node auto scaling — the cluster scales at the pod level without managing node capacity.
Design principles
AHPA is built around three principles:
-
Stable: Scaling triggers only when the application is in a stable state.
-
O&M-free: No additional controllers are required on the client side. AHPA's configuration syntax is simpler than HPA's.
-
Serverless-oriented: The focus is on pods, not on node resource utilization, making AHPA a natural fit for serverless and elastic container instance environments.
Usage notes
For configuration details, prerequisites, and YAML examples, see AHPA overview.
Release notes
April 2024
| Version | Release date | Description | Impact |
|---|---|---|---|
| v2.6.0-aliyun.1 | 2024-04-16 | Optimized the metrics collection link that uses metrics-server. | Perform the update during off-peak hours. |
March 2024
| Version | Release date | Description | Impact |
|---|---|---|---|
| v2.5.6-aliyun.1 | 2024-03-20 | Fixed a panic caused by custom metric processing. | Perform the update during off-peak hours. |
December 2023
| Version | Release date | Description | Impact |
|---|---|---|---|
| v2.5.0-aliyun.1 | 2023-12-25 | Added support for custom PromQL configurations and Elastic Workload. Improved kubectl output to display whether periodic features are active. | Perform the update during off-peak hours. |
October 2023
| Version | Release date | Description | Impact |
|---|---|---|---|
| v2.4.0-aliyun.1 | 2023-10-16 | Added support for multiple metrics in passive prediction. Improved kubectl output to display multiple AHPA metrics. Fixed an issue where modifications to the object specified by TargetRef did not take effect. | Perform the update during off-peak hours. |
July 2023
| Version | Release date | Description | Impact |
|---|---|---|---|
| v2.3.0-aliyun.1 | 2023-07-12 | Added support for custom metrics. Improved kubectl output to display the type of resource referenced. | Perform the update during off-peak hours. |
June 2023
| Version | Release date | Description | Impact |
|---|---|---|---|
| v2.2.0-aliyun.1 | 2023-06-19 | Added concurrency metric support for predictive scaling. Optimized Knative passive processing logic. Improved latency for real-time CPU and memory metric queries. | Perform the update during off-peak hours. |
April 2023
| Version | Release date | Description | Impact |
|---|---|---|---|
| v2.1.0-aliyun.1 | 2023-04-26 | Added Prometheus dashboard support. Added support for customizable time ranges for historical metrics. | Perform the update during off-peak hours. |
July 2022
| Version | Release date | Description | Impact |
|---|---|---|---|
| v1.0.0-aliyun.1 | 2022-07-13 | Initial release. Supported CPU, memory, RT, and QPS metrics for predictive scaling. Supported scaling via Deployments, HPA, and Knative. | No impact on workloads. |