All Products
Search
Document Center

Container Service for Kubernetes:AHPA Controller release notes

Last Updated:Mar 26, 2026

The Advanced Horizontal Pod Autoscaler (AHPA) is an ACK component that adds predictive scaling to Kubernetes workloads. Standard Horizontal Pod Autoscaler (HPA) reacts to metrics after they spike — AHPA eliminates that lag by combining machine learning-based proactive prediction with real-time passive prediction, so pods are ready before traffic arrives.

Install the AHPA controller to enable the predictive scaling feature.

How it works

AHPA runs two complementary scaling paths in parallel:

  • Proactive prediction: AHPA analyzes historical metric data using machine learning algorithms from DAMO Academy and forecasts pod demand up to 24 hours ahead. Pods are pre-provisioned before a predicted traffic surge arrives. This path is effective for applications with periodic traffic patterns — daily peaks, weekly cycles, and similar rhythms.

  • Passive prediction: When real-time metrics deviate from the predicted baseline — for example, during an unexpected traffic spike — AHPA adjusts the pod count immediately based on current data. This path provides a safety net for unpredictable load changes.

Service degradation lets you set maximum and minimum pod counts within one or more specific time windows.

The following figure shows the AHPA architecture.

image

The architecture comprises the following modules:

Module Role
DAMO Academy ML algorithms Analyze historical metrics to predict future pod demand up to 24 hours ahead
Proactive predictor Schedules pre-scaling based on the ML forecast
Passive predictor Monitors real-time metrics and triggers immediate scaling when deviations exceed the predicted baseline
Scaling executor Applies scaling decisions to Knative, HPA, or Deployments
Service degradation layer Enforces configured min/max pod counts within specific time windows

Supported metrics

AHPA supports the following metric types: CPU, memory, QPS (queries per second), RT (response time), and external metrics.

Scaling methods

AHPA applies scaling decisions through three target types:

Target Best for
Knative Serverless workloads; resolves cold start issues; scales based on concurrency, QPS, or RT
HPA Standard Kubernetes workloads; simpler policy configuration; also handles cold start
Deployment Direct Deployment scaling without an intermediate HPA resource

For ACK Serverless clusters where all pods run on elastic container instances, AHPA enables zero-node auto scaling — the cluster scales at the pod level without managing node capacity.

Design principles

AHPA is built around three principles:

  • Stable: Scaling triggers only when the application is in a stable state.

  • O&M-free: No additional controllers are required on the client side. AHPA's configuration syntax is simpler than HPA's.

  • Serverless-oriented: The focus is on pods, not on node resource utilization, making AHPA a natural fit for serverless and elastic container instance environments.

Usage notes

For configuration details, prerequisites, and YAML examples, see AHPA overview.

Release notes

April 2024

Version Release date Description Impact
v2.6.0-aliyun.1 2024-04-16 Optimized the metrics collection link that uses metrics-server. Perform the update during off-peak hours.

March 2024

Version Release date Description Impact
v2.5.6-aliyun.1 2024-03-20 Fixed a panic caused by custom metric processing. Perform the update during off-peak hours.

December 2023

Version Release date Description Impact
v2.5.0-aliyun.1 2023-12-25 Added support for custom PromQL configurations and Elastic Workload. Improved kubectl output to display whether periodic features are active. Perform the update during off-peak hours.

October 2023

Version Release date Description Impact
v2.4.0-aliyun.1 2023-10-16 Added support for multiple metrics in passive prediction. Improved kubectl output to display multiple AHPA metrics. Fixed an issue where modifications to the object specified by TargetRef did not take effect. Perform the update during off-peak hours.

July 2023

Version Release date Description Impact
v2.3.0-aliyun.1 2023-07-12 Added support for custom metrics. Improved kubectl output to display the type of resource referenced. Perform the update during off-peak hours.

June 2023

Version Release date Description Impact
v2.2.0-aliyun.1 2023-06-19 Added concurrency metric support for predictive scaling. Optimized Knative passive processing logic. Improved latency for real-time CPU and memory metric queries. Perform the update during off-peak hours.

April 2023

Version Release date Description Impact
v2.1.0-aliyun.1 2023-04-26 Added Prometheus dashboard support. Added support for customizable time ranges for historical metrics. Perform the update during off-peak hours.

July 2022

Version Release date Description Impact
v1.0.0-aliyun.1 2022-07-13 Initial release. Supported CPU, memory, RT, and QPS metrics for predictive scaling. Supported scaling via Deployments, HPA, and Knative. No impact on workloads.