Container Service for Kubernetes supports Advanced Horizontal Pod Autoscaler (AHPA). AHPA learns from and analyzes historical data to predict future resource needs and dynamically adjusts the number of pod replicas. This ensures that resources are scaled out and prefetched before demand peaks, which improves system response speed and stability. When a low-traffic period is predicted, AHPA also scales in resources at the appropriate time to save costs.
Prerequisites
You have created an ACK managed cluster or an ACK serverless cluster. For more information, see Create an ACK managed cluster or Create a cluster.
You have enabled Managed Service for Prometheus. Ensure that Managed Service for Prometheus has collected at least seven days of historical application data, such as CPU and memory usage. For more information about how to enable Managed Service for Prometheus, see Connect to and configure Managed Service for Prometheus.
Step 1: Install the AHPA Controller
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, find the one you want to manage and click its name. In the left navigation pane, click Add-ons.
On the Component Management page, find the AHPA Controller component, click Install on its card, and follow the on-screen prompts to complete the installation.
Step 2: Configure the Prometheus data source
Log on to the ARMS console.
In the left navigation pane, choose .
At the top of the Instances page, select the region of the Prometheus instance, and then click the name of the target instance. The instance name is the same as the ACK cluster name.
On the Settings page, in the HTTP API URL (Grafana Read URL) section, record the values for the following configuration items.
(Optional) If a token is enabled, record the access token.
Record the Internal Network endpoint (Prometheus URL).
Set the Prometheus query endpoint in the ACK cluster.
Create a file named application-intelligence.yaml with the following content.
prometheusUrl: The endpoint of Managed Service for Prometheus.token: The access token for Prometheus.
apiVersion: v1 kind: ConfigMap metadata: name: application-intelligence namespace: kube-system data: prometheusUrl: "http://cn-hangzhou-intranet.arms.aliyuncs.com:9443/api/v1/prometheus/da9d7dece901db4c9fc7f5b9c40****/158120454317****/cc6df477a982145d986e3f79c985a****/cn-hangzhou" token: "eyJhxxxxx"NoteTo view the AHPA dashboard in Managed Service for Prometheus, you also need to configure the following fields in this ConfigMap:
prometheus_writer_url: The private endpoint for Remote Write.prometheus_writer_ak: The AccessKey ID of your Alibaba Cloud account.prometheus_writer_sk: The AccessKey secret of your Alibaba Cloud account.
For more information, see Enable the Prometheus dashboard for AHPA.
Run the following command to deploy the `application-intelligence` configuration.
kubectl apply -f application-intelligence.yaml
Step 3: Deploy a test service
The test service includes fib-deployment, fib-svc, and fib-loader, which simulates request peaks and troughs. A Horizontal Pod Autoscaler (HPA) resource is also deployed to compare its scaling results with those of AHPA.
Create a file named demo.yaml with the following content.
apiVersion: apps/v1
kind: Deployment
metadata:
name: fib-deployment
namespace: default
annotations:
k8s.aliyun.com/eci-use-specs: "1-2Gi"
spec:
replicas: 1
selector:
matchLabels:
app: fib-deployment
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: fib-deployment
spec:
containers:
- image: registry.cn-huhehaote.aliyuncs.com/kubeway/knative-sample-fib-server:20200820-171837
imagePullPolicy: IfNotPresent
name: user-container
ports:
- containerPort: 8080
name: user-port
protocol: TCP
resources:
limits:
cpu: "1"
memory: 2000Mi
requests:
cpu: "1"
memory: 2000Mi
---
apiVersion: v1
kind: Service
metadata:
name: fib-svc
namespace: default
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
selector:
app: fib-deployment
sessionAffinity: None
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: fib-loader
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: fib-loader
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: fib-loader
spec:
containers:
- args:
- -c
- |
/ko-app/fib-loader --service-url="http://fib-svc.${NAMESPACE}?size=35&interval=0" --save-path=/tmp/fib-loader-chart.html
command:
- sh
env:
- name: NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: registry.cn-huhehaote.aliyuncs.com/kubeway/knative-sample-fib-loader:20201126-110434
imagePullPolicy: IfNotPresent
name: loader
ports:
- containerPort: 8090
name: chart
protocol: TCP
resources:
limits:
cpu: "8"
memory: 16000Mi
requests:
cpu: "2"
memory: 4000Mi
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: fib-hpa
namespace: default
spec:
maxReplicas: 50
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: fib-deployment
targetCPUUtilizationPercentage: 50
---Step 4: Deploy AHPA
Configure an elastic policy by submitting an `AdvancedHorizontalPodAutoscaler` resource.
Create a file named ahpa-demo.yaml with the following content.
apiVersion: autoscaling.alibabacloud.com/v1beta1 kind: AdvancedHorizontalPodAutoscaler metadata: name: ahpa-demo spec: scaleStrategy: observer metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 40 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: fib-deployment maxReplicas: 100 minReplicas: 2 stabilizationWindowSeconds: 300 prediction: quantile: 0.95 scaleUpForward: 180 instanceBounds: - startTime: "2021-12-16 00:00:00" endTime: "2031-12-16 00:00:00" bounds: - cron: "* 0-8 ? * MON-FRI" maxReplicas: 15 minReplicas: 4 - cron: "* 9-15 ? * MON-FRI" maxReplicas: 15 minReplicas: 10 - cron: "* 16-23 ? * MON-FRI" maxReplicas: 20 minReplicas: 15The following describes some of the parameters:
Parameter
Required
Description
scaleTargetRef
Yes
Specifies the target deployment.
metrics
Yes
Configures the scaling metrics. Supported metrics include CPU, GPU, Memory, QPS, and RT.
target
Yes
The target threshold. For example,
averageUtilization: 40indicates that the target CPU utilization is 40%.scaleStrategy
No
Sets the scaling mode. The default value is observer.
auto: AHPA performs scaling operations.
observer: AHPA only observes but does not perform scaling actions. You can use this mode to check if AHPA works as expected.
proactive: Only proactive prediction takes effect.
reactive: Only reactive prediction takes effect.
maxReplicas
Yes
The maximum number of replicas for scale-out.
minReplicas
Yes
The minimum number of replicas for scale-in.
stabilizationWindowSeconds
No
The cooldown time for scale-in. The default value is 300 seconds.
prediction. quantile
Yes
The prediction quantile. A higher value indicates a more conservative prediction, meaning a higher probability that the actual metric value will be below the target value. The value must be between 0 and 1, and supports two decimal places. The default value is 0.99. A value from 0.90 to 0.99 is recommended.
prediction. scaleUpForward
Yes
The time required for a pod to become Ready (cold start time).
instanceBounds
No
The bounds for the number of instances during a scaling period.
startTime: The start time.
endTime: The end time.
instanceBounds. bounds. cron
No
Configures a scheduled task. The cron expression represents a set of times. It uses five space-separated fields. For example,
- cron: "* 0-8 ? * MON-FRI"indicates that the task runs from 00:00 to 08:59 every Monday to Friday.The following table describes the fields in a cron expression. For more information, see Scheduled tasks.
Field name
Required
Allowed values
Allowed special characters
Minutes
Yes
0 to 59
* / , -
Hours
Yes
0 to 23
* / , -
Day of Month
Yes
1 to 31
* / , – ?
Month
Yes
1 to 12 or JAN to DEC
* / , -
Day of Week
No
0 to 6 or SUN to SAT
* / , – ?
NoteThe values for the Month and Day of Week fields are not case-sensitive. For example,
SUN,Sun, andsunhave the same effect.If the Day of Week field is not configured, the default value is
*.Special characters:
*: Indicates all possible values./: Specifies an increment for a numeric value.,: Lists enumerated values.-: Indicates a range.?: Indicates that no specific value is specified.
Run the following command to create the AHPA elastic policy.
kubectl apply -f ahpa-demo.yaml
Step 5: View the prediction results
To view the AHPA elastic prediction results, enable the Prometheus dashboard for AHPA.
Because prediction requires seven days of historical data, you can view the prediction results only after the sample deployment has run for at least seven days. If you have an existing online application, you can directly specify it in the AHPA configuration.
This topic uses the observer mode as an example. In this mode, the results are compared with the HPA policy. This comparison provides a reference for the actual resources that the application requires and helps you verify whether the AHPA prediction results meet expectations.

Actual and predicted CPU usage: The green curve represents the actual CPU usage with HPA. The yellow curve represents the CPU usage predicted by AHPA.
The yellow curve is above the green curve, which indicates that the predicted CPU capacity is sufficient.
The yellow curve leads the green curve, which indicates that the required resources are prepared in advance.
Pod trend: The green curve represents the actual number of pods scaled by HPA. The yellow curve represents the number of pods predicted by AHPA.
The yellow curve is below the green curve, which indicates that AHPA predicts that fewer pods are required.
The yellow curve is smoother than the green curve, which indicates that scaling with AHPA causes fewer fluctuations and improves service stability.
The prediction trend meets expectations. After a period of observation, if the results meet your expectations, you can set the scaling mode to auto to allow AHPA to manage scaling.
Related documents
To use Managed Service for Prometheus to monitor GPU metrics and implement predictive scaling with AHPA based on GPU metrics, see Use AHPA to perform predictive scaling based on GPU metrics.
To view the dashboards provided by Managed Service for Prometheus, see Enable the Prometheus dashboard for AHPA.