All Products
Search
Document Center

Container Compute Service:Use Knative and AHPA to configure scheduled auto scaling

Last Updated:Jun 20, 2026

Advanced Horizontal Pod Autoscaler (AHPA) can perform predictive scaling based on the historical values of metrics, such as the RPS, concurrency, CPU, and memory metrics. It can scale resources in advance and maintain the specified maximum or minimum number of pods within the specified time period. By defining cron expressions, you can specify the maximum and minimum numbers of pods within the specified time period.

Prerequisites

Step 1: Use AHPA to configure metrics for auto scaling

Create an AHPA ConfigMap based on the following YAML content and deploy AHPA in the ACS cluster.

apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: AdvancedHorizontalPodAutoscalerTemplate
metadata:
  name: ahpa-demo
spec:
  metrics:
  - type: Resource
    resource:
      name: rps
      target:
        type: Utilization
        averageUtilization: 10 # The RPS threshold is set to 10. 
  maxReplicas: 50 # The maximum number of replicated pods is set to 50. 
  minReplicas: 0 # The minimum number of replicated pods is set to 0. 
  prediction:
    quantile: 95 # The confidence level of prediction is set to 95%. 
    scaleUpForward: 180 # The time range of forward prediction is set to 180 seconds. 
# The number of replicated pods is limited by the maximum number of replicated pods and the minimum number of replicated pods defined by AHPA from 00:00:00 on June 1, 2023 to 00:00:00 on June 1, 2123. 
  instanceBounds:
  - startTime: "2023-06-01 00:00:00"
    endTime: "2123-06-01 00:00:00"
    bounds:
# The minimum number of replicated pods is 0 and the maximum number of replicated pods is 50 from 0 am to 6 am. 
    - cron: '* 0-6 ?  * *'
      maxReplicas: 50
      minReplicas: 0
# The minimum number of replicated pods is 5 and the maximum number of replicated pods is 50 from 7 am to 9 am. 
    - cron: '* 7-9 ?  * *'
      maxReplicas: 50
      minReplicas: 5
# The minimum number of replicated pods is 10 and the maximum number of replicated pods is 50 from 10 am to 4 pm. 
    - cron: '* 10-16 ?  * *'
      maxReplicas: 50
      minReplicas: 10
# The minimum number of replicated pods is 2 and the maximum number of replicated pods is 50 from 5 pm to 11 pm. 
    - cron: '* 17-23 ?  * *'
      maxReplicas: 50
      minReplicas: 2

Parameter

Required

Description

metrics

Yes

Configure metrics for auto scaling. The RPS, concurrency, CPU, and memory metrics are supported.

maxReplicas

Yes

The maximum number of replicated pods that are allowed.

minReplicas

Yes

The minimum number of replicated pods that must be guaranteed.

instanceBounds

No

The time period during which the number of replicated pods is limited by the maximum number of replicated pods and the minimum number of replicated pods defined by AHPA.

  • startTime: the start time of the query task.

  • endTime: the end time of the query task.

bounds

No

The maximum number of replicated pods and the minimum number of replicated pods within the specified time period.

  • cron: a cron expression that specifies a time period. You can enter a cron expression to configure a CronJob.

    For more information about how to use a cron expression to configure a CronJob or automatically scale out pods, refer to the Fields used in cron expressions section and view the definitions of the special characters and wildcard characters used in cron expressions.

  • maxReplicas: the maximum number of replicated pods.

  • minReplicas: the minimum number of replicated pods.

Fields used in cron expressions

The fields of a cron expression are described as follows. For more information, see Cron scheduled tasks.

Field

Special character

Required

Description

Minutes

* / , -

Yes

Valid values: 0 to 59.

Hours

* / , -

Yes

Valid values: 0 to 23.

Day of month

* / , – ?

Yes

Valid values: 1 to 31.

Month

* / , -

Yes

Valid values: 1 to 12 or JAN to DEC.

Note

The valid values from JAN to DEC are not case-sensitive.

Day of week

* / , – ?

No

Valid values: 0 to 6 or SUN to SAT.

Note
  • The valid values from SUN to SAT are not case-sensitive. For example, both SUN and sun indicate Sunday.

  • If you do not specify the Day of week field, any day of the week is applied, which is equivalent to the wildcard character (*).

Special characters used in cron expressions:

  • An asterisk (*) indicates any value. For example, * indicates any minute or hour.

  • A forward slash (/) indicates the step size. For example, /5 indicates five time units.

  • Commas (,) are used as delimiters. For example, 1,3,5 indicates values 1, 3, and 5.

  • Hyphens (-) are used in value ranges. For example, 1-5 indicates values 1 to 5.

  • Question marks (?) are used only in the Day of month and Day of week fields to indicate variable values.

Step 2: Create a Knative Service and enable AHPA for the Service

  1. Log on to the ACS console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of the target cluster. In the left navigation pane, choose Applications > Knative.

  3. On the Knative page, click the Services tab. Select default from the Namespace drop-down list. Click Create from Template, paste the following YAML template into the editor, and click Create. This creates a Service named helloworld-go-demo.

    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
      name: helloworld-go-demo
    spec:
      template:
        metadata:
          annotations:
            autoscaling.knative.dev/class: ahpa.autoscaling.knative.dev # Specify the AHPA plug-in. 
            autoscaling.knative.dev.alibabacloud/ahpa-template: "ahpa-demo" # If you modify the AHPA template parameter, the corresponding revision is also updated. 
        spec:
          containers:
          - image: registry.cn-hangzhou.aliyuncs.com/knative-sample/helloworld-go:73fbdd56
            env:
            - name: TARGET
              value: "Knative"

    After the Service is created, record the gateway address and domain name of the Service, which will be used in Step 3: Access the Service.

    On the Services tab, you can find the gateway address (for example, alb-xxx.ncs.com) in the Access Gateway section. In the service list, the Default Domain Name column shows the service domain name (for example, helloworld-go.default.example.com). To access the Service, send a request to the gateway address with the Host header set to the service's default domain name.

Step 3: Access the Service

  1. Run the following command to access the Service:

    # helloworld-go-demo.default.example.com is the default domain name of the Service. 
    # alb-i5lagvip6fga******.cn-shenzhen.alb.aliyuncs.com is the gateway address of the Service. 
    curl -H "Host: helloworld-go-demo.default.example.com" http://alb-i5lagvip6fga******.cn-shenzhen.alb.aliyuncs.com

    Expected results:

    Hello Knative!

Step 4 (Optional): Verify scheduled auto scaling

On the Knative Monitoring Dashboards page, you can view the scaling trends of Knative service instances. For more information about the Knative Monitoring Dashboard, see View the Knative Service Monitoring Dashboard.

Note
  • If a Knative application scales to zero and has no running pods, Managed Service for Prometheus cannot collect metrics such as request concurrency and requests per second for the pods. These metrics are displayed in the console only after requests are sent to the Knative application pods.

  • If a Knative application is not scaled to zero, the console displays metrics such as request concurrency and requests per second for the pods, even if they are not handling requests.

image.png

References

You can configure auto scaling based on the number of concurrent pod requests and RPS configurations. For more information, see Enable auto scaling to withstand traffic fluctuations.