All Products
Search
Document Center

Container Compute Service:Implement scheduled auto scaling based on Knative and AHPA

Last Updated:Jan 03, 2025

Advanced Horizontal Pod Autoscaler (AHPA) can perform predictive scaling based on the historical values of metrics, such as the RPS, concurrency, CPU, and memory metrics. It can scale resources in advance and maintain the specified maximum or minimum number of pods within the specified time period. By defining cron expressions, you can specify the maximum and minimum numbers of pods within the specified time period.

Prerequisites

  • Knative is deployed in the ACS cluster. For more information, see Deploy Knative.

  • AHPA is deployed in the ACS cluster. For more information, see Deploy AHPA.

Step 1: Use AHPA to configure metrics for auto scaling

Create an AHPA ConfigMap based on the following YAML content and deploy AHPA in the ACS cluster.

apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: AdvancedHorizontalPodAutoscalerTemplate
metadata:
  name: ahpa-demo
spec:
  metrics:
  - type: Resource
    resource:
      name: rps
      target:
        type: Utilization
        averageUtilization: 10 # The RPS threshold is set to 10. 
  maxReplicas: 50 # The maximum number of replicated pods is set to 50. 
  minReplicas: 0 # The minimum number of replicated pods is set to 0. 
  prediction:
    quantile: 95 # The confidence level of prediction is set to 95%. 
    scaleUpForward: 180 # The time range of forward prediction is set to 180 seconds. 
# The number of replicated pods is limited by the maximum number of replicated pods and the minimum number of replicated pods defined by AHPA from 00:00:00 on June 1, 2023 to 00:00:00 on June 1, 2123. 
  instanceBounds:
  - startTime: "2023-06-01 00:00:00"
    endTime: "2123-06-01 00:00:00"
    bounds:
# The minimum number of replicated pods is 0 and the maximum number of replicated pods is 50 from 0 am to 6 am. 
    - cron: '* 0-6 ?  * *'
      maxReplicas: 50
      minReplicas: 0
# The minimum number of replicated pods is 5 and the maximum number of replicated pods is 50 from 7 am to 9 am. 
    - cron: '* 7-9 ?  * *'
      maxReplicas: 50
      minReplicas: 5
# The minimum number of replicated pods is 10 and the maximum number of replicated pods is 50 from 10 am to 4 pm. 
    - cron: '* 10-16 ?  * *'
      maxReplicas: 50
      minReplicas: 10
# The minimum number of replicated pods is 2 and the maximum number of replicated pods is 50 from 5 pm to 11 pm. 
    - cron: '* 17-23 ?  * *'
      maxReplicas: 50
      minReplicas: 2

Parameter

Required

Description

metrics

Yes

Configure metrics for auto scaling. The RPS, concurrency, CPU, and memory metrics are supported.

maxReplicas

Yes

The maximum number of replicated pods that are allowed.

minReplicas

Yes

The minimum number of replicated pods that must be guaranteed.

instanceBounds

No

The time period during which the number of replicated pods is limited by the maximum number of replicated pods and the minimum number of replicated pods defined by AHPA.

  • startTime: the start time of the query task.

  • endTime: the end time of the query task.

bounds

No

The maximum number of replicated pods and the minimum number of replicated pods within the specified time period.

  • cron: a cron expression that specifies a time period. You can enter a cron expression to configure a CronJob.

    For more information about how to use a cron expression to configure a CronJob or automatically scale out pods, refer to the Fields used in cron expressions section and view the definitions of the special characters and wildcard characters used in cron expressions.

  • maxReplicas: the maximum number of replicated pods.

  • minReplicas: the minimum number of replicated pods.

Fields used in cron expressions

The following table describes the fields that are contained in a CRON expression. For more information, see Cron expressions.

Field

Special character

Required

Description

Minutes

* / , -

Yes

Valid values: 0 to 59.

Hours

* / , -

Yes

Valid values: 0 to 23.

Day of month

* / , – ?

Yes

Valid values: 1 to 31.

Month

* / , -

Yes

Valid values: 1 to 12 or JAN to DEC.

Note

The valid values from JAN to DEC are not case-sensitive.

Day of week

* / , – ?

No

Valid values: 0 to 6 or SUN to SAT.

Note
  • The valid values from SUN to SAT are not case-sensitive. For example, both SUN and sun indicate Sunday.

  • If you do not specify the Day of week field, any day of the week is applied, which is equivalent to the wildcard character (*).

Special characters used in cron expressions:

  • An asterisk (*) indicates any value. For example, * indicates any minute or hour.

  • A forward slash (/) indicates the step size. For example, /5 indicates five time units.

  • Commas (,) are used as delimiters. For example, 1,3,5 indicates values 1, 3, and 5.

  • Hyphens (-) are used in value ranges. For example, 1-5 indicates values 1 to 5.

  • Question marks (?) are used only in the Day of month and Day of week fields to indicate variable values.

Step 2: Create a Knative Service and enable AHPA for the Service

  1. Log on to the ACS console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to manage and click its ID. In the left-side navigation pane of the cluster details page, choose Applications > Knative.

  3. On the Services tab of the Knative page, set Namespace to default, click Create from Template, copy the following YAML content to the editor, and then click Create to create a Service named helloworld-go-demo.

    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
      name: helloworld-go-demo
    spec:
      template:
        metadata:
          annotations:
            autoscaling.knative.dev/class: ahpa.autoscaling.knative.dev # Specify the AHPA plug-in. 
            autoscaling.knative.dev.alibabacloud/ahpa-template: "ahpa-demo" # If you modify the AHPA template parameter, the corresponding revision is also updated. 
        spec:
          containers:
          - image: registry.cn-hangzhou.aliyuncs.com/knative-sample/helloworld-go:73fbdd56
            env:
            - name: TARGET
              value: "Knative"

    After the Service is created, record the gateway address and domain name of the Service, which will be used in Step 3: Access the Service.

    image

Step 3: Access the Service

  1. Run the following command to access the Service:

    # helloworld-go-demo.default.example.com is the default domain name of the Service. 
    # alb-i5lagvip6fga******.cn-shenzhen.alb.aliyuncs.com is the gateway address of the Service. 
    curl -H "Host: helloworld-go-demo.default.example.com" http://alb-i5lagvip6fga******.cn-shenzhen.alb.aliyuncs.com

    Expected results:

    Hello Knative!

Step 4 (Optional): Verify scheduled auto scaling

On the Monitoring Dashboards of Knative, you can view the trends of pod scaling for the Knative Service. For more information about the Knative dashboard, see View the Knative monitoring dashboard.

Note
  • When the number of pods for a Knative application is scaled to zero, metrics such as the request concurrency and the number of requests sent to a pod per second cannot be collected by Managed Service for Prometheus. You can view these metrics in the console only after you access the pods of the Knative application.

  • When the number of pods for a Knative application is not zero, you can directly view the metrics in the console, such as the request concurrency and the number of requests sent to a pod per second. You do not need to access the pods of the Knative application.

image.png

References

You can configure auto scaling based on the number of concurrent pod requests and RPS configurations. For more information, see Enable auto scaling to withstand traffic fluctuations.