All Products
Search
Document Center

Container Service for Kubernetes:Use AHPA to enable predictive scaling in Knative

Last Updated:Aug 28, 2023

Knative allows you to use Advanced Horizontal Pod Autoscaler (AHPA) in ACK Serverless clusters. If your application requests resources in a periodic pattern, you can use AHPA to predict changes in resource requests and prefetch resources for scaling activities. This reduces the impact of cold starts when your application is scaled. This topic describes how to use AHPA to enable predictive scaling in Knative.

Table of contents

Prerequisites

Step 1: Install the AHPA controller

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage and choose Operations > Add-ons in the left-side navigation pane.

  3. On the Add-ons page, click the Others tab or enter AHPA Controller into the search box in the upper-right corner of the page and click the search icon. Then, click Install in the AHPA Controller card.

  4. In the message that appears, click OK.

Step 2: Configure Prometheus to collect metrics

Configure Prometheus to collect the response time (RT) and requests per second (RPS) metrics of your Knative Service.

1. Configure metric collection rules

  1. Log on to the ARMS console.
  2. In the left-side navigation pane, choose Prometheus Service > Prometheus Instances.

  3. In the top navigation bar, select the region in which your cluster resides and click the name of the Prometheus instance that is used to monitor your cluster in the Instance Name column. The details page of the Prometheus instance appears.

  4. In the left-side navigation pane, click Service Discovery. Then, click the Configure tab.

  5. On the Configure tab, click Custom Service Discovery, and then click Add.

  6. In the Add Custom Service Discovery dialog box, configure metric collection rules.

    Example:

    job_name: queue-proxy
    scrape_interval: 3s
    scrape_timeout: 3s
    kubernetes_sd_configs:
    - role: pod
    relabel_configs:
    - source_labels:
      - __meta_kubernetes_pod_label_serving_knative_dev_revision
      - __meta_kubernetes_pod_container_port_name
      action: keep
      regex: .+;http-usermetric
    - source_labels:
      - __meta_kubernetes_namespace
      target_label: namespace
    - source_labels:
      - __meta_kubernetes_pod_name
      target_label: pod
    - source_labels:
      - __meta_kubernetes_service_name
      target_label: service

2. Add Prometheus as a data source

  1. Log on to the ARMS console.
  2. In the left-side navigation pane, choose Prometheus Service > Prometheus Instances.

  3. In the upper-left corner of the Prometheus Monitoring page, select the region in which your Prometheus instances are deployed. Then, click the name of a Prometheus instance whose Instance Type is Prometheus for container service. The details page of the Prometheus instance appears.

  4. In the left-side navigation pane, click Settings and copy the public endpoint in the HTTP API URL section.

  5. In the HTTP API URL section, click Generate Token to generate a token. The token is used to pass the authentication when you access the Prometheus instance.

  6. Create an application-intelligence.yaml file with the following content. The file specifies the public endpoint of the Prometheus instance.

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: application-intelligence
      namespace: kube-system
    data:
      armsUrl: "https://cn-hangzhou.arms.aliyuncs.com:9443/api/v1/prometheus/da9d7dece901db4c9fc7f5b9c40****/158120454317****/cc6df477a982145d986e3f79c985a****/cn-hangzhou"
      token: "****"

    Descriptions of parameters:

    • armsUrl: Specify the public endpoint of the Prometheus instance that you copied in Step 4.

    • token: Specify the token that was generated in Step 5.

  7. Run the following command to create a ConfigMap:

    kubectl apply -f application-intelligence.yaml

Step 3: Create a Knative Service

Create a Knative Service that has AHPA enabled.

  1. Create an autoscale-go.yaml file with the following content:

    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
      name: autoscale-go
      namespace: default
    spec:
      template:
        metadata:
          labels:
            app: autoscale-go
          annotations:
            autoscaling.knative.dev/class: ahpa.autoscaling.knative.dev
            autoscaling.knative.dev/target: "500"
            autoscaling.knative.dev/metric: "rt"
            autoscaling.knative.dev/minScale: "1"
            autoscaling.knative.dev/maxScale: "30"
            autoscaling.alibabacloud.com/scaleStrategy: "observer"
        spec:
          containers:
            - image: registry.cn-hangzhou.aliyuncs.com/knative-sample/autoscale-go:0.1
                            

    Parameter

    Description

    autoscaling.knative.dev/class: ahpa.autoscaling.knative.dev

    Specifies that AHPA is enabled.

    autoscaling.knative.dev/metric: "rt"

    The metric based on which AHPA scales your application. Only the RT metric is supported.

    autoscaling.knative.dev/target: "500

    The scaling threshold for the metric. In this example, the threshold is set to 500 milliseconds.

    autoscaling.knative.dev/minScale: "1"

    The minimum number of pods to which AHPA can scale your application. In this example, the value is set to 1.

    autoscaling.knative.dev/maxScale: "30"

    The maximum number of pods to which AHPA can scale your application. In this example, the value is set to 30.

    autoscaling.alibabacloud.com/scaleStrategy: "observer"

    The scaling mode of AHPA. Default value: observer.

    • observer: AHPA observes but does not scale your application. You can use this mode to check whether AHPA works as expected. The default mode is observer because AHPA performs predictive scaling based on the historical data within the last seven days.

    • auto: AHPA scales your application based on the collected metric and the specified threshold.

  2. Run the following command to enable AHPA:

    kubectl apply -f autoscale-go.yaml

Verify the AHPA policy

You can compare the values of the RT metric of your Knative Service before and after AHPA predictive scaling is enabled, as shown in the following figure.AHPA效果图.png