Use AHPA to enable predictive scaling in Knative - Container Service for Kubernetes

Knative allows you to use Advanced Horizontal Pod Autoscaler (AHPA) in ACK Serverless clusters. If your application requests resources in a periodic pattern, you can use AHPA to predict changes in resource requests and prefetch resources for scaling activities. This reduces the impact of cold starts when your application is scaled. This topic describes how to use AHPA to enable predictive scaling in Knative.

Prerequisites
Step 1: Install the AHPA controller
Step 2: Configure Prometheus to collect metrics
Step 3: Create a Knative Service
Verify the AHPA policy

Prerequisites

Knative is deployed in your cluster. For more information, see Enable Knative.
Managed Service for Prometheus is enabled. For more information, see Enable Managed Service for Prometheus.

Step 1: Install the AHPA controller

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage and choose Operations > Add-ons in the left-side navigation pane.
On the Add-ons page, click the Others tab or enter AHPA Controller into the search box in the upper-right corner of the page and click the search icon. Then, click Install in the AHPA Controller card.
In the message that appears, click OK.

Step 2: Configure Prometheus to collect metrics

Configure Prometheus to collect the response time (RT) and requests per second (RPS) metrics of your Knative Service.

1. Configure metric collection rules

Log on to the ARMS console.
In the left-side navigation pane, choose Prometheus Service > Prometheus Instances.
In the top navigation bar, select the region in which your cluster resides and click the name of the Prometheus instance that is used to monitor your cluster in the Instance Name column. The details page of the Prometheus instance appears.
In the left-side navigation pane, click Service Discovery. Then, click the Configure tab.
On the Configure tab, click Custom Service Discovery, and then click Add.

In the Add Custom Service Discovery dialog box, configure metric collection rules.

Example:

job_name: queue-proxy
scrape_interval: 3s
scrape_timeout: 3s
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels:
  - __meta_kubernetes_pod_label_serving_knative_dev_revision
  - __meta_kubernetes_pod_container_port_name
  action: keep
  regex: .+;http-usermetric
- source_labels:
  - __meta_kubernetes_namespace
  target_label: namespace
- source_labels:
  - __meta_kubernetes_pod_name
  target_label: pod
- source_labels:
  - __meta_kubernetes_service_name
  target_label: service

2. Add Prometheus as a data source

Log on to the ARMS console.
In the left-side navigation pane, choose Prometheus Service > Prometheus Instances.
In the upper-left corner of the Prometheus Monitoring page, select the region in which your Prometheus instances are deployed. Then, click the name of a Prometheus instance whose Instance Type is Prometheus for container service. The details page of the Prometheus instance appears.
In the left-side navigation pane, click Settings and copy the public endpoint in the HTTP API URL section.
In the HTTP API URL section, click Generate Token to generate a token. The token is used to pass the authentication when you access the Prometheus instance.
Create an application-intelligence.yaml file with the following content. The file specifies the public endpoint of the Prometheus instance.
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: application-intelligence
  namespace: kube-system
data:
  armsUrl: "https://cn-hangzhou.arms.aliyuncs.com:9443/api/v1/prometheus/da9d7dece901db4c9fc7f5b9c40****/158120454317****/cc6df477a982145d986e3f79c985a****/cn-hangzhou"
  token: "****"
```
Descriptions of parameters:
- armsUrl: Specify the public endpoint of the Prometheus instance that you copied in Step 4.
- token: Specify the token that was generated in Step 5.
Run the following command to create a ConfigMap:
```
kubectl apply -f application-intelligence.yaml
```

Step 3: Create a Knative Service

Create a Knative Service that has AHPA enabled.

Create an autoscale-go.yaml file with the following content:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: autoscale-go
  namespace: default
spec:
  template:
    metadata:
      labels:
        app: autoscale-go
      annotations:
        autoscaling.knative.dev/class: ahpa.autoscaling.knative.dev
        autoscaling.knative.dev/target: "500"
        autoscaling.knative.dev/metric: "rt"
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/maxScale: "30"
        autoscaling.alibabacloud.com/scaleStrategy: "observer"
    spec:
      containers:
        - image: registry.cn-hangzhou.aliyuncs.com/knative-sample/autoscale-go:0.1

Parameter	Description
`autoscaling.knative.dev/class: ahpa.autoscaling.knative.dev`	Specifies that AHPA is enabled.
`autoscaling.knative.dev/metric: "rt"`	The metric based on which AHPA scales your application. Only the RT metric is supported.
`autoscaling.knative.dev/target: "500`	The scaling threshold for the metric. In this example, the threshold is set to 500 milliseconds.
`autoscaling.knative.dev/minScale: "1"`	The minimum number of pods to which AHPA can scale your application. In this example, the value is set to 1.
`autoscaling.knative.dev/maxScale: "30"`	The maximum number of pods to which AHPA can scale your application. In this example, the value is set to 30.
`autoscaling.alibabacloud.com/scaleStrategy: "observer"`	The scaling mode of AHPA. Default value: `observer`. `observer`: AHPA observes but does not scale your application. You can use this mode to check whether AHPA works as expected. The default mode is `observer` because AHPA performs predictive scaling based on the historical data within the last seven days. `auto`: AHPA scales your application based on the collected metric and the specified threshold.

Run the following command to enable AHPA:
```
kubectl apply -f autoscale-go.yaml
```

Verify the AHPA policy

You can compare the values of the RT metric of your Knative Service before and after AHPA predictive scaling is enabled, as shown in the following figure. AHPA效果图.png

Container Service for Kubernetes:Use AHPA to enable predictive scaling in Knative

Table of contents