Knative allows you to use Advanced Horizontal Pod Autoscaler (AHPA) in serverless Kubernetes (ASK) clusters. If your application requests resources in a periodic pattern, you can use AHPA to predict changes in resource requests and prefetch resources for scaling activities. This reduces the impact of cold starts when your application is scaled. This topic describes how to use AHPA to enable predictive scaling in Knative.
Prerequisites
- Knative is deployed in your cluster. For more information, see Enable Knative.
- Application Real-Time Monitoring Prometheus Service is enabled. For more information, see Enable ARMS Prometheus.
Step 1: Install Application Intelligence Controller
Install Application Intelligence Controller from the Add-ons page of the Container Service for Kubernetes (ACK) console.
- Log on to the ACK console.
- In the left-side navigation pane of the ACK console, click Clusters.
- On the Clusters page, find the cluster that you want to manage and click the name of the cluster or click Details in the Actions column. The details page of the cluster appears.
- In the left-side navigation pane of the details page, choose .
- On the Add-ons page, click the Others tab. Find Application Intelligence Controller and click Install.
- In the Install Application Intelligence Controller message, click OK.
Step 2: Configure Prometheus to collect metrics
Configure Prometheus to collect the response time (RT) and requests per second (RPS) metrics of your Knative Service.
a. Configure metric collection rules- Log on to the ARMS console.
- In the left-side navigation pane, choose .
- In the top navigation bar, select the region in which your cluster resides and click the name of the Prometheus instance that is used to monitor your cluster in the Instance Name column. The details page of the Prometheus instance appears.
- In the left-side navigation pane, click Service Discovery. Then, click the Configure tab.
- On the Configure tab, click Custom Service Discovery. Then, click Add.
- In the Add custom service discovery dialog box, configure metric collection rules.
Example:
job_name: queue-proxy scrape_interval: 3s scrape_timeout: 3s kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: - __meta_kubernetes_pod_label_serving_knative_dev_revision - __meta_kubernetes_pod_container_port_name action: keep regex: .+;http-usermetric - source_labels: - __meta_kubernetes_namespace target_label: namespace - source_labels: - __meta_kubernetes_pod_name target_label: pod - source_labels: - __meta_kubernetes_service_name target_label: service
- Log on to the ARMS console.
- In the left-side navigation pane, choose .
- In the upper-left corner of the Prometheus Monitoring page, select the region in which your Prometheus instances are deployed. Then, click the name of a Prometheus instance whose Instance Type is Prometheus for Container Service. The details page of the Prometheus instance appears.
- In the left-side navigation pane of the instance details page, click Settings and copy the public endpoint in the HTTP API Address section.
- In the HTTP API Address section, click Generate Token to generate a token. The token is used to pass the authentication when you access the Prometheus instance.
- Create an application-intelligence.yaml file with the following content. The file specifies the public endpoint of the Prometheus instance.
apiVersion: v1 kind: ConfigMap metadata: name: application-intelligence namespace: kube-system data: armsUrl: "https://cn-hangzhou.arms.aliyuncs.com:9443/api/v1/prometheus/da9d7dece901db4c9fc7f5b9c40****/158120454317****/cc6df477a982145d986e3f79c985a****/cn-hangzhou" token: "****"
- Run the following command to create a ConfigMap:
kubectl apply -f application-intelligence.yaml
Step 3: Create a Knative Service
Create a Knative Service that has AHPA enabled.
- Create an autoscale-go.yaml with the following content:
apiVersion: serving.knative.dev/v1 kind: Service metadata: name: autoscale-go namespace: default spec: template: metadata: labels: app: autoscale-go annotations: autoscaling.knative.dev/class: ahpa.autoscaling.knative.dev autoscaling.knative.dev/target: "500" autoscaling.knative.dev/metric: "rt" autoscaling.knative.dev/minScale: "1" autoscaling.knative.dev/maxScale: "30" autoscaling.alibabacloud.com/scaleStrategy: "observer" spec: containers: - image: registry.cn-hangzhou.aliyuncs.com/knative-sample/autoscale-go:0.1
Parameter Description autoscaling.knative.dev/class: ahpa.autoscaling.knative.dev
Specifies that AHPA is enabled. autoscaling.knative.dev/metric: "rt"
The metric based on which AHPA scales your application. Only the RT metric is supported. autoscaling.knative.dev/target: "500
The scaling threshold for the metric. In this example, the threshold is set to 500 milliseconds. autoscaling.knative.dev/minScale: "1"
The minimum number of pods to which AHPA can scale your application. In this example, the value is set to 1. autoscaling.knative.dev/maxScale: "30"
The maximum number of pods to which AHPA can scale your application. In this example, the value is set to 30. autoscaling.alibabacloud.com/scaleStrategy: "observer"
The scaling mode of AHPA. Default value: observer
.observer
: AHPA observes but does not scale your application. You can use this mode to check whether AHPA works as expected. The default mode isobserver
because AHPA performs predictive scaling based on the historical data with the last seven days.auto
: AHPA scales your application based on the collected metric and the specified threshold.
- Run the following command to enable AHPA:
kubectl apply -f autoscale-go.yaml
Verify the AHPA policy
