In many scenarios, you may need extra metrics to implement autoscaling of applications in a cluster. This topic describes how to automatically scale your workloads based on Alibaba Cloud metrics.
In many scenarios, you may need to scale applications based on metrics such as HTTP request rate and ingress request rate. By default, Horizontal Pod Autoscaler (HPA) does support custom or external metrics. However, Kubernetes provides the external metrics API, which you can use to implement more flexible scaling mechanisms.
- Log on to the Container Service console. In the left-side navigation pane, choose . On the App Catalog page that appears, choose .
- Click ack-alibaba-cloud-metrics-adapter. In the Deploy section that appears on the right, click Create.
In the left-side navigation pane, choose ack-alibaba-cloud-metrics-adapter has been deployed to the target cluster.. Verify that
- In the left-side navigation pane, choose .
- On the Create from Template page, enter the following code to create an application and expose it with a ClusterIP
service. Click Create.
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment-basic labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.7.9 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: nginx namespace: default spec: ports: - port: 80 protocol: TCP targetPort: 80 selector: app: nginx type: ClusterIP
- In the left-side navigation pane, choose Ingresses page that appears, click Create in the upper-right corner. . On the
- In the Create dialog box, set the parameters and click Create to create an ingress. After the ingress is created, you are automatically redirected to the Ingresses page.
- On the Ingresses page, find the newly created ingress and click Details to view ingress information.
- Configure HPA.
Note Before you configure HPA, make sure that your Kubernetes version is up-to-date and ingress log collection is enabled. For more information, see Upgrade a cluster and Analyze logs of Ingress to monitor access to Ingress.
- In the left-side navigation pane, choose .
- On the Templates page, select HPA and click Create Application to deploy the following YAML file.
apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: ingress-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx-deployment-basic minReplicas: 2 maxReplicas: 10 metrics: - type: External external: metric: name: sls_ingress_qps selector: matchLabels: sls.project: "***" sls.logstore: "nginx-ingress" sls.ingress.route: "default-nginx-80" target: type: AverageValue averageValue: 10 - type: External external: metric: name: sls_ingress_latency_p9999 selector: matchLabels: # default ingress log project is k8s-log-clusterId sls.project: "***" # default ingress logstre is nginx-ingress sls.logstore: "nginx-ingress" # namespace-svc-port sls.ingress.route: "default-nginx-80" # sls vpc endpoint, default true # sls.internal.endpoint:ture target: type: Value # sls_ingress_latency_p9999 > 10ms value: 10The following table lists the HPA parameters that are used in the preceding example.
Parameter Description sls.ingress.route - sls.logstore - sls.project - sls.internal.endpoint Whether the project has an internal endpoint. Default is true.NoteThis topic uses the sls_ingress_qps and sls_ingress_latency_p9999 metrics to implement autoscaling. Each metric has a different type of target.
- The target of sls_ingress_qps is AverageValue, indicating that the request rate is an average value among pods.
- The target of sls_ingress_latency_p9999 is Value, indicating that the latency is a total value.
- On the Templates - HPA page, click Create.
- After HPA is configured, run the following script to test the configurations.
#! /bin/bash ## Use Apache Benchmark to send requests to the application exposed by the ingress. The test lasts 300 seconds and sends 10 concurrent requests per second. ab -t 300 -c 10 <The domain of the ingress>
- Check the scaling status.
- In the left-side navigation pane, choose Clusters. On the Clusters page, find the target cluster and click Manage in the Actions column.
- At the top of the page, click Open Cloud Shell.
- Run the
kubectl describe hpa ingress-hpacommand to check the scaling status.
Q: What do I do if the target column shows <unknow> after I run the kubectl get hpa command?
A: Check whether the metric name is correct in the HorizontalPodAutoscaler configuration.
Q: Where can I find the metrics that are supported by HPA?A: For more information about supported metrics, see Alibaba Cloud metrics adapter. The following table lists the commonly used metrics.
Metric Description Additional parameter sls_ingress_qps QPS of a specific ingress route sls.ingress.route sls_ingress_latency_avg latency of all requests sls.ingress.route sls_ingress_latency_p50 latency of 50% requests sls.ingress.route sls_ingress_latency_p95 latency of 95% requests sls.ingress.route sls_ingress_latency_p99 latency of 99% requests sls.ingress.route sls_ingress_latency_p9999 latency of 99.99% requests sls.ingress.route sls_ingress_inflow inflow bandwidth of ingress sls.ingress.route