When dealing with business bursts, more precise scaling can improve the response speed and further improve the efficiency of cluster resource utilization. This topic describes how to configure external metrics supported by Kubernetes, such as the HTTP request rate and the Ingress queries per second (QPS), to implement auto scaling policies.
In this example, a Deployment, a Service, and an Ingress named NGINX are created to configure Horizontal Pod Autoscaler (HPA). This allows you to implement horizontal auto scaling based on the QPS of the Ingress in Simple Log Service (SLS).
Step 1: Install the ack-alibaba-cloud-metrics-adapter component
The ack-alibaba-cloud-metrics-adapter component allows Kubernetes to obtain the monitoring data of Alibaba Cloud services, such as Elastic Compute Service (ECS), Server Load Balancer (SLB), and ApsaraDB RDS, by using the External Metrics API, which can enhance the monitoring and auto-scaling capabilities of clusters.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage and choose in the left-side navigation pane.
On the Helm page, click Deploy. In the Basic Information step, configure the parameters and select ack-alibaba-cloud-metrics-adapter. Then, click Next. The following table describes the parameters.
In the Parameters step, configure the Chart Version parameter and click OK.
Step 2: Create an application and a Service
Create a file named nginx-test.yaml.
Run the following command to create a Deployment and a Service:
kubectl apply -f nginx-test.yaml
Step 3: Create an Ingress
In the left-side navigation pane, choose
. In the upper-left corner of the Ingresses page, click Create Ingress.In the Create Ingress dialog box, configure the parameters and click OK. After you create an Ingress, the Ingresses page appears.
In the Name column, find and click the name of newly created Ingress to view information about the Ingress. For more information about Ingresses, see Ingress management.
Step 4: Configure HPA
You can configure two metrics to scale SLS projects in HPA, such as the sls_ingress_qps and sls_ingress_latency_p9999 metrics.
Set the sls_ingress_qps metric to AverageValue. This specifies that the metric value is the result of dividing the total QPS by the number of pods.
Set the sls_ingress_latency_p9999 metric to Value. This specifies that the latency is not divided by the number of pods.
Create a file named ingress-hpa.yaml and add the following content to the file:
The following table describes the parameters that are used to configure HPA.
Parameter
Required
Description
sls.ingress.route
Yes
Parameter format:
<namespace>-<svc>-<port>
.<namespace>
specifies the namespace to which the Ingress belongs.<svc>
specifies the name of the Service that you selected when you created the Ingress.<port>
specifies the port of the Service. Example: default-nginx-80sls.logstore
Yes
The name of the Logstore in SLS. The default value of
sls.logstore
isnginx-ingress
.sls.project
Yes
The name of the project in SLS. The default value of
sls.project
isk8s-log-cluster ID
.sls.internal.endpoint
No
Specifies whether SLS is accessed over an internal network. Default value: true.
true: Access SLS over an internal network.
false: Access SLS over the Internet.
Run the following command to configure HPA:
kubectl apply -f ingress-hpa.yml
Step 5: Verify the configuration
After you configure HPA, run the following command to perform a stress test:
ab -t 300 -c 10 <Domain name of the Ingress> # Use Apache Benchmark to send requests to the Service exposed by the Ingress. The test requires 300 seconds to complete and 10 concurrent requests are sent per second.
Check whether HPA works as expected.
In the left-side navigation pane of the ACK console, click Clusters. On the Clusters page, find the cluster that you want to manage and choose
in the Actions column.Run the following command to check the status of HPA:
kubectl get hpa ingress-hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAPS AGE ingress-hpa Depolyment/nginx-deployment-basic 21/10 (avg) 2 10 10 7m49s
If the value of the REPLICAS parameter is the same as the value of the MAXPODS parameter, HPA scaled out the application as expected.
FAQ
How do I use a CLI to obtain the data of the sls_ingress_qps
QPS metrics?
Run the following command to query data. In this example, the sls_ingress_qps metric is used.
kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/*/sls_ingress_qps?labelSelector=sls.project={{SLS_Project}},sls.logstore=nginx-ingress
{{SLS_Project}}
is the name of the SLS project used by the ACK cluster. The default name of the SLS project used by an ACK cluster is k8s-log-{{ClusterId}}
. {{ClusterId}} is the ID of the cluster.
Expected output:
Error from server: {
"httpCode": 400,
"errorCode": "ParameterInvalid",
"errorMessage": "key (slb_pool_name) is not config as key value config,if symbol : is in your log,please wrap : with quotation mark \"",
"requestID": "xxxxxxx"
}
The command output indicates that no data is returned for the sls_alb_ingress_qps metric because no Application Load Balancer (ALB) Ingress is created. The sls_alb_ingress_qps metric is used for data query.
Expected output:
{
"kind": "ExternalMetricValueList",
"apiVersion": "external.metrics.k8s.io/v1beta1",
"metadata": {},
"items": [
{
"metricName": "sls_ingress_qps",
"timestamp": "2025-02-26T16:45:00Z",
"value": "50", # The value of QPS.
"metricLabels": {
"sls.project": "your-sls-project-name",
"sls.logstore": "nginx-ingress"
}
}
]
}
The command output indicates that the QPS metric exists and the value
is the QPS value.
What do I do if the target column is <unknown> after I run the kubectl get hpa
command?
To resolve this issue, perform the following steps.
Run the
kubectl describe hpa <hpa_name>
command to determine why HPA does not work as expected.If the value of AbleToScale is False in the Conditions field, check whether the Deployment is successfully created.
If the value of ScalingActive is False in the Conditions field, proceed to the next step.
Run the
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/"
command. If the Error from server (NotFound): the server could not find the requested resource error message appears, check the status of the alibaba-cloud-metrics-adapter component.If the status of the alibaba-cloud-metrics-adapter component is normal, check whether the HPA metrics are related to the Ingress. If the HPA metrics are related to the Ingress, make sure that you install the SLS component before you install ack-alibaba-cloud-metrics-adapter. For more information, see Analyze and monitor the access log of nginx-ingress.
Make sure that the values of the HPA metrics are valid. The value of sls.ingress.route must be in the
<namespace>-<svc>-<port>
format.namespace specifies the namespace to which the Ingress belongs.
svc specifies the name of the Service that you selected when you created the Ingress.
port specifies the port of the Service.
How do I find the metrics that are supported by HPA?
For more information about the metrics that are supported by HPA, see Alibaba Cloud metrics adapter. The following table describes the commonly used metrics.
Metric | Description | Additional parameter |
sls_ingress_qps | The number of requests that an Ingress can process per second based on a specific routing rule. | sls.ingress.route |
sls_alb_ingress_qps | The number of requests that the ALB Ingress can process per second based on a specific routing rule. | sls.ingress.route |
sls_ingress_latency_avg | The average latency of all requests. | sls.ingress.route |
sls_ingress_latency_p50 | The maximum latency for the fastest 50% of all requests. | sls.ingress.route |
sls_ingress_latency_p95 | The maximum latency for the fastest 95% of all requests. | sls.ingress.route |
sls_ingress_latency_p99 | The maximum latency for the fastest 99% of all requests. | sls.ingress.route |
sls_ingress_latency_p9999 | The maximum latency for the fastest 99.99% of all requests. | sls.ingress.route |
sls_ingress_inflow | The inbound bandwidth of the Ingress. | sls.ingress.route |
How do I configure horizontal autoscaling after I change the format of NGINX Ingress logs?
In this topic, horizontal pod autoscaling is performed based on the Ingress metrics that are collected by SLS. You must configure SLS to collect NGINX Ingress logs.
When you create an ACK cluster, SLS is enabled for the cluster by default. If you use the default log collection settings, you can view the log analysis reports and real-time status of NGINX Ingresses in the SLS console after you create the cluster.
If you disable SLS when you create an ACK cluster, you cannot perform horizontal pod autoscaling based on the Ingress metrics that are collected by SLS. You must enable SLS for the cluster before you can perform horizontal pod autoscaling. For more information, see Analyze and monitor the access log of nginx-ingress-controller.
The AliyunLogConfig that is generated the first time you enable SLS applies only to the default log format that ACK defines for the Ingress controller. If you have changed the log format, you must modify the processor_regex
settings in the AliyunLogConfig. For more information, see Use the Simple Log Service console to collect container text logs in DaemonSet mode.