All Products
Search
Document Center

Container Service for Kubernetes:Implement horizontal pod autoscaling based on Alibaba Cloud metrics

Last Updated:Mar 14, 2025

When dealing with business bursts, more precise scaling can improve the response speed and further improve the efficiency of cluster resource utilization. This topic describes how to configure external metrics supported by Kubernetes, such as the HTTP request rate and the Ingress queries per second (QPS), to implement auto scaling policies.

In this example, a Deployment, a Service, and an Ingress named NGINX are created to configure Horizontal Pod Autoscaler (HPA). This allows you to implement horizontal auto scaling based on the QPS of the Ingress in Simple Log Service (SLS).

Step 1: Install the ack-alibaba-cloud-metrics-adapter component

The ack-alibaba-cloud-metrics-adapter component allows Kubernetes to obtain the monitoring data of Alibaba Cloud services, such as Elastic Compute Service (ECS), Server Load Balancer (SLB), and ApsaraDB RDS, by using the External Metrics API, which can enhance the monitoring and auto-scaling capabilities of clusters.

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage and choose Applications > Helm in the left-side navigation pane.

  3. On the Helm page, click Deploy. In the Basic Information step, configure the parameters and select ack-alibaba-cloud-metrics-adapter. Then, click Next. The following table describes the parameters.

  4. In the Parameters step, configure the Chart Version parameter and click OK.

Step 2: Create an application and a Service

  1. Create a file named nginx-test.yaml.

    View YAML examples

    apiVersion: apps/v1 
    kind: Deployment
    metadata:
      name: nginx-deployment-basic
      labels:
        app: nginx
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: nginx:1.7.9 
            ports:
            - containerPort: 80
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx
      namespace: default
    spec:
      ports:
        - port: 80
          protocol: TCP
          targetPort: 80
      selector:
        app: nginx
      type: ClusterIP
  2. Run the following command to create a Deployment and a Service:

    kubectl apply -f nginx-test.yaml

Step 3: Create an Ingress

  1. In the left-side navigation pane, choose Network > Ingresses. In the upper-left corner of the Ingresses page, click Create Ingress.

  2. In the Create Ingress dialog box, configure the parameters and click OK. After you create an Ingress, the Ingresses page appears.

  3. In the Name column, find and click the name of newly created Ingress to view information about the Ingress. For more information about Ingresses, see Ingress management.

Step 4: Configure HPA

You can configure two metrics to scale SLS projects in HPA, such as the sls_ingress_qps and sls_ingress_latency_p9999 metrics.

  • Set the sls_ingress_qps metric to AverageValue. This specifies that the metric value is the result of dividing the total QPS by the number of pods.

  • Set the sls_ingress_latency_p9999 metric to Value. This specifies that the latency is not divided by the number of pods.

  1. Create a file named ingress-hpa.yaml and add the following content to the file:

    View YAML examples

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: ingress-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: nginx-deployment-basic
      minReplicas: 2
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: sls_ingress_qps
              selector:
                matchLabels:
                  sls.project: "***" # Specify the SLS project that you want to use. 
                  sls.logstore: "nginx-ingress"
                  sls.ingress.route: "default-nginx-80"
            target:
              type: AverageValue
              averageValue: 10
        - type: External
          external:
            metric:
              name: sls_ingress_latency_p9999
              selector:
                matchLabels:
                  # default ingress log project is k8s-log-clusterId
                  sls.project: "***" 
                  # default ingress logstre is nginx-ingress
                  sls.logstore: "nginx-ingress"
                  # namespace-svc-port
                  sls.ingress.route: "default-nginx-80"
                  # sls vpc endpoint, default true
                  # sls.internal.endpoint:true
            target:
              type: Value
              # sls_ingress_latency_p9999>10ms specifies that HPA automatically increases the number of nginx-deployment-basic pods if the value exceeds 10 milliseconds. 
              value: 10

    The following table describes the parameters that are used to configure HPA.

    Parameter

    Required

    Description

    sls.ingress.route

    Yes

    Parameter format: <namespace>-<svc>-<port>. <namespace> specifies the namespace to which the Ingress belongs. <svc> specifies the name of the Service that you selected when you created the Ingress. <port> specifies the port of the Service. Example: default-nginx-80

    sls.logstore

    Yes

    The name of the Logstore in SLS. The default value of sls.logstore is nginx-ingress.

    sls.project

    Yes

    The name of the project in SLS. The default value of sls.project is k8s-log-cluster ID.

    sls.internal.endpoint

    No

    Specifies whether SLS is accessed over an internal network. Default value: true.

    • true: Access SLS over an internal network.

    • false: Access SLS over the Internet.

  2. Run the following command to configure HPA:

    kubectl apply -f ingress-hpa.yml

Step 5: Verify the configuration

  1. After you configure HPA, run the following command to perform a stress test:

    ab -t 300 -c 10 <Domain name of the Ingress> # Use Apache Benchmark to send requests to the Service exposed by the Ingress. The test requires 300 seconds to complete and 10 concurrent requests are sent per second.
  2. Check whether HPA works as expected.

    1. In the left-side navigation pane of the ACK console, click Clusters. On the Clusters page, find the cluster that you want to manage and choose More > Open Cloud Shell in the Actions column.

    2. Run the following command to check the status of HPA:

      kubectl get hpa ingress-hpa

      Expected output:

      NAME            REFERENCE                              TARGETS           MINPODS    MAXPODS    REPLICAPS   AGE
      ingress-hpa     Depolyment/nginx-deployment-basic      21/10 (avg)       2          10         10          7m49s

      If the value of the REPLICAS parameter is the same as the value of the MAXPODS parameter, HPA scaled out the application as expected.

FAQ

How do I use a CLI to obtain the data of the sls_ingress_qps QPS metrics?

Run the following command to query data. In this example, the sls_ingress_qps metric is used.

kubectl get --raw  /apis/external.metrics.k8s.io/v1beta1/namespaces/*/sls_ingress_qps?labelSelector=sls.project={{SLS_Project}},sls.logstore=nginx-ingress

{{SLS_Project}} is the name of the SLS project used by the ACK cluster. The default name of the SLS project used by an ACK cluster is k8s-log-{{ClusterId}}. {{ClusterId}} is the ID of the cluster.

Expected output:

Error from server: {
    "httpCode": 400,
    "errorCode": "ParameterInvalid",
    "errorMessage": "key (slb_pool_name) is not config as key value config,if symbol : is  in your log,please wrap : with quotation mark \"",
    "requestID": "xxxxxxx"
}

The command output indicates that no data is returned for the sls_alb_ingress_qps metric because no Application Load Balancer (ALB) Ingress is created. The sls_alb_ingress_qps metric is used for data query.

Expected output:

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metricName": "sls_ingress_qps",
      "timestamp": "2025-02-26T16:45:00Z", 
      "value": "50", # The value of QPS.
      "metricLabels": {
        "sls.project": "your-sls-project-name",
        "sls.logstore": "nginx-ingress"
      }
    }
  ]
}

The command output indicates that the QPS metric exists and the value is the QPS value.

What do I do if the target column is <unknown> after I run the kubectl get hpa command?

To resolve this issue, perform the following steps.

  1. Run the kubectl describe hpa <hpa_name> command to determine why HPA does not work as expected.

    • If the value of AbleToScale is False in the Conditions field, check whether the Deployment is successfully created.

    • If the value of ScalingActive is False in the Conditions field, proceed to the next step.

  2. Run the kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/" command. If the Error from server (NotFound): the server could not find the requested resource error message appears, check the status of the alibaba-cloud-metrics-adapter component.

    If the status of the alibaba-cloud-metrics-adapter component is normal, check whether the HPA metrics are related to the Ingress. If the HPA metrics are related to the Ingress, make sure that you install the SLS component before you install ack-alibaba-cloud-metrics-adapter. For more information, see Analyze and monitor the access log of nginx-ingress.

  3. Make sure that the values of the HPA metrics are valid. The value of sls.ingress.route must be in the <namespace>-<svc>-<port> format.

    • namespace specifies the namespace to which the Ingress belongs.

    • svc specifies the name of the Service that you selected when you created the Ingress.

    • port specifies the port of the Service.

How do I find the metrics that are supported by HPA?

For more information about the metrics that are supported by HPA, see Alibaba Cloud metrics adapter. The following table describes the commonly used metrics.

Metric

Description

Additional parameter

sls_ingress_qps

The number of requests that an Ingress can process per second based on a specific routing rule.

sls.ingress.route

sls_alb_ingress_qps

The number of requests that the ALB Ingress can process per second based on a specific routing rule.

sls.ingress.route

sls_ingress_latency_avg

The average latency of all requests.

sls.ingress.route

sls_ingress_latency_p50

The maximum latency for the fastest 50% of all requests.

sls.ingress.route

sls_ingress_latency_p95

The maximum latency for the fastest 95% of all requests.

sls.ingress.route

sls_ingress_latency_p99

The maximum latency for the fastest 99% of all requests.

sls.ingress.route

sls_ingress_latency_p9999

The maximum latency for the fastest 99.99% of all requests.

sls.ingress.route

sls_ingress_inflow

The inbound bandwidth of the Ingress.

sls.ingress.route

How do I configure horizontal autoscaling after I change the format of NGINX Ingress logs?

In this topic, horizontal pod autoscaling is performed based on the Ingress metrics that are collected by SLS. You must configure SLS to collect NGINX Ingress logs.

When you create an ACK cluster, SLS is enabled for the cluster by default. If you use the default log collection settings, you can view the log analysis reports and real-time status of NGINX Ingresses in the SLS console after you create the cluster.

If you disable SLS when you create an ACK cluster, you cannot perform horizontal pod autoscaling based on the Ingress metrics that are collected by SLS. You must enable SLS for the cluster before you can perform horizontal pod autoscaling. For more information, see Analyze and monitor the access log of nginx-ingress-controller.

The AliyunLogConfig that is generated the first time you enable SLS applies only to the default log format that ACK defines for the Ingress controller. If you have changed the log format, you must modify the processor_regex settings in the AliyunLogConfig. For more information, see Use the Simple Log Service console to collect container text logs in DaemonSet mode.

References