All Products
Search
Document Center

Container Service for Kubernetes:Auto scale applications based on QPS data by using an HPA

Last Updated:Dec 31, 2025

If your application needs to dynamically adjust its computing resources based on request volume, you can use queries per second (QPS) data from an Application Load Balancer (ALB) instance to configure auto scaling for its pods.

Before you start

Before you start, see Create and use an ALB Ingress to expose a service to learn the basics of ALB Ingress.

How it works

Queries per second (QPS) is the number of requests received per second. Application Load Balancer (ALB) instances use Simple Log Service to record client access data. A Horizontal Pod Autoscaler (HPA) monitors the QPS data of a service from these records and scales the corresponding workloads, such as deployments and StatefulSets.

Prerequisites

Step 1: Create an AlbConfig and associate a Simple Log Service project

  1. View the Simple Log Service project associated with the cluster.

    1. Log on to the ACK console. In the left navigation pane, click Clusters.

    2. On the Clusters page, click the name of the target cluster. In the navigation pane on the left, click Cluster Information.

    3. On the Basic Information tab, find the Simple Log Service Project resource and record the project name.

  2. Create an AlbConfig.

    1. Create a file named alb-qps.yaml, copy the following content to the file, and then specify the details of the Simple Log Service project in the accessLogConfig field.

      apiVersion: alibabacloud.com/v1
      kind: AlbConfig
      metadata:
        name: alb-qps
      spec:
        config:
          name: alb-qps
          addressType: Internet
          zoneMappings:
          - vSwitchId: vsw-uf6ccg2a9g71hx8go**** # The ID of the vSwitch.
          - vSwitchId: vsw-uf6nun9tql5t8nh15****
          accessLogConfig:
            logProject: <LOG_PROJECT> # The name of the Simple Log Service project associated with the cluster.
            logStore: <LOG_STORE> # The name of the custom Logstore. The name must start with "alb_".
        listeners:
          - port: 80
            protocol: HTTP

      The following describes the fields:

      Field

      Type

      Description

      logProject

      string

      The name of the Simple Log Service project.

      Default value: "".

      logStore

      string

      The name of the Simple Log Service Logstore, which must start with alb_. The SLS Logstore is automatically created If it does not exist. For more information, see Enable Simple Log Service to collect access logs.

      Default value: "alb_****".

    2. Run the following command to create the AlbConfig.

       kubectl apply -f alb-qps.yaml

      Expected output:

      albconfig.alibabacloud.com/alb-qps created

Step 2: Create sample resources

In addition to an AlbConfig, an ALB Ingress requires a deployment, a service, an IngressClass, and an Ingress to function. Follow these steps to quickly create these resources.

  1. Create a file named qps-quickstart.yaml that contains the following content.

    apiVersion: networking.k8s.io/v1
    kind: IngressClass
    metadata:
      name: qps-ingressclass
    spec:
      controller: ingress.k8s.alibabacloud/alb
      parameters:
        apiGroup: alibabacloud.com
        kind: AlbConfig
        name: alb-qps # The name of the AlbConfig.
    ---
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: qps-ingress
    spec:
      ingressClassName: qps-ingressclass # The name of the IngressClass.
      rules:
       - host: demo.alb.ingress.top # Replace this with your domain name.
         http:
          paths:
          - path: /qps
            pathType: Prefix
            backend:
              service:
                name: qps-svc
                port:
                  number: 80
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: qps-svc
      namespace: default
    spec:
      ports:
        - port: 80
          protocol: TCP
          targetPort: 80
      selector:
        app: qps-deploy
      type: NodePort
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: qps-deploy
      labels:
        app: qps-deploy
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: qps-deploy
      template:
        metadata:
          labels:
            app: qps-deploy
        spec:
          containers:
          - name: qps-container
            image: nginx:1.7.9
            ports:
            - containerPort: 80
  2. Run the following command to create the sample resources.

    kubectl apply -f qps-quickstart.yaml

Step 3: Create an HPA

  1. Create a file named qps-hpa.yaml with the following content.

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: qps-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: qps-deploy # The name of the workload that the HPA controls.
      minReplicas: 2 # The minimum number of pods.
      maxReplicas: 10 # The maximum number of pods.
      metrics:
        - type: External   # Use external metrics (non-native Kubernetes metrics).
          external:
            metric:
              name: sls_alb_ingress_qps # The name of the metric (QPS of Alibaba Cloud ALB Ingress). Do not modify this value.
              selector:
                matchLabels:
                  sls.project: <LOG_PROJECT> # The name of the Simple Log Service project. Replace this with the actual project name.
                  sls.logstore: <LOG_STORE> # The name of the Logstore. Replace this with the actual Logstore name.
                  sls.ingress.route: default-qps-svc-80 # The path of the service. The format is <namespace>-<svc>-<port>.
            target:
              type: AverageValue  # The target metric type (average value).
              averageValue: "2"     # The expected target value of the metric. In this example, the average QPS of all pods is 2.

    The following describes the fields:

    Field

    Description

    scaleTargetRef

    The application workload. In this example, this refers to the Deployment named qps-deploy that was created in Step 1.

    minReplicas

    The minimum number of pods to which the deployment can be scaled in. This value must be an integer greater than or equal to 1.

    maxReplicas

    The maximum number of pods to which the deployment can be scaled out. This value must be greater than the minimum number of replicas.

    external.metric.name

    The metric that is based on QPS data and is used by the HPA. Do not modify this value.

    sls.project

    The Simple Log Service project on which the metric is based. The value must be the same as that specified in the AlbConfig.

    sls.logstore

    The Logstore on which the metric is based. The value must be the same as that specified in the AlbConfig.

    sls.ingress.route

    The path for the Service uses the format <namespace>-<svc>-<port>. In this example, the path is for the qps-svc Service, which was created in Step 1.

    external.target

    The expected target value of the metric. In this example, the average QPS of all pods is 2. The HPA controls the number of pods to keep the QPS as close to the target value as possible.

  2. Run the following command to create the HPA.

    kubectl apply -f qps-hpa.yaml
  3. Run the following command to view the deployment status of the HPA.

    kubectl get hpa

    Expected output:

    NAME      REFERENCE               TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
    qps-hpa   Deployment/qps-deploy   0/2 (avg)   2         10        2          5m41s
  4. Run the following command to view the configuration details of the HPA.

    kubectl describe hpa qps-hpa

    Expected output:

    Name:                                            qps-hpa
    Namespace:                                       default
    Labels:                                          <none>
    Annotations:                                     <none>
    CreationTimestamp:                               ******** # The timestamp of the HPA. You can ignore this parameter.
    Reference:                                       Deployment/qps-deploy
    Metrics:                                         ( current / target )
      "sls_alb_ingress_qps" (target average value):  0 / 2
    Min replicas:                                    2
    Max replicas:                                    10
    Deployment pods:                                 2 current / 2 desired

(Optional) Step 4: Verify the results

  1. Verify that the application is scaled out.

    1. Run the following command to view information about the Ingress.

      kubectl get ingress

      Expected output:

      NAME            CLASS                HOSTS                  ADDRESS                         PORTS     AGE
      qps-ingress     qps-ingressclass     demo.alb.ingress.top   alb-********.alb.aliyuncs.com   80        10m31s

      Record the values of HOSTS and ADDRESS for subsequent steps.

    2. Run the following command to perform a stress test on the application.

      Replace demo.alb.ingress.top and alb-********.alb.aliyuncs.com with the values that you recorded in the previous step.

      ab -r -c 5 -n 10000 -H Host:demo.alb.ingress.top http://alb-********.alb.aliyuncs.com/qps
    3. Run the following command to view the auto scaling status of the application.

      kubectl get hpa

      Expected output:

      NAME      REFERENCE               TARGETS          MINPODS   MAXPODS   REPLICAS   AGE
      qps-hpa   Deployment/qps-deploy   14375m/2 (avg)   2         10        10         15m

      The output shows that the value of REPLICAS is 10. This indicates that as the QPS increases, the number of application pods scales out to 10, which is the value of MAXPODS.

  2. Verify that the application is scaled in.

    After the stress test is complete, run the following command to view the auto scaling status of the application.

    kubectl get hpa

    Expected output:

    NAME      REFERENCE               TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
    qps-hpa   Deployment/qps-deploy   0/2 (avg)   2         10        2          28m

    The output shows that the value of REPLICAS is 2. This indicates that after the QPS drops to 0, the number of application pods scales in to 2, which is the value of MINPODS.

References