Application Load Balancer (ALB) Ingresses support automatic application scaling based on the QPS values collected by the ALB instances. This ensures the stability of the application and controls resource costs. This topic describes how to use ALB Ingresses to enable automatic application scaling based on QPS.

Prerequisites

Procedure

  1. Create an application and a Service.
  2. Create an ALB Ingress.
  3. Create a Horizontal Pod Autoscaler (HPA).
  4. Verify that the application can be automatically scaled based on the QPS value.

Step 1: Create an application and a Service

  1. Create a file named tea.yaml and copy the following content to the file:
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-deployment-basic
      labels:
        app: tea
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: tea
      template:
        metadata:
          labels:
            app: tea
        spec:
          containers:
          - name: tea
            image: nginx:1.7.9
            ports:
            - containerPort: 80
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: tea-svc
      namespace: default
    spec:
      ports:
        - port: 80
          protocol: TCP
          targetPort: 80
      selector:
        app: tea
      type: NodePort
  2. Run the following command to create an application and a Service:
    kubectl apply -f tea.yaml

Step 2: Create an ALB Ingress

  1. Create an AlbConfig object.
    1. Create a file named alb-test-test.yaml and copy the following content to the file:
      apiVersion: alibabacloud.com/v1
      kind: AlbConfig
      metadata:
        name: alb-demo
      spec:
        config:
          name: alb-test
          addressType: Internet
          zoneMappings:
          - vSwitchId: vsw-uf6ccg2a9g71hx8go****
          - vSwitchId: vsw-uf6nun9tql5t8nh15****
          accessLogConfig:
            logProject: "****"
            logStore: "alb_****"
      • zoneMappings: Specify at least two vSwitch IDs for the ALB Ingress. The vSwitches that you specify must be deployed in different zones of the VPC where your cluster resides.
      • logProject: Specify the name of the Log Service project that you created.
      • logStore: Specify the name of the Logstore that you want to use. The value of logStore must start with alb_. If the Logstore that you specify does not exist, the system automatically creates one with the name that you specified.
    2. Run the following command to create an AlbConfig object:
      kubectl apply -f alb-test.yaml
  2. Create an IngressClass.
    1. Create a file named alb.yaml and copy the following content to the file:
      apiVersion: networking.k8s.io/v1
      kind: IngressClass
      metadata:
        name: alb
      spec:
        controller: ingress.k8s.alibabacloud/alb
        parameters:
          apiGroup: alibabacloud.com
          kind: AlbConfig
          name: alb-demo
    2. Run the following command to create an IngressClass:
      kubectl apply -f alb.yaml
  3. Create an ALB Ingress.
    1. Create a file named tea-ingress.yaml and copy the following content to the file:
      apiVersion: networking.k8s.io/v1
      kind: Ingress
      metadata:
        name: tea-ingress
      spec:
        ingressClassName: alb
        rules:
         - host: demo.ingress.top
           http:
            paths:
            - path: /tea
              pathType: Prefix
              backend:
                service:
                  name: tea-svc
                  port:
                    number: 80
    2. Run the following command to create an ALB Ingress:
      kubectl apply -f tea-ingress.yaml
  4. Run the following command to query the ADDRESS parameter of the ALB Ingress:
    kubectl get ingress

    Expected output:

    NAME                    CLASS   HOSTS                     ADDRESS                                              PORTS   AGE
    tea-ingress             alb     demo.ingress.top          alb-110zvs5nhsvfv*****.cn-chengdu.alb.aliyuncs.com   80      7m5s

Step 3: Create an HPA

  1. Create a file named hpa.yaml and copy the following content to the file:
    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
      name: ingress-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: nginx-deployment-basic
      minReplicas: 2
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: sls_alb_ingress_qps
              # The sls_alb_ingress_qps metric is used to configure auto scaling based on QPS values. 
              selector:
                matchLabels:
                  sls.project: "****"    # Replace the value of sls.project with the Log Service project that you want to use. 
                  sls.logstore: "alb_****"     # Replace the value with the Logstore that you want to use. 
                  sls.ingress.route: "default-tea-svc-80"
                  # Specify the value of the sls.ingress.route parameter in the <namespace>-<svc>-<port> format. Example: default-nginx-80. 
            target:
              type: AverageValue
              # The type parameter is set to AverageValue, which indicates that the average QPS value of each pod is used to determine whether to perform scaling activities. 
              averageValue: 2
  2. Run the following command to create an HPA:
    kubectl apply -f hpa.yaml
  3. Run the following command to query information about the HPA:
    kubectl get hpa

    Expected output:

    NAME          REFERENCE                           TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
    ingress-hpa   Deployment/nginx-deployment-basic   0/2 (avg)   2         10        2          4h34m
  4. Run the following command to query information about the HPA:
    kubectl describe hpa ingress-hpa

    Expected output:

    Name:                                            ingress-hpa
    Namespace:                                       default
    Labels:                                          <none>
    Annotations:                                     <none>
    CreationTimestamp:                               Tue, 31 Jan 2023 11:35:01 +0800
    Reference:                                       Deployment/nginx-deployment-basic
    Metrics:                                         ( current / target )
    "sls_alb_ingress_qps" (target average value):    0 / 2
    Min replicas:                                    2
    Max replicas:                                    10
    Deployment pods:                                 2 current / 2 desired

Step 4: Verify that the application can be automatically scaled based on the QPS value

  1. Verify that the application can be automatically scaled out based on the QPS value.
    1. Run the following commands to perform stress tests on the application:
      ab -c 5 -n 5000 -H Host:demo.ingress.top http://alb-110zvs5nhsvfv*****.cn-chengdu.alb.aliyuncs.com/tea
    2. After the stress test is complete, run the following command to check whether the application is scaled out:
      kubectl get hpa

      Expected output:

      NAME          REFERENCE                           TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
      ingress-hpa   Deployment/nginx-deployment-basic   12500m/2 (avg)   2         10        10         15m

      The value of the REPLICAS parameter is 10. This indicates that the number of pods created for the application is increased to 10 as the QPS value increases.

  2. Verify that the application can be automatically scaled in based on the QPS value.

    After the stress test is complete, the QPS value is decreased to 0, which is below the scale-in threshold. HPA automatically scales in the application.

    After the stress test is complete, run the following command to check whether the application is scaled in:

    kubectl get hpa

    Expected output:

    NAME          REFERENCE                           TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
    ingress-hpa   Deployment/nginx-deployment-basic   0/2 (avg)         2         10         2         60m

    The value of the REPLICAS parameter is 2. This indicates that the number of pods created for the application is decreased as the QPS value decreases.