All Products
Search
Document Center

Container Service for Kubernetes:Configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller

Last Updated:Mar 15, 2024

You can deploy an application in multiple pods to improve application stability. However, this method increases costs and causes resource waste during off-peak hours. You can also manually scale the pods of your application. However, this method increases your O&M workload and pods cannot be scaled in real time. To resolve the preceding issues, you can configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller. This way, the pods of applications are automatically scaled based on loads. This method improves the stability and resilience of applications, optimizes resource usage, and reduces costs. This topic describes how to configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller.

Prerequisites

To configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller, you must convert Managed Service for Prometheus metrics to metrics that are supported by the Horizontal Pod Autoscaler (HPA) and deploy the required components.

Background information

To automatically scale the pods of an application based on the number of requests in a production environment, you can use the http_requests_total metric to collect the number of requests. We recommend that you configure horizontal pod autoscaling based on the metrics of the NGINX Ingress controller.

An Ingress is a Kubernetes API object. An Ingress forwards client requests to Services based on the hosts and URL paths of the requests. Then, the Services route the requests to the backend pods.

The NGINX Ingress controller is deployed in a Container Service for Kubernetes (ACK) cluster to control the Ingresses in the cluster. The NGINX Ingress controller provides high-performance and custom traffic management. The NGINX Ingress controller provided by ACK is developed based on the open source version and is integrated with various features of Alibaba Cloud services to provide a simplified user experience.

Procedure

In this topic, two ClusterIP Services are created to forward the external requests received by the NGINX Ingress controller. In the following example, the HPA is configured to automatically scale pods based on the nginx_ingress_controller_requests metric, which indicates the traffic loads.

  1. Use the following YAML template to create a Deployment and a Service.

    1. Create a file named nginx1.yaml based on the following content. Then, run the kubectl apply -f nginx1.yaml command to create an application named test-app and a Service also named test-app.

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: test-app
        labels:
          app: test-app
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: test-app
        template:
          metadata:
            labels:
              app: test-app
          spec:
            containers:
            - image: skto/sample-app:v2
              name: metrics-provider
              ports:
              - name: http
                containerPort: 8080
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: test-app
        namespace: default
        labels:
          app: test-app
      spec:
        ports:
          - port: 8080
            name: http
            protocol: TCP
            targetPort: 8080
        selector:
          app: test-app
        type: ClusterIP
    2. Create a file named nginx2.yaml based on the following content. Then, run the kubectl apply -f nginx2.yaml command to create an application named sample-app and a Service also named sample-app.

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: sample-app
        labels:
          app: sample-app
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: sample-app
        template:
          metadata:
            labels:
              app: sample-app
          spec:
            containers:
            - image: skto/sample-app:v2
              name: metrics-provider
              ports:
              - name: http
                containerPort: 8080
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: sample-app
        namespace: default
        labels:
          app: sample-app
      spec:
        ports:
          - port: 80
            name: http
            protocol: TCP
            targetPort: 8080
        selector:
          app: sample-app
        type: ClusterIP
  2. Create a file named ingress.yaml based on the following content. Then, run the kubectl apply -f ingress.yaml command to create an Ingress.

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: test-ingress
      namespace: default
    spec:
      ingressClassName: nginx
      rules:
        - host: test.example.com
          http:
            paths:
              - backend:
                  service:
                    name: sample-app
                    port:
                      number: 80
                path: /
                pathType: ImplementationSpecific
              - backend:
                  service:
                    name: test-app
                    port:
                      number: 8080
                path: /home
                pathType: ImplementationSpecific
    • host: the domain name that is used to enable external access to the backend Service. In this example, test.example.com is used.

    • path: the URL paths that are used to match requests. The requests received by the Ingress are matched against the Ingress rules and forwarded to the corresponding Service. Then, the Service routes the requests to the backend pods.

    • backend: the name and port of the Service to which the requests that match the path parameter are forwarded.

  3. Run the following command to query the Ingress:

    kubectl get ingress -o wide

    Expected output:

    NAME           CLASS   HOSTS              ADDRESS       PORTS   AGE                                                  
    test-ingress   nginx   test.example.com   10.10.10.10   80      55s
  4. After you deploy the preceding resource objects, you can send requests to the / and /home URL paths to access the specified host. The NGINX Ingress controller automatically routes your requests to the test-app and sample-app applications based on the URL paths of the requests. You can obtain information about the requests to each application from the Managed Service for Prometheus metric nginx_ingress_controller_requests.

  5. Modify the adapter.config file of the alibaba-cloud-metrics-adapter component to convert the Prometheus metric to a metric that is supported by the HPA.

    Note

    Before you modify the adapter.config file, make sure that the alibaba-cloud-metrics-adapter component is installed in your cluster and the prometheus.url file is configured.

    1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

    2. On the Clusters page, click the name of the cluster that you want to manage and choose Applications > Helm in the left-side navigation pane.

    3. Click ack-alibaba-cloud-metrics-adapter.

    4. In the Resource section, click adapter-config.

    5. On the adapter-config page, click Edit YAML in the upper-right corner.

    6. Replace the code in Value with the following content, and then click OK in the lower part of the page.

      For more information about how to configure the ConfigMap, see Horizontal pod scaling based on Managed Service for Prometheus metrics.

      rules:
      - metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m]))
        name:
          as: ${1}_per_second
          matches: ^(.*)_requests
        resources:
          namespaced: false
          overrides:
            controller_namespace:
              resource: namespace
        seriesQuery: nginx_ingress_controller_requests
  6. Run the following command to query a metric:

    kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/nginx_ingress_controller_per_second" | jq .

    Expected output:

    {
      "kind": "ExternalMetricValueList",
      "apiVersion": "external.metrics.k8s.io/v1beta1",
      "metadata": {},
      "items": [
        {
          "metricName": "nginx_ingress_controller_per_second",
          "metricLabels": {},
          "timestamp": "2022-03-31T10:11:37Z",
          "value": "0"
        }
      ]
    }
  7. Create a file named hpa.yaml based on the following content. Then, run the kubectl apply -f hpa.yaml command to configure the HPA for both the sample-app and test-app applications.

    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
      name: sample-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: sample-app
      minReplicas: 1
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: nginx_ingress_controller_per_second
              selector:
                matchLabels:
    # You can configure this parameter to filter metrics. The value of this parameter is passed to the <<.LabelMatchers>> field in the adapter.config file. 
                  service: sample-app
    # You can specify only thresholds of the Value or AverageValue type for external metrics. 
            target:
              type: AverageValue
              averageValue: 30
    ----------
    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
      name: test-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: test-app
      minReplicas: 1
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: nginx_ingress_controller_per_second
              selector:
                matchLabels:
    # You can configure this parameter to filter metrics. The value of this parameter is passed to the <<.LabelMatchers>> field in the adapter.config file. 
                  service: test-app
    # You can specify only thresholds of the Value or AverageValue type for external metrics. 
            target:
              type: AverageValue
              averageValue: 30
  8. Run the following command to query the HPA information:

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   0/30 (avg)   1         10        1          74s
    test-hpa     Deployment/test-app     0/30 (avg)   1         10        1          59m
  9. After you configure the HPA, perform stress tests to check whether the pods of the applications are automatically scaled out when the number of requests increases.

    1. Run the following command to perform stress tests on the /home URL path of the host:

      ab -c 50 -n 5000 test.example.com/home
    2. Run the following command to query the HPA information:

      kubectl get hpa

      Expected output:

      NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
      sample-hpa   Deployment/sample-app   0/30 (avg)        1         10        1          22m
      test-hpa     Deployment/test-app     22096m/30 (avg)   1         10        3          80m
    3. Run the following command to perform stress tests on the root path of the host:

      ab -c 50 -n 5000 test.example.com/
    4. Run the following command to query the HPA information:

      kubectl get hpa

      Expected output:

      NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
      sample-hpa   Deployment/sample-app   27778m/30 (avg)   1         10        2          38m
      test-hpa     Deployment/test-app     0/30 (avg)        1         10        1          96m

      The output shows that the pods of the applications are automatically scaled out when the number of requests exceeds the scaling threshold.

References

  • Multi-zone load balancing is a deployment solution commonly used in high availability scenarios for data services. If an application that is deployed across zones does not have sufficient resources to handle heavy workloads, you may want ACK to create a specific number of nodes in each zone of the application. For more information, see Configure auto scaling for cross-zone deployment.

  • For more information about how to create custom images to accelerate horizontal pod autoscaling in complex scenarios, see Create custom images.