All Products
Search
Document Center

Container Service for Kubernetes:Configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller

Last Updated:Apr 17, 2025

You can deploy an application in multiple pods to improve application stability. However, this method increases costs and causes resource waste during off-peak hours. You can also manually scale the pods of your application. However, this method increases your O&M workload and pods cannot be scaled in real time. To resolve the preceding issues, you can configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller. This way, the pods of applications are automatically scaled based on loads. This method improves the stability and resilience of applications, optimizes resource usage, and reduces costs. This topic describes how to configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller.

Prerequisites

To configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller, you must convert Managed Service for Prometheus metrics to metrics that are supported by Horizontal Pod Autoscaler (HPA) and deploy the required components.

Feature description

An Ingress is a Kubernetes API object that forwards external requests to Services in a Kubernetes cluster. The Services route the requests to the backend pods. To automatically scale the pods of an application based on the number of requests in the production environment, you can use the http_requests_total metric to collect the number of requests. You can also implement horizontal pod autoscaling based on the metrics of the NGINX Ingress controller.

The NGINX Ingress controller is deployed in a Container Service for Kubernetes (ACK) cluster to control the Ingresses in the cluster. The NGINX Ingress controller provides high-performance and custom traffic management. The NGINX Ingress controller provided by ACK is developed based on the open source version and provides enhanced capabilities to offer a simplified user experience.

In this topic, two Deployments are created and a Service is created for each Deployment. An Ingress is created to route external requests to the two Deployments based on different path matching rules. In the following example, HPA is configured to automatically scale pods based on the nginx_ingress_controller_requests metric, which indicates the traffic loads. In the following example, HPA has the selector.matchLabels.service field configured, which is used to filter metrics.

Step 1: Create applications and Services

Use the following YAML template to create Deployments and Services.

  1. Create a file named nginx1.yaml and copy the following content to the file:

    Click to view the content of the YAML file

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: test-app
      labels:
        app: test-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: test-app
      template:
        metadata:
          labels:
            app: test-app
        spec:
          containers:
          - image: registry-cn-hangzhou.ack.aliyuncs.com/acs/sample-app:v1-b070784-aliyun
            name: metrics-provider
            ports:
            - name: http
              containerPort: 8080
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: test-app
      namespace: default
      labels:
        app: test-app
    spec:
      ports:
        - port: 8080
          name: http
          protocol: TCP
          targetPort: 8080
      selector:
        app: test-app
      type: ClusterIP

    Run the following command to create an application named test-app and a Service that is used to expose the application:

    kubectl apply -f nginx1.yaml
  2. Create a file named nginx2.yaml and copy the following content to the file:

    Click to view the content of the YAML file

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: sample-app
      labels:
        app: sample-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: sample-app
      template:
        metadata:
          labels:
            app: sample-app
        spec:
          containers:
          - image: registry-cn-hangzhou.ack.aliyuncs.com/acs/sample-app:v1-b070784-aliyun
            name: metrics-provider
            ports:
            - name: http
              containerPort: 8080
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: sample-app
      namespace: default
      labels:
        app: sample-app
    spec:
      ports:
        - port: 80
          name: http
          protocol: TCP
          targetPort: 8080
      selector:
        app: sample-app
      type: ClusterIP

    Run the following command to create an application named sample-app and a Service that is used to expose the application:

    kubectl apply -f nginx2.yaml

Step 2: Create an Ingress

  1. Create a file named ingress.yaml and copy the following content to the file:

    Click to view the content of the YAML file

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: test-ingress
      namespace: default
    spec:
      ingressClassName: nginx
      rules:
        - host: test.example.com
          http:
            paths:
              - backend:
                  service:
                    name: sample-app
                    port:
                      number: 80
                path: /
                pathType: ImplementationSpecific
              - backend:
                  service:
                    name: test-app
                    port:
                      number: 8080
                path: /home
                pathType: ImplementationSpecific
    • host: the domain name that is used to enable external access to the backend Service. In this example, test.example.com is specified.

    • path: the URL path that is used to match requests. The requests received by the Ingress are matched against the Ingress rules and forwarded to the corresponding Service. Then, the Service routes the requests to the backend pods.

    • backend: the name and port of the Service to which the requests that match the path parameter are forwarded.

    Run the following command to deploy an Ingress:

    kubectl apply -f ingress.yaml
  2. Run the following command to query the Ingress:

    kubectl get ingress -o wide

    Expected output:

    NAME           CLASS   HOSTS              ADDRESS       PORTS   AGE                                                  
    test-ingress   nginx   test.example.com   10.10.10.10   80      55s
  3. After you deploy the preceding resources, you can send requests to the / and /home paths to access the specified host. The NGINX Ingress controller automatically routes your requests to the sample-app and test-app applications based on the URL paths of the requests. You can obtain information about the requests to each application from the nginx_ingress_controller_request metric in Managed Service for Prometheus.

Step 3: Convert Managed Service for Prometheus metrics to metrics supported by HPA

  1. Modify the adapter.config file of the ack-alibaba-cloud-metrics-adapter component .

    Note

    Before you modify the adapter.config file, make sure that the ack-alibaba-cloud-metrics-adapter component is installed in your cluster and the prometheus.url file is configured.

    1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

    2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Applications > Helm.

    3. Click ack-alibaba-cloud-metrics-adapter, In the Resource section, click adapter-config. In the upper-right corner of the adapter-config page, click Edit YAML. Replace the code in the Value field with the following content, and then click OK in the lower part of the page.

      For more information about the fields in the following code block, see Implement horizontal auto scaling based on Prometheus metrics.

      rules:
      - metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m]))
        name:
          as: ${1}_per_second
          matches: ^(.*)_requests
        resources:
          namespaced: false  
        seriesQuery: nginx_ingress_controller_requests
  2. Run the following command to query a metric:

    kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/nginx_ingress_controller_per_second" | jq .

    Expected output:

    {
      "kind": "ExternalMetricValueList",
      "apiVersion": "external.metrics.k8s.io/v1beta1",
      "metadata": {},
      "items": [
        {
          "metricName": "nginx_ingress_controller_per_second",
          "metricLabels": {},
          "timestamp": "2025-02-28T11:34:56Z",
          "value": "0"
        }
      ]
    }
    

Step 4: Deploy HPA

  1. Create a file named hpa.yaml and copy the following content to the file:

    Click to view the content of the YAML file

    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
      name: sample-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: sample-app
      minReplicas: 1
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: nginx_ingress_controller_per_second
              selector:
                matchLabels:
    # You can configure this parameter to filter metrics. The value of this parameter is passed to the <<.LabelMatchers>> field in the adapter.config file.
                  service: sample-app
    # You can specify only thresholds of the Value or AverageValue type for external metrics.
            target:
              type: AverageValue
              averageValue: 30
    ----------
    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
      name: test-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: test-app
      minReplicas: 1
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: nginx_ingress_controller_per_second
              selector:
                matchLabels:
    # You can configure this parameter to filter metrics. The value of this parameter is passed to the <<.LabelMatchers>> field in the adapter.config file.
                  service: test-app
    # You can specify only thresholds of the Value or AverageValue type for external metrics.
            target:
              type: AverageValue
              averageValue: 30

    Run the following command to deploy HPA for the sample-app and test-app applications separately:

    kubectl apply -f hpa.yaml
  2. Run the following command to query the deployment progress of HPA:

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   0/30 (avg)   1         10        1          74s
    test-hpa     Deployment/test-app     0/30 (avg)   1         10        1          59m

Step: Verify the result

After you configure HPA, perform stress tests to check whether the pods of the applications are automatically scaled out when the number of requests increases.

  1. Run the following command to perform stress tests on the /home URL path of the host:

    ab -c 50 -n 5000 test.example.com/home
  2. Run the following command to query the HPA information:

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   0/30 (avg)        1         10        1          22m
    test-hpa     Deployment/test-app     22096m/30 (avg)   1         10        3          80m
  3. Run the following command to perform stress tests on the root path of the host:

    ab -c 50 -n 5000 test.example.com/
  4. Run the following command to query the HPA information:

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   27778m/30 (avg)   1         10        2          38m
    test-hpa     Deployment/test-app     0/30 (avg)        1         10        1          96m

    The output shows that the pods of the applications are automatically scaled out when the number of requests exceeds the scaling threshold.

References

  • Multi-zone load balancing is a deployment solution commonly used in high availability (HA) scenarios for data services. If an application that is deployed across zones does not have sufficient resources to handle heavy workloads, you may want ACK to create a specific number of nodes in each zone of the application. For more information, see Configure auto scaling for cross-zone deployment.

  • For more information about how to create custom images to accelerate horizontal pod autoscaling in complex scenarios, see Create custom images.