All Products
Search
Document Center

Container Service for Kubernetes:Horizontally scale multiple applications based on Nginx Ingress traffic metrics

Last Updated:Sep 09, 2025

Deploying multiple instances improves application stability but can lead to idle resources and higher cluster costs. Manual scaling is labor-intensive and often delayed. You can use Nginx Ingress to implement Horizontal Pod Autoscaler (HPA) for multiple applications. HPA dynamically adjusts the number of pod replicas based on the workload. This ensures application stability and fast responses. It also optimizes resource utilization and reduces costs. This topic describes how to use Nginx Ingress to implement HPA for multiple applications.

An Ingress forwards external requests to a Service in the cluster. The Service then sends the requests to a pod. In a production environment, you can configure automatic scaling based on request volume. This volume is exposed by the nginx_ingress_controller_requests metric. You can use this built-in metric from the Nginx Ingress Controller to implement HPA. The Nginx Ingress Controller in ACK clusters is an enhanced version of the community edition and is easier to use.

Preparations

Before you use Nginx Ingress to implement Horizontal Pod Autoscaling (HPA) for multiple applications, you must transform Alibaba Cloud Prometheus metrics into HPA-compatible metrics.

  • Deploy the Alibaba Cloud Prometheus monitoring component. For more information, see Use Alibaba Cloud Prometheus for monitoring.

  • Deploy the ack-alibaba-cloud-metrics-adapter component and configure its prometheus.url field.

    Expand to view how to configure the prometheus.url

    1. Log on to the ACK console. In the navigation pane on the left, click Clusters.

    2. On the Clusters page, find the cluster you want and click its name. In the left-side pane, choose Applications > Helm.

    3. On the Helm page, find ack-alibaba-cloud-metrics-adapter and click Update in the Actions column.

    4. In the Update Release panel, set the alibabaCloudMetricsAdapter.prometheus.url field to the Prometheus data request URL that you obtained. Then, click OK.

      For more information, see How to retrieve the Prometheus data request URL.
      For more information, see Detailed description of the ack-alibaba-cloud-metrics-adapter component configuration file.
  • Install the Apache Benchmark stress testing tool.

    Expand to view sample commands

    • macOS: Use Homebrew to install.

      brew install httpd
    • Windows: Go to Apache Lounge and download the Windows version of Apache. Use the command line to run the cd command to navigate to the extracted bin folder. Then, run ab.exe to start the program.

    • Ubuntu or Debian:

      sudo apt update
      sudo apt install apache2-utils
    • CentOS 8 or RHEL:

      sudo yum install httpd-tools

    After the installation is complete, run ab -V to verify the installation.

In this tutorial, you will create two Deployments and their corresponding Services, and configure an Ingress with different access paths to route external traffic. Then, you will configure an HPA for the application based on the nginx_ingress_controller_requests metric and use the selector.matchLabels.service field to filter the metric. This enables pods to scale in and out in response to traffic changes.

Step 1: Create applications and services

Use the following YAML files to create the application Deployments and their corresponding Services.

  1. Create a file named nginx1.yaml and copy the following content into it.

    Expand to view the YAML example

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: test-app
      namespace: default
      labels:
        app: test-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: test-app
      template:
        metadata:
          labels:
            app: test-app
        spec:
          containers:
          - image: registry-cn-hangzhou.ack.aliyuncs.com/acs/sample-app:v1-b070784-aliyun
            name: metrics-provider
            ports:
            - name: http
              containerPort: 8080
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: test-app
      namespace: default
      labels:
        app: test-app
    spec:
      ports:
        - port: 8080
          name: http
          protocol: TCP
          targetPort: 8080
      selector:
        app: test-app
      type: ClusterIP

    Run the following command to create the test-app application and its corresponding Service.

    kubectl apply -f nginx1.yaml
  2. Create a file named nginx2.yaml and copy the following content into it.

    Expand to view the YAML example

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: sample-app
      namespace: default
      labels:
        app: sample-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: sample-app
      template:
        metadata:
          labels:
            app: sample-app
        spec:
          containers:
          - image: registry-cn-hangzhou.ack.aliyuncs.com/acs/sample-app:v1-b070784-aliyun
            name: metrics-provider
            ports:
            - name: http
              containerPort: 8080
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: sample-app
      namespace: default
      labels:
        app: sample-app
    spec:
      ports:
        - port: 80
          name: http
          protocol: TCP
          targetPort: 8080
      selector:
        app: sample-app
      type: ClusterIP

    Run the following command to create the sample-app application and its corresponding Service.

    kubectl apply -f nginx2.yaml

Step 2: Create an Ingress

  1. Create a file named ingress.yaml and copy the following content into it.

    Expand to view the YAML example

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: test-ingress
      namespace: default
    spec:
      ingressClassName: nginx
      rules:
        - host: test.example.com
          http:
            paths:
              - backend:
                  service:
                    name: sample-app
                    port:
                      number: 80
                path: /
                pathType: ImplementationSpecific
              - backend:
                  service:
                    name: test-app
                    port:
                      number: 8080
                path: /home
                pathType: ImplementationSpecific
    • host: The domain name for accessing the Service. This example uses test.example.com.

    • path: The URL path for access. When a request arrives, it is matched with the corresponding Service based on the routing rule. The request is then sent to the corresponding pod through the Service.

    • backend: Consists of a Service name and a Service port. It specifies the Service to which the current path forwards requests.

    Run the following command to deploy the Ingress resource.

    kubectl apply -f ingress.yaml
  2. Run the following command to retrieve the Ingress resource.

    kubectl get ingress -o wide

    Expected output:

    NAME           CLASS   HOSTS              ADDRESS       PORTS   AGE                                                  
    test-ingress   nginx   test.example.com   10.XX.XX.10   80      55s

    After the deployment is successful, you can access the host using the / and /home paths. The NGINX Ingress controller automatically routes your requests to the sample-app and test-app applications based on the request paths. You can query the nginx_ingress_controller_requests metric in Alibaba Cloud Prometheus to retrieve information about requests to each application.

Step 3: Convert Prometheus metrics to HPA-compatible metrics

Modify the adapter-config file

  1. Log on to the ACK console. In the navigation pane on the left, click Clusters.

  2. On the Clusters page, find the cluster you want and click its name. In the left-side navigation pane, choose Applications > Helm.

  3. On the Helm page, click ack-alibaba-cloud-metrics-adapter. In the Resources section, click adapter-config, and then click Edit YAML in the upper-right corner of the page.

  4. Replace the values of the corresponding fields with the values in the following code. Then, click OK at the bottom of the page.

    For more information, see Horizontal pod autoscaling based on Alibaba Cloud Prometheus metrics.
    rules:
    - metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m]))
      name:
        as: ${1}_per_second
        matches: ^(.*)_requests
      resources:
        namespaced: false  
      seriesQuery: nginx_ingress_controller_requests

    image

View the metric output

Run the following command to view the metric output.

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/nginx_ingress_controller_per_second" | jq .

The query result is as follows:

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metricName": "nginx_ingress_controller_per_second",
      "metricLabels": {},
      "timestamp": "2025-07-25T07:56:04Z",
      "value": "0"
    }
  ]
}

Step 4: Create HPAs

  1. Create a file named hpa.yaml and copy the following content into it.

    Expand to view the YAML example

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: sample-hpa
      namespace: default
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: sample-app
      minReplicas: 1
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: nginx_ingress_controller_per_second
              selector:
                matchLabels:
    # You can use this field to filter metrics. The fields set here are passed to the <<.LabelMatchers>> label in adapter.config.
                  service: sample-app
    # The External metric type supports only Value and AverageValue target types.
            target:
              type: AverageValue
              averageValue: 30
    ---
    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: test-hpa
      namespace: default
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: test-app
      minReplicas: 1
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: nginx_ingress_controller_per_second
              selector:
                matchLabels:
    # You can use this field to filter metrics. The fields set here are passed to the <<.LabelMatchers>> label in adapter.config.
                  service: test-app
    # The External metric type supports only Value and AverageValue target types.
            target:
              type: AverageValue
              averageValue: 30

    Run the following command to deploy an HPA for the sample-app and test-app applications.

    kubectl apply -f hpa.yaml
  2. Run the following command to check the HPA deployment status.

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   0/30 (avg)   1         10        1          74s
    test-hpa     Deployment/test-app     0/30 (avg)   1         10        1          59m

Step 5: Verify the results

After the HPAs are deployed, use the Apache Benchmark tool to run a stress test. Observe whether the applications scale out as the number of requests increases.

  1. Run the following command to stress test the /home path of the host.

    ab -c 50 -n 5000 test.example.com/home
  2. Run the following command to check the HPA status.

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   0/30 (avg)        1         10        1          22m
    test-hpa     Deployment/test-app     22096m/30 (avg)   1         10        3          80m
  3. Run the following command to stress test the root path of the host.

    ab -c 50 -n 5000 test.example.com/
  4. Run the following command to check the HPA status.

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   27778m/30 (avg)   1         10        2          38m
    test-hpa     Deployment/test-app     0/30 (avg)        1         10        1          96m

    The results show that the applications successfully scaled out when the request volume exceeded the threshold.

References

  • Multi-zone balancing is a common deployment method for data-intensive services in high-availability scenarios. When the workload increases, applications that use a multi-zone balanced scheduling policy must automatically scale out instances across multiple zones to meet the scheduling demands of the cluster. For more information, see Implement rapid and simultaneous elastic scaling across multiple zones.

  • You can build custom operating system images to simplify elastic scaling in complex scenarios. For more information, see Elastic optimization with custom images.