All Products
Search
Document Center

Container Service for Kubernetes:Horizontal scaling for multiple applications with Nginx Ingress metrics

Last Updated:Jun 25, 2026

Deploying multiple instances improves application stability but can lead to idle resources and higher cluster costs. Manual scaling is labor-intensive and often lags. You can use metrics from an Nginx Ingress to drive the Horizontal Pod Autoscaler (HPA), which dynamically adjusts the number of pod replicas based on workload. This approach ensures application stability and fast responses while optimizing resource utilization and reducing costs. This topic describes how to use Nginx Ingress traffic metrics to autoscale multiple applications.

An Ingress forwards external requests to a Service within the cluster, and the Service then routes the requests to a pod. In a production environment, you can configure automatic scaling based on request volume. The Nginx Ingress Controller exposes this volume through the nginx_ingress_controller_requests metric, which you can use as a metric source for HPA. The Nginx Ingress Controller in ACK clusters is an enhanced version of the community edition and offers a more streamlined user experience.

Prerequisites

Before you can autoscale applications based on Nginx Ingress traffic, you must configure the ack-alibaba-cloud-metrics-adapter component to expose Alibaba Cloud Prometheus metrics for the HPA.

  • The Alibaba Cloud Prometheus monitoring component is deployed. For more information, see Use Alibaba Cloud Prometheus for monitoring.

  • The ack-alibaba-cloud-metrics-adapter component is deployed and its prometheus.url field is configured.

    How to configure prometheus.url

    1. Log on to the ACK console. In the left navigation pane, click Clusters.

    2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Applications > Helm.

    3. On the Helm page, find ack-alibaba-cloud-metrics-adapter and click Update in the Actions column.

    4. In the Update Release panel, set the alibabaCloudMetricsAdapter.prometheus.url field to the Prometheus data request URL that you obtained. Then, click OK.

      For more information, see How to retrieve the Prometheus data request URL.
      For a detailed description of the configuration file, see ack-alibaba-cloud-metrics-adapter component configuration file details.
  • The Apache Benchmark stress test tool is installed.

    Sample commands

    • macOS: Use Homebrew to install the tool.

      brew install httpd
    • Windows: Visit Apache Lounge to download the Windows version of Apache, and then in the Command Prompt, use the cd command to go to the bin directory of the extracted folder and run ab.exe to start the program.

    • Ubuntu or Debian:

      sudo apt update
      sudo apt install apache2-utils
    • CentOS 8 or RHEL:

      sudo yum install httpd-tools

    After the installation is complete, run ab -V to verify the installation.

In this tutorial, you will create two Deployments and their corresponding Services. You will then configure an Ingress with different access paths to route external traffic. Finally, you will configure an HPA for the applications based on the nginx_ingress_controller_requests metric and use the HPA's selector.matchLabels.service field to filter the metric. This enables pods to scale automatically based on traffic.

Step 1: Create applications and services

Use the following YAML manifests to create the application Deployments and their corresponding Services.

  1. Create a file named nginx1.yaml and copy the following content into it.

    YAML example

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: test-app
      namespace: default
      labels:
        app: test-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: test-app
      template:
        metadata:
          labels:
            app: test-app
        spec:
          containers:
          - image: registry-cn-hangzhou.ack.aliyuncs.com/acs/sample-app:v1-b070784-aliyun
            name: metrics-provider
            ports:
            - name: http
              containerPort: 8080
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: test-app
      namespace: default
      labels:
        app: test-app
    spec:
      ports:
        - port: 8080
          name: http
          protocol: TCP
          targetPort: 8080
      selector:
        app: test-app
      type: ClusterIP

    Run the following command to create the test-app application and its corresponding Service.

    kubectl apply -f nginx1.yaml
  2. Create a file named nginx2.yaml and copy the following content into it.

    YAML example

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: sample-app
      namespace: default
      labels:
        app: sample-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: sample-app
      template:
        metadata:
          labels:
            app: sample-app
        spec:
          containers:
          - image: registry-cn-hangzhou.ack.aliyuncs.com/acs/sample-app:v1-b070784-aliyun
            name: metrics-provider
            ports:
            - name: http
              containerPort: 8080
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: sample-app
      namespace: default
      labels:
        app: sample-app
    spec:
      ports:
        - port: 80
          name: http
          protocol: TCP
          targetPort: 8080
      selector:
        app: sample-app
      type: ClusterIP

    Run the following command to create the sample-app application and its corresponding Service.

    kubectl apply -f nginx2.yaml

Step 2: Create an Ingress

  1. Create a file named ingress.yaml and copy the following content into it.

    YAML example

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: test-ingress
      namespace: default
    spec:
      ingressClassName: nginx
      rules:
        - host: test.example.com
          http:
            paths:
              - backend:
                  service:
                    name: sample-app
                    port:
                      number: 80
                path: /
                pathType: ImplementationSpecific
              - backend:
                  service:
                    name: test-app
                    port:
                      number: 8080
                path: /home
                pathType: ImplementationSpecific
    • host: The domain name used to access the Service. This example uses test.example.com.

    • path: The URL path for access. When a request arrives, it is matched to the corresponding Service based on the routing rule. The request is then sent to the corresponding pod through the Service.

    • backend: Specifies the Service to which the current path forwards requests. It consists of a Service name and a Service port.

    Run the following command to deploy the Ingress resource.

    kubectl apply -f ingress.yaml
  2. Run the following command to get the Ingress resource.

    kubectl get ingress -o wide

    Expected output:

    NAME           CLASS   HOSTS              ADDRESS       PORTS   AGE                                                  
    test-ingress   nginx   test.example.com   10.XX.XX.10   80      55s

    After a successful deployment, you can access the Host address by using the / and /home paths. The Nginx Ingress Controller routes traffic to sample-app and test-app respectively based on the preceding configuration. You can query the nginx_ingress_controller_requests metric in Alibaba Cloud Prometheus to obtain request information for each application.

Step 3: Convert Prometheus metrics for HPA

Configure metrics adapter

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Applications > Helm.

  3. On the Helm page, click ack-alibaba-cloud-metrics-adapter. In the Resource section, click adapter-config, and then click Edit YAML in the upper-right corner of the page.

  4. Replace the existing rules in the configuration with the following content. Then, click OK at the bottom of the page.

    For more information, see Horizontal pod autoscaling based on Alibaba Cloud Prometheus metrics.
    rules:
    - metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m]))
      name:
        as: ${1}_per_second
        matches: ^(.*)_requests
      resources:
        namespaced: false  
      seriesQuery: nginx_ingress_controller_requests
    apiVersion: v1
    data:
      config.yaml: >
        rules:
        - metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m]))
          name:
            as: ${1}_per_second
            matches: ^(.*)_requests
          resources:
            namespaced: false
          seriesQuery: nginx_ingress_controller_requests

View the metric output

Run the following command to view the metric output.

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/nginx_ingress_controller_per_second" | jq .

The query result is as follows:

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metricName": "nginx_ingress_controller_per_second",
      "metricLabels": {},
      "timestamp": "2025-07-25T07:56:04Z",
      "value": "0"
    }
  ]
}

Step 4: Create HPAs

  1. Create a file named hpa.yaml and copy the following content into it.

    YAML example

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: sample-hpa
      namespace: default
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: sample-app
      minReplicas: 1
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: nginx_ingress_controller_per_second
              selector:
                matchLabels:
    # You can use this field to filter metrics. The labels specified here are passed to the <<.LabelMatchers>> tag in adapter.config.
                  service: sample-app
    # The External metric type supports only the Value and AverageValue target types.
            target:
              type: AverageValue
              averageValue: 30
    ---
    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: test-hpa
      namespace: default
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: test-app
      minReplicas: 1
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: nginx_ingress_controller_per_second
              selector:
                matchLabels:
    # You can use this field to filter metrics. The labels specified here are passed to the <<.LabelMatchers>> tag in adapter.config.
                  service: test-app
    # The External metric type supports only the Value and AverageValue target types.
            target:
              type: AverageValue
              averageValue: 30

    Run the following command to deploy the HPAs for the sample-app and test-app applications.

    kubectl apply -f hpa.yaml
  2. Run the following command to check the HPA deployment status.

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   0/30 (avg)   1         10        1          74s
    test-hpa     Deployment/test-app     0/30 (avg)   1         10        1          59m

Step 5: Verify the results

After the HPAs are deployed, use the Apache Benchmark tool to run a stress test and verify that the application pods scale out as request volume increases.

  1. Run the following command to perform a stress test on the /home path on the host.

    ab -c 50 -n 5000 test.example.com/home
  2. Run the following command to check the HPA status.

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   0/30 (avg)        1         10        1          22m
    test-hpa     Deployment/test-app     22096m/30 (avg)   1         10        3          80m
  3. Run the following command to stress test the root path of the host.

    ab -c 50 -n 5000 test.example.com/
  4. Run the following command to check the HPA status.

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   27778m/30 (avg)   1         10        2          38m
    test-hpa     Deployment/test-app     0/30 (avg)        1         10        1          96m

    The results show that the applications successfully scaled out when the request volume exceeded the threshold.

Related documents

  • Multi-zone balancing is a common deployment method for data-intensive services in high-availability scenarios. When the workload increases, applications that use a multi-zone balanced scheduling policy must automatically scale out instances across multiple zones to meet cluster scheduling demands. For more information, see Implement rapid and simultaneous elastic scaling across multiple zones.

  • You can build custom operating system images to simplify elastic scaling in complex scenarios. For more information, see Elastic optimization with custom images.