All Products
Search
Document Center

Container Service for Kubernetes:Scale multiple applications with Nginx Ingress traffic metrics

Last Updated:Mar 26, 2026

Scale multiple applications with Nginx Ingress traffic metrics

Running multiple application instances improves stability, but idle replicas raise cluster costs. Horizontal Pod Autoscaler (HPA) dynamically adjusts pod replica counts based on live traffic, eliminating both under-provisioning and idle waste. This tutorial shows how to drive HPA for multiple applications simultaneously using the nginx_ingress_controller_requests metric exposed by the NGINX Ingress Controller in your ACK cluster. The NGINX Ingress Controller in ACK clusters is an enhanced version of the community edition and is easier to use.

Each application gets its own HPA that responds only to its own traffic. The selector.matchLabels.service field in each HPA spec acts as a filter, passing per-service label matchers to the adapter rule so that scaling decisions stay isolated between applications.

Prerequisites

Before you begin, ensure that you have:

  • Alibaba Cloud Prometheus deployed in your cluster. For setup instructions, see Use Alibaba Cloud Prometheus for monitoring

  • Theack-alibaba-cloud-metrics-adapter component deployed, with itsprometheus.url field configured

    How to configure prometheus.url

    1. Log on to the ACK console. In the left navigation pane, click Clusters.

    2. On the Clusters page, find the cluster you want and click its name. In the left-side pane, choose Applications > Helm.

    3. On the Helm page, find ack-alibaba-cloud-metrics-adapter and click Update in the Actions column.

    4. In the Update Release panel, set the alibabaCloudMetricsAdapter.prometheus.url field to your Prometheus data request URL. Then click OK.

    For instructions on retrieving the Prometheus data request URL, see How to retrieve the Prometheus data request URL. For a full description of the adapter configuration file, see Detailed description of the ack-alibaba-cloud-metrics-adapter component configuration file.
  • Apache Benchmark (ab) installed for load testing

    Sample commands

    Install ab using the package manager for your operating system:

    • macOS (Homebrew): ``bash brew install httpd ``

    • Windows: Download the Windows build from Apache Lounge. Extract the archive, navigate to the bin folder, and run ab.exe.

    • Ubuntu or Debian: ``shell sudo apt update sudo apt install apache2-utils ``

    • CentOS 8 or RHEL: ``shell sudo yum install httpd-tools ``

    After installation, run ab -V to confirm it is working.

    ab -V

How it works

An Ingress forwards external requests to a Service, which routes them to the matching pod. The NGINX Ingress Controller records per-service request counts in the nginx_ingress_controller_requests Prometheus metric.

HPA cannot consume Prometheus metrics directly — it requires metrics exposed through the Kubernetes external metrics API. The ack-alibaba-cloud-metrics-adapter bridges this gap by processing nginx_ingress_controller_requests in four steps:

Step Field Role Value used here
1. Discovery seriesQuery Identifies which Prometheus metric to expose nginx_ingress_controller_requests
2. Association resources Specifies whether this metric is namespace-scoped namespaced: false
3. Naming name Renames the metric in the external API Strips _requests, appends _per_secondnginx_ingress_controller_per_second
4. Querying metricsQuery Defines the PromQL template used to compute the value sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m]))

The template variables in metricsQuery are filled in automatically by the adapter at query time:

  • <<.Series>> — replaced with the matched Prometheus series name (nginx_ingress_controller_requests)

  • <<.LabelMatchers>> — replaced with the label selectors from the HPA spec (for example, service="sample-app")

Each HPA uses the selector.matchLabels.service field to filter this metric to a single application, so the scaling of sample-app and test-app remains independent.

Step 1: Create applications and services

Create two Deployments and their corresponding Services using the YAML files below.

  1. Createnginx1.yaml with the following content:

    YAML example

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: test-app
      namespace: default
      labels:
        app: test-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: test-app
      template:
        metadata:
          labels:
            app: test-app
        spec:
          containers:
          - image: registry-cn-hangzhou.ack.aliyuncs.com/acs/sample-app:v1-b070784-aliyun
            name: metrics-provider
            ports:
            - name: http
              containerPort: 8080
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: test-app
      namespace: default
      labels:
        app: test-app
    spec:
      ports:
        - port: 8080
          name: http
          protocol: TCP
          targetPort: 8080
      selector:
        app: test-app
      type: ClusterIP

    Apply the manifest:

    kubectl apply -f nginx1.yaml
  2. Createnginx2.yaml with the following content:

    YAML example

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: sample-app
      namespace: default
      labels:
        app: sample-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: sample-app
      template:
        metadata:
          labels:
            app: sample-app
        spec:
          containers:
          - image: registry-cn-hangzhou.ack.aliyuncs.com/acs/sample-app:v1-b070784-aliyun
            name: metrics-provider
            ports:
            - name: http
              containerPort: 8080
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: sample-app
      namespace: default
      labels:
        app: sample-app
    spec:
      ports:
        - port: 80
          name: http
          protocol: TCP
          targetPort: 8080
      selector:
        app: sample-app
      type: ClusterIP

    Apply the manifest:

    kubectl apply -f nginx2.yaml

Step 2: Create an Ingress

  1. Createingress.yaml with the following content:

    Field Description
    host The domain name used to access the Services. This example uses test.example.com.
    path The URL path. Incoming requests are matched against this path and forwarded to the corresponding Service.
    backend The target Service name and port for each path. / routes to sample-app; /home routes to test-app.

    YAML example

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: test-ingress
      namespace: default
    spec:
      ingressClassName: nginx
      rules:
        - host: test.example.com
          http:
            paths:
              - backend:
                  service:
                    name: sample-app
                    port:
                      number: 80
                path: /
                pathType: ImplementationSpecific
              - backend:
                  service:
                    name: test-app
                    port:
                      number: 8080
                path: /home
                pathType: ImplementationSpecific

    The key fields in this Ingress: Apply the manifest:

    kubectl apply -f ingress.yaml
  2. Verify the Ingress is running:

    kubectl get ingress -o wide

    Expected output:

    NAME           CLASS   HOSTS              ADDRESS       PORTS   AGE
    test-ingress   nginx   test.example.com   10.XX.XX.10   80      55s

    The NGINX Ingress Controller now routes requests arriving at test.example.com/ to sample-app and requests at test.example.com/home to test-app. Both request streams are recorded in the nginx_ingress_controller_requests metric in Alibaba Cloud Prometheus, labeled by service name.

Step 3: Convert Prometheus metrics to HPA-compatible metrics

HPA reads metrics from the Kubernetes external metrics API, not from Prometheus directly. The ack-alibaba-cloud-metrics-adapter translates Prometheus metrics into this format using a rule you define in the adapter-config ConfigMap.

Modify the adapter-config file

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, find the cluster you want and click its name. In the left-side pane, choose Applications > Helm.

  3. On the Helm page, click ack-alibaba-cloud-metrics-adapter. In the Resource section, click adapter-config, then click Edit YAML in the upper-right corner.

  4. Replace the existing rules with the following, then click OK:

    For the full set of adapter configuration options, see Horizontal pod autoscaling based on Alibaba Cloud Prometheus metrics.
    rules:
    - metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m]))
      name:
        as: ${1}_per_second
        matches: ^(.*)_requests
      resources:
        namespaced: false
      seriesQuery: nginx_ingress_controller_requests

    image

Verify the metric is available

Run the following command to confirm the adapter is serving the converted metric:

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/nginx_ingress_controller_per_second" | jq .

Expected output:

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metricName": "nginx_ingress_controller_per_second",
      "metricLabels": {},
      "timestamp": "2025-07-25T07:56:04Z",
      "value": "0"
    }
  ]
}

A value of "0" is expected — there is no traffic yet.

Step 4: Create HPAs

Both HPAs use the same nginx_ingress_controller_per_second external metric but filter it to their respective service using selector.matchLabels.service. The adapter passes these labels as <<.LabelMatchers>> into the PromQL query, keeping each application's scaling decisions independent.

  1. Createhpa.yaml with the following content:

    YAML example

    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: sample-hpa
      namespace: default
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: sample-app
      minReplicas: 1
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: nginx_ingress_controller_per_second
              selector:
                matchLabels:
    # Filters the metric to requests for this service only.
    # The labels set here are passed to <<.LabelMatchers>> in the adapter rule.
                  service: sample-app
    # External metrics support only Value and AverageValue target types.
            target:
              type: AverageValue
              averageValue: 30
    ---
    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: test-hpa
      namespace: default
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: test-app
      minReplicas: 1
      maxReplicas: 10
      metrics:
        - type: External
          external:
            metric:
              name: nginx_ingress_controller_per_second
              selector:
                matchLabels:
    # Filters the metric to requests for this service only.
    # The labels set here are passed to <<.LabelMatchers>> in the adapter rule.
                  service: test-app
    # External metrics support only Value and AverageValue target types.
            target:
              type: AverageValue
              averageValue: 30

    Apply the manifest:

    kubectl apply -f hpa.yaml
  2. Verify both HPAs are ready:

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   0/30 (avg)   1         10        1          74s
    test-hpa     Deployment/test-app     0/30 (avg)   1         10        1          59m

    Both HPAs show 0/30 (avg), meaning the current request rate is below the scale-out threshold of 30 requests per second per pod.

Step 5: Verify autoscaling

Use Apache Benchmark to generate traffic and watch each HPA respond independently.

  1. Send 5,000 requests to the /home path (routed to test-app):

    ab -c 50 -n 5000 test.example.com/home
  2. Watch the HPA status update in real time:

    kubectl get hpa --watch

    After the load is applied, you will see test-hpa scale out while sample-hpa remains at one replica. The output transitions from the initial state to the scaled-out state:

    NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   0/30 (avg)        1         10        1          22m
    test-hpa     Deployment/test-app     22096m/30 (avg)   1         10        3          80m

    Press Ctrl+C to stop watching.

  3. Send 5,000 requests to the root path (routed to sample-app):

    ab -c 50 -n 5000 test.example.com/
  4. Check the HPA status again:

    kubectl get hpa

    Expected output:

    NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
    sample-hpa   Deployment/sample-app   27778m/30 (avg)   1         10        2          38m
    test-hpa     Deployment/test-app     0/30 (avg)        1         10        1          96m

    sample-hpa scaled out in response to the new load, while test-hpa scaled back in after the traffic to /home stopped. Each application scaled independently based on its own traffic.

What's next