All Products
Search
Document Center

Alibaba Cloud Service Mesh:Configure global throttling for an ingress gateway

Last Updated:Mar 11, 2026

When multiple ingress gateway replicas handle traffic independently, a single replica cannot enforce cluster-wide rate limits. Global throttling solves this by routing all rate-limit decisions through a centralized gRPC service backed by Redis, so request counts are shared across all replicas. Use it to protect backend services from traffic bursts, overload, and abuse.

This topic walks through five throttling scenarios -- from broad route-level limits to fine-grained per-client-IP controls. All scenarios share the same base setup: a throttling service, a sample application, and the ASMGlobalRateLimiter custom resource.

How global throttling works

Global throttling relies on a dedicated gRPC-based throttling service (built on Envoy's rate limit service) and a Redis backend to track request counts across all gateway replicas.

  1. Create an ASMGlobalRateLimiter resource in the ASM control plane to define the throttling rule.

  2. ASM reconciles the resource and generates the throttling service configuration in the resource's status.config.yaml field.

  3. Copy the generated configuration into the ratelimit-config ConfigMap in the data plane cluster.

  4. On each inbound request, the ingress gateway queries the throttling service. When the limit is exceeded, the gateway returns 429 Too Many Requests.

The manual ConfigMap update is required because the throttling service runs in the data plane cluster, while the CRD is managed in the ASM control plane. Update the ConfigMap after every rule change to keep the two in sync.

Choose the right scenario

ScenarioThrottle byUse caseASM version
1. Route-level throttlingSpecific routeLimit all traffic on a single route1.18.0.131+
2. Domain and port throttlingDomain + portLimit all traffic to a host:port combination1.18.0.131+
3. Header and query parameter throttlingRequest headers and query parametersLimit only requests matching specific headers or query strings1.19.0+
4. Client IP throttlingSpecific client IP or CIDR blockBlock or limit a known abusive IP1.19.0+
5. Per-client-IP throttlingEach client IP independentlyEnforce per-IP quotas across all clients1.25.0+

Prerequisites

Preparations

Before configuring throttling rules, deploy the throttling service and a sample application.

Deploy the throttling service

The throttling service has two components: a Redis instance that stores rate counters, and the Envoy rate limit service that evaluates requests against configured limits.

  1. Create a file named ratelimit-svc.yaml.

    Show ratelimit-svc.yaml

       apiVersion: v1
       kind: ServiceAccount
       metadata:
         name: redis
       ---
       apiVersion: v1
       kind: Service
       metadata:
         name: redis
         labels:
           app: redis
       spec:
         ports:
         - name: redis
           port: 6379
         selector:
           app: redis
       ---
       apiVersion: apps/v1
       kind: Deployment
       metadata:
         name: redis
       spec:
         replicas: 1
         selector:
           matchLabels:
             app: redis
         template:
          metadata:
             labels:
               app: redis
               sidecar.istio.io/inject: "false"
          spec:
             containers:
             - image: registry-cn-hangzhou.ack.aliyuncs.com/dev/redis:alpine
               imagePullPolicy: Always
               name: redis
               ports:
               - name: redis
                 containerPort: 6379
             restartPolicy: Always
             serviceAccountName: redis
       ---
       apiVersion: v1
       kind: ConfigMap
       metadata:
         name: ratelimit-config
       data:
         config.yaml: |
           {}
       ---
       apiVersion: v1
       kind: Service
       metadata:
         name: ratelimit
         labels:
           app: ratelimit
       spec:
         ports:
         - name: http-port
           port: 8080
           targetPort: 8080
           protocol: TCP
         - name: grpc-port
           port: 8081
           targetPort: 8081
           protocol: TCP
         - name: http-debug
           port: 6070
           targetPort: 6070
           protocol: TCP
         selector:
           app: ratelimit
       ---
       apiVersion: apps/v1
       kind: Deployment
       metadata:
         name: ratelimit
       spec:
         replicas: 1
         selector:
           matchLabels:
             app: ratelimit
         strategy:
           type: Recreate
         template:
           metadata:
             labels:
               app: ratelimit
               sidecar.istio.io/inject: "false"
           spec:
             containers:
               # Latest image from https://hub.docker.com/r/envoyproxy/ratelimit/tags
             - image: registry-cn-hangzhou.ack.aliyuncs.com/dev/envoyproxy/ratelimit:e059638d
               imagePullPolicy: Always
               name: ratelimit
               command: ["/bin/ratelimit"]
               env:
               - name: LOG_LEVEL
                 value: debug
               - name: REDIS_SOCKET_TYPE
                 value: tcp
               - name: REDIS_URL
                 value: redis.default.svc.cluster.local:6379
               - name: USE_STATSD
                 value: "false"
               - name: RUNTIME_ROOT
                 value: /data
               - name: RUNTIME_SUBDIRECTORY
                 value: ratelimit
               - name: RUNTIME_WATCH_ROOT
                 value: "false"
               - name: RUNTIME_IGNOREDOTFILES
                 value: "true"
               ports:
               - containerPort: 8080
               - containerPort: 8081
               - containerPort: 6070
               volumeMounts:
               - name: config-volume
                 # $RUNTIME_ROOT/$RUNTIME_SUBDIRECTORY/$RUNTIME_APPDIRECTORY/config.yaml
                 mountPath: /data/ratelimit/config
             volumes:
             - name: config-volume
               configMap:
                 name: ratelimit-config
  2. Connect to your ACK cluster with kubectl and deploy the throttling service. For details, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

       kubectl apply -f ratelimit-svc.yaml

Deploy the Bookinfo sample application

  1. Download the bookinfo.yaml file from the Istio repository on GitHub.

  2. Deploy the Bookinfo application in the ACK cluster.

       kubectl apply -f bookinfo.yaml
  3. Create a file named bookinfo-gateway.yaml.

    Show bookinfo-gateway.yaml

       apiVersion: networking.istio.io/v1beta1
       kind: Gateway
       metadata:
         name: bookinfo-gateway
         namespace: default
       spec:
         selector:
           istio: ingressgateway
         servers:
           - hosts:
               - bf2.example.com
             port:
               name: http
               number: 80
               protocol: http
       ---
       apiVersion: networking.istio.io/v1beta1
       kind: VirtualService
       metadata:
         name: bookinfo
         namespace: default
       spec:
         gateways:
           - bookinfo-gateway
         hosts:
           - bf2.example.com
         http:
           - match:
               - uri:
                   exact: /productpage
               - uri:
                   prefix: /static
               - uri:
                   exact: /login
               - uri:
                   exact: /logout
               - uri:
                   prefix: /api/v1/products
             name: productpage-route-name1
             route:
               - destination:
                   host: productpage
                   port:
                     number: 9080
  4. Connect to the ASM instance with kubectl and create the Gateway and VirtualService. For details, see Use kubectl on the control plane to access Istio resources. This creates a route named productpage-route-name1 for the bf2.example.com domain. It matches requests to /productpage, /static, /login, /logout, and /api/v1/products, and forwards them to the productpage service on port 9080.

       kubectl apply -f bookinfo-gateway.yaml

Common procedure for applying throttling rules

Every scenario in this topic follows the same three-step process after creating the ASMGlobalRateLimiter YAML:

  1. Apply the rule in the ASM control plane:

       kubectl apply -f global-ratelimit-gw.yaml
  2. Extract the generated configuration from the reconciled resource: Locate the status.config.yaml field in the output and copy its content exactly.

       kubectl get asmglobalratelimiter global-test -n istio-system -o yaml
  3. Update the ConfigMap in the ACK cluster. Paste the config.yaml content into the data.config.yaml field of the ratelimit-config ConfigMap, then apply: The ConfigMap format:

       kubectl apply -f ratelimit-config.yaml
       apiVersion: v1
       kind: ConfigMap
       metadata:
         name: ratelimit-config
       data:
         config.yaml: |
           # Paste the content from status.config.yaml here

Scenario 1: Route-level throttling

Limit all traffic on a specific virtual service route. This example limits the productpage-route-name1 route at bf2.example.com:80 to 1 request per minute.

Step 1: Create the throttling rule

Create a file named global-ratelimit-gw.yaml:

apiVersion: istio.alibabacloud.com/v1beta1
kind: ASMGlobalRateLimiter
metadata:
  name: global-test
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  rateLimitService:
    host: ratelimit.default.svc.cluster.local
    port: 8081
    timeout:
      seconds: 5
  isGateway: true
  configs:
  - name: productpage
    limit:
      unit: MINUTE
      quota: 1
    match:
      vhost:
        name: bf2.example.com
        port: 80
        route:
          name_match: productpage-route-name1

Key fields:

FieldDescription
workloadSelectorTargets the ingress gateway workload. Set to istio: ingressgateway.
isGatewayMust be true for gateway-level throttling.
rateLimitServiceAddress and timeout for the throttling service deployed in the Preparations section.
limit.unit / limit.quotaTime window and request count. MINUTE with quota: 1 allows 1 request per minute.
vhostMatches the domain, port, and route. name and port must match the VirtualService host and gateway port. route.name_match must match the route name in the VirtualService.

For a complete field reference, see ASMGlobalRateLimiter CRD description.

Step 2: Apply and sync the configuration

Follow the common procedure to apply the rule, extract the generated configuration, and update the ConfigMap.

Expected status output

status:
  config.yaml: |
    descriptors:
    - key: generic_key
      rate_limit:
        requests_per_unit: 1
        unit: MINUTE
      value: RateLimit[global-test.istio-system]-Id[597770312]
    domain: ratelimit.default.svc.cluster.local
  message: ok
  status: successful

Step 3: Verify

Send two requests to the ingress gateway within one minute. Replace <ASM-gateway-IP> with your ingress gateway IP address. For details, see Obtain the ingress gateway address.

curl -H 'host: bf2.example.com' http://<ASM-gateway-IP>/productpage -v
curl -H 'host: bf2.example.com' http://<ASM-gateway-IP>/productpage -v

The first request succeeds. The second request returns a 429 response:

< HTTP/1.1 429 Too Many Requests
< x-envoy-ratelimited: true
< x-ratelimit-limit: 1, 1;w=60
< x-ratelimit-remaining: 0
< x-ratelimit-reset: 48

Response headers:

HeaderMeaning
x-envoy-ratelimitedIndicates the request was throttled.
x-ratelimit-limitThe configured limit and window (1 request per 60-second window).
x-ratelimit-remainingRemaining requests in the current window.
x-ratelimit-resetSeconds until the rate limit window resets.

Scenario 2: Domain and port throttling

Limit all traffic to a specific domain and port combination, regardless of route. This example limits all requests to bf2.example.com:80 to 1 request per minute.

Step 1: Create the throttling rule

Create a file named global-ratelimit-gw.yaml:

apiVersion: istio.alibabacloud.com/v1beta1
kind: ASMGlobalRateLimiter
metadata:
  name: global-test
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  rateLimitService:
    host: ratelimit.default.svc.cluster.local
    port: 8081
    timeout:
      seconds: 5
  isGateway: true
  configs:
  - name: productpage
    limit:
      unit: MINUTE
      quota: 1
    match:
      vhost:
        name: bf2.example.com
        port: 80

The only difference from Scenario 1 is that vhost does not include a route.name_match field, so the limit applies to all routes under the specified domain and port.

Step 2: Apply and sync the configuration

Follow the common procedure.

Expected status output

status:
  config.yaml: |
    descriptors:
    - key: generic_key
      rate_limit:
        requests_per_unit: 1
        unit: MINUTE
      value: RateLimit[global-test.istio-system]-Id[2100900480]
    domain: ratelimit.default.svc.cluster.local
  message: ok
  status: successful

Step 3: Verify

Send two requests within one minute:

curl -H 'host: bf2.example.com' http://<ASM-gateway-IP>/productpage -v
curl -H 'host: bf2.example.com' http://<ASM-gateway-IP>/productpage -v

The first request succeeds. The second returns 429 Too Many Requests, confirming that domain-level throttling is active.

Scenario 3: Header and query parameter throttling

Requires ASM version 1.19.0 or later. See Update an ASM instance.

Throttle only requests that match specific headers and query parameters on a route. Other requests on the same route pass through unthrottled.

In this example, the productpage-route-name1 route allows 100,000 requests per second by default (effectively unlimited). Only requests containing both the ratelimit: "true" header and the ratelimit=enabled query parameter are limited to 1 request per minute.

Step 1: Create the throttling rule

Create a file named global-ratelimit-gw.yaml:

apiVersion: istio.alibabacloud.com/v1beta1
kind: ASMGlobalRateLimiter
metadata:
  name: global-test
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      app: istio-ingressgateway
  rateLimitService:
    host: ratelimit.default.svc.cluster.local
    port: 8081
    timeout:
      seconds: 5
  isGateway: true
  configs:
  - name: productpage
    limit:
      unit: SECOND
      quota: 100000
    match:
      vhost:
        name: bf2.example.com
        port: 80
        route:
          name_match: productpage-route-name1
    limit_overrides:
    - request_match:
        header_match:
        - name: ratelimit
          exact_match: "true"
        query_match:
        - name: ratelimit
          exact_match: "enabled"
      limit:
        unit: MINUTE
        quota: 1

Key fields specific to this scenario:

FieldDescription
limit (top-level)Set to a high value (100,000/second) so normal traffic is not throttled.
limit_overridesOverrides the base limit for requests matching specific conditions.
request_match.header_matchMatches requests with the ratelimit: "true" header (exact match).
request_match.query_matchMatches requests with the ratelimit=enabled query parameter (exact match).
limit_overrides.limitThe actual throttling limit for matched requests: 1 request per minute.

Step 2: Apply and sync the configuration

Follow the common procedure.

Expected status output

status:
  config.yaml: |
    descriptors:
    - descriptors:
      - descriptors:
        - key: query_match
          rate_limit:
            requests_per_unit: 1
            unit: MINUTE
          value: RateLimit[global-test.istio-system]-Id[1102463266]
        key: header_match
        value: RateLimit[global-test.istio-system]-Id[1102463266]
      key: generic_key
      rate_limit:
        requests_per_unit: 100000
        unit: SECOND
      value: RateLimit[global-test.istio-system]-Id[1102463266]
    domain: ratelimit.default.svc.cluster.local
  message: ok
  status: successful

Step 3: Verify

Test with matching headers and query parameters -- send two requests within one minute:

curl -H 'host: bf2.example.com' -H 'ratelimit: true' \
  'http://<ASM-gateway-IP>/productpage?ratelimit=enabled' -v
curl -H 'host: bf2.example.com' -H 'ratelimit: true' \
  'http://<ASM-gateway-IP>/productpage?ratelimit=enabled' -v

The first request succeeds. The second returns 429 Too Many Requests.

Test without matching headers -- send a request without the ratelimit header:

curl -H 'host: bf2.example.com' http://<ASM-gateway-IP>/productpage -v

This request succeeds, confirming that only requests matching the specified headers and query parameters are throttled.

Scenario 4: Client IP throttling

Requires ASM version 1.19.0 or later. See Update an ASM instance.
External Traffic Policy of the ingress gateway must be set to Local to preserve client IP addresses. See Create an ingress gateway.
Find the client IP in the downstream_remote_address field of the gateway access log.

Throttle requests from a specific client IP address or CIDR block on a route. Other client IPs are unaffected.

Step 1: Create the throttling rule

Create a file named global-ratelimit-gw.yaml. Replace the IP address and subnet mask placeholders:

apiVersion: istio.alibabacloud.com/v1beta1
kind: ASMGlobalRateLimiter
metadata:
  name: global-test
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      app: istio-ingressgateway
  rateLimitService:
    host: ratelimit.default.svc.cluster.local
    port: 8081
    timeout:
      seconds: 5
  isGateway: true
  configs:
  - name: productpage
    limit:
      unit: SECOND
      quota: 100000
    match:
      vhost:
        name: bf2.example.com
        port: 80
        route:
          name_match: productpage-route-name1
    limit_overrides:
    - request_match:
        remote_address:
          address: <client-IP>          # Example: 106.11.XX.XX
          v4_prefix_mask_len: <mask>     # Example: 24 (for a /24 CIDR block)
      limit:
        unit: MINUTE
        quota: 1

Replace the following placeholders:

PlaceholderDescriptionExample
<client-IP>The client IP address to throttle106.11.XX.XX
<mask>IPv4 subnet mask length (optional)24

Key fields specific to this scenario:

FieldDescription
remote_address.addressThe client IP address to match.
remote_address.v4_prefix_mask_lenOptional. Specifies the subnet mask length to match a CIDR block instead of a single IP.

Step 2: Apply and sync the configuration

Follow the common procedure.

Expected status output

status:
  config.yaml: |
    descriptors:
    - descriptors:
      - key: masked_remote_address
        rate_limit:
          requests_per_unit: 1
          unit: MINUTE
        value: xxxxxx
      key: generic_key
      rate_limit:
        requests_per_unit: 100000
        unit: SECOND
      value: RateLimit[global-test.istio-system]-Id[1102463266]
    domain: ratelimit.default.svc.cluster.local
  message: ok
  status: successful

Step 3: Verify

Test from the throttled IP -- send two requests within one minute:

curl -H 'host: bf2.example.com' http://<ASM-gateway-IP>/productpage -v
curl -H 'host: bf2.example.com' http://<ASM-gateway-IP>/productpage -v

The first request succeeds. The second returns 429 Too Many Requests.

Test from a different IP -- send a request from a client with a different IP address:

curl -H 'host: bf2.example.com' http://<ASM-gateway-IP>/productpage -v

This request succeeds, confirming that only the specified client IP is throttled.

Scenario 5: Per-client-IP throttling

Requires ASM version 1.25.0 or later. See Update an ASM instance.
External Traffic Policy of the ingress gateway must be set to Local. See Create an ingress gateway.
Find the client IP in the downstream_remote_address field of the gateway access log.

Unlike Scenario 4 (which targets a specific IP), this scenario enforces independent rate limits for every client IP. Each IP gets its own quota.

Step 1: Create the throttling rule

Create a file named global-ratelimit-gw.yaml:

apiVersion: istio.alibabacloud.com/v1beta1
kind: ASMGlobalRateLimiter
metadata:
  name: global-test
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      app: istio-ingressgateway
  rateLimitService:
    host: ratelimit.default.svc.cluster.local
    port: 8081
    timeout:
      seconds: 5
  isGateway: true
  configs:
  - name: productpage
    limit:
      unit: SECOND
      quota: 100000
    target_services:
    - name: bookinfo
      namespace: default
      kind: VirtualService
      port: 80
      section_name: productpage-route-name1
    limit_overrides:
    - request_match:
        remote_address:
          distinct: true
      limit:
        unit: MINUTE
        quota: 1

Key fields specific to this scenario:

FieldDescription
remote_address.distinctSet to true to maintain separate rate counters for each client IP.
target_servicesAn alternative to vhost matching. Targets a specific VirtualService by name, namespace, kind, port, and section_name (route name).

For a complete field reference, see ASMGlobalRateLimiter CRD description.

Step 2: Apply and sync the configuration

  1. Apply the rule in the ASM control plane:

       kubectl apply -f global-ratelimit-gw.yaml
  2. Get the reconciled status:

    Expected status output

       kubectl get asmglobalratelimiter global-test -n istio-system -o yaml
       status:
         config.yaml: |
           descriptors:
           - descriptors:
             - key: remote_address
               rate_limit:
                 requests_per_unit: 1
                 unit: MINUTE
             key: generic_key
             rate_limit:
               requests_per_unit: 100000
               unit: SECOND
             value: RateLimit[global-test.istio-system]-Id[537612397]
           domain: ratelimit.default.svc.cluster.local
         message: ok
         status: successful
  3. Update the ConfigMap with the generated configuration: Replace the data.config.yaml field with the content from status.config.yaml:

       kubectl edit ConfigMap ratelimit-config
       apiVersion: v1
       kind: ConfigMap
       metadata:
         name: ratelimit-config
       data:
         config.yaml: |
           descriptors:
           - descriptors:
             - key: remote_address
               rate_limit:
                 requests_per_unit: 1
                 unit: MINUTE
             key: generic_key
             rate_limit:
               requests_per_unit: 100000
               unit: SECOND
             value: RateLimit[global-test.istio-system]-Id[537612397]
           domain: ratelimit.default.svc.cluster.local

Step 3: Verify

Get the ingress gateway IP, then send two requests in a row:

export GATEWAY_URL=$(kubectl -n istio-system get service istio-ingressgateway \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
curl -H 'host: bf2.example.com' http://$GATEWAY_URL:80/productpage -v
curl -H 'host: bf2.example.com' http://$GATEWAY_URL:80/productpage -v

The first request succeeds. The second returns 429 Too Many Requests, confirming that per-client-IP throttling is active. Each client IP has its own independent rate limit counter.

Related topics