All Products
Search
Document Center

Alibaba Cloud Service Mesh:RateLimitingPolicy field reference

Last Updated:Mar 10, 2026

RateLimitingPolicy is a Custom Resource Definition (CRD) in the Service Mesh (ASM) traffic scheduling suite. Use it to declaratively configure global rate limiting for services in an ASM instance. Rate limiting is based on the token bucket algorithm.

How it works

RateLimitingPolicy uses a token bucket to control request rates:

  • The bucket holds a fixed number of tokens, defined by bucket_capacity.

  • Tokens are added at a steady rate: fill_amount tokens every interval.

  • Each incoming request consumes one token. When the bucket is empty, requests are rejected with HTTP 429.

  • Setting bucket_capacity equal to fill_amount prevents burst traffic. Setting bucket_capacity higher than fill_amount allows short bursts above the steady-state rate.

To apply separate rate limits per client or user, group requests by a label (such as a user ID header) so that each group gets its own independent token bucket.

CRD structure

apiVersion: istio.alibabacloud.com/v1
kind: RateLimitingPolicy
metadata:
  name: ...
  namespace: ...
spec:
  rate_limiter:                        # RateLimiter (required)
    fill_amount: ...                   # double  - tokens added per interval
    bucket_capacity: ...               # double  - max tokens in bucket
    parameters:                        # RateLimiterParameters (required)
      interval: ...                    # Duration - refill interval
      limit_by_label_key: ...          # string  - group requests by label
      continuous_fill: ...             # bool    - smooth refill (default: true)
      delay_initial_fill: ...          # bool    - delay first fill (default: false)
      max_idle_time: ...               # Duration - idle bucket TTL (default: 7200s)
      lazy_sync:                       # RateLimiterParametersLazySync
        enabled: ...                   # bool    - enable lazy sync (default: false)
        num_sync: ...                  # int     - syncs per interval (default: 4)
    request_parameters:                # RateLimiterRequestParameters
      denied_response_status_code: ... # int     - override HTTP 429
      tokens_label_key: ...            # string  - override token cost per request
    selectors:                         # []Selector (required)
    - agent_group: ...
      control_point: ...
      service: ...

Examples

Basic rate limiting

The following configuration rate-limits the httpbin service to 2 requests every 30 seconds. Because bucket_capacity equals fill_amount, no burst traffic is allowed. Requests are grouped by the user_id header, so each unique user_id gets its own token bucket.

apiVersion: istio.alibabacloud.com/v1
kind: RateLimitingPolicy
metadata:
  name: ratelimit
  namespace: istio-system
spec:
  rate_limiter:
    bucket_capacity: 2                                    # Max 2 tokens in the bucket
    fill_amount: 2                                        # Add 2 tokens per interval
    parameters:
      interval: 30s                                       # Refill every 30 seconds
      limit_by_label_key: http.request.header.user_id     # One bucket per user_id header
    selectors:
    - agent_group: default
      control_point: ingress
      service: httpbin.default.svc.cluster.local

Per-user rate limiting with burst allowance

The following configuration allows 100 requests per minute with a burst capacity of 150. Lazy sync is enabled to reduce latency at the cost of slightly less accurate enforcement.

apiVersion: istio.alibabacloud.com/v1
kind: RateLimitingPolicy
metadata:
  name: per-user-ratelimit
  namespace: istio-system
spec:
  rate_limiter:
    bucket_capacity: 150                                  # Allow bursts up to 150 requests
    fill_amount: 100                                      # Steady-state: 100 requests per minute
    parameters:
      interval: 60s
      limit_by_label_key: http.request.header.user_id
      continuous_fill: true                               # Smooth token distribution
      lazy_sync:
        enabled: true                                     # Local decisions, periodic sync
        num_sync: 4                                       # Sync 4 times per interval
    request_parameters:
      denied_response_status_code: 503                    # Return 503 instead of 429
    selectors:
    - agent_group: default
      control_point: ingress
      service: my-api.production.svc.cluster.local

Field reference

RateLimitingPolicySpec

The top-level spec field of a RateLimitingPolicy resource.

FieldTypeRequiredDescription
rate_limiterRateLimiterYesRate limiter configuration.

RateLimiter

Defines the token bucket parameters, request handling, and target selectors.

FieldTypeRequiredDefaultDescription
fill_amountdoubleYes--Number of tokens added to the bucket each interval. Together with interval, this defines the steady-state request rate.
bucket_capacitydoubleYes--Maximum number of tokens the bucket can hold. Set equal to fill_amount to prevent bursts. Set higher to allow short traffic spikes.
parametersRateLimiterParametersYes--Rate limiter runtime parameters.
request_parametersRateLimiterRequestParametersNo--Custom request handling configuration.
selectors[]SelectorYes--Services and traffic to which rate limiting applies.

RateLimiterParameters

Controls how the rate limiter fills the token bucket and groups requests.

FieldTypeRequiredDefaultDescription
intervalDurationYes--Token bucket refill interval. Example: 30s adds fill_amount tokens every 30 seconds.
limit_by_label_keystringNo--Groups requests by a request label. Each unique label value gets its own token bucket. See Request labels for available label keys.
continuous_fillboolNotrueWhen true, tokens are added smoothly over the interval rather than all at once when the interval elapses.
delay_initial_fillboolNofalseWhen false, the bucket starts at full capacity on the first request. This may allow more requests than the configured rate during the first interval.
lazy_syncRateLimiterParametersLazySyncNo--Lazy synchronization configuration. See Lazy sync: accuracy vs. latency.
max_idle_timeDurationNo7200sTime to keep a per-label token bucket after its last request. Only applies when limit_by_label_key is set.

RateLimiterRequestParameters

Overrides default request handling behavior.

FieldTypeRequiredDefaultDescription
denied_response_status_codeintNo429HTTP status code returned when a request is rate-limited.
tokens_label_keystringNo--Request label whose value determines the number of tokens consumed per request, overriding the default of 1.

RateLimiterParametersLazySync

Controls lazy synchronization between Envoy and the remote agent. See Lazy sync: accuracy vs. latency for guidance on when to enable this feature.

FieldTypeRequiredDefaultDescription
enabledboolNofalseEnables lazy synchronization.
num_syncintNo4Number of times Envoy syncs with the remote agent within each interval.

Lazy sync: accuracy vs. latency

By default, Envoy contacts the remote agent on every request for a precise rate limiting decision. Lazy sync changes this behavior: Envoy decides locally and syncs with the remote agent periodically.

ModeBehaviorAccuracyLatency
Default (lazy sync disabled)Envoy checks the remote agent per requestHigh -- exact token count enforcementHigher -- remote call on every request
Lazy sync enabledEnvoy decides locally, syncs num_sync times per intervalLower -- temporary over/under-counting between syncsLower -- most requests skip the remote call

Enable lazy sync for high-throughput APIs where approximate rate enforcement is acceptable and low latency matters more than exact counting.

Request labels

The ASM traffic scheduling suite assigns labels to each request as key-value pairs. Use these labels with limit_by_label_key to group requests for per-group rate limiting, or with tokens_label_key to vary the token cost per request.

HTTP request metadata

Each HTTP request is automatically labeled with the following metadata:

Label keyValueExample
http.methodHTTP methodPOST
http.flavorHTTP protocol version1.1
http.hostRequest hosthttpbin.default.svc.cluster.local
http.targetRequest path/get
http.request_content_lengthRequest body size in bytes431
http.request.header.<header_name>Value of the specified request headerhttp.request.header.user_agent

Baggage header

Baggage is an OpenTelemetry standard for propagating context across distributed systems. If a request includes a baggage HTTP header, each key-value pair is converted into a request label.

Example header:

baggage: userId=alice,isProduction=false

This produces two labels: userId: alice and isProduction: false. To rate-limit per user based on Baggage, set limit_by_label_key to userId.