All Products
Search
Document Center

Alibaba Cloud Service Mesh:QuotaSchedulingPolicy field reference

Last Updated:Mar 11, 2026

The QuotaSchedulingPolicy CustomResourceDefinition (CRD), part of the Service Mesh (ASM) traffic scheduling suite, defines a policy for priority-based request scheduling after a specified request quota is reached.

How it works

QuotaSchedulingPolicy combines two mechanisms:

  • Rate limiter: Controls the request rate using a token bucket algorithm. Tokens are added to a bucket at a fixed rate. When the bucket is empty, the rate limit takes effect.

  • Priority scheduler: Queues requests that exceed the rate limit and processes them in priority order. Higher-priority requests are dequeued first.

When a request arrives, the rate limiter checks the token bucket. If a token is available, the request passes through immediately. Otherwise, the priority scheduler queues the request and processes it based on its priority level.

Token bucket behavior

The rate limiter uses a token bucket to control the request rate. The fill_amount and bucket_capacity fields interact as follows:

  1. The bucket receives fill_amount tokens at each interval (configured in rate_limiter).

  2. Tokens accumulate up to the bucket_capacity limit. Excess tokens are discarded.

  3. When the bucket is empty, new requests enter the priority scheduler queue.

Setting bucket_capacity higher than fill_amount allows short bursts of traffic that exceed the steady-state rate. To disallow bursts entirely, set bucket_capacity equal to fill_amount.

QuotaSchedulingPolicySpec

QuotaSchedulingPolicySpec is the top-level configuration, defined in the spec field of the CRD.

FieldTypeRequiredDescription
quota_schedulerQuotaSchedulerYesRate limiter and priority scheduler configuration for quota-based request scheduling.

QuotaScheduler

FieldTypeRequiredDescription
fill_amountdoubleYesNumber of tokens added to the bucket per fill cycle. Combined with interval in rate_limiter, this value determines the sustained request rate.
bucket_capacitydoubleYesMaximum number of tokens the bucket can hold. When the request rate is lower than the fill rate, tokens accumulate up to this limit. Set this value higher than fill_amount to allow burst traffic, or equal to fill_amount to enforce a strict rate limit with no bursts.
rate_limiterRateLimiterParametersYesRate limiter settings, including the fill interval that works with fill_amount to set the effective rate limit. For the full type definition, see RateLimitingPolicy field reference.
schedulerSchedulerYesPriority-based scheduler that queues and reorders requests when the rate limit is exceeded. For the full type definition, see AverageLatencyScheduledPolicy field reference.
selectors[]SelectorYesMatch rules that determine which requests are subject to quota-based scheduling. For the full type definition, see AverageLatencyScheduledPolicy field reference.

See also