The QuotaSchedulingPolicy CustomResourceDefinition (CRD), part of the Service Mesh (ASM) traffic scheduling suite, defines a policy for priority-based request scheduling after a specified request quota is reached.
How it works
QuotaSchedulingPolicy combines two mechanisms:
Rate limiter: Controls the request rate using a token bucket algorithm. Tokens are added to a bucket at a fixed rate. When the bucket is empty, the rate limit takes effect.
Priority scheduler: Queues requests that exceed the rate limit and processes them in priority order. Higher-priority requests are dequeued first.
When a request arrives, the rate limiter checks the token bucket. If a token is available, the request passes through immediately. Otherwise, the priority scheduler queues the request and processes it based on its priority level.
Token bucket behavior
The rate limiter uses a token bucket to control the request rate. The fill_amount and bucket_capacity fields interact as follows:
The bucket receives
fill_amounttokens at eachinterval(configured inrate_limiter).Tokens accumulate up to the
bucket_capacitylimit. Excess tokens are discarded.When the bucket is empty, new requests enter the priority scheduler queue.
Setting
bucket_capacityhigher thanfill_amountallows short bursts of traffic that exceed the steady-state rate. To disallow bursts entirely, setbucket_capacityequal tofill_amount.
QuotaSchedulingPolicySpec
QuotaSchedulingPolicySpec is the top-level configuration, defined in the spec field of the CRD.
| Field | Type | Required | Description |
|---|---|---|---|
quota_scheduler | QuotaScheduler | Yes | Rate limiter and priority scheduler configuration for quota-based request scheduling. |
QuotaScheduler
| Field | Type | Required | Description |
|---|---|---|---|
fill_amount | double | Yes | Number of tokens added to the bucket per fill cycle. Combined with interval in rate_limiter, this value determines the sustained request rate. |
bucket_capacity | double | Yes | Maximum number of tokens the bucket can hold. When the request rate is lower than the fill rate, tokens accumulate up to this limit. Set this value higher than fill_amount to allow burst traffic, or equal to fill_amount to enforce a strict rate limit with no bursts. |
rate_limiter | RateLimiterParameters | Yes | Rate limiter settings, including the fill interval that works with fill_amount to set the effective rate limit. For the full type definition, see RateLimitingPolicy field reference. |
scheduler | Scheduler | Yes | Priority-based scheduler that queues and reorders requests when the rate limit is exceeded. For the full type definition, see AverageLatencyScheduledPolicy field reference. |
selectors | []Selector | Yes | Match rules that determine which requests are subject to quota-based scheduling. For the full type definition, see AverageLatencyScheduledPolicy field reference. |
See also
RateLimitingPolicy field reference:
RateLimiterParameterstype definition andintervalconfigurationAverageLatencyScheduledPolicy field reference:
SchedulerandSelectortype definitions