The AverageLatencySchedulingPolicy Custom Resource Definition (CRD) configures latency-based adaptive load scheduling in Alibaba Cloud Service Mesh (ASM). It uses an Additive Increase Multiplicative Decrease (AIMD) algorithm to throttle request rates based on observed latency, protecting services from overload while maximizing throughput during recovery.
This reference describes every field in the CRD, grouped by object type.
AverageLatencySchedulingPolicySpec
Top-level specification for the policy resource.
| Field | Type | Required | Description |
|---|---|---|---|
load_scheduling_core | LoadSchedulingCore | Yes | Core scheduler configuration. |
LoadSchedulingCore
Container for the load scheduling algorithm configuration.
| Field | Type | Required | Description |
|---|---|---|---|
aimd_load_scheduler | AimdLoadScheduler | Yes | AIMD-based load scheduler configuration. |
Selector
Determines which traffic flows the policy acts on, based on control point, flow labels, agent group, and service identity.
control_point: ingress
label_matcher:
match_labels:
user_tier: gold
match_expressions:
- key: query
operator: In
values:
- insert
- delete
expression:
label_matches:
- label: user_agent
regex: ^(?!.*Chrome).*Safari| Field | Type | Required | Default | Description |
|---|---|---|---|---|
control_point | string | No | ingress | Location within the service where the policy takes effect. Valid values: ingress, egress, a specific listener, or a specific filter chain. |
label_matcher | LabelMatcher | No | -- | Matches traffic flows based on flow labels. |
service | string | No | any | Fully Qualified Domain Name (FQDN) of the target service. |
LabelMatcher
Matches traffic flows using one or more of three methods:
Match labels -- exact key-value matching.
Match expressions -- operator-based matching (
In,NotIn,Exists,DoesNotExist).Arbitrary expression -- logical combinations with regex support.
When multiple methods are specified, they are combined with logical AND. An empty LabelMatcher matches all requests.
The following example excludes health check and metrics endpoints from scheduling. These endpoints do not represent real user traffic and can skew latency calculations:
label_matcher:
match_list:
- key: http.target
operator: NotIn
values:
- /health
- /live
- /ready
- /metrics| Field | Type | Required | Description |
|---|---|---|---|
expression | Expression | No | Logical expression evaluated on flow labels. |
match_labels | map[string]string | No | Key-value pairs for exact label matching. |
match_list | []MatchRequirement | No | List of operator-based matching requirements. |
Expression
A composable logical expression. Only one field can be set per Expression object.
# Match flows where label "foo" exists AND label "app" equals "frobnicator"
all:
of:
- label_exists: foo
- label_equals:
label: app
value: frobnicator| Field | Type | Description |
|---|---|---|
all | ExpressionList | True when all subexpressions are true (logical AND). |
any | ExpressionList | True when any subexpression is true (logical OR). |
label_equals | EqualExpression | True when the specified label equals the given value. |
label_exists | string | True when a label with the given name exists. |
label_matches | MatchesExpression | True when the specified label matches the given regular expression. |
not | Expression | Negates the result of the subexpression. |
ExpressionList
| Field | Type | Description |
|---|---|---|
of | []Expression | List of subexpressions. |
EqualExpression
| Field | Type | Required | Description |
|---|---|---|---|
label | string | Yes | Label key. |
value | string | No | Label value to match. |
MatchesExpression
| Field | Type | Required | Description |
|---|---|---|---|
label | string | Yes | Label key to match against. |
regex | string | Yes | Regular expression pattern. Uses Go RE2 syntax. |
MatchRequirement
Operator-based label matching, following the same pattern as Kubernetes label selectors.
| Field | Type | Required | Description |
|---|---|---|---|
key | string | Yes | Label key. |
operator | enum | Yes | Matching operator. Valid values: In, NotIn, Exists, DoesNotExist. |
values | []string | Conditional | String values to match against. Required for In and NotIn. Must be empty for Exists and DoesNotExist. |
AimdLoadScheduler
The AIMD load scheduler uses a gradient controller to adjust the token rate based on how far the observed latency signal deviates from the setpoint. When the system is overloaded, it multiplicatively decreases the token rate. During recovery, it linearly increases the rate until the system reaches steady state.
| Field | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
gradient | GradientControllerParameters | Yes | -- | -- | Gradient controller parameters. See GradientControllerParameters. |
load_multiplier_linear_increment | float64 | No | 0.025 | Minimum: 0 | Linear increment applied to the load multiplier every 10 seconds while the system is not overloaded, until max_load_multiplier is reached. |
load_scheduler | LoadSchedulerParameters | Yes | -- | -- | Workload scheduler and selector configuration. See LoadSchedulerParameters. |
max_load_multiplier | float64 | No | 2 | Minimum: 0 | Maximum load multiplier during recovery from an overloaded state. When this value is reached, the scheduler enters pass-through mode -- requests bypass the scheduler and go directly to the service. Pass-through mode is disabled if the system becomes overloaded again. This field protects the service from request bursts while recovering. |
GradientControllerParameters
The gradient controller adjusts the load multiplier based on the ratio of the observed signal to the setpoint. The slope field is the exponent applied to this ratio, controlling how aggressively the controller responds.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
max_gradient | float64 | No | 1 | Upper bound for the gradient value. |
min_gradient | float64 | No | 0.1 | Lower bound for the gradient value. |
slope | float64 | No | -1 | Exponent on the signal-to-setpoint ratio. Controls the direction and aggressiveness of the response. |
How slope works:
| Value | Behavior |
|---|---|
1 | Signal too high --> increase the control variable. |
-1 | Signal too high --> decrease the control variable. |
-0.5 | Signal too high --> decrease the control variable gradually. |
Keep the absolute value of slope at or below 1.
LoadSchedulerParameters
Configures the Weighted Fair Queuing (WFQ)-based workload scheduler and specifies which traffic flows the policy applies to.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
scheduler | Scheduler | No | -- | WFQ-based workload scheduler configuration. |
selectors | []Selector | Yes | -- | Selectors that scope which traffic flows the scheduler manages. |
workload_latency_based_tokens | bool | No | false | Automatically estimate token cost per flow based on historical latency. Tokens are set to the average latency of flows in each workload over the last few seconds. Useful for concurrency limiting, where concurrency = average latency x in-flight flows. Explicit tokens in flow labels take highest precedence, followed by tokens in the workload definition, then estimated tokens. |
Scheduler
| Field | Type | Required | Description |
|---|---|---|---|
workloads | []SchedulerWorkload | Yes | Workload definitions for the WFQ scheduler. |
SchedulerWorkload
Defines a workload -- a group of traffic flows that share the same scheduling parameters.
| Field | Type | Required | Description |
|---|---|---|---|
label_matcher | LabelMatcher | No | Matches traffic flows to this workload based on flow labels. |
Name | string | Yes | Workload name. |
Parameters | SchedulerWorkloadParameters | Yes | Scheduling parameters for flows matching the label matcher. |
SchedulerWorkloadParameters
Controls how traffic flows within a workload are prioritized and queued.
| Field | Type | Required | Description |
|---|---|---|---|
priority | float64 | No | The priority. |
queue_timeout | string | No | The timeout for the traffic flows in the workload. |
tokens | float64 | No | Cost of admitting a single flow. Typically defined as milliseconds of expected latency (time to response), or set to 1 when the resource is constrained by flow count (for example, third-party rate limiters). This value is used only when tokens are not specified in flow labels. |