LLMRoute is a Kubernetes Custom Resource Definition (CRD) provided by Alibaba Cloud Service Mesh (ASM) for declarative LLM traffic routing. Define match conditions based on request headers, source labels, or gateways, then route traffic to external Large Language Model (LLM) providers or in-cluster inference pools.
Unlike a standard Istio VirtualService, LLMRoute is purpose-built for LLM workloads. It natively supports provider-host routing and InferencePool backends, eliminating the need for complex virtual service configurations.
For a complete walkthrough, see Traffic routing: Use ASM to manage LLM traffic.
Sample configuration
The following manifest creates an LLMRoute that sends subscriber traffic to DashScope through a dedicated rule. All other traffic falls through to the default backend:
apiVersion: istio.alibabacloud.com/v1beta1
kind: LLMRoute
metadata:
name: dashscope-route
spec:
host: dashscope.aliyuncs.com # Must be unique across LLM providers
rules:
- name: vip-route
matches:
- headers:
user-type:
exact: subscriber # Match only subscriber requests
backendRefs:
- providerHost: dashscope.aliyuncs.com
- backendRefs: # Default rule (no match conditions)
- providerHost: dashscope.aliyuncs.comType hierarchy
The following diagram shows how the CRD types relate to each other:
LLMRoute
├── host (String)
├── gateways ([]String)
└── rules ([]LLMRule)
├── name (String)
├── matches ([]LLMRequestMatch)
│ ├── Headers (map[String]StringMatch)
│ ├── SourceLabels (map[String]String)
│ └── Gateways ([]String)
└── backendRefs ([]LLMBackendRef)
├── ProviderHost (String) * mutually exclusive
├── Weight (Int32)
└── BackendRef (BackendObjectReference) * mutually exclusive
├── Group (String)
├── Kind (String)
├── Name (String)
├── Namespace (String)
└── Port (Int32)LLMRoute
Top-level resource that binds a destination host to a set of routing rules.
| Field | Type | Description |
|---|---|---|
host | String | Destination host URL. Must be unique across different LLM providers. |
gateways | []String | Gateways to which these rules apply. Behaves the same as gateways in an Istio VirtualService. |
rules | []LLMRule | Ordered list of routing rules. |
LLMRule
A single routing rule that pairs match conditions with backend targets.
Appears in: LLMRoute.rules
| Field | Type | Description |
|---|---|---|
name | String | Rule name. Use a descriptive name (for example, vip-route or default). |
matches | []LLMRequestMatch | Conditions a request must satisfy for this rule to apply. |
backendRefs | []LLMBackendRef | Backends that receive traffic matched by this rule. |
Example -- route subscriber traffic to a specific provider:
rules:
- name: vip-route
matches:
- headers:
user-type:
exact: subscriber
backendRefs:
- providerHost: dashscope.aliyuncs.comLLMRequestMatch
Conditions a request must meet.
Appears in: LLMRule.matches
| Field | Type | Description |
|---|---|---|
Headers | map[String]StringMatch | HTTP headers to match against. Each entry defines a header name and a match criterion (exact, prefix, or regex). |
SourceLabels | map[String]String | Labels on the source workload that requests must match. |
Gateways | []String | Gateways to scope this match to. Limits the rule to traffic entering through specific gateways. |
LLMBackendRef
References a backend that serves matched requests. List multiple backends with different weights to split traffic.
Appears in: LLMRule.backendRefs
| Field | Type | Description |
|---|---|---|
ProviderHost | String | Host URL of an external LLM provider (for example, dashscope.aliyuncs.com). Mutually exclusive with BackendRef. |
Weight | Int32 | Relative traffic weight. When multiple backends are listed, traffic is distributed proportionally by weight. |
BackendRef | BackendObjectReference | Reference to an in-cluster backend object. Mutually exclusive with ProviderHost. |
Set either ProviderHost or BackendRef per entry -- not both.
Use
ProviderHostto route to an external LLM API.Use
BackendRefto route to an in-clusterInferencePool.
BackendObjectReference
Identifies an in-cluster backend object by its Kubernetes coordinates.
Appears in: LLMBackendRef.BackendRef
| Field | Type | Description |
|---|---|---|
Group | String | The group to which the backend object belongs. |
Kind | String | The type of the backend object. |
Name | String | Resource name. |
Namespace | String | Namespace where the resource resides. |
Port | Int32 | Port exposed by the backend. |
Currently, only InferencePool resources can be referenced as backend objects.
What's next
Traffic routing: Use ASM to manage LLM traffic -- end-to-end example of configuring
LLMRoutefor production LLM trafficIstio
StringMatchreference -- supported match types for theHeadersfield (exact,prefix,regex)