All Products
Search
Document Center

Alibaba Cloud Service Mesh:LLMRoute CRD

Last Updated:Mar 11, 2026

LLMRoute is a Kubernetes Custom Resource Definition (CRD) provided by Alibaba Cloud Service Mesh (ASM) for declarative LLM traffic routing. Define match conditions based on request headers, source labels, or gateways, then route traffic to external Large Language Model (LLM) providers or in-cluster inference pools.

Unlike a standard Istio VirtualService, LLMRoute is purpose-built for LLM workloads. It natively supports provider-host routing and InferencePool backends, eliminating the need for complex virtual service configurations.

For a complete walkthrough, see Traffic routing: Use ASM to manage LLM traffic.

Sample configuration

The following manifest creates an LLMRoute that sends subscriber traffic to DashScope through a dedicated rule. All other traffic falls through to the default backend:

apiVersion: istio.alibabacloud.com/v1beta1
kind: LLMRoute
metadata:
  name: dashscope-route
spec:
  host: dashscope.aliyuncs.com      # Must be unique across LLM providers
  rules:
  - name: vip-route
    matches:
    - headers:
        user-type:
          exact: subscriber          # Match only subscriber requests
    backendRefs:
    - providerHost: dashscope.aliyuncs.com
  - backendRefs:                     # Default rule (no match conditions)
    - providerHost: dashscope.aliyuncs.com

Type hierarchy

The following diagram shows how the CRD types relate to each other:

LLMRoute
 ├── host (String)
 ├── gateways ([]String)
 └── rules ([]LLMRule)
      ├── name (String)
      ├── matches ([]LLMRequestMatch)
      │    ├── Headers (map[String]StringMatch)
      │    ├── SourceLabels (map[String]String)
      │    └── Gateways ([]String)
      └── backendRefs ([]LLMBackendRef)
           ├── ProviderHost (String)               * mutually exclusive
           ├── Weight (Int32)
           └── BackendRef (BackendObjectReference)  * mutually exclusive
                ├── Group (String)
                ├── Kind (String)
                ├── Name (String)
                ├── Namespace (String)
                └── Port (Int32)

LLMRoute

Top-level resource that binds a destination host to a set of routing rules.

FieldTypeDescription
hostStringDestination host URL. Must be unique across different LLM providers.
gateways[]StringGateways to which these rules apply. Behaves the same as gateways in an Istio VirtualService.
rules[]LLMRuleOrdered list of routing rules.

LLMRule

A single routing rule that pairs match conditions with backend targets.

Appears in: LLMRoute.rules

FieldTypeDescription
nameStringRule name. Use a descriptive name (for example, vip-route or default).
matches[]LLMRequestMatchConditions a request must satisfy for this rule to apply.
backendRefs[]LLMBackendRefBackends that receive traffic matched by this rule.

Example -- route subscriber traffic to a specific provider:

rules:
- name: vip-route
  matches:
  - headers:
      user-type:
        exact: subscriber
  backendRefs:
  - providerHost: dashscope.aliyuncs.com

LLMRequestMatch

Conditions a request must meet.

Appears in: LLMRule.matches

FieldTypeDescription
Headersmap[String]StringMatchHTTP headers to match against. Each entry defines a header name and a match criterion (exact, prefix, or regex).
SourceLabelsmap[String]StringLabels on the source workload that requests must match.
Gateways[]StringGateways to scope this match to. Limits the rule to traffic entering through specific gateways.

LLMBackendRef

References a backend that serves matched requests. List multiple backends with different weights to split traffic.

Appears in: LLMRule.backendRefs

FieldTypeDescription
ProviderHostStringHost URL of an external LLM provider (for example, dashscope.aliyuncs.com). Mutually exclusive with BackendRef.
WeightInt32Relative traffic weight. When multiple backends are listed, traffic is distributed proportionally by weight.
BackendRefBackendObjectReferenceReference to an in-cluster backend object. Mutually exclusive with ProviderHost.
Important

Set either ProviderHost or BackendRef per entry -- not both.

  • Use ProviderHost to route to an external LLM API.

  • Use BackendRef to route to an in-cluster InferencePool.

BackendObjectReference

Identifies an in-cluster backend object by its Kubernetes coordinates.

Appears in: LLMBackendRef.BackendRef

FieldTypeDescription
GroupStringThe group to which the backend object belongs.
KindStringThe type of the backend object.
NameStringResource name.
NamespaceStringNamespace where the resource resides.
PortInt32Port exposed by the backend.
Note

Currently, only InferencePool resources can be referenced as backend objects.

What's next