Access LLM services through an ASM ingress gateway - Alibaba Cloud Service Mesh

Service Mesh (ASM) allows you to access external Large Language Model (LLM) services through an ingress gateway. By routing requests through an ASM gateway, you can use its powerful features, such as traffic splitting, request observability, and robust authentication and authorization. This topic describes how a client outside a cluster can access an external LLM service through an ASM ingress gateway.

Overview

Accessing external LLM services through an ingress gateway is ideal when clients outside the cluster need to connect to them. The ASM gateway provides a wide range of routing, security, and observability features, and supports LLM traffic management. By using the ASM gateway, you can quickly and securely integrate with external LLM services.

The request trace for the example in this topic is as follows:

Prerequisites

You have added a cluster to an ASM instance. The ASM instance must be version 1.22 or later.
You have configured sidecar injection policies.
You have created an ingress gateway and obtained its IP address.
You have activated Alibaba Cloud Model Studio and obtained a valid API key. For more information, see Get an API key.

Step 1: Create the LLMProvider

Create a file named 'LLMProvider.yaml' with the following content.

apiVersion: istio.alibabacloud.com/v1beta1
kind: LLMProvider
metadata:  
  name: dashscope-qwen
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  host: dashscope.aliyuncs.com
  path: /compatible-mode/v1/chat/completions
  configs:
    defaultConfig:
      openAIConfig:
        model: qwen1.5-72b-chat  # Qwen open-source series large language model
        stream: false
        apiKey: ${API_KEY}

To create the LLMProvider, run the following command using the kubeconfig file for your ASM cluster.
```
kubectl apply -f LLMProvider.yaml
```

Step 2: Create the gateway rule

Create a file named 'ingress-gw.yaml' with the following content.

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: ingress-gw
  namespace: istio-system
spec:
  selector:
    istio: ingressgateway
  servers:
    - hosts:
        - '*'
      port:
        name: http
        number: 80
        protocol: HTTP

Run the following command to create the gateway rule.
```
kubectl apply -f ingress-gw.yaml
```

Step 3: Create the LLMRoute

Create a file named 'dashscope-route.yaml' with the following content.

apiVersion: istio.alibabacloud.com/v1beta1
kind: LLMRoute
metadata:  
  name: dashscope-route
spec:
  host: "*"
  gateways:
  - istio-system/ingress-gw
  rules:
  - name: ingress-route
    matches:
    - headers:
        host:
          exact: dashscope.aliyuncs.com  # Routes requests for dashscope.aliyuncs.com. Otherwise, a 404 error is returned.
    - headers:
        host:
          exact: test.com # Routes requests for test.com. After the request is processed by the ASM LLM plugin, it re-triggers route matching, causing the request to match the rule above.
    backendRefs:
    - providerHost: dashscope.aliyuncs.com

Run the following command to create the LLMRoute.
```
kubectl apply -f dashscope-route.yaml
```

Step 4: Test the setup

Run the following command in your local terminal to test the configuration.

curl --location '${INGRESS_GATEWAY_IP}:80' --header 'Content-Type: application/json' --header "host: test.com" --data '{
    "messages": [
        {"role": "user", "content": "Tell me about yourself"}
    ]
}'

Expected output:

{"choices":[{"message":{"role":"assistant","content":"I am a Large Language Model from Alibaba Cloud, and my name is Qwen. My main function is to answer user questions, provide information, and engage in conversation. I can understand user queries and generate corresponding answers or suggestions based on natural language. I can also learn new knowledge and apply it to various scenarios. If you have any questions or need help, please feel free to let me know, and I will do my best to support you."},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion","usage":{"prompt_tokens":3,"completion_tokens":72,"total_tokens":75},"created":1720682745,"system_fingerprint":null,"model":"qwen1.5-72b-chat","id":"chatcmpl-3d117bd7-9bfb-9121-9fc2-xxxxxxxxxxxx"}

The output indicates that the gateway successfully routed the request.

Step 5: Use ASM gateway security features

This step demonstrates how to create a simple authorization policy to deny access to the LLM service from your local IP address through the ASM gateway.

Create a file named 'auth-policy.yaml' with the following content.

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  labels:
    gateway: ingressgateway
  name: auth-policy
  namespace: istio-system
spec:
  action: DENY
  rules:
    - from:
        - source:
            ipBlocks:
              - ${YOUR_LOCAL_IP}
      to:
        - operation:
            hosts:
              - test.com
  selector:
    matchLabels:
      istio: ingressgateway

Run the following command to deploy the authorization policy.
```
kubectl apply -f auth-policy.yaml
```
Run the test command from Step 4 again. You should see the following result:
```
RBAC: access denied
```

Note

The security features available for standard HTTP requests on the ASM gateway, such as comprehensive authorization policies, JWT authentication, and custom external authorization services, also apply to LLM requests. By applying these policies to the ingress gateway, you can more effectively secure your applications.