Service Mesh (ASM) allows you to access external Large Language Model (LLM) services through an ingress gateway. By routing requests through an ASM gateway, you can use its powerful features, such as traffic splitting, request observability, and robust authentication and authorization. This topic describes how a client outside a cluster can access an external LLM service through an ASM ingress gateway.
Overview
Accessing external LLM services through an ingress gateway is ideal when clients outside the cluster need to connect to them. The ASM gateway provides a wide range of routing, security, and observability features, and supports LLM traffic management. By using the ASM gateway, you can quickly and securely integrate with external LLM services.
The request trace for the example in this topic is as follows:
Prerequisites
-
You have added a cluster to an ASM instance. The ASM instance must be version 1.22 or later.
-
You have configured sidecar injection policies.
-
You have created an ingress gateway and obtained its IP address.
-
You have activated Alibaba Cloud Model Studio and obtained a valid API key. For more information, see Get an API key.
Step 1: Create the LLMProvider
-
Create a file named 'LLMProvider.yaml' with the following content.
apiVersion: istio.alibabacloud.com/v1beta1 kind: LLMProvider metadata: name: dashscope-qwen namespace: istio-system spec: workloadSelector: labels: istio: ingressgateway host: dashscope.aliyuncs.com path: /compatible-mode/v1/chat/completions configs: defaultConfig: openAIConfig: model: qwen1.5-72b-chat # Qwen open-source series large language model stream: false apiKey: ${API_KEY} -
To create the LLMProvider, run the following command using the kubeconfig file for your ASM cluster.
kubectl apply -f LLMProvider.yaml
Step 2: Create the gateway rule
-
Create a file named 'ingress-gw.yaml' with the following content.
apiVersion: networking.istio.io/v1beta1 kind: Gateway metadata: name: ingress-gw namespace: istio-system spec: selector: istio: ingressgateway servers: - hosts: - '*' port: name: http number: 80 protocol: HTTP -
Run the following command to create the gateway rule.
kubectl apply -f ingress-gw.yaml
Step 3: Create the LLMRoute
-
Create a file named 'dashscope-route.yaml' with the following content.
apiVersion: istio.alibabacloud.com/v1beta1 kind: LLMRoute metadata: name: dashscope-route spec: host: "*" gateways: - istio-system/ingress-gw rules: - name: ingress-route matches: - headers: host: exact: dashscope.aliyuncs.com # Routes requests for dashscope.aliyuncs.com. Otherwise, a 404 error is returned. - headers: host: exact: test.com # Routes requests for test.com. After the request is processed by the ASM LLM plugin, it re-triggers route matching, causing the request to match the rule above. backendRefs: - providerHost: dashscope.aliyuncs.com -
Run the following command to create the LLMRoute.
kubectl apply -f dashscope-route.yaml
Step 4: Test the setup
Run the following command in your local terminal to test the configuration.
curl --location '${INGRESS_GATEWAY_IP}:80' --header 'Content-Type: application/json' --header "host: test.com" --data '{
"messages": [
{"role": "user", "content": "Tell me about yourself"}
]
}'
Expected output:
{"choices":[{"message":{"role":"assistant","content":"I am a Large Language Model from Alibaba Cloud, and my name is Qwen. My main function is to answer user questions, provide information, and engage in conversation. I can understand user queries and generate corresponding answers or suggestions based on natural language. I can also learn new knowledge and apply it to various scenarios. If you have any questions or need help, please feel free to let me know, and I will do my best to support you."},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion","usage":{"prompt_tokens":3,"completion_tokens":72,"total_tokens":75},"created":1720682745,"system_fingerprint":null,"model":"qwen1.5-72b-chat","id":"chatcmpl-3d117bd7-9bfb-9121-9fc2-xxxxxxxxxxxx"}
The output indicates that the gateway successfully routed the request.
Step 5: Use ASM gateway security features
This step demonstrates how to create a simple authorization policy to deny access to the LLM service from your local IP address through the ASM gateway.
-
Create a file named 'auth-policy.yaml' with the following content.
apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: labels: gateway: ingressgateway name: auth-policy namespace: istio-system spec: action: DENY rules: - from: - source: ipBlocks: - ${YOUR_LOCAL_IP} to: - operation: hosts: - test.com selector: matchLabels: istio: ingressgateway -
Run the following command to deploy the authorization policy.
kubectl apply -f auth-policy.yaml -
Run the test command from Step 4 again. You should see the following result:
RBAC: access denied
The security features available for standard HTTP requests on the ASM gateway, such as comprehensive authorization policies, JWT authentication, and custom external authorization services, also apply to LLM requests. By applying these policies to the ingress gateway, you can more effectively secure your applications.