Alibaba Cloud Service Mesh (ASM) enables access to external LLM services through ASM ingress gateway, offering capabilities such as traffic splitting, request observability, and comprehensive authentication and authorization. This topic describes how do clients external to the cluster connect to LLM services via the ASM ingress gateway.
Overview
The ASM ingress gateway facilitates access to external LLM services, ideal for scenarios involving external client connections. The ASM gateway provides advanced routing, security, observability, and LLM traffic management features, enabling efficient and secure access to external LLM services.
The request trace is as follows:
Prerequisites
Add a cluster to an ASM instance of version 1.22 or later.
Create an ingress gateway and retrieve the gateway IP.
Alibaba Cloud Model Studio is activated and a valid API_KEY is obtained. For more information, see Obtain an API key.
Step 1: Create an LLMProvider
Create a file named LLMProvider.yaml with the following content.
apiVersion: istio.alibabacloud.com/v1beta1 kind: LLMProvider metadata: name: dashscope-qwen namespace: istio-system spec: workloadSelector: labels: istio: ingressgateway host: dashscope.aliyuncs.com path: /compatible-mode/v1/chat/completions configs: defaultConfig: openAIConfig: model: qwen1.5-72b-chat # Open-source Qwen LLM stream: false apiKey: ${API_KEY}Run the following command to create an LLMProvider by using the kubeconfig file of the cluster on the data plane.
kubectl apply -f LLMProvider.yaml
Step 2: Create a gateway rule
Create a file named ingress-gw.yaml with the following content.
apiVersion: networking.istio.io/v1beta1 kind: Gateway metadata: name: ingress-gw namespace: istio-system spec: selector: istio: ingressgateway servers: - hosts: - '*' port: name: http number: 80 protocol: HTTPRun the following command to create a gateway rule.
kubectl apply -f ingress-gw.yaml
Step 3: Create an LLMRoute
Create a file named dashscope-route.yaml with the following content.
apiVersion: istio.alibabacloud.com/v1beta1 kind: LLMRoute metadata: name: dashscope-route spec: host: "*" gateways: - istio-system/ingress-gw rules: - name: ingress-route matches: - headers: host: exact: dashscope.aliyuncs.com # Process the request routed to dashscope.aliyuncs.com, otherwise 404 is returned. - headers: host: exact: test.com # Process the request routed to test.com. After the request being processed by the ASM LLM plug-in, the matching rule of the route is re-triggered and the preceding matching condition is met. backendRefs: - providerHost: dashscope.aliyuncs.comRun the following command to create an LLMRoute.
kubectl apply -f dashscope-route.yaml
Step 4: Verify
Run the following command locally.
curl --location '${ASM Gateway IP}:80' \
--header 'Content-Type: application/json' \
--header "host: test.com" \
--data '{
"messages": [
{"role": "user", "content": "Please introduce yourself"}
]
}'Expected output:
{"choices":[{"message":{"role":"assistant","content":"Hello! I am Qwen, a pre-trained language model developed by Alibaba Cloud. My purpose is to assist users in generating various types of text, such as articles, stories, poems, and answering questions by leveraging my extensive knowledge and understanding of context. Although I'm an AI, I don't have a physical body or personal experiences like human beings do, but I've been trained on a vast corpus of text data, which allows me to engage in conversations, provide information, or help with various tasks to the best of my abilities. So, feel free to ask me anything, and I'll do my best to provide helpful and informative responses!"},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion","usage":{"prompt_tokens":12,"completion_tokens":130,"total_tokens":142},"created":1720682745,"system_fingerprint":null,"model":"qwen1.5-72b-chat","id":"chatcmpl-3608dcd5-e3ad-9ade-bc70-xxxxxxxxxxxxxx"}% The output shows that the request trace has been successfully created.
Step 5: Use the security capabilities of ASM gateway
This step involves creating a simple authorization policy to block access to LLM services via the ASM gateway from a specified local IP.
Create a file named auth-policy.yaml with the following content.
apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: labels: gateway: ingressgateway name: auth-policy namespace: istio-system spec: action: DENY rules: - from: - source: ipBlocks: - ${Local IP} to: - operation: hosts: - test.com selector: matchLabels: istio: ingressgatewayRun the following command to deploy an authorization policy.
kubectl apply -f auth-policy.yamlRe-run the test command in Step 4, and the output is displayed as:
RBAC: access denied
The security capabilities of the ASM gateway that are configured for regular HTTP requests also apply to LLM requests. These capabilities include authorization policies, JWT authentication, and custom authorization services. Implementing these policies on the egress gateway can significantly enhance your applications' security.