This topic describes how to use a Service Mesh (ASM) Egress Gateway to access an external LLM Service. This method is ideal for scenarios where the Client is running inside a Cluster and has a Sidecar injected.
Prerequisites
-
An ACK Cluster has been added to an ASM instance of version 1.22 or later.
-
A Sidecar Injection Policy has been configured.
-
An Ingress Gateway has been created.
-
The sleep sample Application has been deployed. For more information, see Create the sleep test Application.
-
You have activated Alibaba Cloud Model Studio and obtained a valid API key.
Overview
Without an Egress Gateway, Client requests are intercepted by the Sidecar Proxy, which then forwards the requests to the LLM Service provider. Since the Sidecar is deployed in the same Pod as the Client, this approach introduces security risks, such as API key leaks and unauthorized access. If you are concerned about these risks, we strongly recommend using an Egress Gateway.
Introducing an ASM Egress Gateway into the request path enhances security. The Egress Gateway is deployed independently of the Client. This separation allows you to use the RBAC mechanism of the ACK Cluster to restrict gateway management permissions. The API key is dynamically injected and Authentication and Authorization Policies are enforced at the Egress Gateway, enabling you to leverage the full suite of ASM Gateway security features.
The following diagram shows the request path used in this topic:
Step 1: Create an egress gateway and a gateway
-
Create an Egress Gateway, configure port 80, and enable Mutual TLS (mTLS) Authentication. For more information, see Create an Egress Gateway.
-
Create a file named
egress-gw.yamlwith the following content.apiVersion: networking.istio.io/v1beta1 kind: Gateway metadata: name: egress-gw namespace: istio-system spec: selector: istio: egressgateway servers: - hosts: - '*' port: name: http number: 80 protocol: HTTPS tls: mode: ISTIO_MUTUAL -
Run the following command using the kubeconfig of the ASM instance to create the Gateway:
kubectl apply -f egress-gw.yaml
Step 2: Create a gateway-scoped LLMProvider
Scoping the LLMProvider to the Egress Gateway ensures the API key is stored only in the Egress Gateway's memory and is not accessible to the Client.
-
Create a file named
dashscope-qwen.yamlwith the following content.apiVersion: istio.alibabacloud.com/v1beta1 kind: LLMProvider metadata: name: dashscope-qwen namespace: istio-system spec: workloadSelector: labels: istio: egressgateway host: dashscope.aliyuncs.com path: /compatible-mode/v1/chat/completions configs: defaultConfig: openAIConfig: model: qwen1.5-72b-chat # Qwen open-source large model series stream: false apiKey: ${API_KEY} -
Run the following command to create the LLMProvider.
kubectl apply -f dashscope-qwen.yaml
Step 3: Create an LLMRoute
-
Create a file named
dashscope-route.yamlwith the following content to route traffic to the Egress Gateway.apiVersion: istio.alibabacloud.com/v1beta1 kind: LLMRoute metadata: name: dashscope-route spec: host: dashscope.aliyuncs.com gateways: - mesh - istio-system/egress-gw rules: - name: mesh-route # After the Sidecar receives a request for dashscope.aliyuncs.com, it forwards the request to the Egress Gateway. matches: - gateways: - mesh backendRefs: - providerHost: istio-egressgateway.istio-system.svc.cluster.local - name: egress-gw-route # After the Egress Gateway receives a request for dashscope.aliyuncs.com, it forwards the request to the actual provider. matches: - gateways: - istio-system/egress-gw backendRefs: - providerHost: dashscope.aliyuncs.com -
Run the following command to create the LLMRoute.
kubectl apply -f dashscope-route.yaml
Step 4: Test the configuration
Run the following command using the kubeconfig for your ACK Cluster to test the configuration:
kubectl exec deployment/sleep -it -- curl --location 'http://dashscope.aliyuncs.com' --header 'Content-Type: application/json' --data '{
"messages": [
{"role": "user", "content": "Tell me about yourself"}
]
}'
Expected output:
{"choices":[{"message":{"role":"assistant","content":"I am a large language model from Alibaba Cloud, and my name is Qwen. My main function is to answer users' questions, provide information, and engage in conversations. I can understand users' questions and generate corresponding answers or suggestions based on natural language. I can also learn new knowledge and apply it to various scenarios. If you have any questions or need help, please feel free to let me know, and I will do my best to support you."},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion","usage":{"prompt_tokens":3,"completion_tokens":72,"total_tokens":75},"created":1720680044,"system_fingerprint":null,"model":"qwen1.5-72b-chat","id":"chatcmpl-1c33b950-3220-9bfe-9066-xxxxxxxxxxxx"}
Step 5: Configure an authorization policy
-
Create a file named
authpolicy.yamlwith the following content.apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: test namespace: istio-system spec: action: DENY rules: - from: - source: principals: - cluster.local/ns/default/sa/sleep to: - operation: hosts: - dashscope.aliyuncs.com selector: matchLabels: istio: egressgateway -
Run the following command using the ASM kubeconfig to apply the Authorization Policy:
kubectl apply -f authpolicy.yaml -
Run the command in Step 4 again.
Expected output:
RBAC: access deniedThe request is denied.
ASM Gateways provide the same security capabilities for LLM requests as they do for regular HTTP requests, including Authorization Policies, JWT authentication, and custom authorization services. By applying these policies at the Egress Gateway, you can more effectively secure your Applications.