All Products
Search
Document Center

Alibaba Cloud Service Mesh:External clients access LLM services through ASM ingress gateway

Last Updated:Feb 10, 2025

Alibaba Cloud Service Mesh (ASM) enables access to external LLM services through ASM ingress gateway, offering capabilities such as traffic splitting, request observability, and comprehensive authentication and authorization. This topic describes how do clients external to the cluster connect to LLM services via the ASM ingress gateway.

Overview

The ASM ingress gateway facilitates access to external LLM services, ideal for scenarios involving external client connections. The ASM gateway provides advanced routing, security, observability, and LLM traffic management features, enabling efficient and secure access to external LLM services.

The request trace is as follows:

image

Prerequisites

Step 1: Create an LLMProvider

  1. Create a file named LLMProvider.yaml with the following content.

    apiVersion: istio.alibabacloud.com/v1beta1
    kind: LLMProvider
    metadata:  
      name: dashscope-qwen
      namespace: istio-system
    spec:
      workloadSelector:
        labels:
          istio: ingressgateway
      host: dashscope.aliyuncs.com
      path: /compatible-mode/v1/chat/completions
      configs:
        defaultConfig:
          openAIConfig:
            model: qwen1.5-72b-chat  # Open-source Qwen LLM
            stream: false
            apiKey: ${API_KEY}
  2. Run the following command to create an LLMProvider by using the kubeconfig file of the cluster on the data plane.

    kubectl apply -f LLMProvider.yaml

Step 2: Create a gateway rule

  1. Create a file named ingress-gw.yaml with the following content.

    apiVersion: networking.istio.io/v1beta1
    kind: Gateway
    metadata:
      name: ingress-gw
      namespace: istio-system
    spec:
      selector:
        istio: ingressgateway
      servers:
        - hosts:
            - '*'
          port:
            name: http
            number: 80
            protocol: HTTP
  2. Run the following command to create a gateway rule.

    kubectl apply -f ingress-gw.yaml

Step 3: Create an LLMRoute

  1. Create a file named dashscope-route.yaml with the following content.

    apiVersion: istio.alibabacloud.com/v1beta1
    kind: LLMRoute
    metadata:  
      name: dashscope-route
    spec:
      host: "*"
      gateways:
      - istio-system/ingress-gw
      rules:
      - name: ingress-route
        matches:
        - headers:
            host:
              exact: dashscope.aliyuncs.com  # Process the request routed to dashscope.aliyuncs.com, otherwise 404 is returned.
        - headers:
            host:
              exact: test.com # Process the request routed to test.com. After the request being processed by the ASM LLM plug-in, the matching rule of the route is re-triggered and the preceding matching condition is met.
        backendRefs:
        - providerHost: dashscope.aliyuncs.com
  2. Run the following command to create an LLMRoute.

    kubectl apply -f dashscope-route.yaml

Step 4: Verify

Run the following command locally.

curl --location '${ASM Gateway IP}:80' \
--header 'Content-Type: application/json' \
--header "host: test.com" \
--data '{
    "messages": [
        {"role": "user", "content": "Please introduce yourself"}
    ]
}'

Expected output:

{"choices":[{"message":{"role":"assistant","content":"Hello! I am Qwen, a pre-trained language model developed by Alibaba Cloud. My purpose is to assist users in generating various types of text, such as articles, stories, poems, and answering questions by leveraging my extensive knowledge and understanding of context. Although I'm an AI, I don't have a physical body or personal experiences like human beings do, but I've been trained on a vast corpus of text data, which allows me to engage in conversations, provide information, or help with various tasks to the best of my abilities. So, feel free to ask me anything, and I'll do my best to provide helpful and informative responses!"},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion","usage":{"prompt_tokens":12,"completion_tokens":130,"total_tokens":142},"created":1720682745,"system_fingerprint":null,"model":"qwen1.5-72b-chat","id":"chatcmpl-3608dcd5-e3ad-9ade-bc70-xxxxxxxxxxxxxx"}%   

The output shows that the request trace has been successfully created.

Step 5: Use the security capabilities of ASM gateway

This step involves creating a simple authorization policy to block access to LLM services via the ASM gateway from a specified local IP.

  1. Create a file named auth-policy.yaml with the following content.

    apiVersion: security.istio.io/v1beta1
    kind: AuthorizationPolicy
    metadata:
      labels:
        gateway: ingressgateway
      name: auth-policy
      namespace: istio-system
    spec:
      action: DENY
      rules:
        - from:
            - source:
                ipBlocks:
                  - ${Local IP}
          to:
            - operation:
                hosts:
                  - test.com
      selector:
        matchLabels:
          istio: ingressgateway
  2. Run the following command to deploy an authorization policy.

    kubectl apply -f auth-policy.yaml
  3. Re-run the test command in Step 4, and the output is displayed as:

    RBAC: access denied
Note

The security capabilities of the ASM gateway that are configured for regular HTTP requests also apply to LLM requests. These capabilities include authorization policies, JWT authentication, and custom authorization services. Implementing these policies on the egress gateway can significantly enhance your applications' security.