All Products
Search
Document Center

API Gateway:ai-rag

Last Updated:Feb 05, 2025

This topic describes the ai-rag plug-in.

The ai-rag plug-in can work with Alibaba Cloud DashVector to quickly implement Retrieval-Augmented Generation (RAG) based on Large Language Models (LLMs). The following figure shows the process:

image

Running attributes

Plug-in execution stage: default stage. Plug-in execution priority: 200.

Configuration description

Field

Data type

Required

Default value

Description

dashscope.apiKey

string

Yes

-

The token that is used for authentication when a gateway accesses Qwen.

dashscope.serviceFQDN

string

Yes

-

The service name of Qwen.

dashscope.servicePort

int

Yes

-

The service port of Qwen.

dashscope.serviceHost

string

Yes

-

The domain name that is used to access Qwen.

dashvector.apiKey

string

Yes

-

The token that is used for authentication when a gateway accesses Alibaba Cloud DashVector.

dashvector.serviceFQDN

string

Yes

-

The service name of Alibaba Cloud DashVector.

dashvector.servicePort

int

Yes

-

The service port of Alibaba Cloud DashVector.

dashvector.serviceHost

string

Yes

-

The domain name that is used to access Alibaba Cloud DashVector.

dashvector.topk

int

Yes

-

The number of vectors that are obtained during the vector search performed by Alibaba Cloud DashVector.

dashvector.threshold

float

Yes

-

The vector distance threshold. The system filters out the documents whose vector distances exceed the specified threshold.

dashvector.field

string

Yes

-

The field name of the document that is stored in Alibaba Cloud DashVector.

Note

After the plug-in is enabled, when you use the tracing analysis feature, the document ID retrieved by the ai-rag plug-in is added to attribute in span for troubleshooting.

Example

dashscope:
    apiKey: xxxxxxxxxxxxxxx
    serviceFQDN: dashscope
    servicePort: 443
    serviceHost: dashscope.aliyuncs.com
dashvector:
    apiKey: xxxxxxxxxxxxxxxxxxxx
    serviceFQDN: dashvector
    servicePort: 443
    serviceHost: vrs-cn-xxxxxxxxxxxxxxx.dashvector.cn-hangzhou.aliyuncs.com
    collection: xxxxxxxxxxxxxxx
    topk: 1
    threshold: 0.4
    field: raw

The CEC-Corpus dataset contains the corpus and labeled data of 332 news reports about emergencies. The original press release text is extracted, vectorized, and then added to Alibaba Cloud DashVector.

The following content provides an example of using the ai-rag plug-in to implement the RAG feature. The original request is: Where did the Hainan rear-end collision occur? What is the reason? What are the casualties?

  • The result returned by an LLM without the processing of the ai-rag plug-in:

    I'm sorry. As an AI model, I cannot obtain and update the specific information about news events in real time, including details such as locations, causes, and casualties. For such specific events, you are advised to view the latest news reports or official announcements for accurate information. You can also visit mainstream media websites, use news applications, or follow the announcements of relevant government departments to obtain latest news.

  • The result returned by an LLM after processing by the ai-rag plug-in:

    The rear-end collision in Hainan occurred at 37 kilometers from Wenchang to Haikou on the Haiwen Expressway. Regarding the specific cause of the accident, the traffic police department is still investing at the time. The exact cause of the accident cannot be determined based on the provided information. The casualties were 1 person died (the driver died on the spot) and 8 persons were injured (including 2 children and 6 adults). All injured persons were rescued and sent to the hospital for treatment.