This topic describes the ai-rag plug-in.
The ai-rag plug-in can work with Alibaba Cloud DashVector to quickly implement Retrieval-Augmented Generation (RAG) based on Large Language Models (LLMs). The following figure shows the process:
Running attributes
Plug-in execution stage: default stage
. Plug-in execution priority: 200
.
Configuration description
Field | Data type | Required | Default value | Description |
| string | Yes | - | The token that is used for authentication when a gateway accesses Qwen. |
| string | Yes | - | The service name of Qwen. |
| int | Yes | - | The service port of Qwen. |
| string | Yes | - | The domain name that is used to access Qwen. |
| string | Yes | - | The token that is used for authentication when a gateway accesses Alibaba Cloud DashVector. |
| string | Yes | - | The service name of Alibaba Cloud DashVector. |
| int | Yes | - | The service port of Alibaba Cloud DashVector. |
| string | Yes | - | The domain name that is used to access Alibaba Cloud DashVector. |
| int | Yes | - | The number of vectors that are obtained during the vector search performed by Alibaba Cloud DashVector. |
| float | Yes | - | The vector distance threshold. The system filters out the documents whose vector distances exceed the specified threshold. |
| string | Yes | - | The field name of the document that is stored in Alibaba Cloud DashVector. |
After the plug-in is enabled, when you use the tracing analysis feature, the document ID
retrieved by the ai-rag
plug-in is added to attribute
in span
for troubleshooting.
Example
dashscope:
apiKey: xxxxxxxxxxxxxxx
serviceFQDN: dashscope
servicePort: 443
serviceHost: dashscope.aliyuncs.com
dashvector:
apiKey: xxxxxxxxxxxxxxxxxxxx
serviceFQDN: dashvector
servicePort: 443
serviceHost: vrs-cn-xxxxxxxxxxxxxxx.dashvector.cn-hangzhou.aliyuncs.com
collection: xxxxxxxxxxxxxxx
topk: 1
threshold: 0.4
field: raw
The CEC-Corpus dataset contains the corpus and labeled data of 332 news reports about emergencies. The original press release text is extracted, vectorized, and then added to Alibaba Cloud DashVector.
The following content provides an example of using the ai-rag plug-in to implement the RAG feature. The original request is: Where did the Hainan rear-end collision occur? What is the reason? What are the casualties?
The result returned by an LLM without the processing of the ai-rag plug-in:
I'm sorry. As an AI model, I cannot obtain and update the specific information about news events in real time, including details such as locations, causes, and casualties. For such specific events, you are advised to view the latest news reports or official announcements for accurate information. You can also visit mainstream media websites, use news applications, or follow the announcements of relevant government departments to obtain latest news.
The result returned by an LLM after processing by the ai-rag plug-in:
The rear-end collision in Hainan occurred at 37 kilometers from Wenchang to Haikou on the Haiwen Expressway. Regarding the specific cause of the accident, the traffic police department is still investing at the time. The exact cause of the accident cannot be determined based on the provided information. The casualties were 1 person died (the driver died on the spot) and 8 persons were injured (including 2 children and 6 adults). All injured persons were rescued and sent to the hospital for treatment.