All Products
Search
Document Center

Microservices Engine:AI RAG

Last Updated:Aug 02, 2024

This topic describes the AI RAG plug-in.

Description

Knowledge of large language models (LLMs) is restricted by the data used to train the LLMs. After the training of an LLM is complete, the LLM cannot acquire or learn new information. In addition, information about specific fields may be missing or specific topics are not deeply covered even if LLMs are trained based on massive data. As a result, the query results in the fields may be inaccurate or lack depth. The Retrieval-Augmented Generation (RAG) technique can use retrieval systems to find relevant information from large-scale databases, and then provide the information to the text generation model to help generate more accurate, richer, and more realistic text.

Higress can connect to Alibaba Cloud DashVector to quickly implement the RAG feature. The following figure shows the process of implementing the RAG feature.

image

Configuration description

Name

Data type

Required

Default value

Description

dashscope.apiKey

STRING

Yes

-

The token that is used for authentication when a gateway accesses Qwen.

dashscope.serviceName

STRING

Yes

-

The service name of Qwen.

dashscope.servicePort

INT

Yes

-

The service port of Qwen.

dashscope.domain

STRING

Yes

-

The domain name that is used to access Qwen.

dashvector.apiKey

STRING

Yes

-

The token that is used for authentication when a gateway accesses Alibaba Cloud DashVector.

dashvector.serviceName

STRING

Yes

-

The service name of Alibaba Cloud DashVector.

dashvector.servicePort

INT

Yes

-

The service port of Alibaba Cloud DashVector.

dashvector.domain

STRING

Yes

-

The domain name that is used to access Alibaba Cloud DashVector.

Example

The CEC-Corpus dataset contains the corpus and labeled data of 332 news reports about emergencies. The original press release text is extracted, vectorized, and then added to Alibaba Cloud DashVector.