Integrate OpenSearch with an LLM to build an enterprise-specific intelligent Q&A system -

Background information

With the popularity of generative AI technology, which indicates the future development trend of artificial intelligence, more and more enterprises are focusing on how to apply this "super tool" to practical business. While generative AI technology performs well in general tasks, it still cannot give accurate answers in vertical business areas. Therefore, building an enterprise-exclusive Q&A system based on specific knowledge and ensuring the generated content is controllable is the key to serving the vertical field. Alibaba Cloud Intelligent Open Search (OpenSearch) is a one-stop intelligent search business development platform with high-performance vector retrieval capability. By combining vector retrieval and large-scale models, it can build reliable intelligent Q&A solutions in vertical domains and quickly apply them in business scenarios. This article will provide a detailed introduction on how to quickly build an intelligent Q&A system for enterprises by using the "OpenSearch vector retrieval version + large-scale models" practical solution.

Enterprise-exclusive model solution

1. Introduction to the Solution

The OpenSearch + large-scale model solution is divided into two parts: firstly, the business data is preprocessed and vectorized, and secondly, an online search service is used for retrieval and content generation.

1.1. Preprocessing of Business Data

First, it is necessary to vectorize the business data and then construct a vector index.

Step 1: Import the textual business data into the text vectorization model to obtain the vector form of the business data.

Step 2: Import the vectorized business data into the OpenSearch vector retrieval version to construct a vector index.

1.2. Search for online Q&A services.

After implementing the search function, combine the Top N search results and return the search Q&A results based on the LLM Q&A model.

Step 1: Vectorize the query input text from the end user to obtain the vectorized form of the user query.

Step 2: Input the vectorized user query into the OpenSearch Vector Retrieval Edition.

Step 3: Use the built-in vector retrieval engine in OpenSearch Vector Retrieval Edition to obtain the Top N search results in the business data.

Step 4: Integrate the Top N search results as prompts and input them into the LLM Q&A model.

Step 5: Return the Q&A results generated by the Q&A model and the search results obtained by vector retrieval to the end user.

2. Benefits

Advantage 1: High performance: self-developed high-performance vector retrieval engine.

OpenSearch Vector Retrieval Edition supports millisecond response for hundreds of billions of data and real-time data update can be seen in seconds. The retrieval performance of OpenSearch Vector Retrieval Edition is several times better than that of open-source vector search engines, and the recall rate is significantly better than that of open-source vector search engines in high QPS scenarios.

OpenSearch vector search version vs. open source engine performance: medium data scenarios

Data source：Alibaba Intelligent Engine Division team, November 2022

OpenSearch vector search version vs. open source engine performance: big data scenarios

Data source：Alibaba Intelligent Engine Division team, November 2022

Advantage 2: Low cost: adopt multiple ways to optimize storage cost and reduce resource consumption.

Data compression: convert the original data into float format for storage, and then use efficient algorithms such as zstd for data compression to optimize storage costs.
Refined index structure design: for indexes of different types, different optimization strategies can be used to reduce index size.
Non-full memory loading: use mmap non-locking form to load indexes, effectively reducing memory overhead.
Engine advantages: The OpenSearch Vector Retrieval Edition engine itself has the advantage of building index size and GPU resource consumption. Under the same data conditions, the memory usage of OpenSearch Vector Retrieval Edition is only about 50% of that of open-source vector retrieval engines.

Advantage 3: Flexible and fast construction of enterprise-specific intelligent Q&A solutions.

Stable and reliable: Use customer business data instead of public data for content generation, resulting in more stable and reliable output results.
Interactive upgrade: It can simultaneously meet the needs of users for search and Q&A. The Q&A-style search interaction can be used to replace the conventional search form.

Practices

1. Purchase an OpenSearch Vector Search Edition instance

For more information about how to purchase an OpenSearch Vector Search Edition instance, see Getting started for common scenarios.

2. Configure the OpenSearch Vector Search Edition instance

For newly purchased instances, the instance status in its detail page is "to be configured", and an empty cluster with the same number and specifications of query nodes and data nodes as purchased will be automatically deployed. Afterwards, the cluster needs to be configured with a data source, index configuration, and index rebuilding before normal searching can be performed.

2.1.Configure API data source

After successfully configuring the data source, click Next to configure the index structure:

2.2 Add index table:

2.3 Select the newly created data source, configure the index table, and select the general template:

2.4 Set the fields, at least two fields need to be defined: primary key field and vector field (the vector field needs to be set as a multi-value float type):

Note: The fields need to be set according to the field names and types in the figure, otherwise the data cannot be automatically pushed.

2.5 Set the index, set the index type of the primary key field to PRIMARYKEY64, and select CUSTOMIZED for the vector index type:

Add fields containing the vector field:

2.6 Advanced configuration of vector index. You can refer to the following configuration of vector index parameters. For details, please refer to Vector indexes.

The vector dimension is configured according to the selected vector model. In this example, the large model text-embedding-ada-002 is used, and the dimension is configured as 1536. enable_rt_build is set to true to enable real-time index building.

2.7. After completing the configuration in step 2.6, click Save Version, and then fill in the comments (optional) in the pop-up window, and click Publish:

After the index is published, click "Next" to rebuild the index:

2.8 Index rebuilding, select the parameter items that need to be configured for index rebuilding, and click "Next":

2.9 You can check the progress of index rebuilding in the Operations Center > Historical Changes > Data Source Changes, and after the progress is completed, you can perform query testing:

3. Vector Retrieval Edition + Large Model System Construction

3.1 Download the Large Model Tool OpenSearch-LLM,And unzip it to the llm directory

3.2 Configure Chat-related information in the .env file in the llm directory. Users need to purchase Chat-related services themselves.

Using Chat：

LLM_NAME=OpenAI

OPENAI_API_KEY=***
OPENAI_API_BASE=***

OPENAI_EMBEDDING_ENGINE=text-embedding-ada-002
OPENAI_CHAT_ENGINE=gpt-3.5-turbo

VECTOR_STORE=OpenSearch
# OpenSearch message
OPENSEARCH_ENDPOINT=http://ha-cn-wwo38nf8q01.ha.aliyuncs.com/
OPENSEARCH_INSTANCE_ID=ha-cn-wwo38nf8q01
OPENSEARCH_TABLE_NAME=llm
OPENSEARCH_DATA_SOURCE=ha-cn-wwo38nf8q01_data
OPENSEARCH_USER_NAME=opensearch         #The username is the one set when purchasing the Vector Retrieval Edition.      
OPENSEARCH_PASSWORD=chat001             #The password is the one set when purchasing the Vector Retrieval Edition.

Using Microsoft Azure OpenAI

LLM_NAME=OpenAI

OPENAI_API_KEY=***
OPENAI_API_BASE=***
OPENAI_API_VERSION=2023-03-15-preview
OPENAI_API_TYPE=azure

# Fill in the deployment of the OpenAI model in Azure.
OPENAI_EMBEDDING_ENGINE=embedding_deployment_id
OPENAI_CHAT_ENGINE=chat_deployment_id

VECTOR_STORE=OpenSearch
# opensearch message
OPENSEARCH_ENDPOINT=http://ha-cn-wwo38nf8q01.ha.aliyuncs.com/
OPENSEARCH_INSTANCE_ID=ha-cn-wwo38nf8q01
OPENSEARCH_TABLE_NAME=llm
OPENSEARCH_DATA_SOURCE=ha-cn-wwo38nf8q01_data
OPENSEARCH_USER_NAME=opensearch      #The username is the one set when purchasing the Vector Retrieval Edition.
OPENSEARCH_PASSWORD=chat001          #The password is the one set when purchasing the Vector Retrieval Edition.

Note: The OpenSearch information in the configuration should correspond to the Vector Retrieval Edition instance purchased earlier.

3.3 Data Processing and Pushing

Use the embed_files.py script in the llm directory to process user data files, currently supporting Markdown and PDF formats. After processing, it will be automatically pushed to the vector retrieval instance configured above. Here is an example of pushing the documents in the user directory ${doc_dir} to the ha-cn-wwo38nf8q01 instance and automatically building the index.

python -m script.embed_files -f ${doc_dir}

Use the -f option to specify the directory where the documents to be processed are located.

3.4 Launching the Question Answering Service

cd ~/llm 
python api_demo.py

3.5 Testing with the curl Command

Test Request:

curl -H "Content-Type: application/json" http://127.0.0.1:8000/chat -d '{"query": "介绍一下opensearch"}'

Output:

{
    "success": true,
    "result": "OpenSearch is a distributed search engine developed by Alibaba. It can be used to store, process, and analyze large-scale data with high availability,scalability,and high performance.",
    "prompt": "Human: Answer questions based on search results.    Search Results: statements and provides built-in User Defined Function (UDF), allowing customers to develop their own UDFs in the form of plug-ins. OpenSearch can be used to store, process, and analyze large-scale data with high availability, scalability, and performance. You can use distributed O & M tools to deploy distributed clusters on physical machines, or use cloud-native architectures on cloud platforms. \n\n According to the above known information, concise and professional to answer the user's questions. If you can't get the answer from it, please say you don't know. You are not allowed to add fabricated elements to the answer. Please use Chinese for the answer. \n Query: Describe the OpenSearch Assistant:"
}

4. Reference prompt

{
    "prompt": "Human: Answer questions based on search results.    Search Results: 
Comparison of OpenSearch specifications and types: 1. Difference between industry algorithm version and high-performance search version: Product overview: OpenSearch introduction OpenSearch is a one-stop intelligent search business development platform based on Alibaba's independently developed large-scale distributed search engine. It realizes millisecond response of hundreds of billions of data in big data scenarios and provides search solutions for orders, logistics, insurance policies and other scenarios. Product architecture SaaS platform, developers can interact with the system through the console or API. For scenario-based configuration, developers only need to create application instances, configure data sources, configure field structures, search properties, and wait for index reconstruction to complete, and then perform search tests through the SDK or console. Compared with the industry algorithm version, the high-performance search version eliminates the heavy industry algorithm capability. On the basis of supporting the general search capability (analyzer and sorting), the high-performance search version focuses on the throughput of business queries and writes, providing developers with the capability of second-level response and real-time queries in large data set search scenarios. Product features high throughput, single table supports 10,000-level write TPS, second-level update. Safe and stable operation and maintenance for 7 × 24 hours, technical support is provided by means of online work order and telephone alarm, and a series of fault emergency response mechanisms such as perfect fault monitoring, automatic alarm, rapid positioning, etc. Based on Aliyun's AccessKeyId and AccessKeySecret security encryption pair, permission control and isolation are performed on the access interface to ensure user-level data isolation and user data security. Data redundancy backup to ensure that data will not be lost. Query: What versions of the OpenSearch Assistant: ",
}

Effect display demo

Summary and prospect

This topic describes how to use OpenSearch Vector Search Edition and an LLM to build an enterprise-specific Q&A system. For more information about search solutions, visit the OpenSearch product page.

In the future, OpenSearch will release an edition for intelligent search and Q&A scenarios. In addition, a SaaS-based enterprise-specific LLM that uses one-stop training will be available to build intelligent search and Q&A systems.

Note

The "open-source vector model", "large model", and other similar terms used in this solution come from third-party sources (referred to as "third-party models"). Alibaba Cloud cannot guarantee the compliance and accuracy of third-party models, nor shall it be responsible for the third-party models themselves, or for any behavior and results arising from your use of the third-party models. Please use caution and carefully consider before accessing and using these models. Additionally, we remind you that third-party models come with open-source licenses, licenses, and other agreements, and you should carefully read and strictly comply with the terms of these agreements.