Build a conversational search solution based on OpenSearch Vector Search Edition and an LLM -

Background information

The generative AI technology has gone viral, which indicates the development trend of AI and revitalizes the AI industry in China. More enterprises focus on how to apply generative AI to actual business scenarios. The generative AI technology performs well in general tasks, but it is still unable to provide accurate answers in vertical fields. It is the key to the success of generative AI in vertical fields to build a dedicated enterprise conversational solution based on specific knowledge in a vertical field and generate controllable results.

Alibaba Cloud OpenSearch is a one-stop development platform for intelligent search and provides high-performance vector search capabilities. You can build a reliable and intelligent conversational search solution for a vertical field based on the vector search capabilities and a large language model (LLM) and apply the solution to actual business scenarios.

This topic describes how to build an intelligent conversational search system for enterprises based on OpenSearch Vector Search Edition and an LLM.

Dedicated enterprise model-based solution

1. Overview

To build a conversational search solution based on OpenSearch Vector Search Edition and an LLM, you need to perform two steps. In the first step, vectorize your business data. In the second step, use an online search service to search for the required content and return results.

1.1. Vectorize business data

You need to vectorize business data and build a vector index.

Step 1: Import the business data in the TEXT format into the text vectorization model to obtain business data in the form of vectors.

Step 2: Import the business data in the form of vectors into OpenSearch Vector Search Edition to build a vector index.

1.2. Use an online search service to perform searches

After you implement the search feature, the system can return answers based on the top N search results and the LLM.

Step 1: Import the query of an end user into the text vectorization model to obtain the query in the form of vectors.

Step 2: Import the query in the form of vectors into OpenSearch Vector Search Edition.

Step 3: The built-in vector search engine of OpenSearch Vector Search Edition returns the top N search results from the business data.

Step 4: Integrate the top N search results as a prompt and import the prompt into the integrated LLM.

Step 5: Return the answer generated by the LLM and the search results retrieved based on vectors to the end user.

2. Benefits

High performance: Alibaba Cloud-developed high-performance vector search engine

OpenSearch Vector Search Edition can retrieve hundreds of billions of data records within milliseconds, update data in real time, and display results within seconds.
The search performance of OpenSearch Vector Search Edition is several times higher than that of an open source vector search engine. The retrieval rate of OpenSearch Vector Search Edition is significantly higher than that of an open source vector search engine in high-queries per second (QPS) scenarios.

Comparison of OpenSearch Vector Search Edition and an open source vector search engine in a medium data scenario

Source: Alibaba intelligent engine team, November, 2022.

Comparison of OpenSearch Vector Search Edition and an open source vector search engine in a big data scenario

Source: Alibaba intelligent engine team, November, 2022.

Low costs: multiple methods to reduce storage costs and resource consumption

Data compression: OpenSearch Vector Search Edition can convert raw data into the FLOAT data type for storage, and then use efficient algorithms such as ZSTD to compress data. This way, storage costs are reduced.
Fine-grained index schema design: A variety of optimization strategies are provided to reduce the index size for different indexes.
Partial Index loading: You can load indexes without locking indexes to memory by using the mmap loading policy, which effectively reduces the memory overhead.
Fully-featured engine: Compared with an open source vector search engine, the engine of OpenSearch Vector Search Edition can build indexes of smaller sizes and consumes fewer GPU resources. Under the same data conditions, OpenSearch Vector Search Edition occupies only about 50% of the memory that is occupied by an open source vector search engine.

Flexible: flexibly and quickly build an intelligent conversational search system for your enterprise

Stability and reliability: The system uses your business data instead of public data to generate content. The output results are more stable and reliable.
Upgraded interaction: The system can meet requirements for search and Q&A. End users can search for content in a conversational way instead of a traditional way.

Procedure

1. Purchase an OpenSearch Vector Search Edition instance

For more information, see Getting started for common scenarios. You need to record the specified username and password for subsequent use.

2. Configure the OpenSearch Vector Search Edition instance

On the details page of the purchased instance, you can view that the instance is in the Pending Configuration state and an empty cluster is automatically deployed for the instance. The numbers and specifications of Query Result Searcher (QRS) workers and Searcher workers in the cluster are those you specify when you purchase the instance. You must configure a data source and an index schema and rebuild indexes for the cluster before you can use the search service.

2.1 Configure an API data source.

After the data source is configured, click Next to configure the index schema.

2.2 Add an index table.

2.3 Select the created data source and configure the index table. Select Common Template as Select Template.

2.4 Configure fields. You must define at least a primary key field and a vector field. The vector field must be of the multi-value FLOAT type.

Note: You must configure the field names and types based on the preceding figure. Otherwise, the data cannot be automatically pushed.

2.5 Set the Type parameter of the primary key field to PRIMARYKEY64. Set the Type parameter of the vector field to CUSTOMIZED.

Note: The name of the vector index must be embedding_index.

Add fields to the vector field.

2.6 Configure the advanced settings of the vector index. You can refer to the parameter settings shown in the following figure to configure the vector index. For more information, see Vector indexes.

Configure the dimension parameter based on the vector model that you use. In this example, the text-embedding-ada-002 LLM is used, the dimension parameter is set to 1536, and the enable_rt_build parameter is set to true. This way, OpenSearch can build indexes in real time.

2.7 After you perform the preceding steps, click Save Edition. In the dialog box that appears, configure the Description parameter based on your business requirements. Then, click Publish.

After the index is published, click Next to rebuild the index.

2.8 Configure the parameters that are required to rebuild the index. Then, click Next.

2.9 In the left-side navigation pane, choose O&M Center > Change History. On the page that appears, click the Data Source Changes tab. You can check the rebuild progress on the page that appears. After the index is rebuilt, you can perform query tests.

3. Build a search service based on OpenSearch Vector Search Edition and an LLM

3.1 Download OpenSearch-LLM and decompress the model to the llm directory.

3.2 Configure the parameters related to Chat in the .env file in the llm directory. Before you configure the parameters, you need to purchase the Chat service.

Use Chat

LLM_NAME=OpenAI

OPENAI_API_KEY=***
OPENAI_API_BASE=***

OPENAI_EMBEDDING_ENGINE=text-embedding-ada-002
OPENAI_CHAT_ENGINE=gpt-3.5-turbo

VECTOR_STORE=OpenSearch
# OpenSearch information
OPENSEARCH_ENDPOINT=ha-cn-wwo38nf8q01.ha.aliyuncs.com
OPENSEARCH_INSTANCE_ID=ha-cn-wwo38nf8q01
OPENSEARCH_TABLE_NAME=llm
OPENSEARCH_DATA_SOURCE=ha-cn-wwo38nf8q01_data
OPENSEARCH_USER_NAME=opensearch         # The username you specified when you purchase the OpenSearch Vector Search Edition instance.       
OPENSEARCH_PASSWORD=chat001             # The password you specified when you purchase the OpenSearch Vector Search Edition instance.

Use Azure OpenAI

LLM_NAME=OpenAI

OPENAI_API_KEY=***
OPENAI_API_BASE=***
OPENAI_API_VERSION=2023-03-15-preview
OPENAI_API_TYPE=azure

# Specify the deployment of the OpenAI model in Azure.
OPENAI_EMBEDDING_ENGINE=embedding_deployment_id
OPENAI_CHAT_ENGINE=chat_deployment_id

VECTOR_STORE=OpenSearch
# OpenSearch information
OPENSEARCH_ENDPOINT=ha-cn-wwo38nf8q01.ha.aliyuncs.com
OPENSEARCH_INSTANCE_ID=ha-cn-wwo38nf8q01
OPENSEARCH_TABLE_NAME=llm
OPENSEARCH_DATA_SOURCE=ha-cn-wwo38nf8q01_data
OPENSEARCH_USER_NAME=opensearch      # The username you specified when you purchase the OpenSearch Vector Search Edition instance.
OPENSEARCH_PASSWORD=chat001          # The password you specified when you purchase the OpenSearch Vector Search Edition instance.

Note:

You need to specify the parameters related to OpenSearch based on the purchased OpenSearch Vector Search Edition instance.
The OPENSEARCH_ENDPOINT parameter specifies the endpoint used to access the OpenSearch Vector Search Edition instance over the Internet. You must add the IP address that is used to access the instance to the IP address whitelist of the instance. The value of the OPENSEARCH_ENDPOINT parameter cannot contain the http:// prefix.

3.3 Process and push data

Use the embed_files.py script in the llm directory to process your data files. Only Markdown and PDF files are supported. After the files are processed, they are pushed to the preceding OpenSearch Vector Search Edition instance. In the following example, the files in the ${doc_dir} directory are processed and pushed to the ha-cn-wwo38nf8q01 instance and an index is automatically built.

python -m script.embed_files -f ${doc_dir}

You can configure the -f parameter to specify the directory in which the files to be processed reside.

3.4 Start the intelligent conversational service.

cd ~/llm 
python api_demo.py

3.5 Use the curl command to test the service.

Run the following command to send a test query:

curl -H "Content-Type: application/json" http://127.0.0.1:8000/chat -d '{"query": "Can you introduce OpenSearch?"}'

Returned result:

{
    "success": true,
    "result": "OpenSearch is a distributed search engine that is developed by Alibaba. OpenSearch can be used to store, process, and analyze large-scale data and features high availability, scalability, and performance.",
    "prompt": "Human: Answer the question based on search results.     Search Results: OpenSearch is a distributed search engine that is developed by Alibaba. OpenSearch supports SQL query statements and provides built-in user-defined functions (UDFs). OpenSearch also allows you to develop your own UDFs in the form of plug-ins. OpenSearch can be used to store, process, and analyze large-scale data and features high availability, scalability, and performance. You can deploy OpenSearch by using a distributed O&M tool. You can deploy a distributed cluster on physical machines or use a cloud-native architecture to deploy OpenSearch on a cloud platform. \n\n Briefly and professionally answer the question based on the preceding information. If you do not know the answer, return "I do not know". Do not fabricate a fake answer. \n Query: Can you introduce OpenSearch?    Assistant:   "
}

4. Reference prompt

{
    "prompt": "Human: Answer the question based on search results.     Search Results: Comparison of OpenSearch Industry Algorithm Edition and OpenSearch High-performance Search Edition: Overview: OpenSearch is a one-stop intelligent search development platform based on the large-scale distributed search engine independently developed by Alibaba Cloud. In big data scenarios, OpenSearch can retrieve hundreds of billions of data within milliseconds. OpenSearch can provide search solutions for multiple scenarios, such as the order, logistics, and insurance scenarios. OpenSearch is a Software as a Service (SaaS) platform. You can interact with the system by using the OpenSearch console or API operations. You can configure scenarios in an easy manner by creating application instances. After you configure data sources, field structure, and search attributes, you can perform search tests by using OpenSearch SDK or the OpenSearch console. In terms of big data search, OpenSearch High-performance Search Edition removes complex industry algorithm capabilities and supports general search capabilities compared with OpenSearch Industry Algorithm Edition. The general search capabilities are based on analyzers and sorting capabilities. OpenSearch High-performance Search Edition focuses on the throughput of business queries and writes. This way, OpenSearch can respond within seconds and you can query data in real time in a big data search scenario. OpenSearch provides high throughput. A single table supports ten thousand write transactions per second (TPS) and can be updated within seconds. OpenSearch offers 24/7 support to ensure service stability and security by using tickets and phone calls. OpenSearch provides comprehensive fault and emergency response mechanisms, such as fault monitoring, automatic alerting, fast troubleshooting, and fast recovery. Alibaba Cloud assigns AccessKey IDs and AccessKey secrets to users to control permissions on OpenSearch. This ensures data security by isolating the data of different users. OpenSearch uses redundant and backup data to prevent data loss.     Query: How many editions does OpenSearch have?   Assistant:   ",
}

Demo

Summary and prospect

This topic describes how to use OpenSearch Vector Search Edition and an LLM to build an enterprise-specific conversational search system. For more information about search solutions, visit the OpenSearch product page.

In the future, OpenSearch will release an edition for conversational search. In addition, a SaaS-based enterprise-specific LLM that uses one-stop training will be available to build intelligent conversational search systems.

Note

The open source vector model and LLM used in this solution are third-party models. Alibaba Cloud cannot guarantee the compliance and accuracy of third-party models and assumes no responsibility for third-party models, or for your behavior and results of using third-party models. Therefore, proceed with caution before you visit or use third-party models. In addition, we remind you that third-party models come with agreements such as "Open Source License" and "License", and you should carefully read and strictly abide by the provisions of these agreements.