Hologres と大規模言語モデルを使用してカスタムチャットボットを無料で作成する - Hologres

このトピックでは、Hologres のベクトル計算機能を活用して、モデルのトレーニングなしでカスタムチャットボットを無料で迅速に構築する方法について説明します。

背景情報

大規模言語モデル (LLM) は多くの業界で応用できます。LLM を使用してチャットボットをカスタマイズするには、モデルを再トレーニングする代わりにプロンプトチューニングを使用できます。プロンプトチューニングは、質問が行われた際に関連する業界固有のテキストを LLM に提供し、その読解力と情報統合能力を活用して、特定のドメインに合わせたより正確な回答を提供できるようにします。

Hologres は、ワンストップのリアルタイムデータウェアハウスエンジンです。Alibaba DAMO Academy が開発した高性能ベクトル計算ソフトウェアライブラリである Proxima と深く統合されています。この統合により、効率的で使いやすいベクトル計算機能が提供されます。詳細については、「Proxima ベクトル計算」をご参照ください。

Hologres をリアルタイムのベクトルストレージおよび検索エンジンとして使用します。カスタムテキストをベクトルとして保存し、LLM にフィードすることで、カスタムチャットボットを迅速に構築できます。

前提条件

LLM にアクセスできること。このトピックは、さまざまな LLM に適用されます。
Hologres インスタンスがアクティブ化されていること。

説明
8 CU の Hologres インスタンスを使用することを推奨します。インスタンスでは、数百万のベクトルを処理できます。数百万を超えるベクトルを処理するには、8 CU 以上の Hologres インスタンスを使用してください。
Anaconda がインストールされていること。詳細については、「Anaconda」をご参照ください。

操作手順

環境を準備します。
- Anaconda 環境で次のコマンドを実行して、必要な依存関係をインストールします。
```
conda create --name chatbot python=3.8
conda activate chatbot

pip install langchain psycopg2-binary torch transformers modelscope bottle
pip install <LLM_dependencies>
```
- LLM を準備します。このトピックでは、API を介してアクセス可能な LLM を例として使用します。LLM の API AccessKey ペアを準備します。
- コードとサンプルデータをクリックするか、次のコマンドを実行して、このトピックのコードとサンプルデータをクローンします。
  説明
  
  リンクからコードとサンプルデータをダウンロードした場合は、使用する前にパッケージを解凍してください。
  
  次のコマンドを使用してコードとサンプルデータをクローンする場合は、まず Git をダウンロードしてインストールしてください。
```
git clone https://github.com/aliyun/alibabacloud-hologres-connectors.git
```

LLM をテストします。

まず、Hologres に関する専門的な質問をして、元の LLM をテストします。

holo-chatbot/chatbot.py ファイルの main 関数 (107 行) で、chat_model_name を使用したい LLM に変更します。以下にコード例を示します。

説明
LangChain でサポートされている任意の LLM を使用できます。詳細については、「LangChain chat models」をご参照ください。他の LLM の対応するインターフェイスを実装して呼び出すこともできます。
```
# Select an LLM.
from langchain.chat_models import <chat_model_name>

chat_model = <chat_model_name>(<api_key>='xxx', <api_base>='xxx', temperature=0.1)
bot = Chatbot(chat_model, args.clear, args.no_vector_store)
```

次のコマンドを実行して LLM を起動します。

説明

初めてコマンドを実行すると、埋め込みモデル (約 400 MB) が自動的にダウンロードされます。これには時間がかかる場合があります。2 回目以降の呼び出しでは、このステップは不要です。

# Switch to the directory where the code is located.
cd alibabacloud-hologres-connectors/holo-chatbot
# Run the following command to open the setup wizard. Press Enter to skip each setting.
python generate_config.py

# -n --no-vector-store: Directly calls the LLM for Q&A without using Hologres for vector retrieval.
python chatbot.py -n

次の例は、LLM からの質問とそれに対応する回答を示しています。

Human: What is Hologres?
Chatbot: Hologres is a cloud-native analytical data warehouse provided by Alibaba Cloud. It can process petabytes of data and provides high-performance query and analysis capabilities. Hologres supports the SQL and PostgreSQL protocols and is deeply integrated with the Alibaba Cloud ecosystem, such as MaxCompute and DataWorks.

Human: Is there a way to accelerate jsonb queries in Hologres?
Chatbot: Yes, Hologres provides a method called "Gin Index" to accelerate queries on the jsonb type. A Gin Index is an inverted index-based data structure that can quickly locate rows containing specific jsonb key-value pairs. In Hologres, you can use the CREATE INDEX statement to create a Gin Index. For example:
CREATE INDEX idx_gin ON my_table USING gin(my_jsonb_column);
This creates a Gin Index on the my_jsonb_column column of the my_table table. After that, you can use jsonb query operators (such as ->, ->>, #>, #>>) to query the column, and the query will be accelerated by the Gin Index.

Human: What is column-store jsonb in Hologres?
Chatbot: Column-store jsonb in Hologres is a data type that can store data in JSON format. JSON is a lightweight data interchange format commonly used in web applications. jsonb is a binary format of JSON that provides faster query speeds and requires less storage space. In Hologres, you can use jsonb columns to store and query JSON data.

Hologres でベクトルデータを準備します。

Hologres でデータベースを作成します。詳細については、「データベースの作成」をご参照ください。

接続情報を入力します。

以下のコマンドを実行してセットアップウィザードを開き、接続情報を入力してください。また、後で config/config.yaml ファイルを編集して、接続情報を手動で更新することもできます。

python generate_config.py

接続情報：

パラメーター	説明
HOLO_ENDPOINT	Hologres インスタンスのネットワークアドレスです。ネットワークアドレスを取得するには、Hologres コンソールのInstance Detailsページに移動します。
HOLO_PORT	Hologres インスタンスのポート番号です。 Hologres コンソールのInstance Details ページでポート番号を取得します。
HOLO_USER	ご利用の Alibaba Cloud アカウントの AccessKey ID です。 AccessKey 管理に移動して AccessKey ID を取得します。
HOLO_PASSWORD	ご利用の Alibaba Cloud アカウントの AccessKey Secret です。 AccessKey 管理に移動して AccessKey Secret を取得します。
HOLO_DATABASE	Hologres インスタンス内のデータベースの名前です。

コーパスを準備します。

この Topic では、ダウンロードしたコードとサンプルデータの data ディレクトリにある example_data.csv ファイルを例として使用します。

コーパスファイルは CSV フォーマットで、title 列および content 列を含む必要があります。

このトピックでは、選択した Hologres ユーザーマニュアルをコーパスとして使用します。次の表にサンプルコンテンツを示します。サンプルファイルの形式に基づいてコーパスをカスタマイズすることもできます。

title	content
あなたは誰ですか	私は AI チャットロボットの Holo チャットボットです。提供されたドキュメントデータベースに基づいて、ご質問に最善を尽くして回答します。
あなたの実装方法とアーキテクチャは何ですか	私は Langchain フレームワークを使用して開発されています。Langchain は、多くの実用的なツールを統合した、大規模言語モデル向けの使いやすい Python 開発フレームワークです。私のバックエンドは、ベクトルデータストレージに Hologres を使用しており、大量のテキストデータとベクトルデータを保存し、高速な近似ベクトルクエリを実行して最も関連性の高いドキュメントを取得できます。最後に、LLM API を呼び出すことで Q&A を実装しています。
リアルタイムデータウェアハウス Hologres とは何ですか	Hologres は、Alibaba が独自に開発したワンストップのリアルタイムデータウェアハウス (Real-Time Data Warehouse) エンジンです。大量データのリアルタイム書き込み、リアルタイム更新、リアルタイム処理、リアルタイム分析をサポートしています...

コーパスデータをベクトル化して Hologres にインポートします。

次のコマンドを実行して、コーパスデータをベクトル (テキスト埋め込み) に変換し、Hologres に書き込みます。

説明

このトピックでは、ModelScope プラットフォーム上の DAMO Academy のオープンソース中国語埋め込みモデル CoRom を使用して、コーパスを 768 次元のベクトルデータに変換します。詳細については、「coROM」をご参照ください。

python chatbot.py -l

Hologres の langchain_embedding テーブルで、変換されたベクトルデータにクエリを実行できます:

SELECT * FROM langchain_embedding limit 1;

--Sample result
id|embedding|metadata|document
2419815e-1407-11ee-bee5-acde48001122|{0.395261,0.123794,0.761932,0.413286,...}|{"source": "data/example_data.csv", "row": 1}|title: How are you implemented and what is your architecture content: I am developed using the Langchain framework. Langchain is an easy-to-use Python development framework for large language models that integrates many practical tools. My backend uses Hologres for vector data storage, which can store a large amount of text and vector data and perform fast approximate vector queries to retrieve the most relevant documents. Finally, I implement Q&A by calling the LLM API.

LLM とコーパスの効果を検証します。

前のコーパスに対応するベクトルデータを LLM に提供して、プロンプトチューニングを行います。次のコマンドを実行して LLM を起動し、同じ質問を再度行います。

python chatbot.py

質問 1：元の LLM と比較して、コーパスによって最適化された回答は、Hologres の利点をより豊富かつ正確に要約しています。

Human: What is Hologres
Chatbot: Hologres is a one-stop real-time data warehouse (Real-Time Data Warehouse) engine developed by Alibaba. It supports real-time writing, updating, processing, and analysis of massive data. Hologres supports standard SQL and is compatible with the PostgreSQL protocol, syntax, and most functions. It also supports petabyte-level multidimensional analysis (OLAP), ad hoc analysis, and online data services (Serving) with high concurrency and low latency. The engine provides fine-grained isolation for various workloads and enterprise-level security. Hologres is deeply integrated with MaxCompute, Flink, and DataWorks to offer an all-in-one data warehouse solution for both offline and online scenarios. Hologres is designed to be a high-performance, reliable, low-cost, and scalable real-time data warehouse engine. It provides solutions for massive real-time data and interactive query services with sub-second latency. Common use cases include building real-time data mid-ends, fine-grained analysis, self-service analysis, marketing profiles, audience segmentation, and real-time risk control.

質問 2：最適化された LLM は、Hologres が JSONB 型のクエリを高速化する方法に関するコア情報を理解し、バージョン情報と原則の完全な要約を提供できます。

Human: Is there a way to accelerate jsonb queries in Hologres?
Chatbot: Yes. Starting from V1.3, Hologres supports column storage optimization for the JSONB type. This optimization reduces the storage size of JSONB data and accelerates queries. When you enable this feature, the system automatically converts the JSONB column to a column store with a strong schema at the underlying layer. When you query a value in the JSONB data, the specified column can be directly hit, which improves query performance. Because the values are stored in a columnar format, the storage layer achieves the same storage and compression efficiency as regular structured data. This effectively reduces storage costs.

質問 3：最適化された LLM は、column store を column storage optimization として正しく解釈し、JSONB 列ストレージの原則の完全なまとめを提供します。

Human: What is column-store jsonb in Hologres?
Chatbot: Column-store JSONB in Hologres means that data of the JSONB type is stored in a columnar format. Each key-value pair in the JSONB data is split into a separate column for storage, instead of storing the entire JSONB object as a whole. This storage method improves query efficiency because queries only need to access the required columns, rather than scanning the entire JSONB object. Column-store JSONB also effectively reduces storage space because the same value can be repeatedly stored in different rows, achieving data compression. Starting from V1.3, Hologres supports column storage optimization for the JSONB type.

ご覧のとおり、最適化後、Hologres の専門分野における LLM の回答は非常に正確です。

(オプション) チャットボットを DingTalk グループチャットに接続します。

チャットボットを DingTalk グループチャットに接続することができます。また、dingding_server.py 内のメソッドに従って、チャットボットを他のサービスに接続することもできます。
1. DingTalk グループチャットを開き、カスタムボットを追加します。
  
  ［セキュリティ設定］で、Add Signature を選択し、［送信メカニズム］を有効にします。送信メカニズムで、カスタムチャットボットをデプロイしたサーバーの IP アドレスと任意のトークンを入力します。［Webhook］の access_token および secret を、接続情報 (config/config.yaml) の DINGDING_TOKEN および DINGDING_SECRET 環境変数として使用します。
2. 次のコマンドを実行して、サーバーに HTTP サービスをデプロイします。
```
# This script runs in the background to serve HTTP requests from the DingTalk robot. The URL is http://localhost:8889/chat
python dingding_server.py > log.txt 2>&1 &
```
3. DingTalk グループチャットに接続してチャットを開始します。
  
  DingTalk グループチャットでチャットボットに @メンションして、会話を開始できます。