All Products
Search
Document Center

Hologres:Build a private chatbot with Hologres

Last Updated:Mar 11, 2026

This topic describes how to quickly build a custom chatbot for free by leveraging the vector computation capabilities of Hologres. No model training is required.

Background information

Large language models (LLMs) can be applied across many industries. To customize a chatbot with an LLM, you can use prompt tuning instead of retraining the model. Prompt tuning supplies the LLM with relevant industry-specific text when a question is asked, enabling it to leverage its reading comprehension and information integration abilities to deliver more accurate answers tailored to that specific domain.

Hologres is a one-stop real-time data warehouse engine. It is deeply integrated with Proxima, a high-performance vector computation software library developed by Alibaba DAMO Academy. This integration delivers efficient and easy-to-use vector computation capabilities. For more information, see Proxima vector computation.

Use Hologres as a real-time vector storage and retrieval engine. Store your custom text as vectors and feed them to an LLM to quickly build a custom chatbot.

Prerequisites

  • You have access to an LLM. This topic applies to various LLMs.

  • A Hologres instance is activated.

    Note

    We recommend that you use a Hologres instance with 8 CUs. The instance can process millions of vectors. To process more than millions of vectors, use a Hologres instance with more than 8 CUs.

  • You have installed Anaconda. For more information, see Anaconda.

Procedure

  1. Prepare the environment.

    • Run the following commands in the Anaconda environment to install the required dependencies:

      conda create --name chatbot python=3.8
      conda activate chatbot
      
      pip install langchain psycopg2-binary torch transformers modelscope bottle
      pip install <LLM_dependencies>
    • Prepare the LLM. This topic uses an LLM accessible through an API as an example. Prepare the API AccessKey pair for the LLM.

    • Click Code and sample data or run the following command to clone the code and sample data for this topic:

      Note
      • If you download the code and sample data from the link, decompress the package before using it.

      • If you use the following command to clone the code and sample data, first download and install Git.

      git clone https://github.com/aliyun/alibabacloud-hologres-connectors.git
  2. Test the LLM.

    First, test the original LLM by asking it professional questions about Hologres.

    1. In the main function (line 107) of the holo-chatbot/chatbot.py file, change chat_model_name to the LLM you want to use. The following code provides an example:

      Note

      You can use any LLM supported by LangChain. For more information, see LangChain chat models. You can also implement the corresponding interfaces for other LLMs and then call them.

      # Select an LLM.
      from langchain.chat_models import <chat_model_name>
      
      chat_model = <chat_model_name>(<api_key>='xxx', <api_base>='xxx', temperature=0.1)
      bot = Chatbot(chat_model, args.clear, args.no_vector_store)
    2. Run the following command to start the LLM:

      Note

      The first time you run the command, the embedding model (approximately 400 MB) is automatically downloaded. This may take a long time. Subsequent calls do not require this step.

      # Switch to the directory where the code is located.
      cd alibabacloud-hologres-connectors/holo-chatbot
      # Run the following command to open the setup wizard. Press Enter to skip each setting.
      python generate_config.py
      
      # -n --no-vector-store: Directly calls the LLM for Q&A without using Hologres for vector retrieval.
      python chatbot.py -n
    3. The following examples show questions and the corresponding answers from the LLM:

      • Human: What is Hologres?
        Chatbot: Hologres is a cloud-native analytical data warehouse provided by Alibaba Cloud. It can process petabytes of data and provides high-performance query and analysis capabilities. Hologres supports the SQL and PostgreSQL protocols and is deeply integrated with the Alibaba Cloud ecosystem, such as MaxCompute and DataWorks.
      • Human: Is there a way to accelerate jsonb queries in Hologres?
        Chatbot: Yes, Hologres provides a method called "Gin Index" to accelerate queries on the jsonb type. A Gin Index is an inverted index-based data structure that can quickly locate rows containing specific jsonb key-value pairs. In Hologres, you can use the CREATE INDEX statement to create a Gin Index. For example:
        CREATE INDEX idx_gin ON my_table USING gin(my_jsonb_column);
        This creates a Gin Index on the my_jsonb_column column of the my_table table. After that, you can use jsonb query operators (such as ->, ->>, #>, #>>) to query the column, and the query will be accelerated by the Gin Index.
      • Human: What is column-store jsonb in Hologres?
        Chatbot: Column-store jsonb in Hologres is a data type that can store data in JSON format. JSON is a lightweight data interchange format commonly used in web applications. jsonb is a binary format of JSON that provides faster query speeds and requires less storage space. In Hologres, you can use jsonb columns to store and query JSON data.
  3. Prepare vector data in Hologres.

    1. Create a database in Hologres. For more information, see Create a database.

    2. Complete the connection information.

      Run the following command to open the setup wizard and enter the connection information. You can also manually update the connection information later by editing the config/config.yaml file.

      python generate_config.py

      Connection information:

      Parameter

      Description

      HOLO_ENDPOINT

      The network address of the Hologres instance.

      Go to the Instance Details page in the Hologres console to obtain the network address.

      HOLO_PORT

      The port number of the Hologres instance.

      Go to the Instance Details page in the Hologres console to obtain the port number.

      HOLO_USER

      The AccessKey ID of your Alibaba Cloud account.

      Go to AccessKey Management to obtain the AccessKey ID.

      HOLO_PASSWORD

      The AccessKey secret of your Alibaba Cloud account.

      Go to AccessKey Management to obtain the AccessKey secret.

      HOLO_DATABASE

      The name of the database in the Hologres instance.

    3. Prepare the corpus.

      This topic uses the example_data.csv file in the data directory of the downloaded code and sample data as an example.

      • The corpus file must be in CSV format and contain the title and content columns.

      • This topic uses selected Hologres user documentation as the corpus. The following table shows sample content. You can also customize the corpus based on the format of the sample file.

        title

        content

        Who are you

        I am Holo chatbot, an AI chat robot. I will do my best to answer your questions based on the provided document database.

        How are you implemented and what is your architecture

        I am developed using the Langchain framework. Langchain is an easy-to-use Python development framework for large language models that integrates many practical tools. My backend uses Hologres for vector data storage, which can store a large amount of text and vector data and perform fast approximate vector queries to retrieve the most relevant documents. Finally, I implement Q&A by calling the LLM API.

        What is the real-time data warehouse Hologres

        Hologres is a one-stop real-time data warehouse (Real-Time Data Warehouse) engine independently developed by Alibaba. It supports real-time writing, real-time updates, real-time processing, and real-time analysis of massive data...

      • Vectorize the corpus data and import it into Hologres:

        Run the following command to convert the corpus data into vectors (text embeddings) and write them to Hologres.

        Note

        This topic uses the open source Chinese embedding model CoRom from DAMO Academy on the ModelScope platform to convert the corpus into 768-dimensional vector data. For more information, see coROM.

        python chatbot.py -l

        You can query the converted vector data in the langchain_embedding table of Hologres:

        SELECT * FROM langchain_embedding limit 1;
        
        --Sample result
        id|embedding|metadata|document
        2419815e-1407-11ee-bee5-acde48001122|{0.395261,0.123794,0.761932,0.413286,...}|{"source": "data/example_data.csv", "row": 1}|title: How are you implemented and what is your architecture content: I am developed using the Langchain framework. Langchain is an easy-to-use Python development framework for large language models that integrates many practical tools. My backend uses Hologres for vector data storage, which can store a large amount of text and vector data and perform fast approximate vector queries to retrieve the most relevant documents. Finally, I implement Q&A by calling the LLM API.
  4. Verify the effects of the LLM and the corpus.

    Provide the LLM with the vector data corresponding to the preceding corpus for prompt tuning. Run the following command to start the LLM and ask it the same questions again.

    python chatbot.py
    • Question 1: Compared with the original LLM, the answer optimized by the corpus summarizes the advantages of Hologres more richly and accurately.

      Human: What is Hologres
      Chatbot: Hologres is a one-stop real-time data warehouse (Real-Time Data Warehouse) engine developed by Alibaba. It supports real-time writing, updating, processing, and analysis of massive data. Hologres supports standard SQL and is compatible with the PostgreSQL protocol, syntax, and most functions. It also supports petabyte-level multidimensional analysis (OLAP), ad hoc analysis, and online data services (Serving) with high concurrency and low latency. The engine provides fine-grained isolation for various workloads and enterprise-level security. Hologres is deeply integrated with MaxCompute, Flink, and DataWorks to offer an all-in-one data warehouse solution for both offline and online scenarios. Hologres is designed to be a high-performance, reliable, low-cost, and scalable real-time data warehouse engine. It provides solutions for massive real-time data and interactive query services with sub-second latency. Common use cases include building real-time data mid-ends, fine-grained analysis, self-service analysis, marketing profiles, audience segmentation, and real-time risk control.
    • Question 2: The optimized LLM can understand the core information about how Hologres accelerates queries on the JSONB type and provides a complete summary of the version information and principles.

      Human: Is there a way to accelerate jsonb queries in Hologres?
      Chatbot: Yes. Starting from V1.3, Hologres supports column storage optimization for the JSONB type. This optimization reduces the storage size of JSONB data and accelerates queries. When you enable this feature, the system automatically converts the JSONB column to a column store with a strong schema at the underlying layer. When you query a value in the JSONB data, the specified column can be directly hit, which improves query performance. Because the values are stored in a columnar format, the storage layer achieves the same storage and compression efficiency as regular structured data. This effectively reduces storage costs.
    • Question 3: The optimized LLM correctly interprets column store as column storage optimization and provides a complete summary of the principles of JSONB column storage.

      Human: What is column-store jsonb in Hologres?
      Chatbot: Column-store JSONB in Hologres means that data of the JSONB type is stored in a columnar format. Each key-value pair in the JSONB data is split into a separate column for storage, instead of storing the entire JSONB object as a whole. This storage method improves query efficiency because queries only need to access the required columns, rather than scanning the entire JSONB object. Column-store JSONB also effectively reduces storage space because the same value can be repeatedly stored in different rows, achieving data compression. Starting from V1.3, Hologres supports column storage optimization for the JSONB type.

    As you can see, after optimization, the LLM's answers in the professional field of Hologres are highly accurate.

  5. (Optional) Connect the chatbot to a DingTalk group chat.

    You can connect the chatbot to a DingTalk group chat. You can also follow the method in dingding_server.py to connect the chatbot to other services.

    1. Open a DingTalk group chat and add a custom bot.

      For Security Settings, select Add Signature and enable the Outgoing mechanism. In the Outgoing mechanism, enter the IP address of the server where you deployed the custom chatbot and an arbitrary token. Use the access_token and secret from the Webhook as the DINGDING_TOKEN and DINGDING_SECRET environment variables in the connection information (config/config.yaml).

    2. Run the following command to deploy an HTTP service on the server.

      # This script runs in the background to serve HTTP requests from the DingTalk robot. The URL is http://localhost:8889/chat
      python dingding_server.py > log.txt 2>&1 &
    3. Connect to the DingTalk group chat and start chatting.

      You can @ the chatbot in the DingTalk group chat to start a conversation.