All Products
Search
Document Center

AnalyticDB:Create a dedicated ChatBot using Compute Nest

Last Updated:Mar 28, 2026

Deploy a private enterprise chatbot powered by a large language model (LLM) and AnalyticDB for PostgreSQL as the vector database — without writing any infrastructure code. Compute Nest provisions all required resources, giving you a working Retrieval-Augmented Generation (RAG) chatbot with a web UI in about 10 minutes.

How it works

The chatbot uses RAG to answer questions from your private documents:

  1. Upload documents — PDF, Markdown, TXT, or Word files go into a knowledge base.

  2. Chunk and embed — The system splits documents into segments and converts them to vector embeddings stored in AnalyticDB for PostgreSQL.

  3. Retrieve and generate — When a user asks a question, the system retrieves the most relevant chunks from the vector database, then passes them to the LLM to generate a grounded answer.

AnalyticDB for PostgreSQL handles vector storage and similarity search. The LLM (running on Platform for AI (PAI)) generates responses. LangChain on Elastic Compute Service (ECS) orchestrates the pipeline and serves the web UI.

What you'll do

  1. Review billing and prerequisites.

  2. Create a service instance using the GenAI-LLM-RAG template.

  3. Upload documents to build a knowledge base.

  4. Ask questions through the web UI.

  5. (Optional) Manage resources: scale PAI-EAS, switch models, or inspect database tables.

Prerequisites

Before you begin, make sure you have:

Billing

When you create the One-stop Enterprise-specific Chatbot Community Edition (Large Language Model + Vector Database) service, the system automatically creates an ECS instance and an AnalyticDB for PostgreSQL instance in elastic storage mode. You are charged for these resources.

For pricing details, see:

Operation video

Create a service instance

This guide uses the GenAI-LLM-RAG template as an example.

  1. Go to the Create Service Instance page. In the Quick Trial section, click GenAI-LLM-RAG.

  2. On the Create Service Instance page, configure the following parameters.

    SectionParameterDescription
    Service instance nameEnter a name that is easy to identify. The system generates a random name by default.
    RegionThe region where all resources (service instance, ECS, and AnalyticDB for PostgreSQL) will be created.
    Billing method configurationBilling methodSelect Pay-As-You-Go or Subscription. This guide uses pay-as-you-go.
    ECS configurationInstance typeSelect the ECS instance specifications.
    Instance passwordThe logon password for the ECS instance.
    Whitelist settingsAdd the IP addresses of servers that need to call the LLM API.
    PAI-EAS model configurationSelect large modelSelect a pre-configured LLM. This guide uses llama2-7b.
    PAI instance typeSelect the GPU specifications for PAI. Unavailable specifications are grayed out.
    AnalyticDB for PostgreSQLInstance typeThe node specifications for the AnalyticDB for PostgreSQL instance.
    Segment storage sizeStorage space for compute nodes, in GB.
    Database account nameThe initial database account name.
    Database passwordThe password for the initial database account.
    Application configurationSoftware logon nameThe username for logging in to the LangChain web service.
    Software logon passwordThe password for the LangChain web service.
    Zone configurationvSwitch zoneThe zone where the service instance will be created.
    Network configurationCreate a new VPCCreate a new VPC or use an existing one. This guide uses a new VPC.
    VPC IPv4 CIDR blockThe IPv4 CIDR block for the VPC.
    vSwitch subnet CIDR blockThe CIDR block for the vSwitch.
    Tags and resource groupsTagAttach a tag to the service instance.
    Resource groupThe resource group for the service instance. See What is Resource Management?
  3. Click Next: Confirm Order.

  4. Review the Dependency Check, Service Instance Information, and Price Preview sections.

    If a role permission shows as disabled in Dependency Check, click Enable Now on the right, then click the refresh button in that section.
  5. Click Create Now.

  6. Click View Service.

The service instance takes about 10 minutes to create. Its status changes to Deployed when ready.

The LLM is downloaded asynchronously from Hugging Face after the service instance is deployed. This download takes an additional 30 to 60 minutes. The chatbot is not usable until the download completes.

Set up the knowledge base and use the chatbot

Before the chatbot can answer questions, upload your documents to a knowledge base.

  1. In the Compute Nest console, go to Service Instance and click the ID of your service instance.

  2. On the service instance details page, in the Use Now section, click the link next to Endpoint.

  3. In the Log On dialog box, enter the Software Logon Name and Software Logon Password you set during service creation, then click Log On.

  4. In the upper-right corner, under Please select a usage mode, select Knowledge Base Q&A.

  5. In the Configure Knowledge Base section on the right, under Please select a knowledge base to load, select Create Knowledge Base. Enter a name for the new knowledge base and click Add to Knowledge Base Options.

  6. Set Sentence Length Limit for Text Storage based on your requirements. The recommended value is 500. Longer segments reduce chunking granularity and can lower retrieval accuracy.

  7. Upload documents to the knowledge base. Supported upload methods: Supported file formats: PDF, Markdown, TXT, and Word. > Tip: Documents with complex layouts (tables, multi-column text, or heavy formatting) may produce lower-quality chunks. For best retrieval accuracy, convert such documents to plain text or structured Markdown before uploading. To remove a file, use the Delete File interface.

    • Upload File — upload individual files

    • Upload File and URL — upload files or fetch from a URL

    • Upload Folder — upload an entire folder

  8. After the upload completes, type a question in the lower-left corner and click Submit

Resource management

View associated resources

  1. In the Compute Nest console, go to Service Instance and click the ID of your service instance.

  2. Click the Resources tab.

Manage AnalyticDB for PostgreSQL

On the Resources tab, find the resource with Product set to AnalyticDB for PostgreSQL and click its Resource ID to open the instance management page.

For more information about vector capabilities:

To scale resources:

View knowledge base data in the database

  1. On the AnalyticDB for PostgreSQL instance management page, click Log On to Database in the upper-right corner. For connection instructions, see Use DMS to log on to a database.

    Use the Database Account Name and Database Password you specified when creating the service instance.
  2. In the Logged-in Instances list on the left, find your AnalyticDB for PostgreSQL instance and double-click the public schema under the chatglmuser database.

    • The langchain_collections table lists all knowledge bases.

    • Each knowledge base has its own table (named after the knowledge base) containing embeddings, chunks, file metadata, and original file names.

For more information about Data Management (DMS), see What is Data Management (DMS).

Manage PAI-EAS resources

Enable auto scaling

PAI-EAS supports horizontal auto scaling, scheduled scaling, and elastic resource pools. For workloads with significant traffic peaks, enable horizontal auto scaling to avoid over-provisioning at low traffic and prevent resource exhaustion at peak traffic.

  1. On the Resources tab, find the resource with Product set to Platform for AI (PAI) and click its Resource ID to open the service details page.

  2. Click the Auto Scaling tab.

  3. In the Elastic Scaling section, click Enable Auto Scaling.

  4. In the Auto Scaling Settings dialog box, configure the parameters based on your workload:

    ScenarioMinimum instancesMaximum instancesScaling metricQPS threshold
    Low-traffic (start on demand, stop when idle)01QPS-based scaling threshold per instance1
    High-traffic (large daily volume with fluctuations)550QPS-based scaling threshold per instance2
  5. Click Enable.

For a description of each scaling type, see horizontal auto scaling, scheduled scaling, and elastic resource pools.

Switch to a different LLM

  1. On the Resources tab, find the resource with Product set to Platform for AI (PAI) and click its Resource ID.

  2. Click Update Service in the upper-right corner.

  3. On the deployment page, update the Run Command and GPU Instance Type using the values in the following table. Leave all other parameters at their default values.

    ModelRun commandRecommended instance type
    Llama 2 13Bpython api/api_server.py --port=8000 --model-path=meta-llama/Llama-2-13b-chat-hf --precision=fp16V100 (gn6e)
    Llama 2 7Bpython api/api_server.py --port=8000 --model-path=meta-llama/Llama-2-7b-chat-hfGU30, A10
    Qwen 7Bpython api/api_server.py --port=8000 --model-path=Qwen/Qwen-7B-ChatGU30, A10
  4. Click Deploy.

  5. In the Deploy Service dialog box, click OK.

FAQ

How do I check whether the LLM has finished downloading?

After the service instance is deployed, the LLM downloads asynchronously from Hugging Face to the ECS instance. This takes 30 to 60 minutes. To monitor progress, log on to the ECS instance and run:

journalctl -ef -u langchain-chatglm

When you see a log entry indicating the service is listening (for example, the Uvicorn startup message), the model is loaded and the chatbot is ready. Then log on to the web UI to use the chatbot.

Why does the model fail to load after deployment?

The model download from Hugging Face takes 30 to 60 minutes and may be slower in some regions. The chatbot is unavailable until the download completes. Use the log command above to monitor download progress. Wait for the model-loaded message before accessing the web UI.

Why do I see a blank page when I access the service?

This service runs on the Alibaba Cloud China site (www.aliyun.com). If you access it through a proxy from outside China, the page may appear blank. Disable the proxy before creating and accessing the service.

How do I log on to the ECS instance?

On the Resources tab of your service instance, find the resource with Resource Type set to securitygroup and click its Resource ID. On the ECS basic information page, click Remote Connection. For details, see Connect to an instance.

How do I restart the LangChain service?

Log on to the ECS instance and run:

systemctl restart langchain-chatglm

How do I view LangChain logs?

Log on to the ECS instance and run:

journalctl -ef -u langchain-chatglm

How do I enable the LangChain API?

Log on to the ECS instance and run the following commands:

# Copy the systemd unit file for the API service
cp /lib/systemd/system/langchain-chatglm.service /lib/systemd/system/langchain-chatglm-api.service

# Edit ExecStart in the new unit file:
# For PAI-EAS:
#   ExecStart=/usr/bin/python3.9 /home/langchain/langchain-ChatGLM/api.py
# For a single GPU-accelerated instance:
#   ExecStart=/usr/bin/python3.9 /home/admin/langchain-ChatGLM/api.py

# Reload systemd and start the API service
systemctl daemon-reload
systemctl restart langchain-chatglm-api

# Verify the API is running (look for the following log entry):
# INFO:     Uvicorn running on http://0.0.0.0:7861 (Press CTRL+C to quit)

# List all available API endpoints:
curl http://0.0.0.0:7861/openapi.json

Where is LangChain deployed on the ECS instance?

LangChain is deployed at /home/admin/langchain-ChatGLM.

How do I use the vector search APIs?

See Import and query vector data through APIs (Java).

How do I get backend support from the product team?

Subscribe to the One-stop Enterprise-specific Chatbot Managed Service to request support.

Where can I find the deployment source code?

See the langchain-ChatGLM repository on GitHub.

What's next