All Products
Search
Document Center

:Applicable to images with the versions earlier than v0.3.0

Last Updated:Dec 01, 2025

Retrieval-Augmented Generation (RAG) of Platform for AI (PAI) provides various APIs for features, such as service management and chat. This topic describes the APIs and calling methods supported by RAG services deployed by using images with the versions earlier than v0.3.0.

Limits

This document is applicable only to RAG services deployed by using images with the versions earlier than v0.3.0.

To view the image version, perform the following operations: On the Elastic Algorithm Service (EAS) page, click the name of the desired RAG service. In the Environment Information section of the Overview tab, view the image version.

image

Obtain the access address and token of the service

Before you call a RAG service by using an API, obtain the access address and token of the service.

  1. Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).

  2. Click the name of the desired service. In the Basic Information section of the Overview tab, click View Endpoint Information.

  3. On the Shared Gateway tab of the Invocation Method dialog box, obtain the service access address (EAS_SERVICE_URL) and token (EAS_Token).

    Important
    • Remove the forward slash (/) from the end of EAS_SERVICE_URL.

    • To call the service by using a public endpoint, the client that you use must support access over the Internet.

    • To call the service by using a VPC endpoint, the client that you use must be in the same virtual private cloud (VPC) as the service.

    image

Chat API

Call the service by using the OpenAI API that is compatible with the service. Before calling the service, you need to complete the corresponding configurations on the WebUI page of the RAG service based on your business requirements.

Supported features

  • Web search: You need to configure web search parameters.

  • Knowledge base query: You need to upload knowledge base files.

  • LLM chat: Use large language model (LLM) services to provide answers. You need to configure LLM services.

  • Agent chat: You need to complete agent-related code configurations on the WebUI page of the RAG service.

  • Database or table query: You need to complete chat_db-related parameters on the WebUI page of the RAG service.

Method

URL

{EAS_SERVICE_URL}/v1/chat/completions

Request method

POST

Request header

Authorization: EAS_TOKEN # The token of the Elastic Algorithm Service (EAS) service.

HTTP body

{
    "model": "default",  # The model name. Set the value to default.
    "messages": [
        {"role": "user", "content": "Hello"},
        {"role": "assistant", "content": "Hello, how can I help you?"},
        {"role": "user", "content": "What is the capital of Zhejiang province"},
        {"role": "assistant", "content": "Hangzhou is the capital of Zhejiang province."},
        {"role": "user", "content": "What are some interesting places to visit"},
    ],
    "stream": true,  # Specifies whether to use the streaming mode.
    "chat_knowledgebase": true,  # Specifies whether to query the local knowledge base.
    "search_web": false,  # Specifies whether to use web search.
    "chat_llm": false,  # Specifies whether to use the LLM chat only.
    "chat_agent": false,  # Specifies whether to use the agent.
    "chat_db": false,  # Specifies whether to query the database.
    "index_name": "default_index",  # The index name used for RAG scenarios. Only 1 index name is supported. If you leave this parameter empty, the system will use the default index name.
    "max_tokens": 1024,  # The maximum number of tokens, such as 1024.
    "temperature": 0.1,  # Controls the randomness of the generated content. Valid values: [0,1]. The smaller the value, the more deterministic the generated content. The larger the value, the higher the randomness of the generated content.
}
Important
  • If you configure multiple features at the same time, the system calls them based on the priority from highest to lowest: search_web, chat_knowledgebase, chat_agent, chat_db, chat_llm. Additionally, for each feature, the system performs preliminary intent recognition to determine whether to call that feature or directly use the LLM to generate a response.

  • If you set all features to false or do not configure the features, the local knowledge base ("chat_knowledgebase": true) is queried by default.

Request example (click here to view details)

Web search

from openai import OpenAI

##### API configuration #####
# Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.

openai_api_key = ""
openai_api_base = "/v1"
client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)


#### Chat ######
def chat():
    stream = True
    chat_completion = client.chat.completions.create(
        model="default",
        stream=stream,
        messages=[
            {"role": "user", "content": "Hello"},
            {"role": "assistant", "content": "Hello, how can I help you?"},
            {"role": "user", "content": "What is the capital of Zhejiang province"},
            {"role": "assistant", "content": "Hangzhou is the capital of Zhejiang province."},
            {"role": "user", "content": "What are some interesting places to visit"},
        ],
        extra_body={
            "search_web": True,
        },
    )

    if stream:
        for chunk in chat_completion:
            print(chunk.choices[0].delta.content, end="")
    else:
        result = chat_completion.choices[0].message.content
        print(result)


chat()

Database query

from openai import OpenAI

##### API configuration #####
# Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.

openai_api_key = ""
openai_api_base = "/v1"
client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)


#### Chat ######
def chat():
    stream = True
    chat_completion = client.chat.completions.create(
        model="default",
        stream=stream,
        messages=[
            {"role": "user", "content": "How many cats are there"},
            {"role": "assistant", "content": "There are 2 cats"},
            {"role": "user", "content": "What about dogs"},
        ],
        extra_body={
            "chat_db": True,
        },
    )

    if stream:
        for chunk in chat_completion:
            print(chunk.choices[0].delta.content, end="")
    else:
        result = chat_completion.choices[0].message.content
        print(result)


chat()

Management API

Upload knowledge base files

Method

URL

{EAS_SERVICE_URL}/api/v1/upload_data

Request method

POST

Request header

  • Authorization: EAS_TOKEN # The token of the EAS service.

  • Content-Type: multipart/form-data

Request parameter

  • files: the file.

  • oss_path: the Object Storage Service (OSS) path.

  • index_name: the index name. The default value is default_index.

cURL request example (click here to view details)

  • Upload a single file

     # Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
     # Replace the path after "-F 'files=@" with the path to your file.
     # Configure index_name as the name of your knowledge base index.
       curl -X 'POST' /api/v1/upload_data \
      -H 'Authorization: ' \
      -H 'Content-Type: multipart/form-data' \
      -F 'files=@example_data/paul_graham/paul_graham_essay.txt' \
      -F 'index_name=default_index'
  • Upload multiple files. Use multiple -F 'files=@path' parameters with each parameter corresponding to a file to be uploaded, as shown in the example:

      # Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
      # Replace the path after "-F 'files=@" with the path to your file.
      # Configure index_name as the name of your knowledge base index.
      curl -X 'POST' /api/v1/upload_data \
      -H 'Authorization: ' \
      -H 'Content-Type: multipart/form-data' \
      -F 'files=@example_data/paul_graham/paul_graham_essay.txt' \
      -F 'files=@example_data/another_file1.md' \
      -F 'files=@example_data/another_file2.pdf' \
      -F 'index_name=default_index'

Response example (click here to view details)

  { "task_id": "2c1e557733764fdb9fefa0635389****" }

Query the upload status

Method

URL

{EAS_SERVICE_URL}/api/v1/get_upload_state

Request method

GET

Request header

Authorization: EAS_TOKEN # The token of the EAS service.

cURL request example (click here to view details)

# Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
# Configure task_id as the task_id value returned after you upload knowledge base files.
curl -X 'GET' '/api/v1/get_upload_state?task_id=2c1e557733764fdb9fefa0635389****' -H 'Authorization: '

Response example (click here to view details)

  {
    "task_id": "2c1e557733764fdb9fefa0635389****",
    "status": "completed",
    "detail": null
  }

Query a knowledge base

Method

URL

{EAS_SERVICE_URL}/api/v1/query/retrieval

Request method

POST

Request header

  • Authorization: EAS_TOKEN # The token of the EAS service.

  • Content-Type: application/json

Request parameter

  • question: the user question

  • index_name: the name of the index. The default value is default_index.

cURL request example (click here to view details)

# Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
# Configure the question parameter as the user question. 
# Specify an index name for the index_name parameter.  
  curl -X 'POST' '<EAS_SERVICE_URL>/api/v1/query/retrieval' \
  -H 'Authorization: <EAS_TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{
      "question": "What can I do when the x13-auto-arima component reports an error?",
      "index_name": "default_index"
  }'

Response example (click here to view details)

{
  "docs": [
    {
      "text": "2.PAl-Studio/Designer FAQ 2.1. FAQ about algorithm components : \nCharacters that cannot be transcoded are displayed as \"blob.\" Ignore this error, because nodes in the downstream can read and process the data.\nWhat can I do when the x13-auto-arima component reports an error?\nMake sure that up to 1,200 training data samples are imported into the x13-auto-arima component.\nWhat can I do when the Doc2Vec component reports the CallExecutorToParseTaskFail error?",
      "score": 0.83608,
      "metadata": {
        "file_path": "***/pai_document.md",
        "file_name": "pai_document.md",
        "file_size": 3794,
        "creation_date": "2025-03-20",
        "last_modified_date": "2025-03-20"
      },
      "image_url": null
    }
  ]
}

Upload EXCEL or CSV files to query chat_db tables

Method

URL

{EAS_SERVICE_URL}/api/v1/upload_datasheet

Request method

POST

Request header

  • Authorization: EAS_TOKEN # The token of the EAS service.

  • Content-Type: multipart/form-data

Request parameter

The EXCEL or CSV file.

cURL request example (click here to view details)

  # Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
  # Replace the path after "-F 'file=@" with the actual file path.
  curl -X 'POST' /api/v1/upload_datasheet \
  -H 'Authorization: ' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@example_data/titanic_train.csv'

Response example (click here to view details)

  {
    "task_id": "3b12cf5fabee4a99a32895d2f693****",
    "destination_path": "./localdata/data_analysis/titanic_train.csv",
    "data_preview": "xxx"
  }

Upload JSON files to supplement chat_db database information - Q&A pairs

Method

URL

{EAS_SERVICE_URL}/api/v1/upload_db_history

Request method

POST

Request header

  • Authorization: EAS_TOKEN # The token of the EAS service.

  • Content-Type: multipart/form-data

Request parameter

  • file: the JSON file.

  • db_name: the database name.

cURL request example (click here to view details)

  # Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
  # Replace the path after " -F 'file=@" with the JSON file path.
  # Configure db_name as the name of your database.
  curl -X 'POST' /api/v1/upload_db_history \
  -H 'Authorization: ' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@example_data/db_query_history.json' \
  -F 'db_name=my_pets'

Response example (click here to view details)

  {
    "task_id": "204191f946384a54a48b13ec00fd****",
    "destination_path": "./localdata/data_analysis/nl2sql/history/my_pets_db_query_history.json"
  } 

Load database information

Method

URL

{EAS_SERVICE_URL}/api/v1/query/load_db_info

Request method

POST

Request header

Authorization: EAS_TOKEN # The token of the EAS service.

cURL request example (click here to view details)

# Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
curl -X 'POST' <EAS_SERVICE_URL>/api/v1/query/load_db_info -H 'Authorization: <EAS_TOKEN>'

Response example (click here to view details)

"Load database info successfully."

Query all knowledge base indexes

Method

URL

{EAS_SERVICE_URL}/api/v1/indexes

Request method

GET

Request header

Authorization: EAS_TOKEN # The token of the EAS service.

cURL request example (click here to view details)

# Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
curl -X 'GET' '/api/v1/indexes' -H 'Authorization: '

Response example (click here to view details)

  {
    "indexes": {
      "default_index": {
        "index_name": "default_index",
        "vector_store_config": {
          "persist_path": "localdata/storage",
          "type": "faiss",
          "is_image_store": false
        },
        "embedding_config": {
          "source": "huggingface",
          "model": "bge-m3",
          "embed_batch_size": 10,
          "enable_sparse": false
        }
      }
    },
    "current_index_name": "default_index"
  }

Create a knowledge base index

Method

URL

{EAS_SERVICE_URL}/api/v1/indexes/{index_name}

Request method

POST

Request header

  • Authorization: EAS_TOKEN # The token of the EAS service.

  • Content-Type: application/json

Request parameter

  • index_name: the index name.

  • vector_store_config: the vector database configuration.

  • embedding_config: the embedding model configuration.

cURL request example (click here to view details)

    # Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
    # Replace <my_index> with the new knowledge base index.
    # Configure vector_store_config as the vector database configuration.
    curl -X 'POST' '/api/v1/indexes/' \
    -H 'Authorization: ' \
    -H 'Content-Type: application/json' \
    -d '{
        "index_name": "",
        "vector_store_config": {
            "type": "faiss"
        },
        "embedding_config": {
            "model": "bge-m3",
            "source": "huggingface"
        }
    }'

The vector_store_config parameter in the preceding code uses Faiss as an example. The vector_store_config configurations for other vector databases are as follows:

Milvus

"vector_store_config":
      {
          "type":"milvus",
          "host":"c-xxxxx.milvus.aliyuncs.com",
          "port":19530,
          "user":"root",
          "password":"xxx",
          "database":"default",
          "collection_name":"test",
          "reranker_weights":[0.5,0.5]
      }

Hologres

"vector_store_config":
      {
          "type":"hologres",
          "host":"xxx",
          "port":xxx,
          "user":"xxx",
          "password":"xxx",
          "database":"default",
          "table_name":"test",
          "pre_delete_table":"false"
      }

Elasticsearch

"vector_store_config":
      {
          "type":"elasticsearch",
          "es_url":"xxx",
          "es_user":xxx,
          "es_password":"xxx",
          "es_index":"xxx"
      }

OpenSearch

"vector_store_config":
      {
          "type":"opensearch",
          "endpoint":"xxx",
          "instance_id":xxx,
          "username":"xxx",
          "password":"xxx",
          "table_name":"xxx"
      }

AnalyticDB

"vector_store_config":
      {
          "type":"analyticdb",
          "ak":"xxx",
          "sk":xxx,
          "region_id":"xxx",
          "instance_id":"xxx",
          "account":"xxx",
          "account_password":"xxx",
          "namespace":"xxx",
          "collection":"xxx"
      }

Tablestore

"vector_store_config":
      {
          "type":"tablestore",
          "endpoint":"xxx",
          "instance_name":xxx,
          "access_key_id":"xxx",
          "access_key_secret":"xxx",
          "table_name":"xxx"
      }

DashVector

"vector_store_config":
      {
          "type":"dashvector",
          "endpoint":"xxx",
          "api_key":xxx,
          "collection_name":"xxx",
          "partition_name":"xxx"
      }

Response example (click here to view details)

  { "msg": "Add index 'my_index' successfully." }

Update a knowledge base index

Method

URL

{EAS_SERVICE_URL}/api/v1/indexes/{index_name}

Request method

PATCH

Request header

  • Authorization: EAS_TOKEN # The token of the EAS service.

  • Content-Type: application/json

Request parameter

  • index_name: the index name.

  • vector_store_config: the vector database configuration.

  • embedding_config: th embedding model configuration.

cURL request example (click here to view details)

    # Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
    # Replace <my_index> with the knowledge base index to be updated.
    # Configure vector_store_config as the vector database configuration to be updated.
    curl -X 'PATCH' '/api/v1/indexes/' \
    -H 'Authorization: ' \
    -H 'Content-Type: application/json' \
    -d '{
        "index_name": "",
        "vector_store_config": {
            "type": "faiss"
        },
        "embedding_config": {
            "model": "bge-m3",
            "source": "huggingface"
        }
    }'

The vector_store_config parameter in the preceding code uses Faiss as an example. For vector_store_config configurations of other vector databases, see Create a knowledge base index.

Response example (click here to view details)

  { "msg": "Update index 'my_index' successfully." }

Delete a knowledge base index

Method

URL

{EAS_SERVICE_URL}/api/v1/indexes/{index_name}

Request method

DELETE

Request header

  • Authorization: EAS_TOKEN # The token of the EAS service.

  • Content-Type: application/json

Request parameter

index_name: the index name.

cURL request example (click here to view details)

# Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
# Replace <my_index> with the knowledge base index to be deleted.
curl -X 'DELETE' '/api/v1/indexes/' -H 'Authorization: ' -H 'Content-Type: application/json' -d '{"index_name":""}'

Response example (click here to view details)

  { "msg": "Delete index 'my_index' successfully." }

Query the configurations of a RAG service

Method

URL

{EAS_SERVICE_URL}/api/v1/config

Request method

GET

Request header

Authorization: EAS_TOKEN # The token of the EAS service.

cURL request example (click here to view details)

# Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
curl -X 'GET' '/api/v1/config' -H 'Authorization: '

Response example (click here to view details)

  {
    "system": {
      "default_web_search": false,
      "query_type": "websearch"
    },
    "data_reader": {
      "concat_csv_rows": false,
      "enable_mandatory_ocr": false,
      "format_sheet_data_to_json": false,
      "sheet_column_filters": null,
      "number_workers": 4
    },
    "node_parser": {
      "type": "Sentence",
      "chunk_size": 500,
      "chunk_overlap": 10,
      "enable_multimodal": true,
      "paragraph_separator": "\n\n\n",
      "sentence_window_size": 3,
      "sentence_chunk_overlap": 200,
      "breakpoint_percentile_threshold": 95,
      "buffer_size": 1
    },
    "index": {
      "vector_store": {
        "persist_path": "localdata/storage",
        "type": "faiss",
        "is_image_store": false
      },
      "enable_multimodal": true,
      "persist_path": "localdata/storage"
    },
    "embedding": {
      "source": "huggingface",
      "model": "bge-m3",
      "embed_batch_size": 10,
      "enable_sparse": false
    },
    "multimodal_embedding": {
      "source": "cnclip",
      "model": "ViT-L-14",
      "embed_batch_size": 10,
      "enable_sparse": false
    },
    "llm": {
      "source": "openai_compatible",
      "temperature": 0.1,
      "system_prompt": null,
      "max_tokens": 4000,
      "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
      "api_key": "sk-xxx",
      "model": "qwen-max"
    },
    "multimodal_llm": {
      "source": "openai_compatible",
      "temperature": 0.1,
      "system_prompt": null,
      "max_tokens": 4000,
      "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
      "api_key": "sk-xxx",
      "model": ""
    },
    "functioncalling_llm": null,
    "agent": {
      "system_prompt": "You are a travel assistant, xxx",
      "python_scripts": "xxx",
      "function_definition": "xxx",
      "api_definition": "xxx"
    },
    "chat_store": {
      "type": "local",
      "persist_path": "localdata/storage"
    },
    "data_analysis": {
      "type": "mysql",
      "nl2sql_prompt": "Give a question, xxx",
      "synthesizer_prompt": "Give a question, xxx",
      "database": "my_pets",
      "tables": [],
      "descriptions": {},
      "enable_enhanced_description": false,
      "enable_db_history": true,
      "enable_db_embedding": true,
      "max_col_num": 100,
      "max_val_num": 1000,
      "enable_query_preprocessor": true,
      "enable_db_preretriever": true,
      "enable_db_selector": true,
      "user": "root",
      "password": "xxx",
      "host": "127.0.0.1",
      "port": 3306
    },
    "intent": {
      "descriptions": {
        "rag": "\nThis tool can help you get more specific information from the knowledge base.\n",
        "tool": "\nThis tool can help you get travel information about time, weather, flights, train and hotels.\n"
      }
    },
    "node_enhancement": {
      "tree_depth": 3,
      "max_clusters": 52,
      "proba_threshold": 0.1
    },
    "oss_store": {
      "bucket": "",
      "endpoint": "oss-cn-hangzhou.aliyuncs.com",
      "ak": null,
      "sk": null
    },
    "postprocessor": {
      "reranker_type": "no-reranker",
      "similarity_threshold": 0.5
    },
    "retriever": {
      "vector_store_query_mode": "default",
      "similarity_top_k": 3,
      "image_similarity_top_k": 2,
      "search_image": false,
      "hybrid_fusion_weights": [0.7, 0.3]
    },
    "search": {
      "source": "google",
      "search_count": 10,
      "serpapi_key": "142xxx",
      "search_lang": "zh-CN"
    },
    "synthesizer": {
      "use_multimodal_llm": false,
      "system_role_template": "You are xxx",
      "custom_prompt_template": "Your goal is to provide accurate, useful, and easy-to-understand information. xxx"
    },
    "query_rewrite": {
      "enabled": true,
      "rewrite_prompt_template": "# Role\nYou are a professional information retrieval expert, xxx",
      "llm": {
        "source": "openai_compatible",
        "temperature": 0.1,
        "system_prompt": null,
        "max_tokens": 4000,
        "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        "api_key": null,
        "model": ""
      }
    },
    "guardrail": {
      "endpoint": null,
      "region": null,
      "access_key_id": null,
      "access_key_secret": null,
      "custom_advice": null
    }
  }

Update the configurations of a RAG service

Method

URL

{EAS_SERVICE_URL}/api/v1/config

Request method

PATCH

Request header

  • Authorization: EAS_TOKEN # The token of the EAS service.

  • Content-Type: application/json

Request parameter

new_config: the updated configuration.

cURL request example (click here to view details)

    # Replace <EAS_TOKEN> and <EAS_SERVICE_URL> with the service token and the service access address, respectively.
    curl -X 'PATCH' '{EAS_SERVICE_URL}/api/v1/config' \
    -H 'Authorization: EAS_TOKEN' \
    -H 'Content-Type: application/json' \
    -d '{
        "system": {
          "default_web_search": false,
          "query_type": "websearch"
        },
        "data_reader": {
          "concat_csv_rows": false,
          "enable_mandatory_ocr": false,
          "format_sheet_data_to_json": false,
          "sheet_column_filters": null,
          "number_workers": 4
        },
        "node_parser": {
          "type": "Sentence",
          "chunk_size": 500,
          "chunk_overlap": 10,
          "enable_multimodal": true,
          "paragraph_separator": "\n\n\n",
          "sentence_window_size": 3,
          "sentence_chunk_overlap": 200,
          "breakpoint_percentile_threshold": 95,
          "buffer_size": 1
        },
        ...
    }' # (For more information, see the response example in the "Query the configurations of a RAG service" section.)

Response example (click here to view details)

  { "msg": "Update RAG configuration successfully." }