All Products
Search
Document Center

AnalyticDB:ChatWithKnowledgeBase

Last Updated:Feb 10, 2026

This service combines a knowledge base with a large language model to deliver an AI chat experience.

Operation description

This API enables you to interact with a large language model using specified knowledge base collections, so that responses are grounded in the knowledge base content. You can customize requests by configuring parameters such as the database instance ID, knowledge retrieval settings, and model inference settings. The API provides a default system prompt template and supports custom system prompts.

  • DBInstanceId: Required. Specifies the database instance ID.

  • KnowledgeParams: Optional. Contains parameters for knowledge retrieval, such as the retrieval content and merge policy.

  • ModelParams: Required. Contains parameters for model inference, such as the message list and the model name.

  • PromptTemplate: Optional. Specifies a custom system prompt template.

Try it now

Try this API in OpenAPI Explorer, no manual signing needed. Successful calls auto-generate SDK code matching your parameters. Download it with built-in credential security for local usage.

Test

RAM authorization

The table below describes the authorization required to call this API. You can define it in a Resource Access Management (RAM) policy. The table's columns are detailed below:

  • Action: The actions can be used in the Action element of RAM permission policy statements to grant permissions to perform the operation.

  • API: The API that you can call to perform the action.

  • Access level: The predefined level of access granted for each API. Valid values: create, list, get, update, and delete.

  • Resource type: The type of the resource that supports authorization to perform the action. It indicates if the action supports resource-level permission. The specified resource must be compatible with the action. Otherwise, the policy will be ineffective.

    • For APIs with resource-level permissions, required resource types are marked with an asterisk (*). Specify the corresponding Alibaba Cloud Resource Name (ARN) in the Resource element of the policy.

    • For APIs without resource-level permissions, it is shown as All Resources. Use an asterisk (*) in the Resource element of the policy.

  • Condition key: The condition keys defined by the service. The key allows for granular control, applying to either actions alone or actions associated with specific resources. In addition to service-specific condition keys, Alibaba Cloud provides a set of common condition keys applicable across all RAM-supported services.

  • Dependent action: The dependent actions required to run the action. To complete the action, the RAM user or the RAM role must have the permissions to perform all dependent actions.

Action

Access level

Resource type

Condition key

Dependent action

gpdb:ChatWithKnowledgeBase

create

*DBInstance

acs:gpdb:{#regionId}:{#accountId}:dbinstance/{#DBInstanceId}

None None

Request parameters

Parameter

Type

Required

Description

Example

DBInstanceId

string

Yes

The ID of the instance.

Note

You can call the DescribeDBInstances operation to view the details of all instances in the destination region, including instance IDs.

gp-xxxxxxxxx

RegionId

string

Yes

The region ID of the instance.

cn-hangzhou

KnowledgeParams

object

No

The knowledge retrieval parameter object. If you do not specify this parameter, only the chat feature is used.

MergeMethod

string

No

The method to merge multiple knowledge bases. The default value is RRF. Valid values:

  • RRF

  • Weight

"RRF"

MergeMethodArgs

object

No

The parameters for merging multiple knowledge bases.

Rrf

object

No

The parameters to configure when you set MergeMethod to RRF.

K

integer

No

The constant k in the score calculation formula 1/(k + rank_i). It must be a positive integer greater than 1.

60

Weight

object

No

The parameters to configure when you set MergeMethod to Weight.

Weights

array

No

An array of weights for each SourceCollection.

number

No

The weight of each SourceCollection.

0.01

RerankFactor

number

No

The reranking factor. If this parameter is specified, the vector retrieval results are reranked. The value must be in the range of (1, 5].

Note
  • If document chunks are sparse, reranking is less efficient.

  • The number of results for reranking, which is calculated as TopK × Factor and rounded up, must not exceed 50.

1.0001

SourceCollection

array<object>

Yes

The knowledge base.

array<object>

No

Collection

string

Yes

The name of the collection for retrieval.

adbpg_document_collection

Namespace

string

No

The namespace. The default value is public.

Note

For more information, see CreateNamespace and ListNamespaces.

dukang

NamespacePassword

string

Yes

The password of the namespace.

Note

This value is specified when you call the CreateNamespace operation.

namespacePasswd

QueryParams

object

No

The parameters for retrieving this knowledge base.

Filter

string

No

The filter condition for the data to update. The format is the same as a SQL WHERE clause.

id = 'llm-t87l87fxuhn56woc_8anu8j2d3f_file_e74635e2cc314e838543e724f6b3b1f2_10658020'

GraphEnhance

boolean

No

Specifies whether to enable knowledge graph enhancement. Default value: false.

false

GraphSearchArgs

object

No

The number of top entities and relationship edges to return. Default value: 60.

GraphTopK

integer

No

The number of top entities and relationship edges to return. Default value: 60.

60

HybridSearch

string

No

The multi-channel recall algorithm. If you leave this empty, the system directly compares and sorts the scores from dense vector retrieval and full-text search.

Valid values:

  • RRF: Reciprocal rank fusion. A parameter k controls the fusion effect. For more information, see the HybridSearchArgs configuration.

  • Weight: Weighted sorting. This method uses parameters to control the weight of vector scores and full-text search scores before sorting. For more information about the parameters, see the HybridSearchArgs configuration.

  • Cascaded: Performs a full-text search first, and then performs a vector retrieval based on the results.

RRF

HybridSearchArgs

object

No

The parameters for the multi-channel recall algorithm. This supports RRF and Weight. HybridPathsSetting can specify the recall of dense vectors (dense), sparse vectors (sparse), and full-text search results (fulltext). If this parameter is empty, dense vectors and full-text search results are recalled by default.

  • RRF: Specifies the constant k in the formula 1/(k + rank_i) that is used to calculate the score. It must be a positive integer greater than 1. The format is as follows:

{
  "HybridPathsSetting": {
    "paths": "dense,fulltext"
  },
  "RRF": {
    "k": 60
  }
}
  • Weight:
    • Two-channel recall (HybridPathsSetting is not specified, only alpha is specified):
      • Formula: alpha * dense_score + (1 - alpha) * fulltext_score. The alpha parameter represents the weight of the dense vector retrieval score relative to the full-text search score. The value ranges from 0 to 1. A value of 0 indicates that only full-text search is used. A value of 1 indicates that only dense vector retrieval is used.

{ 
   "Weight": {
    "alpha": 0.5
   }
}
  • Three-channel recall mode:
    • Formula: normalized_dense * dense_score + normalized_sparse * sparse_score + normalized_fulltext * fulltext_score. The dense, sparse, and fulltext parameters represent the weights of dense vector, sparse vector, and full-text search retrieval, respectively. The values must be greater than or equal to 0. The system automatically normalizes the weights to a range of 0 to 1 (normalized_x = x / (dense + sparse + fulltext)).

{
  "HybridPathsSetting": {
     "paths": "dense,sparse,fulltext"
   },
  "Weight": {
    "dense": 0.5,
    "sparse": 0.3,
    "fulltext": 0.2
  }
}

any

No

The parameter configuration value.

Metrics

string

No

The method used for vector index building. Valid values:

  • l2: Euclidean distance.

  • ip: Dot product (inner product) distance.

  • cosine: Cosine similarity.

cosine

RecallWindow

array

No

The recall window. If this parameter is specified, the context of the retrieval results is expanded. The format is an array of two elements: List<A, B>, where -10 <= A <= 0 and 0 <= B <= 10.

Note
  • Use this parameter if document chunks are too small and retrieval might lose context.

  • Reranking occurs before windowing. The system first reranks the results and then applies the window.

integer

No

The recall window. If this parameter is specified, the context of the retrieval results is expanded. The format is an array of two elements: List<A, B>, where -10 <= A <= 0 and 0 <= B <= 10.

Note
  • Use this parameter if document chunks are too small and retrieval might lose context.

  • Reranking occurs before windowing. The system first reranks the results and then applies the window.

[-1,1]

RerankFactor

number

No

The reranking factor. If this parameter is specified, the vector retrieval results are reranked. The value must be in the range of (1, 5].

Note
  • If document chunks are sparse, reranking is less efficient.

  • The number of results for reranking, which is calculated as TopK × Factor and rounded up, must not exceed 50.

1.5

TopK

integer

No

The number of top results to return.

10

UseFullTextRetrieval

boolean

No

Specifies whether to use full-text search (two-channel recall). The default value is false, which means only vector retrieval is used.

true

TopK

integer

No

The number of top results to return after the results from multiple vector collection recalls are merged.

10

PromptParams

string

No

The system prompt template. It must include the `{{ text_chunks }}`, `{{ user_system_prompt }}`, `{{ graph_entities }}`, and `{{ graph_relations }}` placeholders. If you do not specify this parameter, the template is not used.

"参考以下知识回答问题:{{ text_chunks }}"

ModelParams

object

Yes

The parameter object for calling the Large Language Model (LLM).

MaxTokens

integer

No

The maximum number of tokens to generate.

8192

Messages

array<object>

Yes

The list of messages.

object

Yes

The list of messages.

Content

string

Yes

The content of the message.

你是一个有帮助的助手。

Role

string

Yes

The role of the message. Valid values:

  • system

  • user

  • assistant

user

Model

string

Yes

The name of the large language model to use. For valid values, see the Alibaba Cloud Model Studio documentation.

qwen-plus

N

integer

No

The number of candidate replies to generate.

1

PresencePenalty

number

No

The presence penalty coefficient. The value ranges from -2.0 to 2.0.

1.0

Seed

integer

No

The random seed.

42

Stop

array

No

A list of stop words.

string

No

A stop word.

"\n"

Temperature

number

No

The sampling temperature. The value ranges from 0 to 2.

0.6

Tools

array<object>

No

The list of tools.

array<object>

No

The details of the tool.

Function

object

No

The function information.

Description

string

No

The description of the function tool.

获取天气。

Name

string

No

The name of the function tool.

get_weather

Parameters

any

No

The JSON schema of the function parameters.

{"type": "object", ...}

TopP

number

No

The probability threshold for nucleus sampling. The value ranges from 0 to 1.

0.9

IncludeKnowledgeBaseResults

boolean

No

Specifies whether to return the recall results. Default value: false.

false

Response elements

Element

Type

Description

Example

object

The response body.

RequestId

string

The request ID.

ABB39CC3-4488-4857-905D-2E4A051D0521

MultiCollectionRecallResult

object

The information about the multi-knowledge base recall.

Entities

array

The details of the entities.

string

The entity type.

{'entities': []}

Matches

array<object>

The recalled items.

array<object>

The recalled items.

Content

string

The document content.

ADBPG向量数据库。

FileName

string

The file name.

process_info_19b9df4dc9ad4bf2b30eb2faa4a9a987.txt

FileURL

string

The public URL of the image in the query result. The default validity period is 2 hours.

You can specify the validity period using the UrlExpiration request parameter.

http://viapi-customer-pop.oss-cn-shanghai.aliyuncs.com/b4d8_207196811002111319_570c0e199f03428f812ab21fcc00dd6a

Id

string

The unique ID of the vector data.

273e3fc7-8f56-4167-a1bb-d35d2f3b9043

LoaderMetadata

any

The metadata from when the document was loaded.

{"page":1}

Metadata

object

The metadata.

any

The metadata information.

RerankScore

number

The reranking score.

0.1

RetrievalSource

integer

The source of the retrieval result. 1 indicates vector retrieval, 2 indicates full-text search, and 3 indicates two-channel recall.

3

Score

number

The similarity score of this data entry. The scoring algorithm is related to the algorithm(l2/ip/cosine) specified during index building.

12

Vector

array

The vector data.

number

The vector data.

[]

Relations

array

The file name.

string

The details of the relationship edges.

{'relations': []}

RequestId

string

The request ID.

6B9E3255-4543-5B3B-9E00-6490CA64742B

Status

string

The status of the API execution. Valid values:

  • success: The execution was successful.

  • fail: The execution failed.

success

Tokens

integer

The number of tokens consumed.

42

Usage

object

The number of tokens or items consumed for document understanding or embedding.

EmbeddingTokens

integer

The number of tokens used for vectorization.

Note

A token is the smallest unit into which the input text is divided. A token can be a word, a phrase, a punctuation mark, or a character.

21

ChatCompletion

object

The model response.

Choices

array<object>

The real-time generated text content.

array<object>

The role identifier.

FinishReason

string

The reason for stopping.

finish

Index

integer

The ordinal number of the reply.

0

Message

object

The response from the large language model.

Content

string

The document content.

杭州的天气是晴天。

Role

string

The role of the message:

  • system

  • user

  • assistant

user

ToolCalls

array<object>

The tool calling response.

array<object>

Id

string

The ID.

"chatcmpl-c1bebafa-cc48-44e2-88c6-1a3572952f8e"

Function

object

The information about the called function.

Arguments

string

The arguments of the called function.

{"city":"hangzhou"}

Name

string

The name of the called function.

"get_weather"

Index

integer

The ordinal number of the tool call.

1

ReasoningContent

string

The reasoning content of the model.

逻辑推理过程

Created

integer

The creation time.

1758529748

Id

string

The response ID.

273e3fc7-8f56-4167-a1bb-d35d2f3b9043

Model

string

The name of the model used.

qwen-plus

Usage

object

The number of tokens used by the large language model output.

CompletionTokens

integer

The number of tokens consumed for generating content.

42

PromptTokens

integer

The number of tokens consumed by the input prompt.

42

PromptTokensDetails

object

The details of the prompt tokens.

CachedTokens

integer

The number of tokens that hit the cache.

24

TotalTokens

integer

The total number of tokens.

42

Message

string

The returned message.

Successful

Status

string

The status. Valid values:

  • success: The operation was successful.

  • fail: The operation failed.

success

Examples

Success response

JSON format

{
  "RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
  "MultiCollectionRecallResult": {
    "Entities": [
      "{'entities': []}"
    ],
    "Matches": [
      {
        "Content": "ADBPG向量数据库。",
        "FileName": "process_info_19b9df4dc9ad4bf2b30eb2faa4a9a987.txt",
        "FileURL": "http://viapi-customer-pop.oss-cn-shanghai.aliyuncs.com/b4d8_207196811002111319_570c0e199f03428f812ab21fcc00dd6a",
        "Id": "273e3fc7-8f56-4167-a1bb-d35d2f3b9043",
        "LoaderMetadata": "{\"page\":1}",
        "Metadata": {
          "key": ""
        },
        "RerankScore": 0.1,
        "RetrievalSource": 3,
        "Score": 12,
        "Vector": [
          0
        ]
      }
    ],
    "Relations": [
      "{'relations': []}"
    ],
    "RequestId": "6B9E3255-4543-5B3B-9E00-6490CA64742B",
    "Status": "success",
    "Tokens": 42,
    "Usage": {
      "EmbeddingTokens": 21
    }
  },
  "ChatCompletion": {
    "Choices": [
      {
        "FinishReason": "finish",
        "Index": 0,
        "Message": {
          "Content": "杭州的天气是晴天。",
          "Role": "user",
          "ToolCalls": [
            {
              "Id": "\"chatcmpl-c1bebafa-cc48-44e2-88c6-1a3572952f8e\"",
              "Function": {
                "Arguments": "{\"city\":\"hangzhou\"}",
                "Name": "\"get_weather\""
              },
              "Index": 1
            }
          ],
          "ReasoningContent": "逻辑推理过程"
        }
      }
    ],
    "Created": 1758529748,
    "Id": "273e3fc7-8f56-4167-a1bb-d35d2f3b9043",
    "Model": "qwen-plus",
    "Usage": {
      "CompletionTokens": 42,
      "PromptTokens": 42,
      "PromptTokensDetails": {
        "CachedTokens": 24
      },
      "TotalTokens": 42
    }
  },
  "Message": "Successful",
  "Status": "success"
}

Error codes

See Error Codes for a complete list.

Release notes

See Release Notes for a complete list.