This service combines a knowledge base with a large language model to deliver an AI chat experience.
Operation description
This API enables you to interact with a large language model using specified knowledge base collections, so that responses are grounded in the knowledge base content. You can customize requests by configuring parameters such as the database instance ID, knowledge retrieval settings, and model inference settings. The API provides a default system prompt template and supports custom system prompts.
DBInstanceId: Required. Specifies the database instance ID.
KnowledgeParams: Optional. Contains parameters for knowledge retrieval, such as the retrieval content and merge policy.
ModelParams: Required. Contains parameters for model inference, such as the message list and the model name.
PromptTemplate: Optional. Specifies a custom system prompt template.
Try it now
Test
RAM authorization
|
Action |
Access level |
Resource type |
Condition key |
Dependent action |
|
gpdb:ChatWithKnowledgeBase |
create |
*DBInstance
|
None | None |
Request parameters
|
Parameter |
Type |
Required |
Description |
Example |
| DBInstanceId |
string |
Yes |
The ID of the instance. Note
You can call the DescribeDBInstances operation to view the details of all instances in the destination region, including instance IDs. |
gp-xxxxxxxxx |
| RegionId |
string |
Yes |
The region ID of the instance. |
cn-hangzhou |
| KnowledgeParams |
object |
No |
The knowledge retrieval parameter object. If you do not specify this parameter, only the chat feature is used. |
|
| MergeMethod |
string |
No |
The method to merge multiple knowledge bases. The default value is RRF. Valid values:
|
"RRF" |
| MergeMethodArgs |
object |
No |
The parameters for merging multiple knowledge bases. |
|
| Rrf |
object |
No |
The parameters to configure when you set MergeMethod to RRF. |
|
| K |
integer |
No |
The constant k in the score calculation formula 1/(k + rank_i). It must be a positive integer greater than 1. |
60 |
| Weight |
object |
No |
The parameters to configure when you set MergeMethod to Weight. |
|
| Weights |
array |
No |
An array of weights for each SourceCollection. |
|
|
number |
No |
The weight of each SourceCollection. |
0.01 |
|
| RerankFactor |
number |
No |
The reranking factor. If this parameter is specified, the vector retrieval results are reranked. The value must be in the range of (1, 5]. Note
|
1.0001 |
| SourceCollection |
array<object> |
Yes |
The knowledge base. |
|
|
array<object> |
No |
|||
| Collection |
string |
Yes |
The name of the collection for retrieval. |
adbpg_document_collection |
| Namespace |
string |
No |
The namespace. The default value is public. Note
For more information, see CreateNamespace and ListNamespaces. |
dukang |
| NamespacePassword |
string |
Yes |
The password of the namespace. Note
This value is specified when you call the CreateNamespace operation. |
namespacePasswd |
| QueryParams |
object |
No |
The parameters for retrieving this knowledge base. |
|
| Filter |
string |
No |
The filter condition for the data to update. The format is the same as a SQL WHERE clause. |
id = 'llm-t87l87fxuhn56woc_8anu8j2d3f_file_e74635e2cc314e838543e724f6b3b1f2_10658020' |
| GraphEnhance |
boolean |
No |
Specifies whether to enable knowledge graph enhancement. Default value: false. |
false |
| GraphSearchArgs |
object |
No |
The number of top entities and relationship edges to return. Default value: 60. |
|
| GraphTopK |
integer |
No |
The number of top entities and relationship edges to return. Default value: 60. |
60 |
| HybridSearch |
string |
No |
The multi-channel recall algorithm. If you leave this empty, the system directly compares and sorts the scores from dense vector retrieval and full-text search. Valid values:
|
RRF |
| HybridSearchArgs |
object |
No |
The parameters for the multi-channel recall algorithm. This supports RRF and Weight. HybridPathsSetting can specify the recall of dense vectors (dense), sparse vectors (sparse), and full-text search results (fulltext). If this parameter is empty, dense vectors and full-text search results are recalled by default.
|
|
|
any |
No |
The parameter configuration value. |
||
| Metrics |
string |
No |
The method used for vector index building. Valid values:
|
cosine |
| RecallWindow |
array |
No |
The recall window. If this parameter is specified, the context of the retrieval results is expanded. The format is an array of two elements: List<A, B>, where -10 <= A <= 0 and 0 <= B <= 10. Note
|
|
|
integer |
No |
The recall window. If this parameter is specified, the context of the retrieval results is expanded. The format is an array of two elements: List<A, B>, where -10 <= A <= 0 and 0 <= B <= 10. Note
|
[-1,1] |
|
| RerankFactor |
number |
No |
The reranking factor. If this parameter is specified, the vector retrieval results are reranked. The value must be in the range of (1, 5]. Note
|
1.5 |
| TopK |
integer |
No |
The number of top results to return. |
10 |
| UseFullTextRetrieval |
boolean |
No |
Specifies whether to use full-text search (two-channel recall). The default value is false, which means only vector retrieval is used. |
true |
| TopK |
integer |
No |
The number of top results to return after the results from multiple vector collection recalls are merged. |
10 |
| PromptParams |
string |
No |
The system prompt template. It must include the `{{ text_chunks }}`, `{{ user_system_prompt }}`, `{{ graph_entities }}`, and `{{ graph_relations }}` placeholders. If you do not specify this parameter, the template is not used. |
"参考以下知识回答问题:{{ text_chunks }}" |
| ModelParams |
object |
Yes |
The parameter object for calling the Large Language Model (LLM). |
|
| MaxTokens |
integer |
No |
The maximum number of tokens to generate. |
8192 |
| Messages |
array<object> |
Yes |
The list of messages. |
|
|
object |
Yes |
The list of messages. |
||
| Content |
string |
Yes |
The content of the message. |
你是一个有帮助的助手。 |
| Role |
string |
Yes |
The role of the message. Valid values:
|
user |
| Model |
string |
Yes |
The name of the large language model to use. For valid values, see the Alibaba Cloud Model Studio documentation. |
qwen-plus |
| N |
integer |
No |
The number of candidate replies to generate. |
1 |
| PresencePenalty |
number |
No |
The presence penalty coefficient. The value ranges from -2.0 to 2.0. |
1.0 |
| Seed |
integer |
No |
The random seed. |
42 |
| Stop |
array |
No |
A list of stop words. |
|
|
string |
No |
A stop word. |
"\n" |
|
| Temperature |
number |
No |
The sampling temperature. The value ranges from 0 to 2. |
0.6 |
| Tools |
array<object> |
No |
The list of tools. |
|
|
array<object> |
No |
The details of the tool. |
||
| Function |
object |
No |
The function information. |
|
| Description |
string |
No |
The description of the function tool. |
获取天气。 |
| Name |
string |
No |
The name of the function tool. |
get_weather |
| Parameters |
any |
No |
The JSON schema of the function parameters. |
{"type": "object", ...} |
| TopP |
number |
No |
The probability threshold for nucleus sampling. The value ranges from 0 to 1. |
0.9 |
| IncludeKnowledgeBaseResults |
boolean |
No |
Specifies whether to return the recall results. Default value: false. |
false |
Response elements
|
Element |
Type |
Description |
Example |
|
object |
The response body. |
||
| RequestId |
string |
The request ID. |
ABB39CC3-4488-4857-905D-2E4A051D0521 |
| MultiCollectionRecallResult |
object |
The information about the multi-knowledge base recall. |
|
| Entities |
array |
The details of the entities. |
|
|
string |
The entity type. |
{'entities': []} |
|
| Matches |
array<object> |
The recalled items. |
|
|
array<object> |
The recalled items. |
||
| Content |
string |
The document content. |
ADBPG向量数据库。 |
| FileName |
string |
The file name. |
process_info_19b9df4dc9ad4bf2b30eb2faa4a9a987.txt |
| FileURL |
string |
The public URL of the image in the query result. The default validity period is 2 hours. You can specify the validity period using the UrlExpiration request parameter. |
http://viapi-customer-pop.oss-cn-shanghai.aliyuncs.com/b4d8_207196811002111319_570c0e199f03428f812ab21fcc00dd6a |
| Id |
string |
The unique ID of the vector data. |
273e3fc7-8f56-4167-a1bb-d35d2f3b9043 |
| LoaderMetadata |
any |
The metadata from when the document was loaded. |
{"page":1} |
| Metadata |
object |
The metadata. |
|
|
any |
The metadata information. |
||
| RerankScore |
number |
The reranking score. |
0.1 |
| RetrievalSource |
integer |
The source of the retrieval result. 1 indicates vector retrieval, 2 indicates full-text search, and 3 indicates two-channel recall. |
3 |
| Score |
number |
The similarity score of this data entry. The scoring algorithm is related to the algorithm |
12 |
| Vector |
array |
The vector data. |
|
|
number |
The vector data. |
[] |
|
| Relations |
array |
The file name. |
|
|
string |
The details of the relationship edges. |
{'relations': []} |
|
| RequestId |
string |
The request ID. |
6B9E3255-4543-5B3B-9E00-6490CA64742B |
| Status |
string |
The status of the API execution. Valid values:
|
success |
| Tokens |
integer |
The number of tokens consumed. |
42 |
| Usage |
object |
The number of tokens or items consumed for document understanding or embedding. |
|
| EmbeddingTokens |
integer |
The number of tokens used for vectorization. Note
A token is the smallest unit into which the input text is divided. A token can be a word, a phrase, a punctuation mark, or a character. |
21 |
| ChatCompletion |
object |
The model response. |
|
| Choices |
array<object> |
The real-time generated text content. |
|
|
array<object> |
The role identifier. |
||
| FinishReason |
string |
The reason for stopping. |
finish |
| Index |
integer |
The ordinal number of the reply. |
0 |
| Message |
object |
The response from the large language model. |
|
| Content |
string |
The document content. |
杭州的天气是晴天。 |
| Role |
string |
The role of the message:
|
user |
| ToolCalls |
array<object> |
The tool calling response. |
|
|
array<object> |
|||
| Id |
string |
The ID. |
"chatcmpl-c1bebafa-cc48-44e2-88c6-1a3572952f8e" |
| Function |
object |
The information about the called function. |
|
| Arguments |
string |
The arguments of the called function. |
{"city":"hangzhou"} |
| Name |
string |
The name of the called function. |
"get_weather" |
| Index |
integer |
The ordinal number of the tool call. |
1 |
| ReasoningContent |
string |
The reasoning content of the model. |
逻辑推理过程 |
| Created |
integer |
The creation time. |
1758529748 |
| Id |
string |
The response ID. |
273e3fc7-8f56-4167-a1bb-d35d2f3b9043 |
| Model |
string |
The name of the model used. |
qwen-plus |
| Usage |
object |
The number of tokens used by the large language model output. |
|
| CompletionTokens |
integer |
The number of tokens consumed for generating content. |
42 |
| PromptTokens |
integer |
The number of tokens consumed by the input prompt. |
42 |
| PromptTokensDetails |
object |
The details of the prompt tokens. |
|
| CachedTokens |
integer |
The number of tokens that hit the cache. |
24 |
| TotalTokens |
integer |
The total number of tokens. |
42 |
| Message |
string |
The returned message. |
Successful |
| Status |
string |
The status. Valid values:
|
success |
Examples
Success response
JSON format
{
"RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
"MultiCollectionRecallResult": {
"Entities": [
"{'entities': []}"
],
"Matches": [
{
"Content": "ADBPG向量数据库。",
"FileName": "process_info_19b9df4dc9ad4bf2b30eb2faa4a9a987.txt",
"FileURL": "http://viapi-customer-pop.oss-cn-shanghai.aliyuncs.com/b4d8_207196811002111319_570c0e199f03428f812ab21fcc00dd6a",
"Id": "273e3fc7-8f56-4167-a1bb-d35d2f3b9043",
"LoaderMetadata": "{\"page\":1}",
"Metadata": {
"key": ""
},
"RerankScore": 0.1,
"RetrievalSource": 3,
"Score": 12,
"Vector": [
0
]
}
],
"Relations": [
"{'relations': []}"
],
"RequestId": "6B9E3255-4543-5B3B-9E00-6490CA64742B",
"Status": "success",
"Tokens": 42,
"Usage": {
"EmbeddingTokens": 21
}
},
"ChatCompletion": {
"Choices": [
{
"FinishReason": "finish",
"Index": 0,
"Message": {
"Content": "杭州的天气是晴天。",
"Role": "user",
"ToolCalls": [
{
"Id": "\"chatcmpl-c1bebafa-cc48-44e2-88c6-1a3572952f8e\"",
"Function": {
"Arguments": "{\"city\":\"hangzhou\"}",
"Name": "\"get_weather\""
},
"Index": 1
}
],
"ReasoningContent": "逻辑推理过程"
}
}
],
"Created": 1758529748,
"Id": "273e3fc7-8f56-4167-a1bb-d35d2f3b9043",
"Model": "qwen-plus",
"Usage": {
"CompletionTokens": 42,
"PromptTokens": 42,
"PromptTokensDetails": {
"CachedTokens": 24
},
"TotalTokens": 42
}
},
"Message": "Successful",
"Status": "success"
}
Error codes
See Error Codes for a complete list.
Release notes
See Release Notes for a complete list.