This service integrates a Knowledge Base with a Large Language Model (LLM) to provide intelligent Q&A. You can call its Streaming API using server-sent events (SSE) or the Java async SDK.
Operation description
This API interacts with a Large Language Model (LLM) to generate answers grounded in content from specified Knowledge Bases. You can customize the request by configuring parameters for the Database Instance, Knowledge Retrieval, and Model Inference. The API includes a default System Prompt Template but also supports a custom one.
DBInstanceId: Required. The ID of the Database Instance.KnowledgeParams: Optional. Parameters for Knowledge Retrieval, including the content to retrieve and the Merge Strategy.ModelParams: Required. Parameters for Model Inference, including the message list and the model name.PromptTemplate: Optional. A custom System Prompt Template.
Try it now
Test
RAM authorization
|
Action |
Access level |
Resource type |
Condition key |
Dependent action |
|
gpdb:ChatWithKnowledgeBaseStream |
get |
*DBInstance
|
None | None |
Request syntax
POST / HTTP/1.1
Request parameters
|
Parameter |
Type |
Required |
Description |
Example |
| DBInstanceId |
string |
Yes |
The ID of the instance. Note
You can call the DescribeDBInstances operation to query the IDs of all AnalyticDB for PostgreSQL instances in a specific region. |
gp-xxxxxxxxx |
| RegionId |
string |
No |
The region ID of the instance. |
cn-hangzhou |
| KnowledgeParams |
object |
No |
The parameters for knowledge retrieval. If this parameter is not specified, only a chat is performed. |
|
| MergeMethod |
string |
No |
The method for merging the results from multiple knowledge bases. Default value:
|
"RRF" |
| MergeMethodArgs |
object |
No |
The parameters for merging the results from multiple knowledge bases. |
|
| Rrf |
object |
No |
The parameters that can be configured when MergeMethod is set to RRF. |
|
| K |
integer |
No |
Specifies the constant k in the 1/(k + rank_i) algorithm for calculating the score. The value must be a positive integer greater than 1. |
60 |
| Weight |
object |
No |
The parameters that can be configured when MergeMethod is set to Weight. |
|
| Weights |
array |
No |
The weight array of each SourceCollection. |
|
|
number |
No |
The weight of each SourceCollection. |
0.01 |
|
| RerankFactor |
number |
No |
The reranking factor. If this parameter is not empty, the vector retrieval results are reranked. Valid values: (1, 5]. Note
|
5.0 |
| SourceCollection |
array<object> |
Yes |
The knowledge base. |
|
|
array<object> |
No |
The knowledge base. |
||
| Collection |
string |
Yes |
The name of the collection to be recalled. |
cloud_index_adb_50943_prod |
| Namespace |
string |
No |
The namespace. Note
You can call the ListNamespaces operation to view the list. |
ddstar_vector |
| NamespacePassword |
string |
Yes |
The password of the namespace. Note
This parameter is specified when you call the CreateNamespace operation. |
namespacePassword |
| QueryParams |
object |
No |
The parameters related to the retrieval of the knowledge base. |
|
| Filter |
string |
No |
The filter conditions for the data to be updated, in the format of a SQL WHERE clause. |
method_id='e41695f0-2851-40ac-b21d-dd337b60d71c' |
| GraphEnhance |
boolean |
No |
Specifies whether to enable knowledge graph enhancement. Default value: false. |
true |
| GraphSearchArgs |
object |
No |
The parameters for knowledge graph retrieval. |
|
| GraphTopK |
integer |
No |
The number of top entities and relationship edges to be returned. Default value: 60. |
60 |
| HybridSearch |
string |
No |
The dual-retrieval algorithm. By default, this parameter is empty. The system directly compares the scores of vector retrieval and full-text retrieval and then sorts the scores. Valid values:
|
Cascaded |
| HybridSearchArgs |
object |
No |
The algorithm parameters for dual-retrieval.
|
|
|
any |
No |
{\"RRF\":{\"k\":60}} |
||
| Metrics |
string |
No |
The method for building a vector index. Valid values:
|
cosine |
| RecallWindow |
array |
No |
The recall window. When this parameter is not empty, the context of the returned retrieval results is added. The format is a two-element array Note
|
|
|
integer |
No |
The recall window. When this parameter is not empty, the context of the returned retrieval results is added. The format is a two-element array Note
|
[-1,1] |
|
| RerankFactor |
number |
No |
The reranking factor. If this parameter is not empty, the vector retrieval results are reranked. Valid values: (1, 5]. Note
|
2.0 |
| TopK |
integer |
No |
Specifies the number of top results to be returned. |
101 |
| UseFullTextRetrieval |
boolean |
No |
Specifies whether to use full-text retrieval (dual-retrieval). Default value: false. Only vector retrieval is used. |
true |
| TopK |
integer |
No |
After the results are retrieved from multiple vector sets and merged, this parameter specifies the number of top results to be returned. |
10 |
| PromptParams |
string |
No |
The system prompt template, which must include {{ text_chunks }},{{ user_system_prompt }},{{ graph_entities }},{{ graph_relations }}. If this parameter is not specified, this part does not take effect. |
"参考以下知识回答问题:{{ text_chunks }}" |
| ModelParams |
object |
Yes |
The large language model (LLM) call parameter object. |
|
| MaxTokens |
integer |
No |
The maximum number of tokens to be generated. |
8192 |
| Messages |
array<object> |
Yes |
The message list. |
|
|
object |
No |
The message list. |
||
| Content |
string |
No |
The content of the message. |
你是一个有帮助的助手。 |
| Role |
string |
No |
The role of the message. Valid values:
|
user |
| Model |
string |
Yes |
The name of the large model to be used. For more information about the options, see Bailian Documentation. |
qwen-plus |
| N |
integer |
No |
The number of candidate replies to be generated. |
1 |
| PresencePenalty |
number |
No |
The presence penalty. Value range: [-2.0, 2.0]. |
1.0 |
| Seed |
integer |
No |
The random seed. |
42 |
| Stop |
array |
No |
The list of stop words. |
|
|
string |
No |
The stop word. |
"\n" |
|
| Temperature |
number |
No |
The sampling temperature. Value range: (0, 2). |
0.6 |
| Tools |
array<object> |
No |
The list of tools. |
|
|
array<object> |
No |
The details of a tool. |
||
| Function |
object |
No |
The function information. |
|
| Description |
string |
No |
The description of the function tool. |
获取天气。 |
| Name |
string |
No |
The name of the function tool. |
get_weather |
| Parameters |
any |
No |
The function parameter JSON Schema. |
{"type": "object", ...} |
| TopP |
number |
No |
The probability threshold for nucleus sampling. Value range: (0, 1). |
0.9 |
| IncludeKnowledgeBaseResults |
boolean |
No |
Specifies whether to return the retrieval results. Default value: false. |
false |
Response elements
|
Element |
Type |
Description |
Example |
|
object |
The response schema. |
||
| RequestId |
string |
The ID of the request. |
ABB39CC3-4488-4857-905D-2E4A051D0521 |
| MultiCollectionRecallResult |
object |
The recall results from multiple knowledge bases. |
|
| Entities |
array |
Details of the entities. |
|
|
string |
Details of an entity. |
{'entities': []} |
|
| Matches |
array<object> |
A list of retrieved items. |
|
|
array<object> |
A retrieved item. |
||
| Content |
string |
The document content. |
ADBPG向量数据库。 |
| FileName |
string |
The file name. |
a14b0221-e3f2-4cf2-96cd-b3c293510770.jpg |
| FileURL |
string |
The public URL of the retrieved file. This URL is valid for two hours by default. You can customize this duration with the |
http://dailyshort-sh.oss-cn-shanghai.aliyuncs.com/vod-8efba5/f06147795c6c71f080605420848d0302/0ca34d5743a84bf7c68f489a60715dac-ld.mp4 |
| Id |
string |
The unique ID of the retrieved data entry. Note
|
273e3fc7-8f56-4167-a1bb-d35d2f3b9043 |
| LoaderMetadata |
any |
The metadata generated when the document is loaded. |
{"page":1} |
| Metadata |
object |
The custom metadata associated with the data entry. |
|
|
any |
|||
| RerankScore |
number |
The reranking score. |
0.12 |
| RetrievalSource |
integer |
The retrieval method used for this item. Valid values: |
0.12 |
| Score |
number |
The similarity score of this data entry, which is determined by the distance metric ( |
10 |
| Vector |
array |
The vector data. |
|
|
number |
A single value in the vector array. |
[] |
|
| Relations |
array |
The names of the relations. |
|
|
string |
Details of a relationship edge. |
{'relations': []} |
|
| RequestId |
string |
The ID of the request. |
ABB39CC3-4488-4857-905D-2E4A051D0521 |
| Status |
string |
The status of the recall operation. Valid values:
|
success |
| Tokens |
integer |
The number of tokens consumed. |
42 |
| Usage |
object |
Usage statistics for the document embedding process. |
|
| EmbeddingTokens |
integer |
The number of tokens used for vectorization. Note
A token is the smallest unit of text that is processed. It can be a word, a phrase, a punctuation mark, or a character. |
158 |
| ChatCompletion |
object |
The response from the model. |
|
| Choices |
array<object> |
A list of generated completion choices. |
|
|
array<object> |
A generated completion choice. |
||
| FinishReason |
string |
The reason the model stopped generating output. |
finish |
| Index |
integer |
The index of the response choice. |
0 |
| Message |
object |
The message object returned by the model. |
|
| Content |
string |
The message content. |
杭州的天气是晴天。 |
| Role |
string |
The role of the message author. Valid values:
|
user |
| ToolCalls |
array<object> |
The tool calls generated by the model for the client to invoke. |
|
|
array<object> |
A single tool call generated by the model. |
||
| Id |
string |
The ID of the tool call. |
"chatcmpl-c1bebafa-cc48-44e2-88c6-1a3572952f8e" |
| Function |
object |
The function that the model chose to invoke. |
|
| Arguments |
string |
The arguments to pass to the function. |
{"city":"hangzhou"} |
| Name |
string |
The name of the function to invoke. |
"get_weather" |
| Index |
integer |
The index of the tool definition in the |
1 |
| ReasoningContent |
string |
The content of the model's Chain of Thought (CoT). |
逻辑推导过程 |
| Created |
integer |
The Unix timestamp (in seconds) when the response was created. |
1758529748 |
| Id |
string |
The ID of the response. |
273e3fc7-8f56-4167-a1bb-d35d2f3b9043 |
| Model |
string |
The model that generated the response. |
qwen-plus |
| Usage |
object |
Usage statistics for the completion request. |
|
| CompletionTokens |
integer |
The number of tokens in the generated response. |
42 |
| PromptTokens |
integer |
The number of tokens in the input prompt. |
42 |
| PromptTokensDetails |
object |
Details about the prompt tokens. |
|
| CachedTokens |
integer |
The number of prompt tokens that were retrieved from the cache. |
24 |
| TotalTokens |
integer |
The total number of tokens used in the request, including both prompt and completion tokens. |
42 |
| Message |
string |
A summary message for the operation. |
Successful |
| Status |
string |
The status of the operation. Valid values:
|
success |
Examples
Success response
JSON format
{
"RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
"MultiCollectionRecallResult": {
"Entities": [
"{'entities': []}"
],
"Matches": [
{
"Content": "ADBPG向量数据库。\n",
"FileName": "a14b0221-e3f2-4cf2-96cd-b3c293510770.jpg",
"FileURL": "http://dailyshort-sh.oss-cn-shanghai.aliyuncs.com/vod-8efba5/f06147795c6c71f080605420848d0302/0ca34d5743a84bf7c68f489a60715dac-ld.mp4",
"Id": "273e3fc7-8f56-4167-a1bb-d35d2f3b9043",
"LoaderMetadata": "{\"page\":1}\n",
"Metadata": {
"key": ""
},
"RerankScore": 0.12,
"RetrievalSource": 0.12,
"Score": 10,
"Vector": [
0
]
}
],
"Relations": [
"{'relations': []}"
],
"RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
"Status": "success",
"Tokens": 42,
"Usage": {
"EmbeddingTokens": 158
}
},
"ChatCompletion": {
"Choices": [
{
"FinishReason": "finish",
"Index": 0,
"Message": {
"Content": "杭州的天气是晴天。\n",
"Role": "user",
"ToolCalls": [
{
"Id": "\"chatcmpl-c1bebafa-cc48-44e2-88c6-1a3572952f8e\"\n",
"Function": {
"Arguments": "{\"city\":\"hangzhou\"}\n",
"Name": "\"get_weather\"\n"
},
"Index": 1
}
],
"ReasoningContent": "逻辑推导过程"
}
}
],
"Created": 1758529748,
"Id": "273e3fc7-8f56-4167-a1bb-d35d2f3b9043\n",
"Model": "qwen-plus\n",
"Usage": {
"CompletionTokens": 42,
"PromptTokens": 42,
"PromptTokensDetails": {
"CachedTokens": 24
},
"TotalTokens": 42
}
},
"Message": "Successful",
"Status": "success"
}
Error codes
See Error Codes for a complete list.
Release notes
See Release Notes for a complete list.