Provides intelligent question-and-answer services by combining a knowledge base with a large language model. A streaming API, which is called by using the SSE or the Java asynchronous SDK.
Operation description
This API enables users to query a large language model with answers grounded in a specified knowledge base collection. You can configure multiple parameters to customize requests, including but not limited to database instance IDs, knowledge retrieval parameters, and model inference parameters. In addition, a default system prompt template is provided and users are allowed to customize the system prompt.
- DBInstanceId: required. This parameter specifies the ID of the database instance.
- KnowledgeParams: optional. It contains parameters related to knowledge retrieval, such as retrieval content and merge policy.
- ModelParams: required. It contains parameters related to model inference, such as the message list and the name of the model.
- PromptTemplate: optional. It is used to customize a system prompt template.
Debugging
Authorization information
The following table shows the authorization information corresponding to the API. The authorization information can be used in the Action policy element to grant a RAM user or RAM role the permissions to call this API operation. Description:
- Operation: the value that you can use in the Action element to specify the operation on a resource.
- Access level: the access level of each operation. The levels are read, write, and list.
- Resource type: the type of the resource on which you can authorize the RAM user or the RAM role to perform the operation. Take note of the following items:
- For mandatory resource types, indicate with a prefix of * .
- If the permissions cannot be granted at the resource level,
All Resourcesis used in the Resource type column of the operation.
- Condition Key: the condition key that is defined by the cloud service.
- Associated operation: other operations that the RAM user or the RAM role must have permissions to perform to complete the operation. To complete the operation, the RAM user or the RAM role must have the permissions to perform the associated operations.
| Operation | Access level | Resource type | Condition key | Associated operation |
|---|---|---|---|---|
| gpdb:ChatWithKnowledgeBaseStream | get | *DBInstance acs:gpdb:{#regionId}:{#accountId}:dbinstance/{#DBInstanceId} |
| none |
Request parameters
| Parameter | Type | Required | Description | Example |
|---|---|---|---|---|
| DBInstanceId | string | Yes | The cluster ID. Note
You can call the DescribeDBInstances operation to query the information about all AnalyticDB for PostgreSQL instances within a region, including instance IDs.
| gp-xxxxxxxxx |
| RegionId | string | No | The region ID of the instance. | cn-hangzhou |
| KnowledgeParams | object | No | The knowledge retrieval parameter object. If you do not specify this parameter, only chat mode is enabled. | |
| MergeMethod | string | No | The method used to merge multiple knowledge base. Default value: RRF. Valid values:
| "RRF" |
| MergeMethodArgs | object | No | Parameters for multi-knowledge-base fusion. | |
| Rrf | object | No | The parameter to be configured when the MergeMethod parameter is set to RRF. | |
| K | long | No | Formula to calculate the score: 1/(k + rank_i). The k constant must be a positive integer greater than 1. | 60 |
| Weight | object | No | The smoothing constant in the formula to calculate the score: 1/(k + rank_i). It must be a positive integer greater than 1. | |
| Weights | array | No | An array of weights for each SourceCollection. | |
| double | No | The parameter to be configured when you set the MergeMethod parameter to Weight. | 0.01 | |
| RerankFactor | double | No | The rerank factor. If you specify this parameter, the search result is reranked once again. Valid values: 1<RerankFactor<=5. Note
| 5.0 |
| SourceCollection | array<object> | Yes | Knowledge base. | |
| object | No | |||
| Collection | string | Yes | The name of the collection to be recalled. | cloud_index_adb_50943_prod |
| Namespace | string | No | The namespace. Note
You can call the ListNamespaces operation to query a list of namespaces.
| ddstar_vector |
| NamespacePassword | string | Yes | The password for the namespace. Note
The value of this parameter is specified by the CreateNamespace operation.
| namespacePassword |
| QueryParams | object | No | Parameters related to the knowledge base retrieval. | |
| Filter | string | No | The condition that is used to filter the data to be updated. Specify this parameter in a format that is the same as the WHERE clause. | method_id='e41695f0-2851-40ac-b21d-dd337b60d71c' |
| GraphEnhance | boolean | No | Whether to enable knowledge graph enhancement. Default value: false. | true |
| GraphSearchArgs | object | No | The knowledge graph retrieval parameters. | |
| GraphTopK | long | No | The number of top entities and relationship edges. Default value: 60. | 60 |
| HybridSearch | string | No | The dual-path retrieval algorithm. This parameter is empty by default, which specifies that scores of vector retrieval and full-text retrieval are directly compared and sorted together. Valid values:
| Cascaded |
| HybridSearchArgs | object | No | The parameters of the dual-path retrieval algorithm. RRF and Weight are supported at this time:
| |
| any | No | {\"RRF\":{\"k\":60}} | ||
| Metrics | string | No | The method that is used to create vector indexes. Valid values:
| cosine |
| RecallWindow | array | No | The retrieval window. If you specify this parameter, the context of the retrieved result is added in the output. Format: List<A, B>. Valid values: -10<=A<=0 and 0<=B<=10. Note
| |
| long | No | The retrieval window. If you specify this parameter, the context of the retrieved result is added in the output. Format: List<A, B>. Valid values: -10<=A<=0 and 0<=B<=10. Note
| [-1,1] | |
| RerankFactor | double | No | The rerank factor. If you specify this parameter, the retrieved results are reranked once again. Valid values: 1<RerankFactor<=5. Note
| 2.0 |
| TopK | long | No | The number of top results. | 101 |
| UseFullTextRetrieval | boolean | No | Specifies whether to use full-text retrieval (dual-path retrieval). The default value is false, which means only vector retrieval is used. | true |
| TopK | long | No | Specifies the number of top results to return after merging retrieved results from multiple vector collections. | 10 |
| PromptParams | string | No | The system prompt template, which should include {{ text_chunks }},{{ user_system_prompt }},{{ graph_entities },{{ graph_relations }}. If any of these placeholders are not specified, the corresponding section should have no effect. | |
| ModelParams | object | Yes | The Large Language Model (LLM) invocation parameter object. | |
| MaxTokens | long | No | Maximum number of tokens to generate. | 8192 |
| Messages | array<object> | Yes | Message list. | |
| object | No | Message list. | ||
| Content | string | No | The message content. | |
| Role | string | No | The message role. Valid values:
| user |
| Model | string | Yes | The model name. See Model Studio Document for the available models. | qwen-plus |
| N | long | No | The number of candidate responses to generate. | 1 |
| PresencePenalty | double | No | Presence penalty coefficient (-2.0 to 2.0). | 1.0 |
| Seed | long | No | The random seed. | 42 |
| Stop | array | No | Stop words. | |
| string | No | Stop word. | "\n" | |
| Temperature | double | No | Sampling temperature (0~2). | 0.6 |
| Tools | array<object> | No | Tools. | |
| object | No | Tool Details. | ||
| Function | object | No | The information about a function. | |
| Description | string | No | The description of the function. | |
| Name | string | No | The name of the function. | get_weather |
| Parameters | any | No | JSON Schema for function parameters. | {"type": "object", ...} |
| TopP | double | No | Top-p (nucleus) sampling threshold (0–1). | 0.9 |
| IncludeKnowledgeBaseResults | boolean | No | Whether to return the retrieved result. Default value: false. | false |
Response parameters
Examples
Sample success responses
JSONformat
{
"RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
"MultiCollectionRecallResult": {
"Entities": [
"{'entities': []}"
],
"Matches": [
{
"Content": "",
"FileName": "a14b0221-e3f2-4cf2-96cd-b3c293510770.jpg",
"FileURL": "http://dailyshort-sh.oss-cn-shanghai.aliyuncs.com/vod-8efba5/f06147795c6c71f080605420848d0302/0ca34d5743a84bf7c68f489a60715dac-ld.mp4",
"Id": "273e3fc7-8f56-4167-a1bb-d35d2f3b9043",
"LoaderMetadata": {
"page": 1
},
"Metadata": {
"Source": 1
},
"RerankScore": 0.12,
"RetrievalSource": 0.12,
"Score": 10,
"Vector": [
0
]
}
],
"Relations": [
"{'relations': []}"
],
"RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
"Status": "success",
"Tokens": 42,
"Usage": {
"EmbeddingTokens": 158
}
},
"ChatCompletion": {
"Choices": [
{
"FinishReason": "finish",
"Index": 0,
"Message": {
"Content": "",
"Role": "user",
"ToolCalls": [
{
"Id": "chatcmpl-c1bebafa-cc48-44e2-88c6-1a3572952f8e",
"Function": {
"Arguments": {
"city": "hangzhou"
},
"Name": "get_weather"
},
"Index": 1
}
],
"ReasoningContent": "Logical reasoning process\n"
}
}
],
"Created": 1758529748,
"Id": "273e3fc7-8f56-4167-a1bb-d35d2f3b9043\n",
"Model": "qwen-plus\n",
"Usage": {
"CompletionTokens": 42,
"PromptTokens": 42,
"PromptTokensDetails": {
"CachedTokens": 24
},
"TotalTokens": 42
}
},
"Message": "Successful",
"Status": "success"
}Error codes
For a list of error codes, visit the Service error codes.
