QueryKnowledgeBasesContent - AnalyticDB - Alibaba Cloud Documentation Center

Retrieve vectors and metadata from multiple specified knowledge bases using natural-language queries. Merge results from multi-channel recall and return them.

Try it now

Try this API in OpenAPI Explorer, no manual signing needed. Successful calls auto-generate SDK code matching your parameters. Download it with built-in credential security for local usage.

Test

RAM authorization

The table below describes the authorization required to call this API. You can define it in a Resource Access Management (RAM) policy. The table's columns are detailed below:

Action: The actions can be used in the Action element of RAM permission policy statements to grant permissions to perform the operation.
API: The API that you can call to perform the action.
Access level: The predefined level of access granted for each API. Valid values: create, list, get, update, and delete.
Resource type: The type of the resource that supports authorization to perform the action. It indicates if the action supports resource-level permission. The specified resource must be compatible with the action. Otherwise, the policy will be ineffective.
- For APIs with resource-level permissions, required resource types are marked with an asterisk (*). Specify the corresponding Alibaba Cloud Resource Name (ARN) in the Resource element of the policy.
- For APIs without resource-level permissions, it is shown as All Resources. Use an asterisk (*) in the Resource element of the policy.
Condition key: The condition keys defined by the service. The key allows for granular control, applying to either actions alone or actions associated with specific resources. In addition to service-specific condition keys, Alibaba Cloud provides a set of common condition keys applicable across all RAM-supported services.
Dependent action: The dependent actions required to run the action. To complete the action, the RAM user or the RAM role must have the permissions to perform all dependent actions.

Action

Access level

Resource type

Condition key

Dependent action

gpdb:QueryKnowledgeBasesContent

create

*Document

acs:gpdb:{#regionId}:{#accountId}:document/{#DBInstanceId}

None

Request parameters

Parameter	Type	Required	Description	Example
DBInstanceId	string	Yes	The ID of the instance. Note Call the DescribeDBInstances operation to view details of all AnalyticDB for PostgreSQL instances in the destination region, including their instance IDs.	gp-xxxxxxxxx
RegionId	string	Yes	The ID of the region where the instance resides.	cn-beijing
Content	string	Yes	The text content to search for.	ADBPG是什么？
MergeMethod	string	No	The method used to merge results from multiple knowledge bases. Default value: RRF. Valid values: RRF Weight	RRF
MergeMethodArgs	object	No	The parameters for the merge method of each SourceCollection.
Rrf	object	No	The parameters you can configure when MergeMethod is set to RRF.
K	integer	No	The constant k in the scoring formula 1/(k+rank_i). It must be a positive integer greater than 1.	60
Weight	object	No	The parameters you can configure when MergeMethod is set to Weight.
Weights	array	No	The array of weights for each SourceCollection.
	number	No	The weight for each SourceCollection.	0.5
RerankFactor	number	No	The reranking factor. If this value is not empty, the vector retrieval results are reranked. Valid values: 1 < RerankFactor ≤ 5. Note Reranking is slower when documents are sparsely chunked. We recommend that the number of items reranked (TopK × Factor, rounded up) does not exceed 50.	2
SourceCollection	array<object>	Yes	The information about multiple collections to search.
	array<object>	No	Knowledge base
Collection	string	Yes	The name of the document collection. Note Create a collection by calling the CreateDocumentCollection operation. Call the ListDocumentCollections operation to view existing collections.	knowledge22
Namespace	string	No	The namespace. Note Create a namespace by calling the CreateNamespace operation. Call the ListNamespaces operation to view the list.	ns_cloud_index
NamespacePassword	string	Yes	The password for the namespace. Note This value is specified in the CreateNamespace operation.	ns_password
QueryParams	object	No	The filter condition for the data to retrieve, formatted as a SQL WHERE clause.
Filter	string	No	The filter condition for the data to retrieve, formatted as a SQL WHERE clause. This is a Boolean expression that evaluates to true or false. Conditions can use simple comparison operators such as equals (=), not equals (<> or !=), greater than (>), less than (<), greater than or equal to (>=), or less than or equal to (<=). They can also use logical operators (AND, OR, NOT) to combine more complex expressions, or keywords such as IN, BETWEEN, and LIKE. Note For detailed syntax, see https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-where/	id = 'llm-52tvykqt6u67iw73_j6ovptwjk7_file_6ce3da1f7e69495d9f491f2180c86973_11967297'
GraphEnhance	boolean	No	Whether to enable knowledge graph enhancement. Default value: false.	true
GraphSearchArgs	object	No	The number of top entities and relationship edges to return. Default value: 60.
GraphTopK	integer	No	The number of top entities and relationship edges to return. Default value: 60.	60
HybridSearch	string	No	The hybrid search algorithm. Default value: empty (the scores from dense vectors and full-text search are compared and sorted directly). Valid values: RRF: Reciprocal rank fusion. Uses a parameter k to control fusion effectiveness. See HybridSearchArgs for configuration details. Weight: Weighted ranking. Uses parameters to control the score weights of vectors and full-text search before sorting. See HybridSearchArgs for configuration details. Cascaded: Performs full-text search first, then performs vector search on those results.	Cascaded
HybridSearchArgs	object	No	The parameters for the hybrid search algorithm. Currently supports RRF and Weight. HybridPathsSetting specifies which retrieval paths to use: dense vectors (dense), sparse vectors (sparse), and full-text search (fulltext). If empty, the default paths are dense vectors and full-text search. RRF: The constant k in the scoring formula `1/(k+rank_i)`. It must be a positive integer greater than 1. Format: `{ "HybridPathsSetting": { "paths": "dense,fulltext" }, "RRF": { "k": 60 } }` Weight: Two-path retrieval (do not specify HybridPathsSetting, only specify alpha): Scoring formula: alpha × dense_score + (1 − alpha) × fulltext_score. The parameter alpha controls the relative weight of dense vectors and full-text search scores. Valid range: 0 to 1. A value of 0 means full-text search only. A value of 1 means dense vectors only. `{ "Weight": { "alpha": 0.5 } }` Three-path retrieval: Scoring formula: normalized_dense × dense_score + normalized_sparse × sparse_score + normalized_fulltext × fulltext_score. Here, dense, sparse, and fulltext represent the weights for dense vectors, sparse vectors, and full-text search respectively. Each weight must be ≥ 0. The system automatically normalizes these weights to the range 0–1 (i.e., normalized_x = x / (dense + sparse + fulltext)). `{ "HybridPathsSetting": { "paths": "dense,sparse,fulltext" }, "Weight": { "dense": 0.5, "sparse": 0.3, "fulltext": 0.2 } }`
	any	No	The parameter configuration value.	{ "RRF": { "k": 60 } }
Metrics	string	No	The metric used when building the vector index. Valid values: l2: Euclidean distance. ip: Inner product (dot product) distance. cosine: Cosine similarity.	cosine
RecallWindow	array	No	The recall window. If this value is not empty, it adds context to the returned search results. Format: a two-element array List<A, B>, where −10 ≤ A ≤ 0 and 0 ≤ B ≤ 10. Note Use this parameter when documents are over-chunked and context may be lost during retrieval. Reranking takes priority over windowing. That is, reranking happens first, then windowing.
	integer	No	The recall window range value.	[0,0]
RerankFactor	number	No	The reranking factor. If this value is not empty, the vector retrieval results are reranked. Valid values: 1 < RerankFactor ≤ 5. Note Reranking is slower when documents are sparsely chunked. We recommend that the number of items reranked (TopK × Factor, rounded up) does not exceed 50.	2.0
TopK	integer	No	The number of top results to return.	776
UseFullTextRetrieval	boolean	No	Whether to use full-text retrieval (two-path retrieval). Default value: false. Only vector retrieval is used if this value is false.	false
OrderBy	string	No	This parameter specifies the field for sorting and is empty by default. The field must be part of the metadata or a default field in the table, such as `id`. The format supports the following options: You can specify a single field, such as `chunk_id`. You can also specify multiple fields separated by commas, such as `block_id, chunk_id`. Descending order is also supported. For example: `block_id DESC, chunk_id DESC`.	file_id,sort_num
Offset	integer	No	The offset for paged queries.	20
TopK	integer	No	The number of top results to return after merging multi-channel recall results.	10

Response elements

Element	Type	Description	Example
	object
RequestId	string	The request ID.	ABB39CC3-4488-4857-905D-2E4A051D0521
Message	string	The response message.	success
Status	string	The status of the API call. Valid values: success: The call succeeded. fail: The call failed.	success
Matches	object
MatchList	array<object>	A single record.
	array<object>	A single record.
Id	string	The unique ID of the vector data.	doca-1234
Content	string	The text content.	云原生数据仓库AnalyticDB PostgreSQL版提供简单、快速、经济高效的PB级云端数据仓库解决方案。
Metadata	object	The metadata map.
	string	The value of the metadata map.	{\"pic_id\":\"text\",\"pic_name\":\"text\",\"pic_url\":\"text\"}
FileName	string	The file name.	my_doc.txt
Score	number	The similarity score for this item. The scoring algorithm matches the one specified when creating the index (l2, ip, or cosine).	0.12345
RetrievalSource	integer	The source of the retrieval result. Possible values are: 1 for vector retrieval, 2 for full-text index, and 3 for dual-path retrieval.	1
LoaderMetadata	string	The metadata added by the document loader.	{"page_pos": 1}
FileURL	string	The public URL of the image in the query result. Default validity period: 2 hours. You can specify a custom validity period using the UrlExpiration request parameter.	https://xxx-cn-beijing.aliyuncs.com/image/test.png
RerankScore	number	The reranking score.	6.2345
EmbeddingTokens	string	The number of tokens used during vectorization. Note A token is the smallest unit into which input text is divided. A token can be a word, phrase, punctuation mark, or character.	100
Usage	object	The resource usage for this query.
EmbeddingTokens	string	The number of tokens used during vectorization. Note A token is the smallest unit into which input text is divided. A token can be a word, phrase, punctuation mark, or character.	475
EmbeddingEntries	string	The number of entries processed during vectorization. Note An entry is a processing unit for text or images. For example, processing one piece of text counts as one entry. Processing one image counts as two entries.	10
Entities	object
entities	array<object>	Details about the entity.
	object	Details about the entity.
Id	string	The ID of the entity.	1
Entity	string	The name of the entity.	Dr. Wang
Type	string	The type of the entity.	人物
Description	string	The description of the entity.	A former advisor at DeepMind.
FileName	string	The file name.	my_doc.txt
Relations	object
relations	array<object>	Details about the relationship edge.
	object	Details about the relationship edge.
Id	string	The ID of the relationship edge.	1
SourceEntity	string	The source entity.	DeepMind前顾问
TargetEntity	string	The target entity.	Dr. Wang
Description	string	The description of the relationship edge.	Dr. Wang previously served as an advisor at DeepMind.
FileName	string	The file name.	my_doc.txt

Examples

Success response

JSON format

{
  "RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
  "Message": "success",
  "Status": "success",
  "Matches": {
    "MatchList": [
      {
        "Id": "doca-1234",
        "Content": "云原生数据仓库AnalyticDB PostgreSQL版提供简单、快速、经济高效的PB级云端数据仓库解决方案。",
        "Metadata": {
          "key": "{\\\"pic_id\\\":\\\"text\\\",\\\"pic_name\\\":\\\"text\\\",\\\"pic_url\\\":\\\"text\\\"}"
        },
        "FileName": "my_doc.txt",
        "Score": 0.12345,
        "RetrievalSource": 1,
        "LoaderMetadata": "{\"page_pos\": 1}",
        "FileURL": "https://xxx-cn-beijing.aliyuncs.com/image/test.png",
        "RerankScore": 6.2345
      }
    ]
  },
  "EmbeddingTokens": "100",
  "Usage": {
    "EmbeddingTokens": "475",
    "EmbeddingEntries": "10"
  },
  "Entities": {
    "entities": [
      {
        "Id": "1",
        "Entity": "Dr. Wang",
        "Type": "人物",
        "Description": "A former advisor at DeepMind.",
        "FileName": "my_doc.txt"
      }
    ]
  },
  "Relations": {
    "relations": [
      {
        "Id": "1",
        "SourceEntity": "DeepMind前顾问",
        "TargetEntity": "Dr. Wang",
        "Description": "Dr. Wang previously served as an advisor at DeepMind.",
        "FileName": "my_doc.txt\n"
      }
    ]
  }
}

Error codes

See Error Codes for a complete list.

Release notes

See Release Notes for a complete list.