This operation retrieves vectors and metadata from a specified document collection using natural language statements.
Try it now
Test
RAM authorization
|
Action |
Access level |
Resource type |
Condition key |
Dependent action |
|
gpdb:QueryContent |
create |
*Document
|
None | None |
Request parameters
|
Parameter |
Type |
Required |
Description |
Example |
| DBInstanceId |
string |
Yes |
The instance ID. Note
Call the DescribeDBInstances operation to view the details of all AnalyticDB for PostgreSQL instances in the destination region, including the instance IDs. |
gp-xxxxxxxxx |
| Namespace |
string |
No |
The namespace. The default value is public. Note
Create a namespace by calling the CreateNamespace operation. View the list of namespaces by calling the ListNamespaces operation. |
mynamespace |
| Collection |
string |
Yes |
The name of the document collection. Note
The document collection is created by calling the CreateDocumentCollection operation. Call the ListDocumentCollections operation to view the created document collections. |
document |
| RegionId |
string |
Yes |
The region ID of the instance. |
cn-hangzhou |
| NamespacePassword |
string |
Yes |
The password of the namespace. Note
This value is specified when you call the CreateNamespace operation. |
testpassword |
| Content |
string |
No |
The text content to use for retrieval. |
ADBPG是什么? |
| Filter |
string |
No |
The filter condition for the data to query. The format is the same as a SQL WHERE clause. It is an expression that returns a Boolean value (true or false). The condition can be a simple comparison operator, such as equal to (=), not equal to (<> or !=), greater than (>), less than (<), greater than or equal to (>=), or less than or equal to (<=). It can also be a more complex expression that combines logical operators (AND, OR, NOT), or conditions that use keywords such as IN, BETWEEN, and LIKE. Note
|
title = 'test' AND name like 'test%' |
| RecallWindow |
array |
No |
The recall window. If this value is not empty, the context of the retrieval results is returned. The format is an array of two elements: List<A, B>, where -10 <= A <= 0 and 0 <= B <= 10. Note
|
|
|
integer |
No |
The size of the recall window. |
[-5, 5] |
|
| TopK |
integer |
No |
The number of top results to return. |
10 |
| RerankFactor |
number |
No |
The reranking factor. If this value is not empty, the vector retrieval results are reranked. The value must be in the range of 1 < RerankFactor <= 5. Note
|
2 |
| UseFullTextRetrieval |
boolean |
No |
(This parameter is deprecated) Specifies whether to use full-text index (dual-channel recall). The default value is false, which means only vector retrieval is used. |
true |
| Metrics |
string |
No |
The similarity algorithm for retrieval. If this parameter is empty, the algorithm specified when the knowledge base was created is used. Do not set this parameter unless you have special requirements. Note
Valid values:
|
cosine |
| FileName |
string |
No |
In a search by image scenario, this is the source file name of the image to search for. Note
The image file must have a file extension. Supported extensions are: bmp, jpg, jpeg, png, and tiff. |
test.jpg |
| FileUrl |
string |
No |
In a search by image scenario, this is the publicly accessible URL of the image file. Note
The image file must have a file extension. Supported extensions are: bmp, jpg, jpeg, png, and tiff. |
https://xx/myImage.jpg |
| IncludeVector |
boolean |
No |
Specifies whether to return vectors. The default value is false. Note
|
true |
| HybridSearch |
string |
No |
The multi-channel recall algorithm. The default value is empty, which means the scores of dense vectors and full-text search are directly compared and sorted. Valid values:
|
RRF |
| HybridSearchArgs |
object |
No |
The algorithm parameters for multi-channel recall. Currently, RRF and Weight are supported. HybridPathsSetting can specify the recall of dense vectors (dense), sparse vectors (sparse), and full-text index (fulltext). If the value is empty, dense vectors and full-text index are recalled by default.
|
|
|
object |
No |
The name of the multi-channel recall parameter. |
||
|
any |
No |
The parameter value. |
{ "HybridPathsSetting": { "paths": "dense,fulltext" }, "RRF": { "k": 60 } } |
|
| IncludeMetadataFields |
string |
No |
The metadata fields to return. The default value is empty. Separate multiple fields with commas. |
title,page |
| IncludeFileUrl |
boolean |
No |
Specifies whether to synchronously return the URL of the document. The default is to not return the URL. |
false |
| UrlExpiration |
string |
No |
The validity period of the returned image URL. Note
Valid values
|
7200s |
| GraphEnhance |
boolean |
No |
Specifies whether to enable knowledge graph enhancement. Default value: false. |
false |
| GraphSearchArgs |
object |
No |
The knowledge graph retrieval parameters. |
|
| GraphTopK |
integer |
No |
The number of top entities and relationship edges to return. Default value: 60. |
60 |
| OrderBy |
string |
No |
The field to sort by. The default value is empty. The field must be a metadata field or a default field in the table, such as id. The following formats are supported: A single field, such as chunk_id. Multiple fields separated by commas, such as block_id, chunk_id. Descending order is supported, such as: block_id DESC, chunk_id DESC. Single field, such as chunk_id. Multiple fields that are separated by commas (,), such as block_id,chunk_id. Descending order is supported, such as block_id DESC,chunk_id DESC. |
created_at |
| Offset |
integer |
No |
The offset for paged queries. |
0 |
Response elements
|
Element |
Type |
Description |
Example |
|
object |
|||
| RequestId |
string |
The request ID. |
ABB39CC3-4488-4857-905D-2E4A051D0521 |
| Message |
string |
The returned message. |
success |
| Status |
string |
The status. Valid values:
|
success |
| Matches |
object |
||
| MatchList |
array<object> |
The list of matched items. |
|
|
array<object> |
A single record. |
||
| Id |
string |
The unique ID of the vector data. |
doca-1234 |
| Content |
string |
The text content. |
云原生数据仓库AnalyticDB PostgreSQL版提供简单、快速、经济高效的PB级云端数据仓库解决方案。 |
| Metadata |
object |
The metadata map. |
|
|
string |
The metadata. |
{"title":"test"} |
|
| Vector |
object |
||
| VectorList |
array |
The list of vector data. |
|
|
number |
The vector data. |
[1.2123,-0.12314,...] |
|
| FileName |
string |
The file name. |
my_doc.txt |
| Score |
number |
The similarity score of this data entry. The scoring algorithm is related to the algorithm (l2, ip, or cosine) specified when the index was created. |
0.12345 |
| RetrievalSource |
integer |
The source of the retrieval result. 1 indicates vector retrieval, 2 indicates full-text index, and 3 indicates dual-channel recall. |
1 |
| LoaderMetadata |
string |
The metadata from when the document loader loaded the document. |
{"page_pos": 1} |
| FileURL |
string |
The public URL of the image in the query result. The default validity period is 2 hours. You can specify a custom validity period using the UrlExpiration input parameter. |
https://xxx-cn-beijing.aliyuncs.com/image/test.png |
| RerankScore |
number |
The reranking score. |
6.2345 |
| WindowMatches |
object |
||
| windowMatches |
array<object> |
The list of windowed matches. |
|
|
array<object> |
|||
| WindowMatch |
object |
||
| windowMatch |
array<object> |
The list of matches for a single top window. |
|
|
array<object> |
|||
| Id |
string |
The unique ID of the vector data. |
doca-2345 |
| Content |
string |
The text content. |
云原生数据仓库AnalyticDB PostgreSQL版是一种大规模并行处理(MPP)数据仓库服务,可提供海量数据在线分析服务。 |
| Metadata |
object |
The metadata map. |
|
|
string |
The metadata. |
{"title":"test"} |
|
| FileName |
string |
The file name. |
my_doc.txt |
| LoaderMetadata |
string |
The metadata from when the document loader loaded the document. |
{"page_pos": 2} |
| EmbeddingTokens |
string |
The number of tokens used for vectorization. Note
A token is the smallest unit into which the input text is divided. A token can be a word, a phrase, a punctuation mark, a character, and so on. |
100 |
| Usage |
object |
The resource usage for this query. |
|
| EmbeddingTokens |
string |
The number of tokens used for vectorization. Note
A token is the smallest unit into which the input text is divided. A token can be a word, a phrase, a punctuation mark, a character, and so on. |
100 |
| EmbeddingEntries |
string |
The number of entries used for vectorization. Note
An entry refers to the number of processing operations during vectorization of text or images. For example, processing text once counts as one entry, while processing an image once counts as two entries. |
10 |
| Entities |
object |
||
| entities |
array<object> |
The list of entities. |
|
|
object |
The entity details. |
||
| Id |
string |
The entity ID. |
1 |
| Entity |
string |
The entity name. |
Dr. Wang |
| Type |
string |
The entity type. |
人物 |
| Description |
string |
The entity description. |
A former advisor at DeepMind. |
| FileName |
string |
The file name. |
my_doc.txt |
| Relations |
object |
||
| relations |
array<object> |
The list of relationship edges. |
|
|
object |
The relationship edge details. |
||
| Id |
string |
The relationship edge ID. |
1 |
| SourceEntity |
string |
The source entity. |
DeepMind前顾问 |
| TargetEntity |
string |
The target entity. |
Dr. Wang |
| Description |
string |
The relationship edge description. |
Dr. Wang previously served as an advisor at DeepMind. |
| FileName |
string |
The file name. |
my_doc.txt |
Examples
Success response
JSON format
{
"RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
"Message": "success",
"Status": "success",
"Matches": {
"MatchList": [
{
"Id": "doca-1234",
"Content": "云原生数据仓库AnalyticDB PostgreSQL版提供简单、快速、经济高效的PB级云端数据仓库解决方案。",
"Metadata": {
"key": "{\"title\":\"test\"}"
},
"Vector": {
"VectorList": [
0
]
},
"FileName": "my_doc.txt",
"Score": 0.12345,
"RetrievalSource": 1,
"LoaderMetadata": "{\"page_pos\": 1}",
"FileURL": "https://xxx-cn-beijing.aliyuncs.com/image/test.png",
"RerankScore": 6.2345
}
]
},
"WindowMatches": {
"windowMatches": [
{
"WindowMatch": {
"windowMatch": [
{
"Id": "doca-2345",
"Content": "云原生数据仓库AnalyticDB PostgreSQL版是一种大规模并行处理(MPP)数据仓库服务,可提供海量数据在线分析服务。",
"Metadata": {
"key": "{\"title\":\"test\"}"
},
"FileName": "my_doc.txt",
"LoaderMetadata": "{\"page_pos\": 2}"
}
]
}
}
]
},
"EmbeddingTokens": "100",
"Usage": {
"EmbeddingTokens": "100",
"EmbeddingEntries": "10"
},
"Entities": {
"entities": [
{
"Id": "1",
"Entity": "Dr. Wang",
"Type": "人物",
"Description": "A former advisor at DeepMind.",
"FileName": "my_doc.txt"
}
]
},
"Relations": {
"relations": [
{
"Id": "1",
"SourceEntity": "DeepMind前顾问",
"TargetEntity": "Dr. Wang",
"Description": "Dr. Wang previously served as an advisor at DeepMind.",
"FileName": "my_doc.txt\n"
}
]
}
}
Error codes
See Error Codes for a complete list.
Release notes
See Release Notes for a complete list.