Use the Retrieve API to perform a semantic search in a knowledge base and return the most relevant chunks for a query. The API supports vector retrieval, full-text search, hybrid search, metadata filter, and various Rerank strategies.
Retrieval configuration precedence
The retrieval configuration (retrievalConfiguration) of the Retrieve API determines the parameters that take effect based on the following priority:
Priority | Source | Description |
1 (Highest) | Retrieve API parameters | The provided |
2 | Knowledge base-level configuration | Set when you create the knowledge base. You can modify it at any time using the UpdateKnowledgeBase API. |
3 (Lowest) | System defaults | Hybrid search (vector + full-text search) using the WEIGHT strategy for weighted fusion (vector 0.7 : full-text search 0.3), returning 20 results. |
The simplest retrieval request requires only knowledgeBaseName and retrievalQuery. The system automatically uses the knowledge base configuration or default values.
Search types
Type | Description | Use cases |
| Vector retrieval based on semantic similarity. | Ideal for queries expressed in natural language where understanding semantic intent is crucial. For example, a query for "how to install" can match content about "deployment steps." |
| Full-text search based on keyword matching. | Use for queries with exact keywords, proper nouns, or identifiers. |
Vector retrieval and full-text search are complementary. Vector retrieval excels at understanding semantics, while full-text search is effective for exact matches. We recommend enabling both search types and using Rerank to fuse the results.
Rerank strategies
Rerank fuses and reorders the results from vector retrieval and full-text search to produce the final result set.
WEIGHT (weighted fusion)
This strategy applies weighted scores to the results from both vector retrieval and full-text search. It is ideal for scenarios requiring fine-grained control over the contribution of each search method. This is the recommended default strategy.
Configuration parameters:
Parameter | Type | Default | Description |
| double | 0.7 | Weighting ratio for vector retrieval. |
| double | 0.3 | Weighting ratio for full-text search. |
RRF (Reciprocal Rank Fusion)
This strategy fuses results from multiple search methods by weighting them based on the reciprocal of their rank. It offers stable performance and low latency without an additional model call.
Configuration parameters:
Parameter | Type | Default | Description |
| double | 1.0 | Weight for vector retrieval. |
| double | 1.0 | Weight for full-text search. |
| int | 60 | RRF algorithm parameter. Must be greater than 0. |
Configuration example:
{
"rerankingConfiguration": {
"type": "RRF",
"numberOfResults": 5,
"rrfConfiguration": {
"denseVectorSearchWeight": 0.6,
"fullTextSearchWeight": 0.4,
"k": 20
}
}
}MODEL (Model Rerank)
Calling a Rerank model, such as gte-rerank-v2, to re-rank candidate results provides the highest ranking quality but introduces additional latency and computational costs.
Configuration parameters:
Parameter | Type | Default | Description |
| string |
| Rerank model provider. Currently, only Model Studio is supported. |
| string |
| The name of the Rerank model. |
Choosing a strategy
Scenario | Recommended strategy |
General-purpose scenarios sensitive to latency. | RRF |
Scenarios that require fine-grained control over the ratio between vector and full-text search results. | WEIGHT |
Scenarios that demand the highest ranking quality and can tolerate additional latency. | MODEL |
Metadata filter
During retrieval, use the filter parameter to filter results by metadata conditions and narrow the candidate pool. The filtering fields must be defined in metadata when the knowledge base is created.
Comparison operators
Operator | Symbol | Description | Applicable types |
| = | Equals | All types |
| ≠ | Does not equal | All types |
| > | Greater than | long, double, date |
| ≥ | Greater than or equal to | long, double, date |
| < | Less than | long, double, date |
| ≤ | Less than or equal to | long, double, date |
Set and matching operators
Operator | Description | Applicable types |
| Value is in the specified set. | All types |
| Value is not in the specified set. | All types |
| Matches a string prefix. | string |
| Matches a substring. | string |
| Checks if a list field contains the specified element. | list |
Logical combination operators
Operator | Description |
| All conditions must be met. |
| At least one of the conditions must be met. |
| Not all of the conditions are met. |
You can nest andAll, orAll, and notAll to build complex combined filtering logic.
Filter examples
Query for documents where category is "technology" or "science", score is ≥ 60, and title starts with "Product":
{
"filter": {
"andAll": [
{"in": {"key": "category", "value": ["technology", "science"]}},
{"greaterThanOrEquals": {"key": "score", "value": 60}},
{"startsWith": {"key": "title", "value": "Product"}}
]
}
}Use orAll to implement OR logic: query for documents where the status is "active" or the score is greater than 90.
{
"filter": {
"orAll": [
{"equals": {"key": "status", "value": "active"}},
{"greaterThan": {"key": "score", "value": 90}}
]
}
}Retrieve API
Request parameters
Parameter | Type | Description |
| string | The name of the knowledge base. Required. |
| list<string> | A list of subspaces, up to 32. Required if subspaces are enabled. |
| object | The retrieval query. Required. |
| string | Query type (Required). Only |
| string | The query text. Required. Maximum 128 characters. |
| object | Retrieval configuration. If not provided, the knowledge base-level configuration or the system defaults are used. |
| list<string> | The search type. |
| int | Number of results to return from vector retrieval. Maximum: 100. |
| int | Number of results to return from full-text search. Maximum: 100. |
| object | Rerank configuration. |
| object | metadata filter conditions. |
Response
Response fields
Field | Type | Description |
| list<object> | A list of retrieval results, sorted by relevance score in descending order. |
| string | The ID of the document to which the chunk belongs. |
| int | The ID of the chunk. |
| string | The OSS path of the source document. |
| float | The relevance score. A higher score indicates a better match. |
| string | The original content of the chunk. |
| string | The subspace to which the chunk belongs. |
| object | The document metadata. |
Code examples
Minimal example
This example does not set retrieval parameters in the request. The system uses the knowledge base-level configuration or the system defaults:
resp = client.retrieve({
"knowledgeBaseName": "product_docs_kb",
"retrievalQuery": {"type": "TEXT", "text": "What are the installation steps for the product?"}
})
for r in resp["data"]["retrievalResults"]:
print(f"[{r['score']:.4f}] {r['content'][:80]}...")Complete example
This example sets retrieval parameters in the request to override the knowledge base-level configuration:
resp = client.retrieve({
"knowledgeBaseName": "product_docs_kb",
"subspace": ["default"],
"retrievalQuery": {"type": "TEXT", "text": "What are the installation steps for the product?"},
"retrievalConfiguration": {
"searchType": ["DENSE_VECTOR", "FULL_TEXT"],
"denseVectorSearchConfiguration": {"numberOfResults": 10},
"fullTextSearchConfiguration": {"numberOfResults": 10},
"rerankingConfiguration": {
"type": "RRF",
"numberOfResults": 5,
"rrfConfiguration": {
"denseVectorSearchWeight": 0.6,
"fullTextSearchWeight": 0.4,
"k": 60
}
},
"filter": {
"andAll": [
{"in": {"key": "category", "value": ["Product Documentation"]}}
]
}
}
})
for result in resp["data"]["retrievalResults"]:
print(f"[score={result['score']:.4f}] {result['content'][:100]}...")
print(f" Source: {result['ossKey']}, chunkId: {result['chunkId']}")Using model classes
Use the provided model classes to build the request for better code completion and type checking in your IDE:
from tablestore_agent_storage.models import (
RetrieveRequest, RetrievalQuery, RetrievalQueryType
)
resp = client.retrieve(RetrieveRequest(
knowledge_base_name="product_docs_kb",
retrieval_query=RetrievalQuery(
text="What are the installation steps for the product?",
type=RetrievalQueryType.TEXT
)
))Sample response
{
"code": "SUCCESS",
"data": {
"retrievalResults": [
{
"ossKey": "oss://example-bucket/docs/product_manual.pdf",
"docId": "96fb386e-...",
"chunkId": 3,
"subspace": "default",
"score": 0.85,
"content": "Step 1: Download the installation package...",
"metadata": {"author": "John Doe", "category": "Product Documentation"}
}
]
},
"message": "succeed"
}Usage notes
Issue | Description |
Query text is too long | The Note For specific business needs, contact technical support by submitting a ticket or joining the Tablestore technology exchange group (ID: 36165029092). |
Filter field is not defined | Filter fields must be defined as metadata when the knowledge base is created. |
The |
|
Document indexing is not complete | Documents in |
| If subspaces are enabled for the knowledge base, this parameter is required. |
Performance tuning
numberOfResults parameter relationship
The retrieval process has three layers of numberOfResults. Understanding the relationship between them is the key to tuning:
denseVectorSearchConfiguration.numberOfResults = N1 // Number of candidates recalled by vector retrieval
fullTextSearchConfiguration.numberOfResults = N2 // Number of candidates recalled by full-text search
rerankingConfiguration.numberOfResults = N3 // Number of final results after RerankN1 and N2 determine the size of the candidate pool. Larger values improve recall but also increase computational cost.
N3 determines the final number of results to return. Typically, N3 should be less than both N1 and N2.
Recommended starting configuration: Set N1 and N2 to 20, and set N3 to a value between 5 and 10. Adjust these values gradually based on your results.
Choosing a search type
Scenario | Recommended search type | Description |
Asking questions in natural language |
| Vector retrieval captures semantic meaning, while full-text search ensures keyword hits. |
Searching with exact keywords or identifiers | Prioritize | Vector retrieval is not sensitive to exact matches. |
Purely semantic understanding scenarios | Prioritize | For example, matching "how to install" with "deployment steps." |
Using the metadata filter
A filter prunes the candidate pool before the search phase, improving both accuracy and performance.
Filter fields must be defined as metadata when the knowledge base is created and cannot be added at runtime.
For date range filtering, use the
datetype instead of thestringtype to support range comparison operators.