Configure hybrid search and Rerank strategies for a knowledge base (vector, full-text search, and metadata filter) - Tablestore

Use the Retrieve API to perform a semantic search in a knowledge base and return the most relevant chunks for a query. The API supports vector retrieval, full-text search, hybrid search, metadata filter, and various Rerank strategies.

Retrieval configuration precedence

The retrieval configuration (retrievalConfiguration) of the Retrieve API determines the parameters that take effect based on the following priority:

Priority	Source	Description
1 (Highest)	Retrieve API parameters	The provided `retrievalConfiguration` is valid for the current request only.
2	Knowledge base-level configuration	Set when you create the knowledge base. You can modify it at any time using the UpdateKnowledgeBase API.
3 (Lowest)	System defaults	Hybrid search (vector + full-text search) using the WEIGHT strategy for weighted fusion (vector 0.7 : full-text search 0.3), returning 20 results.

The simplest retrieval request requires only knowledgeBaseName and retrievalQuery. The system automatically uses the knowledge base configuration or default values.

Search types

Type	Description	Use cases
`DENSE_VECTOR`	Vector retrieval based on semantic similarity.	Ideal for queries expressed in natural language where understanding semantic intent is crucial. For example, a query for "how to install" can match content about "deployment steps."
`FULL_TEXT`	Full-text search based on keyword matching.	Use for queries with exact keywords, proper nouns, or identifiers.

Vector retrieval and full-text search are complementary. Vector retrieval excels at understanding semantics, while full-text search is effective for exact matches. We recommend enabling both search types and using Rerank to fuse the results.

Rerank strategies

Rerank fuses and reorders the results from vector retrieval and full-text search to produce the final result set.

WEIGHT (weighted fusion)

This strategy applies weighted scores to the results from both vector retrieval and full-text search. It is ideal for scenarios requiring fine-grained control over the contribution of each search method. This is the recommended default strategy.

Configuration parameters:

Parameter	Type	Default	Description
`denseVectorSearchWeight`	double	0.7	Weighting ratio for vector retrieval.
`fullTextSearchWeight`	double	0.3	Weighting ratio for full-text search.

RRF (Reciprocal Rank Fusion)

This strategy fuses results from multiple search methods by weighting them based on the reciprocal of their rank. It offers stable performance and low latency without an additional model call.

Configuration parameters:

Parameter	Type	Default	Description
`denseVectorSearchWeight`	double	1.0	Weight for vector retrieval.
`fullTextSearchWeight`	double	1.0	Weight for full-text search.
`k`	int	60	RRF algorithm parameter. Must be greater than 0.

Configuration example:

{
  "rerankingConfiguration": {
    "type": "RRF",
    "numberOfResults": 5,
    "rrfConfiguration": {
      "denseVectorSearchWeight": 0.6,
      "fullTextSearchWeight": 0.4,
      "k": 20
    }
  }
}

MODEL (Model Rerank)

Calling a Rerank model, such as gte-rerank-v2, to re-rank candidate results provides the highest ranking quality but introduces additional latency and computational costs.

Configuration parameters:

Parameter	Type	Default	Description
`provider`	string	`Bailian`	Rerank model provider. Currently, only Model Studio is supported.
`model`	string	`gte-rerank-v2`	The name of the Rerank model.

Choosing a strategy

Scenario	Recommended strategy
General-purpose scenarios sensitive to latency.	RRF
Scenarios that require fine-grained control over the ratio between vector and full-text search results.	WEIGHT
Scenarios that demand the highest ranking quality and can tolerate additional latency.	MODEL

Metadata filter

During retrieval, use the filter parameter to filter results by metadata conditions and narrow the candidate pool. The filtering fields must be defined in metadata when the knowledge base is created.

Comparison operators

Operator	Symbol	Description	Applicable types
`equals`	=	Equals	All types
`notEquals`	≠	Does not equal	All types
`greaterThan`	>	Greater than	long, double, date
`greaterThanOrEquals`	≥	Greater than or equal to	long, double, date
`lessThan`	<	Less than	long, double, date
`lessThanOrEquals`	≤	Less than or equal to	long, double, date

Set and matching operators

Operator	Description	Applicable types
`in`	Value is in the specified set.	All types
`notIn`	Value is not in the specified set.	All types
`startsWith`	Matches a string prefix.	string
`stringContains`	Matches a substring.	string
`listContains`	Checks if a list field contains the specified element.	list

Logical combination operators

Operator	Description
`andAll`	All conditions must be met.
`orAll`	At least one of the conditions must be met.
`notAll`	Not all of the conditions are met.

You can nest andAll, orAll, and notAll to build complex combined filtering logic.

Filter examples

Query for documents where category is "technology" or "science", score is ≥ 60, and title starts with "Product":

{
  "filter": {
    "andAll": [
      {"in": {"key": "category", "value": ["technology", "science"]}},
      {"greaterThanOrEquals": {"key": "score", "value": 60}},
      {"startsWith": {"key": "title", "value": "Product"}}
    ]
  }
}

Use orAll to implement OR logic: query for documents where the status is "active" or the score is greater than 90.

{
  "filter": {
    "orAll": [
      {"equals": {"key": "status", "value": "active"}},
      {"greaterThan": {"key": "score", "value": 90}}
    ]
  }
}

Retrieve API

Request parameters

Parameter	Type	Description
`knowledgeBaseName`	string	The name of the knowledge base. Required.
`subspace`	list<string>	A list of subspaces, up to 32. Required if subspaces are enabled.
`retrievalQuery`	object	The retrieval query. Required.
`retrievalQuery.type`	string	Query type (Required). Only `TEXT` is supported.
`retrievalQuery.text`	string	The query text. Required. Maximum 128 characters.
`retrievalConfiguration`	object	Retrieval configuration. If not provided, the knowledge base-level configuration or the system defaults are used.
`retrievalConfiguration.searchType`	list<string>	The search type.
`retrievalConfiguration.denseVectorSearchConfiguration.numberOfResults`	int	Number of results to return from vector retrieval. Maximum: 100.
`retrievalConfiguration.fullTextSearchConfiguration.numberOfResults`	int	Number of results to return from full-text search. Maximum: 100.
`retrievalConfiguration.rerankingConfiguration`	object	Rerank configuration.
`retrievalConfiguration.filter`	object	metadata filter conditions.

Response

Response fields

Field	Type	Description
`retrievalResults`	list<object>	A list of retrieval results, sorted by relevance score in descending order.
`retrievalResults[].docId`	string	The ID of the document to which the chunk belongs.
`retrievalResults[].chunkId`	int	The ID of the chunk.
`retrievalResults[].ossKey`	string	The OSS path of the source document.
`retrievalResults[].score`	float	The relevance score. A higher score indicates a better match.
`retrievalResults[].content`	string	The original content of the chunk.
`retrievalResults[].subspace`	string	The subspace to which the chunk belongs.
`retrievalResults[].metadata`	object	The document metadata.

Code examples

Minimal example

This example does not set retrieval parameters in the request. The system uses the knowledge base-level configuration or the system defaults:

resp = client.retrieve({
    "knowledgeBaseName": "product_docs_kb",
    "retrievalQuery": {"type": "TEXT", "text": "What are the installation steps for the product?"}
})

for r in resp["data"]["retrievalResults"]:
    print(f"[{r['score']:.4f}] {r['content'][:80]}...")

Complete example

This example sets retrieval parameters in the request to override the knowledge base-level configuration:

resp = client.retrieve({
    "knowledgeBaseName": "product_docs_kb",
    "subspace": ["default"],
    "retrievalQuery": {"type": "TEXT", "text": "What are the installation steps for the product?"},
    "retrievalConfiguration": {
        "searchType": ["DENSE_VECTOR", "FULL_TEXT"],
        "denseVectorSearchConfiguration": {"numberOfResults": 10},
        "fullTextSearchConfiguration": {"numberOfResults": 10},
        "rerankingConfiguration": {
            "type": "RRF",
            "numberOfResults": 5,
            "rrfConfiguration": {
                "denseVectorSearchWeight": 0.6,
                "fullTextSearchWeight": 0.4,
                "k": 60
            }
        },
        "filter": {
            "andAll": [
                {"in": {"key": "category", "value": ["Product Documentation"]}}
            ]
        }
    }
})

for result in resp["data"]["retrievalResults"]:
    print(f"[score={result['score']:.4f}] {result['content'][:100]}...")
    print(f"  Source: {result['ossKey']}, chunkId: {result['chunkId']}")

Using model classes

Use the provided model classes to build the request for better code completion and type checking in your IDE:

from tablestore_agent_storage.models import (
    RetrieveRequest, RetrievalQuery, RetrievalQueryType
)

resp = client.retrieve(RetrieveRequest(
    knowledge_base_name="product_docs_kb",
    retrieval_query=RetrievalQuery(
        text="What are the installation steps for the product?",
        type=RetrievalQueryType.TEXT
    )
))

Sample response

{
  "code": "SUCCESS",
  "data": {
    "retrievalResults": [
      {
        "ossKey": "oss://example-bucket/docs/product_manual.pdf",
        "docId": "96fb386e-...",
        "chunkId": 3,
        "subspace": "default",
        "score": 0.85,
        "content": "Step 1: Download the installation package...",
        "metadata": {"author": "John Doe", "category": "Product Documentation"}
      }
    ]
  },
  "message": "succeed"
}

Usage notes

Issue	Description
Query text is too long	The `text` parameter has a maximum length of 128 characters, and an error is reported for longer inputs. Note For specific business needs, contact technical support by submitting a ticket or joining the Tablestore technology exchange group (ID: 36165029092).
Filter field is not defined	Filter fields must be defined as metadata when the knowledge base is created.
The `k` value for RRF is 0	`k` must be greater than 0, or a `VALIDATION_ERROR` is returned.
Document indexing is not complete	Documents in `Pending` or `Indexing` status cannot be retrieved.
`subspace` parameter is missing	If subspaces are enabled for the knowledge base, this parameter is required.

Performance tuning

numberOfResults parameter relationship

The retrieval process has three layers of numberOfResults. Understanding the relationship between them is the key to tuning:

denseVectorSearchConfiguration.numberOfResults = N1  // Number of candidates recalled by vector retrieval
fullTextSearchConfiguration.numberOfResults         = N2  // Number of candidates recalled by full-text search
rerankingConfiguration.numberOfResults           = N3  // Number of final results after Rerank

N1 and N2 determine the size of the candidate pool. Larger values improve recall but also increase computational cost.
N3 determines the final number of results to return. Typically, N3 should be less than both N1 and N2.
Recommended starting configuration: Set N1 and N2 to 20, and set N3 to a value between 5 and 10. Adjust these values gradually based on your results.

Choosing a search type

Scenario	Recommended search type	Description
Asking questions in natural language	`DENSE_VECTOR` + `FULL_TEXT`	Vector retrieval captures semantic meaning, while full-text search ensures keyword hits.
Searching with exact keywords or identifiers	Prioritize `FULL_TEXT`	Vector retrieval is not sensitive to exact matches.
Purely semantic understanding scenarios	Prioritize `DENSE_VECTOR`	For example, matching "how to install" with "deployment steps."

Using the metadata filter

A filter prunes the candidate pool before the search phase, improving both accuracy and performance.
Filter fields must be defined as metadata when the knowledge base is created and cannot be added at runtime.
For date range filtering, use the date type instead of the string type to support range comparison operators.