The Retrieve API returns text segments from a knowledge base based on semantic similarity. When semantic search returns too many irrelevant results, use SearchFilters to apply metadata-based filtering on top of semantic results. This is effective for structured data such as data tables with well-defined fields.
How SearchFilters works
Without SearchFilters, the Retrieve API returns all semantically similar text segments, which may include irrelevant matches. SearchFilters first performs semantic search and then filters results by field conditions.
Without SearchFilters:
{
"indexId": "o73yjlxxxx",
"query": "Employees in the company named San Zhang"
}
Semantic retrieval returns text segments that are not highly relevant to "San Zhang".
With SearchFilters:
{
"indexId": "o73yjlxxxx",
"query": "Employees in the company named San Zhang",
"searchFilters": [
{
"Name": "San Zhang"
}
]
}
SearchFilters removes text segments that do not match Name: "San Zhang" from the semantic results.
Syntax
SearchFilters is an array of subgroups. Each subgroup is a JSON object with one or more field: value pairs. Subgroups are always combined with AND logic.
{
"searchFilters": [
{
"Name": "San Zhang",
"Sex": "Male"
},
{
"Position": "Engineer"
}
]
}
In this example, results must match Name = "San Zhang" AND Sex = "Male" (within subgroup 1) AND Position = "Engineer" (subgroup 2).
Supported query types
Operator reference
| Query type | Operator | Supported field types | Description |
|---|---|---|---|
| Single-value | *(direct value)* | string, numeric (long, double) | Match a single exact value. Example: {"Name": "San Zhang"} |
| Multi-value | *(JSON array)* | string array, numeric array | Match any value in the array (similar to SQL IN). Example: {"Name": "[\"San Zhang\",\"Si Li\"]"} |
| Equality | eq |
string, numeric | Equal to. Example: {"Name": {"eq": "San Zhang"}} |
| Equality | neq |
string, numeric | Not equal to. Example: {"Name": {"neq": "San Zhang"}} |
| Interval | gt |
numeric | Greater than. Example: {"Age": {"gt": 20}} |
| Interval | gte |
numeric | Greater than or equal to. Example: {"Age": {"gte": 20}} |
| Interval | lt |
numeric | Less than. Example: {"Age": {"lt": 30}} |
| Interval | lte |
numeric | Less than or equal to. Example: {"Age": {"lte": 30}} |
| Fuzzy | like |
string | Wildcard pattern match. Example: {"Position": {"like": "E%r"}} |
| Tag | tags |
string array | Filter by document tags (document search knowledge bases only). Example: {"tags": "[\"tag1\",\"tag2\"]"} |
Equality and interval queries are range query subtypes. Equality queries are case-insensitive and accept only single values per field.
Wildcard characters for fuzzy queries
| Character | Description | Example |
|---|---|---|
% |
Matches zero or more characters | E%r matches Engineer |
_ |
Matches exactly one character | E_gineer matches Engineer |
Tag query logic
-
Multiple tags in one subgroup use OR logic: files matching any tag are returned.
-
For AND logic between tags, place each tag in a separate subgroup.
Prerequisites
You need:
-
(RAM users only) The AliyunBailianDataFullAccess policy and workspace membership. Not required for an Alibaba Cloud account.
-
The Model Studio SDK installed and configured
-
AccessKey and AccessKey secret configured as environment variables
RAM users can manage knowledge bases only in workspaces they've joined. Alibaba Cloud accounts can manage all workspaces.
Sample data setup
Examples use EmployeeInformation.xlsx (3 records). Create a knowledge base with:
| Setting | Value |
|---|---|
| Knowledge base type | Data Query |
| Data source | Upload Data Table |
| Fields | Name (string), Sex (string), Position (string), Age (double) |
| Index settings | All fields used for retrieval and model responses |
Complete code examples
The following Python and Java examples cover all query types.
Before running the sample code, configure your AccessKey and AccessKey secret as environment variables.
Python
Java
Query type examples
Each query type includes JSON syntax and sample request/response.
Replace these placeholders with your actual values:
| Placeholder | Description | Example |
|---|---|---|
<knowledge-base-id> |
Knowledge base index ID | 27ubwxxxxx |
<workspace-id> |
Workspace ID | llm-4u5xpd1xdjxxxxxx |
Subgroup query
Each element in the searchFilters array is a subgroup combining filter conditions. All subgroups must match (AND logic).
Filter for records where Name is "San Zhang" AND Sex is "Female":
{
"searchFilters": [
{
"Name": "San Zhang"
},
{
"Sex": "Female"
}
]
}
No records match both conditions, so nodes is empty.
Python
Java
Single-value query
Pass a single value for a field to match records with that exact value.
Filter for records where Name is "San Zhang":
{
"searchFilters": [
{
"Name": "San Zhang"
}
]
}
Python
Java
Multi-value query
Pass multiple values for a field to match records containing any of the specified values, similar to the SQL IN operator. Serialize the value array as a JSON string.
Filter for records where Name is "San Zhang" or "Si Li":
{
"searchFilters": [
{
"Name": "[\"San Zhang\",\"Si Li\"]"
}
]
}
Python
Java
Range query
Use range queries to find records where a numeric field falls within a specified interval. Combine range queries with other query types by placing them in separate subgroups.
Filter for records where Position is "Engineer" (single-value) AND Age is between 20 and 25 (range):
{
"searchFilters": [
{
"Position": "Engineer"
},
{
"Age": {
"gte": 20,
"lte": 25
}
}
]
}
Python
Java
Fuzzy query
Use wildcard patterns to match string fields partially, similar to the SQL LIKE operator. Wrap the pattern in a {"like": "<pattern>"} object and serialize it as a JSON string.
Filter for records where Name is "San Zhang" AND Position starts with "E" and ends with "r":
{
"searchFilters": [
{
"Name": "San Zhang"
},
{
"Position": {
"like": "E%r"
}
}
]
}
Python
Java
Tag query
Filter by document tags in a document search knowledge base. Tags narrow retrieval to text segments from files that match specific tag values.
Sample setup: A document search knowledge base contains three resume files with the following tags:
| File | Tags |
|---|---|
| San Zhang's Resume | University A, Sports Specialist Student |
| Si Li's Resume | University B |
| Wu Wang's Resume | University B, Student Union President |
Example 1: OR logic (multiple tags in one subgroup)
Return text segments from files tagged with University A OR Student Union President:
Multiple tags within a single subgroup use OR logic. To apply AND logic, place each tag in a separate subgroup (see Example 2).
{
"searchFilters": [
{
"tags": "[\"University A\",\"Student Union President\"]"
}
]
}
This returns results from both San Zhang's Resume (tagged University A) and Wu Wang's Resume (tagged Student Union President).
Example 2: AND logic (tags in separate subgroups)
Return text segments from files tagged with both University A AND Sports Specialist Student:
{
"searchFilters": [
{
"tags": "[\"University A\"]"
},
{
"tags": "[\"Sports Specialist Student\"]"
}
]
}
Only San Zhang's Resume is returned because it is the only file with both tags.
References
| Resource | Description |
|---|---|
| Create and use a knowledge base | Knowledge base user guide |
| Retrieve API | Retrieve text segments from a knowledge base |
| Grant API permissions to a RAM user | Set up RAM user access for the Retrieve API |
| Error codes | Troubleshoot API call failures |