Multi-Path Search Overview: Parallel Query & Result Merging - OpenSearch

The Multi-path Search API runs up to N independent queries in parallel within a single request. It then sorts and merges the results from each path. For each path, you can customize the number of documents to retrieve and set a priority. You can also trace the source of each document in the final results. This feature is useful for complex search scenarios that combine text search, vector search, different indexes, or various query strategies.

Core concepts

Multi-path query: In a single search request, multiple independent queries are executed in parallel. Each query is called a path. The system then deduplicates, merges, and sorts the documents retrieved from all paths to generate a single result list.
Query group: A query group is a group of paths that have the same priority. The paths within the group share a retrieval quota. For example, a group contains two ingest endpoints, A and B, each with a quota of 100. If ingest endpoint A retrieves only 70 documents, the remaining quota of 30 can be used by ingest endpoint B, allowing ingest endpoint B to retrieve up to 130 documents.
Derived vector search: When a text query path matches a text embedding rule configured in the query analysis of the console, the system automatically creates a new vector query path. This derived path has the same priority as its parent path. Its quota defaults to 0, which means it shares the quota of its parent path.

Request method

POST

URL

/v3/openapi/apps/$app_name/multi-path-search

Note

$app_name: The name of your application. Premium Edition and Standard Edition support multiple applications and require you to specify an application name for access. This parameter is mainly used for applications that are in service. You can also specify the ID of an offline application to access its search service.
The preceding URL omits request headers, encoding, and other details.
The preceding URL omits the host address for accessing the application.

Request parameters

The request body is a JSON object. It contains a queries array that defines each query path and other global request parameters.

Global request parameters

Parameter	Type	Required	Default	Description
raw_query	String	No	""	The original search query from the end user. Used for query analysis or logging.
start	Integer	No	0	The starting position for pagination. The operation returns documents starting from the document at the `start` index. Value range: `[0, 5000]`.
hit	Integer	No	10	The maximum number of documents to return for the request. Valid values: `[1, 500]`.
format	String	No	`fulljson`	The format of the returned result set.
fetch_fields	String	No	All fields	The document fields to return. Separate multiple fields with semicolons (;).
unified_rank_size	Integer	No	2000	The total number of documents for unified sorting. The system selects up to `unified_rank_size` documents from the retrieved results of all paths based on priority for final sorting. Valid values: `[1, 10000]`.
unified_rank_type	String	No	none	The type of unified sorting. Valid values: • `rrf`: Uses the Reciprocal Rank Fusion algorithm to merge and sort results from multiple paths. • `cava_script`: Uses a custom Cava script for sorting. Text relevance scoring cannot be performed in the Cava script for unified sorting. You can use models such as click-through rate (CTR) for scoring. • `none`: Skips the unified sorting stage.
unified_rank_name	String	No	""	The name of the sorting script. This parameter is required when `unified_rank_type` is set to `cava_script`.
vector_search	Object	No	{}	The vector retrieval configuration. For more information about the parameters, see Vector retrieval.
user_id	String	No	""	The unique identifier of the end user. It can be used for unique visitor (UV) statistics or algorithm training.
trace	String	No	""	The log level for the search. Used for troubleshooting.
rank_trace	String	No	""	The log level for sorting. Used for troubleshooting.

Parameters of the queries array member object

Each JSON object in the queries array represents an independent query path.

Parameter	Type	Required	Default	Description
path	String	Yes	-	The unique identifier for the current query path. You can use it to trace the source in sorting logs.
query	String	Yes	-	The query clause. It defines the retrieval conditions for this path. For more information about the syntax, see Index retrieval - query clause.
priority	Integer	Yes	-	The retrieval priority. Valid values are non-negative integers. `0` is the highest priority. The system prioritizes selecting documents from high-priority paths for final sorting.
quota	Integer	Yes	-	The maximum number of documents to retrieve for this path. Valid values are non-negative integers.
qp	String	No	Default rules in the console	The names of the query analysis configurations. Separate multiple names with commas (,).
first_rank_name	String	No	Default configurations in the console	The name of the rough sort configuration.
second_rank_name	String	No	Default configurations in the console	The name of the fine sort configuration.
total_rank_size	Integer	No	DPI engine default	The total number of documents for rough sorting. Valid values: `[1, 5000000]`.
total_rerank_size	Integer	No	DPI engine default	The total number of documents for fine sorting. Valid values: `[1, 10000]`.
sort	String	No	`-RANK`	The sort clause. For more information about the syntax, see Global sorting - sort clause.
filter	String	No	""	The filter clause. For more information about the syntax, see Result filtering - filter clause.
kvpairs	Map	No	{}	Custom key-value pairs. For more information about the usage, see Custom parameter passing - kvpairs clause.

Code examples

Request example

The following example shows a request that contains three query paths:

The main path: This path has the highest priority and uses the sys_title query analysis rule.
The sub path: This path has the second-highest priority and does not use a query analysis rule.
The main/vector path: This is a derived vector search path that is created because the query analysis rule in the main path performs text embedding.

{
    "queries": [
        {
            "query": "title:'AI'",
            "path": "sub",
            "priority": 2,
            "quota": 100,
            "qp": ""
        },
        {
            "query": "title:'OpenSearch'",
            "path": "main",
            "priority": 1,
            "quota": 100,
            "qp": "sys_title"
        }
    ],
    "raw_query": "OpenSearch",
    "start": 0,
    "hit": 10,
    "format": "fulljson",
    "trace": "debug",
    "rank_trace": "info",
    "unified_rank_type": "rrf",
    "unified_rank_size": 1000,
    "fetch_fields": "title",
    "user_id": "123",
    "vector_search": {
        "vector": {
            "top_n": 100,
            "namespaces": [],
            "threshold": 0.5,
            "search_params": {
                "qc_scan_ratio": 0.01
            }
        }
    }
}

Response example

{
    "errors": [
        {
            "code": 2112,
            "message": "Specified index not in query:text_relevance(title,default,true)"
        }
    ],
    "ops_request_misc": "%7B%22request%5Fid%22%3A%22176129335716862423300005%22%2C%22scm%22%3A%2220140713.120141928..%22%7D",
    "request_id": "176129335716862423300005",
    "result": {
        "compute_cost": [
            {
                "index_name": "test",
                "value": 13.73
            }
        ],
        "facet": [],
        "items": [
            {
                "attribute": {
                    "pk": [
                        "120142134_1"
                    ]
                },
                "fields": {
                    "title": "OpenSearch Industry Algorithm Edition"
                },
                "property": {},
                "sortExprValues": [
                    "0.0327869"
                ],
                "tracerInfo": "begin search path[main] rank trace:\nend search path[main] rank trace.\n\nbegin search path[main/vector] rank trace:\nend search path[main/vector] rank trace.\n",
                "variableValue": {}
            },
            {
                "attribute": {
                    "pk": [
                        "120142134_2"
                    ]
                },
                "fields": {
                    "title": "OpenSearch Retrieval Engine Edition"
                },
                "property": {},
                "sortExprValues": [
                    "0.0322581"
                ],
                "tracerInfo": "begin search path[main] rank trace:\nend search path[main] rank trace.\n\nbegin search path[main/vector] rank trace:\nend search path[main/vector] rank trace.\n",
                "variableValue": {}
            },
            {
                "attribute": {
                    "pk": [
                        "120142134_3"
                    ]
                },
                "fields": {
                    "title": "AI Search Open Platform"
                },
                "property": {},
                "sortExprValues": [
                    "0.0163934"
                ],
                "tracerInfo": "begin search path[sub] rank trace:\nend search path[sub] rank trace.\n",
                "variableValue": {}
            }
        ],
        "num": 3,
        "searchtime": 0.085292,
        "total": 3,
        "viewtotal": 3
    },
    "status": "OK",
    "tracer": ""
}

Return parameters

Parameter	Type	Description
status	string	The execution result of the search. Valid values: OK and FAIL. A value of OK indicates that the search is successful. A value of FAIL indicates that the search failed. In this case, troubleshoot errors based on the error code.
request_id	string	The request ID.
result	string	The return results, which include the searchtime, total, num, viewtotal, items, facet, and scroll_id parameters.
errors	string	The error information, in which the error_message parameter indicates the error message. For more information about error codes, see Error codes.

searchtime: the period of time that was taken by the engine to complete the search. Unit: seconds.
Difference between the total, viewtotal, and num parameters: The total parameter indicates the number of results that meet the conditions in the engine for a single search regardless of the config clause. If the number of the results is large, the value of the total parameter is optimized. However, to ensure search performance and relevance, the number of results that the engine returns is less than or equal to the value of the viewtotal parameter. If you require paging, the sum of the values of the start and hit parameters must be less than the value of the viewtotal parameter. The value of the total parameter is generally used for display. The num parameter indicates the number of entries returned for this search request. The value of this parameter is limited by the start and hit parameters in the config clause and does not exceed the value of the hit parameter.
compute_cost: an array with only one map element. The index_name parameter indicates the ID of the application. The value parameter indicates the logical computing units (LCUs) that are consumed in this search request.
items: the search results. The fields parameter indicates the content of a search result.
variableValue: the value of a custom parameter, such as the value of the distance parameter. The variableValue parameter is displayed only when the format parameter in the config clause is xml or fulljson. By default, the variableValue parameter is not displayed when the format parameter is set to json.
sortExprValues: the sort score of a document.
facet: the statistics returned by the aggregate clause.
Field of the ARRAY type: If the response is in the JSON or fullJSON format, data is separated by tab characters (\t). If the response is in the XML format, data is separated by spaces.

Multi-path query processing flow

Parallel query processing: The system processes each query in the request in parallel. The system calculates the number of documents to retrieve for each path based on the quota of the path. The default number is the quota value, with a maximum of 5,000. Then, the system sends the queries to the DPI engine and obtains a candidate set of document IDs for each path.
Candidate set selection: The system selects documents from each path based on their priority, up to the specified quota. The selection process stops if the total number of selected documents exceeds the `unified_rank_size` value.
Candidate set sorting:
1. If `unified_rank_type` is set to `cava_script`, all document IDs are sent to the DPI engine for sorting.
2. If `unified_rank_type` is set to `rrf`, the system directly sorts the documents using the RRF algorithm.
3. If `unified_rank_type` is set to `none`, the sorting stage is skipped.
Result return:
1. If the documents were sorted, the system selects a range of documents based on the `start` and `hit` parameters to form the result set. If sorting was skipped because `unified_rank_type` is set to `none`, the entire candidate set is returned as the result set, and the `start` and `hit` parameters are ignored.
  Important
  When unified_rank_type is set to none, the system skips the unified sorting stage, ignores the start and hit paging parameters, and returns the complete candidate set. This mode can return a large amount of data. Use this mode only if you want to implement sorting logic outside of OpenSearch.
2. The document IDs from the result set are sent to the DPI engine to retrieve the complete documents. These documents are then returned to the user.

Limitations

Feature limitations: The aggregation, distinct, and highlighting features are not supported.
Path limit: A single request supports a maximum of five paths. Derived vector search paths do not count toward this limit.
Application type limit: This feature is available only for dedicated applications.
Retrieval limit: In a multi-path parallel query, a maximum of 5,000 candidate documents can be retrieved from the DPI engine for each path.