All Products
Search
Document Center

OpenSearch:Parallel queries

Last Updated:Aug 27, 2024

Overview

The parallel query feature is an extension of the query feature. Parallel queries are performed based on graphics architectures. After you enable the parallel query feature, the system splits a query into multiple query processes and uses multiple threads to process the query. This helps reduce the overall query latency. When you write a query statement, you can specify the number of threads that you want to use to perform the query. The parallel query feature is suitable for scenarios in which seek timeout errors may occur and incomplete search results are returned. You can perform parallel queries in the following scenarios:

  • Your business uses complex computing logic, including complex filtering, statistical operations, and calculation.

  • You use a cluster in which computing and storage are decoupled and you frequently perform index dictionary lookup operations and inverted seek operations to access the remote storage.

How to perform a parallel query

  • Make sure that your Searcher workers are deployed in a multi-core and multi-thread runtime environment.

  • OpenSearch Vector Search Edition provides the parallel query feature that supports parallel queries on 2 threads, 4 threads, 8 threads, and 16 threads. You can select the number of threads based on your business requirements.

  • By default, the parallel query feature is enabled. When you configure Searcher workers, you can specify the number of threads in the paraSearchWays parameter. For example, you can specify -- env paraSearchWays=2,4,8. In this case, the workers can use 2, 4, or 8 parallel threads to perform queries. If you do not specify the paraSearchWays parameter, the default value is used. In this case, each worker can support two and four parallel threads.

  • In a query statement, you can specify the default cluster as the cluster on which you want to perform the query and the number of threads that you want to use in the para_search parameter. The name of the default cluster is general. For example, config=cluster:general.para_search_2, ...."para_search_2" specifies that the query is performed on the general cluster and is performed by two parallel threads.

  • You can also specify a custom cluster on which you want to perform the query. For example, config=cluster:daogou.para_search_2, ...."para_search_2" specifies that the query is performed on the daogou cluster and is performed by two parallel threads.