PolarSearch 3.0 performance - PolarDB - Alibaba Cloud Documentation Center

This topic presents the OpenSearch Benchmark results for PolarSearch 3.0 using three standard workloads: HTTP logs, NYC taxis, and Geonames.

Note

For detailed test steps, see PolarSearch Benchmark Performance Test Method.

Write performance

The write test uses the append-no-conflicts-index-only test procedure to bulk write the entire corpus to the cluster, evaluating the write throughput (docs/s) across different numbers of concurrent write clients (bulk_indexing_clients). Each test starts with an empty index.

HTTP logs write performance

This dataset, derived from real web server access logs from the 1998 FIFA World Cup, contains approximately 247 million log records.

Test scenario 1: 6 shards, 1 replica
Throughput (docs/s)
Client 1
Client 2
Client 4
Client 8 (default)
Client 16
Client 32
PolarSearch 3.0
129830
216656
345442
528321
552017
551569
Test scenario 2: 6 shards, 0 replicas
Throughput (docs/s)
Client 1
Client 2
Client 4
Client 8 (default)
Client 16
Client 32
PolarSearch 3.0
189685
326329
577366
880890
957399
951244

NYC taxis write performance

This dataset contains approximately 165 million records from New York City taxi trips in 2015.

Test scenario 1: 6 shards, 1 replica
Throughput (docs/s)
Client 1
Client 2
Client 4
Client 8 (default)
Client 16
Client 32
PolarSearch 3.0
92773
134968
222221
321173
335255
336760
Test scenario 2: 6 shards, 0 replicas
Throughput (docs/s)
Client 1
Client 2
Client 4
Client 8 (default)
Client 16
Client 32
PolarSearch 3.0
139770
222312
370978
555390
601674
598330

Search performance

Search tests run after write operations are complete, with replica=1 and target_throughput=0 (full-speed stress test mode). This evaluates the upper limit of search throughput (ops/s) at full CPU load.

HTTP logs search performance

term - Term match

This test performs a term match on logs-* indices. Each query hits a fixed 10,000 documents and returns the default 10 documents. The test measures the efficiency of inverted index lookups, the overhead of cross-shard fan-out and merge operations, and the processing capacity of the coordinating node in PolarSearch 3.0.

HTTP logs term (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	256	519	868	1361	1914	2592

asc_sort_timestamp - Doc value sorting

This test runs a match_all query on all documents, sorts them by timestamp in ascending order, and returns the top 10. The @timestamp field, a date type, is sorted using doc values, which rely on columnar storage. This test evaluates the efficiency of reading doc values, performing top-N heap sorts, and merging results on the coordinating node for PolarSearch 3.0.

HTTP logs asc_sort_timestamp (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	193	432	779	1020	1514	2073

range - Time range query

This test performs a dynamic time-window range query on the @timestamp field. Because 90% of queries in log analytics products include a time filter, this task measures the efficiency of date range queries. It serves as a counterpart to the term test, comparing range query performance against point query performance.

HTTP logs range (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	219	494	893	1338	2005	2646

hourly_agg - Time-series aggregation

This test performs a date_histogram aggregation, bucketing by the hour across all documents. It measures the aggregation throughput for date types, a core operation for log monitoring and statistical analysis scenarios.

HTTP logs hourly_agg (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	48	145	251	315	339	346

NYC taxis search performance

match-all - Baseline dispatching

This test runs a match_all query on approximately 165 million records, returning only the default 10 results without any sorting or aggregation. It measures the baseline efficiency of the engine's request parsing, shard dispatching, and result serialization.

NYC taxis match-all (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	271	612	1292	2326	3668	5531

range - Numeric range query

This test applies a fixed numeric range filter to the total_amount field (a scaled_float type representing the fare amount). It finds trips with fares between 5 and 15 dollars, returning the default 10 results. The test evaluates the efficiency of numeric field range queries.

NYC taxis range (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	161	441	906	1335	1723	1958

autohisto_agg - Adaptive time aggregation

Perform auto_date_histogram (automatic bucketing, buckets=20) on data within a 20-day time window to test the engine's adaptive aggregation capability. This operation complements range (range is for numeric range queries, while auto_date_histogram is for time range aggregations).

NYC taxis autohisto_agg (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	170	444	733	966	1117	1142

Geonames search performance

The Geonames dataset contains approximately 11.4 million records of global points of interest (POIs), with fields for place names, country codes, population counts, and geographic coordinates. We use this dataset to evaluate performance across various query types. Because the Geonames dataset is much smaller than the HTTP logs and NYC taxis datasets, write times are short and consistent, which makes a write performance comparison impractical. Therefore, the Geonames tests focus on search scenarios. These tests provide a comprehensive evaluation of query processing capabilities by covering exact queries, full-text search, aggregations, geospatial queries, and script scoring.

term - Term match

Geonames term (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	594	1118	2004	3871	5851	8186

phrase - Phrase query

This test evaluates phrase query capabilities within full-text search. It requires verifying term position information in the inverted index, which makes it more computationally complex than a term query.

Geonames phrase (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	481	1153	1686	3014	4719	6803

country_agg_uncached - Terms and nested aggregation

This test performs a terms aggregation (bucketing by country code) with a nested sum aggregation on all documents. The aggregation cache is disabled to reflect the true computational overhead. This task represents a compute-intensive aggregation scenario and measures the engine's CPU efficiency under heavy aggregation loads.

Geonames country_agg_uncached (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	17	32.56	40.95	43.56	45	45

decay_geo_gauss_function_score - Geospatial distance decay scoring

This test runs a function score query using Gaussian distance decay on the geo_point field. It evaluates both geospatial index query efficiency and scoring performance.

Geonames decay_geo_gauss (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	5.78	11.32	19	21	21.38	21.34

painless_static - Painless custom scoring

This test uses a Painless script to perform custom score calculations on each document. The script is pre-compiled and executed using static binding. This task measures the computational throughput of the script engine (JIT compilation) and represents a typical scenario for highly customized script scoring.

Geonames painless_static (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	12	14	21	23.51	23.71	23.74

desc_sort_population - Numeric doc value sorting

This test sorts all documents in descending order by the population field (an integer type) and returns the top 10 results. Sorting is performed using doc values, which rely on columnar storage. The test evaluates the efficiency of reading columnar numeric fields and the performance of top-N sorting.

Geonames desc_sort_population (ops/s)	Client 1	Client 2	Client 4	Client 8	Client 16	Client 32
PolarSearch 3.0	316	463	1047	1894	2577	3593

Throughput (docs/s)	Client 1	Client 2	Client 4	Client 8 (default)	Client 16	Client 32
PolarSearch 3.0	129830	216656	345442	528321	552017	551569