This topic presents the OpenSearch Benchmark results for PolarSearch 3.0 using three standard workloads: HTTP logs, NYC taxis, and Geonames.
For detailed test steps, see PolarSearch Benchmark Performance Test Method.
Write performance
The write test uses the append-no-conflicts-index-only test procedure to bulk write the entire corpus to the cluster, evaluating the write throughput (docs/s) across different numbers of concurrent write clients (bulk_indexing_clients). Each test starts with an empty index.
HTTP logs write performance
This dataset, derived from real web server access logs from the 1998 FIFA World Cup, contains approximately 247 million log records.
Test scenario 1: 6 shards, 1 replica
Throughput (docs/s)
Client 1
Client 2
Client 4
Client 8 (default)
Client 16
Client 32
PolarSearch 3.0
129830
216656
345442
528321
552017
551569
Test scenario 2: 6 shards, 0 replicas
Throughput (docs/s)
Client 1
Client 2
Client 4
Client 8 (default)
Client 16
Client 32
PolarSearch 3.0
189685
326329
577366
880890
957399
951244
NYC taxis write performance
This dataset contains approximately 165 million records from New York City taxi trips in 2015.
Test scenario 1: 6 shards, 1 replica
Throughput (docs/s)
Client 1
Client 2
Client 4
Client 8 (default)
Client 16
Client 32
PolarSearch 3.0
92773
134968
222221
321173
335255
336760
Test scenario 2: 6 shards, 0 replicas
Throughput (docs/s)
Client 1
Client 2
Client 4
Client 8 (default)
Client 16
Client 32
PolarSearch 3.0
139770
222312
370978
555390
601674
598330
Search performance
Search tests run after write operations are complete, with replica=1 and target_throughput=0 (full-speed stress test mode). This evaluates the upper limit of search throughput (ops/s) at full CPU load.
HTTP logs search performance
term - Term match
This test performs a term match on logs-* indices. Each query hits a fixed 10,000 documents and returns the default 10 documents. The test measures the efficiency of inverted index lookups, the overhead of cross-shard fan-out and merge operations, and the processing capacity of the coordinating node in PolarSearch 3.0.
HTTP logs term (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 256 | 519 | 868 | 1361 | 1914 | 2592 |
asc_sort_timestamp - Doc value sorting
This test runs a match_all query on all documents, sorts them by timestamp in ascending order, and returns the top 10. The @timestamp field, a date type, is sorted using doc values, which rely on columnar storage. This test evaluates the efficiency of reading doc values, performing top-N heap sorts, and merging results on the coordinating node for PolarSearch 3.0.
HTTP logs asc_sort_timestamp (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 193 | 432 | 779 | 1020 | 1514 | 2073 |
range - Time range query
This test performs a dynamic time-window range query on the @timestamp field. Because 90% of queries in log analytics products include a time filter, this task measures the efficiency of date range queries. It serves as a counterpart to the term test, comparing range query performance against point query performance.
HTTP logs range (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 219 | 494 | 893 | 1338 | 2005 | 2646 |
hourly_agg - Time-series aggregation
This test performs a date_histogram aggregation, bucketing by the hour across all documents. It measures the aggregation throughput for date types, a core operation for log monitoring and statistical analysis scenarios.
HTTP logs hourly_agg (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 48 | 145 | 251 | 315 | 339 | 346 |
NYC taxis search performance
match-all - Baseline dispatching
This test runs a match_all query on approximately 165 million records, returning only the default 10 results without any sorting or aggregation. It measures the baseline efficiency of the engine's request parsing, shard dispatching, and result serialization.
NYC taxis match-all (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 271 | 612 | 1292 | 2326 | 3668 | 5531 |
range - Numeric range query
This test applies a fixed numeric range filter to the total_amount field (a scaled_float type representing the fare amount). It finds trips with fares between 5 and 15 dollars, returning the default 10 results. The test evaluates the efficiency of numeric field range queries.
NYC taxis range (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 161 | 441 | 906 | 1335 | 1723 | 1958 |
autohisto_agg - Adaptive time aggregation
Perform auto_date_histogram (automatic bucketing, buckets=20) on data within a 20-day time window to test the engine's adaptive aggregation capability. This operation complements range (range is for numeric range queries, while auto_date_histogram is for time range aggregations).
NYC taxis autohisto_agg (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 170 | 444 | 733 | 966 | 1117 | 1142 |
Geonames search performance
The Geonames dataset contains approximately 11.4 million records of global points of interest (POIs), with fields for place names, country codes, population counts, and geographic coordinates. We use this dataset to evaluate performance across various query types. Because the Geonames dataset is much smaller than the HTTP logs and NYC taxis datasets, write times are short and consistent, which makes a write performance comparison impractical. Therefore, the Geonames tests focus on search scenarios. These tests provide a comprehensive evaluation of query processing capabilities by covering exact queries, full-text search, aggregations, geospatial queries, and script scoring.
term - Term match
Geonames term (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 594 | 1118 | 2004 | 3871 | 5851 | 8186 |
phrase - Phrase query
This test evaluates phrase query capabilities within full-text search. It requires verifying term position information in the inverted index, which makes it more computationally complex than a term query.
Geonames phrase (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 481 | 1153 | 1686 | 3014 | 4719 | 6803 |
country_agg_uncached - Terms and nested aggregation
This test performs a terms aggregation (bucketing by country code) with a nested sum aggregation on all documents. The aggregation cache is disabled to reflect the true computational overhead. This task represents a compute-intensive aggregation scenario and measures the engine's CPU efficiency under heavy aggregation loads.
Geonames country_agg_uncached (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 17 | 32.56 | 40.95 | 43.56 | 45 | 45 |
decay_geo_gauss_function_score - Geospatial distance decay scoring
This test runs a function score query using Gaussian distance decay on the geo_point field. It evaluates both geospatial index query efficiency and scoring performance.
Geonames decay_geo_gauss (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 5.78 | 11.32 | 19 | 21 | 21.38 | 21.34 |
painless_static - Painless custom scoring
This test uses a Painless script to perform custom score calculations on each document. The script is pre-compiled and executed using static binding. This task measures the computational throughput of the script engine (JIT compilation) and represents a typical scenario for highly customized script scoring.
Geonames painless_static (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 12 | 14 | 21 | 23.51 | 23.71 | 23.74 |
desc_sort_population - Numeric doc value sorting
This test sorts all documents in descending order by the population field (an integer type) and returns the top 10 results. Sorting is performed using doc values, which rely on columnar storage. The test evaluates the efficiency of reading columnar numeric fields and the performance of top-N sorting.
Geonames desc_sort_population (ops/s) | Client 1 | Client 2 | Client 4 | Client 8 | Client 16 | Client 32 |
PolarSearch 3.0 | 316 | 463 | 1047 | 1894 | 2577 | 3593 |