The faster-bulk plug-in, developed by the Alibaba Cloud Elasticsearch team, aggregates bulk write requests in batches before writing them to shards. By combining small requests based on a configurable size threshold and time interval, the plug-in reduces write queue blocking and lowers request rejection rates, improving write throughput by more than 20% in high-throughput scenarios.
The faster-bulk plug-in batches requests before writing to shards. Do not use it in latency-sensitive scenarios.
Use cases
The faster-bulk plug-in is suited for:
High write throughput: Clusters receiving a continuous stream of bulk write requests
High shard count: Indexes with many shards where small requests cause write queue congestion
How it works
Incoming bulk write requests are held in an aggregation buffer on each data node.
When the buffer reaches the configured size threshold (
flush_threshold_size) or the aggregation interval (combine.interval) expires, the node flushes the buffer to the target shards.By batching small requests into larger writes, the plug-in reduces the per-request overhead on the write queue.
Performance benchmark
The following benchmark was run with the Rally nyc_taxis dataset (650 bytes per document) on a cluster with 3 data nodes and 2 independent client nodes (16 vCPUs and 64 GiB of memory each). The apack.fasterbulk.combine.interval was set to 200 ms.
| Translog status | Without faster-bulk (doc/s) | With faster-bulk (doc/s) | Improvement |
|---|---|---|---|
| Synchronous (default) | 182,314 | 226,242 | 23% |
| Asynchronous | 218,732 | 241,060 | 10% |
Prerequisites
Before you begin, make sure you have:
An Alibaba Cloud Elasticsearch V6.7.0 or V7.10.0 cluster (Standard or Advanced Edition). See Create an Alibaba Cloud Elasticsearch cluster.
The faster-bulk plug-in installed from the Built-in Plug-ins tab. See Install and remove a built-in plug-in.
After installation, the bulk request aggregation feature is disabled by default. Enable it before use.
Enable bulk request aggregation
Log on to the Kibana console of your Elasticsearch cluster. See Log on to the Kibana console.
In the left-side navigation pane, click Dev Tools.
On the Console tab, run the following command:
PUT _cluster/settings { "transient": { "apack.fasterbulk.combine.enabled": "true" } }You can also run this command using the cURL tool or a third-party visualizer.
Configure the request size and aggregation interval
Run the following command to set the maximum request size and aggregation interval. The plug-in flushes the buffer to shards when either threshold is reached on a data node.
PUT _cluster/settings
{
"transient": {
"apack.fasterbulk.combine.flush_threshold_size": "1mb",
"apack.fasterbulk.combine.interval": "50"
}
}| Parameter | Description | Default |
|---|---|---|
apack.fasterbulk.combine.flush_threshold_size | Maximum size of the aggregated bulk request buffer | 1mb |
apack.fasterbulk.combine.interval | Maximum time to hold requests in the buffer before flushing | 50 (ms) |
To process highly concurrent bulk requests and prevent the requests from blocking the write queue, you can increase the maximum request size or aggregation interval based on your business requirements.
Enable directed routing
If documents in a bulk write request have no custom routing value and no primary key (_id), enable directed routing to route documents directly to their target shards, improving write speed.
Cluster level — applies to all indexes:
PUT _cluster/settings { "persistent": { "index.direct_routing.global.enable": "true" } }Index level — applies to a specific index:
PUT index/settings { "index.direct_routing.enable": "true" }
If directed routing is enabled and documents already have both a custom routing value and a primary key configured, directed routing does not take effect and existing write operations are not affected.
Disable bulk request aggregation
Run the following command to disable the feature:
PUT _cluster/settings
{
"transient": {
"apack.fasterbulk.combine.enabled": "false"
}
}