All Products
Search
Document Center

Elasticsearch:Evaluate specifications and storage capacity

Last Updated:Feb 27, 2026

Plan storage capacity, node specifications, and shard layout for an Alibaba Cloud Elasticsearch cluster before purchase or configuration changes.

Quick reference: With one replica per primary shard, provision approximately 3.4x your source data volume in total storage. For example, 100 GiB of source data requires about 340 GiB of cluster storage.

The evaluation methods in this document are based on real-world test results and operational experience. Actual requirements may differ depending on data structure, query complexity, data volume, data changes, and performance goals. Validate estimates with representative workloads before finalizing your configuration.

Storage capacity formula

With the default of one replica shard per primary shard, total cluster storage is roughly 3.4x the source data volume. This multiplier accounts for the following overhead factors:

FactorOverheadDescription
Replica shards2x (with 1 replica)Each primary shard has at least one replica shard
Indexing overheadTypically 10%Space consumed by index structures beyond source data
Internal overhead20% reservedSegment merging, logging, and other internal operations
OS reserved space5% reservedCritical processes, system recovery, and disk fragments
Security thresholdAt least 15% reservedMinimum free space maintained by Elasticsearch

Simplified formula:

Cluster storage = Source data x (1 + Number of replicas) x 1.7
                = Source data x 3.4  (when replicas = 1)

Full formula:

Cluster storage = Source data
                  x (1 + Number of replicas)
                  x Indexing overhead factor
                  / (1 - OS reserved)
                  / (1 - Internal overhead)
                  / (1 - Security threshold)

                = Source data x (1 + Number of replicas) x 1.1 / 0.95 / 0.80 / 0.85
                = Source data x (1 + Number of replicas) x 1.7
Important

The 3.4x multiplier assumes one replica. Adjust the formula with your actual replica count.

Worked example: 200 GiB source data

Scenario: 200 GiB of source data, one replica shard per primary shard.

Cluster storage = 200 GiB x (1 + 1) x 1.7
                = 200 x 2 x 1.7
                = 680 GiB

Storage consumed outside the formula

Beyond the factors in the formula, these items also consume storage:

  • X-Pack monitoring indexes -- Used for exception analysis:

    • .monitoring-es-6-*: Consumes significant storage. Retains the last 7 days by default.

    • .monitoring-kibana-6-*: Grows with the number of indexes. Retains the last 7 days by default.

    • .watcher-history-3-*: Consumes minimal storage. Delete manually when no longer needed.

  • Cluster logs -- Include run logs, access logs, and slow logs. Retained for the last 7 days by default. This retention period cannot be changed. Log volume increases with the number of queries and data pushes the cluster receives.

Node specifications and count

Data nodes

Two rules determine the maximum scale per data node:

  • Maximum nodes per cluster = vCPUs per node x 5

  • Maximum storage per node = Memory per node (GiB) x a scenario-specific multiplier

ScenarioMultiplierTypical use
GeneralMemory x 30Mixed read/write workloads
QueryMemory x 10Acceleration, aggregation
LoggingMemory x 50Log import, offline analytics

The following table shows the maximum node count and maximum storage per node for each specification:

SpecificationMax nodesGeneralQueryLogging
2 vCPUs, 4 GiB10120 GiB40 GiB200 GiB
2 vCPUs, 8 GiB10240 GiB80 GiB400 GiB
4 vCPUs, 16 GiB20480 GiB160 GiB800 GiB
8 vCPUs, 32 GiB40960 GiB320 GiB1.5 TiB
16 vCPUs, 64 GiB801.9 TiB640 GiB3 TiB

Total cluster storage = Storage per node x Number of nodes

Select node specifications based on the maximum storage per node and the maximum number of nodes for your target specification.

  • The number of data nodes affects the total shard count. Complete the shard evaluation below before finalizing node specifications.

  • For aggregation-heavy queries, select specifications with a 1:2 vCPU-to-memory ratio and enable client nodes.

Worked example: 2 TiB log data

Scenario: 2 TiB of log data, logging use case, one replica.

  1. Calculate required storage: 2 TiB x 3.4 = 6.8 TiB

  2. Select node specification: 8 vCPUs, 32 GiB (logging max: 1.5 TiB per node)

  3. Calculate node count: 6.8 TiB / 1.5 TiB = ~5 nodes (round up)

  4. Verify node limit: 5 nodes < 40 max nodes. Valid.

Dedicated master nodes

Enable dedicated master nodes for clusters with many data nodes to maintain cluster stability. Select the specification based on your data node count:

Data node countDedicated master node specification
Default2 vCPUs, 8 GiB
More than 104 vCPUs, 16 GiB
More than 308 vCPUs, 32 GiB
More than 5016 vCPUs, 64 GiB
If the cluster has many indexes and shards, or data changes frequently, select higher specifications for dedicated master nodes.

Client nodes

Client nodes (coordinating node in Elasticsearch) handle the reduce phase of distributed queries. Dedicated client nodes isolate garbage collection (GC) impact from data nodes.

GuidelineValue
Client-to-data node ratio1:5
Client node vCPU-to-memory ratio1:4 or 1:8
Minimum client nodes2

Example: For 10 data nodes at 8 vCPUs, 32 GiB each, configure 2 client nodes at 8 vCPUs, 32 GiB each.

Shard evaluation

Shards are the basic storage units of Elasticsearch indexes, classified into primary shards and replica shards. For more information, see Shard and replica shard.

Proper shard planning prevents performance degradation, uneven disk usage, and imbalanced CPU loads across nodes. Plan shards based on data volume per index, expected data growth, node specifications, and whether temporary indexes need regular deletion or merging.

Shard size guidelines

ScenarioMaximum shard size
General workloads30 GiB (up to 50 GiB in special cases)
Log analytics or very large indexes100 GiB

Number of shards per index

Determine the shard count based on data volume:

  • Large data volume, high write throughput: Configure multiple primary shards per index with one replica per primary shard.

  • Small data volume, low write throughput: Configure one primary shard per index with one or more replica shards.

Default shard configuration varies by version:
  • V7.X and later: 1 primary shard, 1 replica shard per index

  • Earlier than V7.X: 5 primary shards, 1 replica shard per index

Load balancing with small indexes: If the data volume per index is less than 30 GiB, use one primary shard with multiple replicas to distribute load across nodes. For example, a 20 GiB index on a 5-node cluster can use 1 primary shard and 4 replica shards.

Shard distribution guidelines

  • Keep the total shard count equal to the data node count, or an integer multiple of it.

  • Place a maximum of 5 shards per index on a single node.

Total shards per node

Calculate the maximum number of shards a single data node can hold:

Cluster sizeFormula
Small specifications (or data volume < 1 TiB)Shards per node = Memory (GiB) x 30
Large specificationsShards per node = Memory (GiB) x 50
  • The default maximum shard count per node in V7.X clusters is 1,000. Do not change this limit. If more shards are needed, add more nodes instead.

  • Excessive shards increase performance overhead and may exhaust file handles, leading to cluster faults. Configure shards based on actual business requirements.

For more guidance, see How to size your shards.

Sizing and maintenance best practices

  • Start with estimates, then iterate. The formulas in this document provide initial sizing estimates. Validate with representative workloads and adjust as needed.

  • Delete outdated indexes. If Auto Indexing is enabled, use index lifecycle management (ILM) or an Elasticsearch API script to remove outdated indexes. For details, see Use ILM to manage Heartbeat indexes.

  • Free heap memory. Delete small or unused indexes promptly to free heap memory.

  • Monitor shard health. If the data volume per shard on an existing index exceeds the recommended limit, reindex the data. For details, see Use the reindex API to migrate data. Data reindexing maintains service continuity but is time-consuming.

References