All Products
Search
Document Center

Elasticsearch:Data disaster recovery

Last Updated:Mar 31, 2026

Solution options

Elasticsearch (ES) offers the following remote disaster recovery solutions:

  • OSS snapshot backup and restoration: Back up index data to Object Storage Service (OSS) for persistent storage. The first snapshot is a full backup, and subsequent snapshots are incremental backups. You can use a cross-cluster OSS repository to restore snapshot data to a target ES instance. For more information, see Back up and restore data by using a cross-cluster OSS repository.

  • Logstash: You can configure a pipeline to read data from a source cluster, process it, and then write it to a target cluster. This approach is ideal for migrating data between major versions or when data filtering and transformation are required. For more information, see Quick start.

  • Reindex: The built-in ES Reindex API lets you copy all or a subset of data from one index to another, including across clusters. This is ideal for one-time migrations of small datasets. For more information, see Migrate data by using the Reindex API.

  • Cross-Cluster Replication (CCR): CCR automatically replicates writable indexes from a leader cluster to one or more follower clusters asynchronously and incrementally. It supports near-real-time synchronization, making it suitable for disaster recovery scenarios with strict RPO and RTO requirements. For more information, see Replicate data across clusters by using CCR.

Solution comparison

Solution

Use cases

RPO

RTO

Limitations

OSS snapshot

Periodic backup and recovery of large-scale data (from gigabytes to petabytes).

Hours to days (depending on the snapshot interval).

Several hours (depending on data volume and shard recovery time).

Does not support continuous synchronization. Service may need to be stopped during recovery.

Logstash

Data migration with low real-time requirements, for data that needs filtering and transformation, or for migration between major versions.

Seconds to minutes (depending on synchronization frequency).

Several hours (depending on data volume and instance performance).

Batch synchronization only; not real-time. Does not support synchronizing delete operations.

Reindex

One-time index migration for small datasets.

Not applicable (one-time operation).

Minutes to hours (depending on data volume).

Does not support continuous synchronization. Inefficient for large-scale data migrations.

CCR

Remote disaster recovery, read/write splitting, and geo-proximity access.

Near-zero (seconds).

Seconds to minutes.

Follower indexes are read-only. Requires identical mapping and shard counts.

For remote disaster recovery scenarios with strict RPO and real-time requirements, CCR is the best choice for the following reasons:

  • CCR synchronizes data within seconds, minimizing data loss.

  • If the leader cluster fails, you can fail over to a follower cluster to restore service without the delay of a snapshot recovery.

  • Although the initial deployment cost is higher, CCR is more cost-effective in the long run by preventing business losses from data loss.