High availability - Elasticsearch - Alibaba Cloud Documentation Center

Alibaba Cloud Elasticsearch provides the data backup and restoration, load balancing, and cross-zone deployment features. It also provides various kernel optimization policies to ensure cluster stability. These features and policies ensure comprehensive data reliability and service availability.

Data backup and restoration

Backup and restoration mode	Description
Automatic snapshot creation and data restoration from automatic snapshots	Alibaba Cloud Elasticsearch supports automatic snapshot creation. You can specify the time at which snapshots are automatically created every day. After automatic snapshots are created, you can restore data from an automatic snapshot that is created within three days to the original Elasticsearch cluster. For more information, see Create automatic snapshots and restore data from automatic snapshots.
Manual snapshot creation and data restoration from manual snapshots	Alibaba Cloud Elasticsearch allows you to manually run a command to create a snapshot for a specific index. Then, you can save the snapshot in an Object Storage Service (OSS) bucket in the same region as your Elasticsearch cluster. After the snapshot is created, you can manually run a command to restore the data in the snapshot to the original Elasticsearch cluster or an Elasticsearch cluster that is in the same region as the original Elasticsearch cluster. For more information, see Create manual snapshots and restore data from manual snapshots.
Shared OSS repository	Alibaba Cloud Elasticsearch allows you to configure shared OSS repositories for your Elasticsearch cluster. This way, you can restore data from the automatic snapshots of an Elasticsearch cluster that are stored in these repositories to your Elasticsearch cluster. For more information, see Configure a shared OSS repository.

Load balancing

Alibaba Cloud Elasticsearch supports load balancing. You can specify the public or internal endpoint of your Elasticsearch cluster on your application. Your requests are evenly distributed to all the data nodes in your Elasticsearch cluster to achieve load balancing.

Important Load balancing among these data nodes depends on the number and size of index shards. When you create an index, you must set the number and size of index shards to appropriate values. For more information, see Shard evaluation.

Cross-zone deployment

Alibaba Cloud Elasticsearch allows you to deploy an Elasticsearch cluster across zones. In cross-zone deployment, the system automatically selects the zones. If replica shards are configured and nodes in one zone fail, the nodes in the remaining zones can still provide services without interruptions. This significantly enhances the availability of the cluster. In addition, you can perform a switchover in the console to isolate the faulty nodes. During the switchover, the system adds computing resources to the remaining zones to make up for the resources lost in the zone that contains the faulty nodes. After the nodes recover, you can perform a recovery for the zone in the Elasticsearch console. During the recovery, the system adds the nodes that were removed during the switchover to the zone again. It also removes the computing resources that were added to the remaining zones during the switchover. The switchover and recovery are imperceptible to customer services and improve service stability. For more information, see Deploy and use a multi-zone Elasticsearch cluster.

AliES enhancements

The Alibaba Cloud Elasticsearch team continuously develops and optimizes the Elasticsearch kernel to improve cluster stability and availability. The following table describes the features that are provided by the optimized kernel.

Feature	Description
Pruning for time series indexes	When you query data from a time series index, you can specify a time range to filter the data. This feature improves the query performance of time series indexes.
Slow query isolation	This feature allows you to track the overheads for a single query request and logically isolate the request. This reduces the impact of anomalous queries on cluster stability.
gig plug-in	When an exception occurs in a cluster, the gig plug-in can perform a switchover within seconds. This prevents query jitters caused by anomalous nodes.

For information about other features provided by the optimized kernel, see AliES release notes.