You can deploy an Alibaba Cloud Elasticsearch cluster across zones to ensure the high availability of the cluster.

Overview

Alibaba Cloud Elasticsearch allows you to deploy an Elasticsearch cluster across zones. In cross-zone deployment, the system automatically selects the zones that have sufficient Elastic Compute Service (ECS) instances. If replica shards are configured and nodes in one zone fail, the nodes in the remaining zones can still provide services without interruption. This significantly enhances the availability of the cluster. In addition, you can perform a switchover in the console to isolate the faulty nodes. The system then adds computing resources to the remaining zones to make up for the resources lost in the zone that contains the faulty nodes.

Scenarios

You can deploy an Alibaba Cloud Elasticsearch cluster by using one of the following methods:
  • In one zone: This deployment method is the default method. It is typically used to handle non-critical workloads.
  • Across two zones: This deployment method implements cross-zone disaster recovery. It is typically used to handle production workloads.
  • Across three zones: This deployment method implements high availability. We recommend that you use this deployment method to handle production workloads that have high requirements for service availability.

Implementation mechanism

Alibaba Cloud Elasticsearch provides high availability by using the following methods:

  • Zones

    If you want to deploy an Elasticsearch cluster across zones, you do not need to specify each zone. The system selects the zones.

  • Nodes
    • You must purchase three dedicated master nodes. If you want to deploy an Elasticsearch cluster across two zones, Alibaba Cloud provides the following deployment plans:
      • If the current region has at least three zones and all these zones have sufficient ECS instances, the dedicated master nodes are deployed in these zones. This ensures that your Elasticsearch cluster can still select a dedicated master node if nodes in one zone fail.
      • If the current region has only two zones or only two zones in the region have sufficient ECS instances, the dedicated master nodes are deployed in the two zones. If nodes in the zone that contains only one dedicated master node fail, your Elasticsearch cluster can still select a dedicated master node. If nodes in the zone that contains two dedicated master nodes fail, you must perform a switchover in the console.
    • The numbers of data nodes, warm nodes, and client nodes must be a multiple of the number of zones. For more information about zones, see Regions and zones.
  • Replica shards of indexes
    • If your Elasticsearch cluster is deployed across two zones but nodes in one zone fail, the nodes in the remaining zone continue to provide services. Therefore, you must configure at least one replica shard for each index.
      Note By default, each index is configured with five primary shards and one replica shard for each primary shard. If you do not have specific requirements on read performance, use the default settings.
    • If your Elasticsearch cluster is deployed across three zones but nodes in one or two of them fail, the nodes in the remaining zones continue to provide services. Therefore, you must configure at least two replica shards for each index.
      Note By default, each index is configured with five primary shards and one replica shard for each primary shard. You must modify the index template to adjust the default number of replica shards. For more information, see Index templates.
  • Switchover and recovery
    • If nodes in a zone fail, you can perform a switchover for the zone in the Elasticsearch console. To ensure normal read and write operations after the switchover, we recommend that you configure replica shards for indexes. The state of the zone changes from Enabled to Disabled, and the nodes in the zone are removed from your Elasticsearch cluster. Network data sent from clients is then transferred to the remaining zones that are in the Enabled state. To ensure that your Elasticsearch cluster has sufficient computing resources and that read and write operations on indexes are not affected, Elasticsearch adds nodes to the remaining zones that are in the Enabled state. These nodes include dedicated master nodes, client nodes, and data nodes. For more information, see Perform a switchover.
    • If the nodes recover, you can perform a recovery for the zone in the Elasticsearch console. After the recovery, the state of the zone changes from Disabled to Enabled. Network data sent from clients is then transferred to all zones that are in the Enabled state. Elasticsearch adds the nodes that were removed during the switchover to the zone again. Then, it removes the nodes that were added to the remaining zones during the switchover. When Elasticsearch removes data nodes, it migrates the data on them to other data nodes. For more information, see Perform a recovery.