All Products
Search
Document Center

Elasticsearch:Use the CCR feature in disaster recovery scenarios

Last Updated:May 07, 2024

If a catastrophic event, such as a service interruption caused by a hardware failure, software error, data center failure, or natural disaster, occurs on your Elasticsearch cluster, you can use the cross-cluster replication (CCR) feature to implement cross-region or cross-resource disaster recovery. This topic describes how to implement CCR in the new and original network architectures.

Background information

CCR is a commercial feature released in open source Elasticsearch of the Platinum edition. After you purchase an Alibaba Cloud Elasticsearch cluster and make a few simple configurations, you can use this feature free of charge.

You can use CCR in disaster recovery scenarios to back up data among Elasticsearch clusters that reside in the same virtual private cloud (VPC) but different zones. If a cluster (remote cluster) fails, you can retrieve its data from another cluster (local cluster) and restore the data to the remote cluster. This feature helps prevent data loss.

To use CCR, you must prepare two types of clusters: local clusters and remote clusters. Remote clusters provide source data, which is stored in leader indexes. Local clusters replicate the data and store it in follower indexes. You can also use CCR to migrate large volumes of data at a time in real time. For more information, see Cross-cluster replication.

Scenarios

The following table lists the use scenarios of CCR.

Environment

Solution

Two Alibaba Cloud Elasticsearch clusters are deployed in the new network architecture.

Note

Only Alibaba Cloud Elasticsearch clusters of V7.7 or later are supported.

Use NLB and PrivateLink to implement CCR

Two Alibaba Cloud Elasticsearch clusters are deployed in the original network architecture and reside in the same VPC.

Note

Only single-zone Alibaba Cloud Elasticsearch clusters of V6.7.0 or later are supported.

Connect Alibaba Cloud Elasticsearch clusters to enable CCR

Note
  • The preceding use scenarios are also applicable to the cross-cluster search (CCS) feature. For more information, see modules-cross-cluster-search.

  • The CCR feature cannot be used to back up data between an Alibaba Cloud Elasticsearch cluster and a self-managed Elasticsearch cluster.

  • Alibaba Cloud Elasticsearch clusters created before October 2020 are deployed in the original network architecture. Alibaba Cloud Elasticsearch clusters created in October 2020 or later are deployed in the new network architecture.

Use NLB and PrivateLink to implement CCR

Preparations

  • Create two Alibaba Cloud Elasticsearch clusters of V7.7 or later in the same region and zone.

    Note

    The two clusters are used in the following way:

    • One is used as a remote cluster and provides source data.

    • The other is used as a local cluster and replicates data from one or more indexes in the remote cluster.

  • Establish a private connection between the two clusters. For more information, see Use NLB and PrivateLink to establish a private connection between Alibaba Cloud Elasticsearch clusters.

    Note

    Add the private IP address of the remote cluster to a Network Load Balancer (NLB) server group to establish a private connection between the two clusters.

  • Create indexes (leader indexes) in the remote cluster.

    1. Log on to the Kibana console of the remote cluster. For more information, see Log on to the Kibana console.

    2. On the page that appears, click the 菜单.png icon in the upper-left corner and choose Management > Dev Tools.

    3. Run the following command to create a leader index in the remote cluster:

      PUT /leader-new
      {
        "settings": {
          "number_of_shards": 1,
          "number_of_replicas": 0
        },
        "mappings": {
          "properties": {
            "name": {
              "type": "text"
            },
            "age": {
              "type": "integer"
            }
          }
        }
      }

Scenario 1: Implement CCR for a specific index

Step 1: Connect the remote cluster to the local cluster

  1. Log on to the Kibana console of the local cluster. For more information, see Log on to the Kibana console.

  2. On the page that appears, click the 菜单.png icon in the upper-right corner and choose Management > Stack Management.

  3. In the left-side navigation pane of the Management page, click Remote Clusters.

  4. Click Add a remote cluster.

  5. On the page that appears, specify information about the remote cluster.

    • Name: the name of the remote cluster. The name must be unique.

    • Connection mode: Turn on Use proxy mode.

    • Proxy address: the address of the proxy server. The address must be in the Endpoint domain name:9300 format. The endpoint domain name is the domain name of the endpoint that corresponds to your PrivateLink endpoint service.

      Note

      During CCR, Kibana uses the IP addresses of data nodes to access clusters over TCP port 9300. HTTP port 9200 is not supported.

  6. Click Save.

    Then, the system connects the remote cluster to the local cluster. If the remote cluster is connected to the local cluster, Connected appears.

Sample code for calling the related API

PUT /_cluster/settings 
{
	"persistent": {
		"cluster": {
			"remote": {
				"<remote_cluster>": {
					"mode": "PROXY",
					"proxy_address": "Endpoint domain name:9300"
				}
			}
		}
	}
}

Parameter

Description

persistent

Specifies that settings are permanently stored even if the clusters are restarted.

<remote_cluster>

Replace it with the name of the remote cluster.

mode

Only the proxy mode is supported. The local cluster uses the configured proxy address to access the remote cluster. All requests to the remote cluster are sent to this proxy address and forwarded by the proxy server to the appropriate node in the remote cluster.

proxy_address

The address of the proxy server. The address must be in the Domain name of the endpoint that corresponds to your PrivateLink endpoint service:9300 format.

Note

In this example, CCR or CCS uses the transport layer of Elasticsearch and needs to use port 9300 for communication.

Step 2: Configure CCR

  1. Go to the Management page in the Kibana console of the local cluster. In the left-side navigation pane, click Cross-Cluster Replication.

  2. On the page that appears, click Create a follower index.

  3. Configure CCR.

    Parameter

    Description

    Remote cluster

    The remote cluster that is connected to the local cluster.

    Leader index

    The index whose data you want to back up.

    Follower index

    The index to which you want to back up data. You must specify a unique index name.

  4. Click Create.

    After the follower index is created, the index is in the Active state.

Sample code for calling the related API

When you create a follower index, you must reference the remote cluster and the leader index created in the remote cluster.

PUT /leader-old-copy/_ccr/follow
{
  "remote_cluster": "es-leader",
  "leader_index": "leader-old"
}

Parameter

Description

remote_cluster

The name of the remote cluster that is connected to the local cluster. The remote cluster you specify must be the same as the remote cluster you specify in Step 1.

leader_index

The name of the leader index.

Step 3: Verify the data backup result

  1. In the Kibana console of the remote cluster, run the following command to insert data into the leader index:

    POST leader-new/_doc/
    {
      "name":"Jack",
      "age":40
    }
  2. In the Kibana console of the local cluster, run the following command to check whether the inserted data is backed up to the follower index:

    GET leader-new-copy/_search

    The result shown in the following figure is returned. The result shows that data in the leader index leader-new of the remote cluster is backed up to the follower index leader-new-copy of the local cluster.image.png

  3. Insert a document into the leader index.

    POST leader-new/_doc/
    {
      "name":"Pony",
      "age":50
    }
  4. Run the following command on the local cluster to check whether incremental data can be backed up to the follower index in real time:

    GET leader-new-copy/_search

    The command output shows that the incremental data is backed up to the follower index.image.png

Scenario 2: Specify an index pattern to implement CCR for multiple indexes

Step 1: Connect the remote cluster to the local cluster

  1. Log on to the Kibana console of the local cluster. For more information, see Log on to the Kibana console.

  2. On the page that appears, click the 菜单.png icon in the upper-right corner and choose Management > Stack Management.

  3. In the left-side navigation pane of the Management page, click Remote Clusters.

  4. Click Add a remote cluster.

  5. On the page that appears, specify information about the remote cluster.

    • Name: the name of the remote cluster. The name must be unique.

    • Connection mode: Turn on Use proxy mode.

    • Proxy address: the address of the proxy server. The address must be in the Endpoint domain name:9300 format. The endpoint domain name is the domain name of the endpoint that corresponds to your PrivateLink endpoint service.

      Note

      During CCR, Kibana uses the IP addresses of data nodes to access clusters over TCP port 9300. HTTP port 9200 is not supported.

  6. Click Save.

    Then, the system connects the remote cluster to the local cluster. If the remote cluster is connected to the local cluster, Connected appears.

Sample code for calling the related API

PUT /_cluster/settings 
{
	"persistent": {
		"cluster": {
			"remote": {
				"<remote_cluster>": {
					"mode": "PROXY",
					"proxy_address": "Endpoint domain name:9300"
				}
			}
		}
	}
}

Parameter

Description

persistent

Specifies that settings are permanently stored even if the clusters are restarted.

<remote_cluster>

Replace it with the name of the remote cluster.

mode

Only the proxy mode is supported. The local cluster uses the configured proxy address to access the remote cluster. All requests to the remote cluster are sent to this proxy address and forwarded by the proxy server to the appropriate node in the remote cluster.

proxy_address

The address of the proxy server. The address must be in the Domain name of the endpoint that corresponds to your PrivateLink endpoint service:9300 format.

Note

In this example, CCR or CCS uses the transport layer of Elasticsearch and needs to use port 9300 for communication.

Step 2: Configure CCR

  1. Go to the Management page in the Kibana console of the local cluster. In the left-side navigation pane, click Cross-Cluster Replication.

  2. Click the Auto-follow patterns tab.

  3. Click Create auto-follow pattern.

    Configure CCR.

    Parameter

    Description

    Remote cluster

    The remote cluster that is connected to the local cluster.

    Index patterns

    The pattern of indexes whose data you want to back up.

Sample code for calling the related API

PUT /_ccr/auto_follow/beats 
{
	"remote_cluster": "es-leader",
	"leader_index_patterns": 
	[
		"leader-*"
	],
	"follow_index_pattern": "{{leader_index}}-copy"
}

Parameter

Description

remote_cluster

The remote cluster that is connected to the local cluster. The remote cluster you specify must be the same as the remote cluster you specify in Step 1.

leader_index_patterns

The pattern of the indexes whose data you want to back up.

follow_index_pattern

The pattern of indexes to be created in the local cluster. The system creates indexes in the local cluster based on the pattern and backs up data to the indexes.

Step 3: Verify the data backup result

  1. In the Kibana console of the remote cluster, run the following command to add an index:

    PUT /leader-new
    {
      "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
      },
      "mappings": {
        "properties": {
          "name": {
            "type": "text"
          },
          "age": {
            "type": "integer"
          }
        }
      }
    }
  2. In the Kibana console of the local cluster, run the following command to check whether data in the added index is backed up to the local cluster:

    get _cat/indices?v

    image.png

Connect Alibaba Cloud Elasticsearch clusters to enable CCR

Preparations

  1. Create two Alibaba Cloud Elasticsearch clusters of the same version (6.7 or later). Make sure that the two clusters are deployed in the same VPC and belong to the same vSwitch.

    Note
    • The two clusters are used in the following way:

      • One is used as a remote cluster and provides source data.

      • The other is used as a local cluster and replicates data from one or more indexes in the remote cluster.

    • If you have uploaded a synonym file to the remote cluster, you must upload the synonym file to the local cluster.

  2. Configure the remote cluster to connect it to the local cluster. For more information, see Connect Elasticsearch clusters to enable cross-cluster searches.

  3. Create indexes (leader indexes) in the remote cluster.

    1. Log on to the Kibana console of the remote cluster. For more information, see Log on to the Kibana console.

    2. On the page that appears, click the 菜单.png icon in the upper-left corner and choose Management > Dev Tools.

    3. Run the following command to create a leader index in the remote cluster:

      PUT myindex
      {
        "settings": {
          "index.soft_deletes.retention.operations": 1024,
          "index.soft_deletes.enabled": true
        }
      }
      Note
      • If you create an index in an Elasticsearch cluster of V7.0 or earlier, you must enable the soft_deletes attribute for the index. Otherwise, an error is reported. You can run the GET /<yourIndexName>/_settings?pretty command to check whether the soft_deletes attribute is enabled. If the soft_deletes attribute is enabled, the configuration of the soft_deletes attribute is displayed in the command output.

      • If you want to back up data in an existing index, you can call the reindex API to enable the soft_deletes attribute.

  4. Disable the physical replication feature for the leader index.

    Note

    The physical replication feature is automatically enabled for indexes in Elasticsearch V6.7.0 clusters. Before you use CCR, you must disable the physical replication feature.

    1. Disable the index.

      POST myindex/_close
    2. Update the settings configuration of the index to disable the physical replication feature.

      PUT myindex/_settings
      {
      "index.replication.type" : null
      }
    3. Enable the index.

      POST myindex/_open

Step 1: Connect the remote cluster to the local cluster

  1. Log on to the Kibana console of the local cluster. For more information, see Log on to the Kibana console.

  2. On the page that appears, click the 菜单.png icon in the upper-right corner and choose Management > Stack Management.

  3. In the left-side navigation pane of the Management page, click Remote Clusters.

  4. Click Add a remote cluster.

  5. On the page that appears, specify information about the remote cluster.

    • Name: the name of the remote cluster. The name must be unique.

    • Proxy address: The address must be in the IP address of a node in the remote cluster:9300 format. To obtain the IP addresses of nodes, log on to the Kibana console of the remote cluster and run the GET /_cat/nodes?v command on the Console tab of the Dev Tools page. The nodes you specify must include a dedicated master node of the remote cluster. We recommend that you specify multiple nodes. This ensures that you can still use CCR if the specified dedicated master node fails.

      Note

      During CCR, Kibana uses the IP addresses of data nodes to access clusters over TCP port 9300. HTTP port 9200 is not supported.

  6. Click Save.

    Then, the system connects the remote cluster to the local cluster. If the remote cluster is connected to the local cluster, Connected appears.

Step 2: Configure CCR

  1. Go to the Management page in the Kibana console of the local cluster. In the left-side navigation pane, click Cross-Cluster Replication.

  2. On the page that appears, click Create a follower index.

  3. Configure CCR.

    Parameter

    Description

    Remote cluster

    The remote cluster that is connected to the local cluster.

    Leader index

    The index whose data you want to back up. In this example, the myindex index that is created in Preparations is used.

    Follower index

    The index to which you want to back up data. You must specify a unique index name.

  4. Click Create.

    After the follower index is created, the index is in the Active state.

Step 3: Verify the data backup result

  1. In the Kibana console of the remote cluster, run the following command to insert data into the remote cluster:

    POST myindex/_doc/
    {
      "name":"Jack",
      "age":40
    }
  2. In the Kibana console of the local cluster, run the following command to check whether the inserted data is backed up to the local cluster:

    GET myindex_follow/_search

    The result shown in the following figure is returned. The result shows that data in the leader index myindex of the remote cluster is backed up to the follower index myindex_follow of the local cluster.数据同步结果

    Note

    The follower index myindex_follow is read-only. If you want to write data to the follower index, convert the follower index into a common index first. For more information, see Use Elasticsearch CCR to migrate data across data centers.

  3. Insert a document into the remote cluster and check whether the document is backed up to the local cluster in real time.

    POST myindex/_doc/
    {
      "name":"Pony",
      "age":50
    }
  4. Query the inserted document in the local cluster. The following figure shows the document.验证数据同步的实时性

    The preceding figure shows that the CCR feature can implement real-time backup of incremental data.

    Note

    You can also call the APIs for the CCR feature to perform cross-cluster replication operations. For more information, see Cross-cluster replication APIs.

FAQ

  • Q: I can use port 9300 to add a remote cluster. Why is only port 9200 accessible when I use a domain name to access an Elasticsearch cluster?

    A: Port 9300 is an open port. However, when you access a cluster over the Internet, Server Load Balancer (SLB) enables only port 9200 during port verification for security purposes.

  • Q: How do I view the status of CCR-based data synchronization?

    A: Run the GET /_ccr/stats command in the Kibana console of the cluster that stores the follower index and view the value of the number_of_failed_follow_indices parameter. This parameter indicates the number of failed shards.

    • If the value of the parameter is 0, the synchronization is normal.

    • If the value of the parameter is not 0, run the following commands for the cluster that stores the follower index to pause and then resume the synchronization:

      POST /<follower_index>/_ccr/pause_follow
      POST /<follower_index>/_ccr/resume_follow