You can use an Alibaba Cloud Logstash pipeline to migrate data from a self-managed Elasticsearch cluster to an Alibaba Cloud Elasticsearch cluster. This topic describes the migration procedure in detail.
Prerequisites
- A self-managed Elasticsearch cluster is created.
We recommend that you create a self-managed Elasticsearch cluster on Alibaba Cloud Elastic Compute Service (ECS) instances. For more information, see Install and Run Elasticsearch.Notice
- The ECS instances that host the self-managed Elasticsearch cluster must be deployed in a virtual private cloud (VPC). You cannot use ECS instances that are connected to a VPC over ClassicLink.
- Alibaba Cloud Logstash clusters are deployed in VPCs. Before you configure a Logstash pipeline, you must check whether the ECS instances that host the self-managed Elasticsearch cluster reside in the same VPC as the Alibaba Cloud Logstash cluster that you want to use. If they reside in different VPCs, you must configure NAT gateways to connect the ECS instances and Logstash cluster to the Internet. For more information, see Configure a NAT gateway for data transmission over the Internet.
- You must configure security group rules to allow access from the IP addresses of the nodes in the Logstash cluster for the security groups of the ECS instances that host the self-managed Elasticsearch cluster. In addition, you must enable port 9200. You can obtain the IP addresses of the nodes in the Logstash cluster on the Basic Information page of the Logstash cluster.
- In this example, an Alibaba Cloud Logstash V6.7.0 cluster is used to migrate data from a self-managed Elasticsearch 5.6.16 cluster to an Alibaba Cloud Elasticsearch V6.7.0 cluster. The scripts provided in this topic apply only to this type of data migration. If you want to perform other types of data synchronization, you must check whether your Elasticsearch clusters and Logstash cluster are compatible with each other based on the instructions in Compatibility matrixes. If they are not compatible with each other, you can upgrade their versions or purchase new clusters.
- An Alibaba Cloud Logstash cluster is created.
For more information, see Create an Alibaba Cloud Logstash cluster.
- An Alibaba Cloud Elasticsearch cluster is created in the VPC where the Alibaba Cloud
Logstash cluster resides. Make sure that the Alibaba Cloud Elasticsearch cluster is
of the same version as the Logstash cluster. In this example, V6.7.0 is used.
For more information, see Create an Alibaba Cloud Elasticsearch cluster.
- The Auto Indexing feature is enabled for the Alibaba Cloud Elasticsearch cluster.
For more information, see Configure the YML file.
Note Logstash does not synchronize the structure features of data when Logstash migrates data. Therefore, if you enable the Auto Indexing feature, the structure of data may change after the data is migrated to the destination. If you want the structure of the data to remain unchanged, we recommend that you create an empty index in the destination and migrate data to the index. When you create the index, copy the mappings and settings configurations of the source and set the numbers of shards and replicas to appropriate values.
Configure and run a Logstash pipeline
View migration results
FAQ
- Q: How do I connect the ECS instances that host the self-managed Elasticsearch cluster
to the Alibaba Cloud Logstash cluster when the ECS instances and the Logstash cluster
belong to different accounts?
A: The ECS instances and the Logstash cluster belong to different accounts. Therefore, the ECS instances and the Logstash cluster reside in different VPCs. In this case, you can use Cloud Enterprise Network (CEN) to connect the ECS instances to the Logstash cluster. For more information, see Step 3: Attach network instances.
- Q: An error occurs when Logstash writes data to the destination. How do I do?
A: Troubleshoot the error based on the instructions provided in FAQ about data transfer by using Logstash.