Alibaba Cloud Elasticsearch is compatible with open source Elasticsearch features such as Security, Machine Learning, Graph, and Application Performance Management (APM). Alibaba Cloud Elasticsearch provides capabilities such as enterprise-level access control, security monitoring and alerts, and automatic report generation. You can use Alibaba Cloud Elasticsearch to search and analyze data. This topic describes how to synchronize data from a self-managed MySQL database that is hosted on Elastic Compute Service (ECS) to an Elasticsearch cluster by using Data Transmission Service (DTS).
Prerequisites
- An Elasticsearch cluster of version 5.5, 5.6, 6.3, 6.7, or 7.4 is created. For more information, see Create an Elasticsearch cluster.
- The version of the self-managed MySQL database is 5.1, 5.5, 5.6, 5.7, or 8.0.
Precautions
- DTS uses read and write resources of the source and destination databases during initial full data synchronization. This may increase the loads of the database servers. If the database performance is unfavorable, the specification is low, or the data volume is large, database services may become unavailable. For example, DTS occupies a large amount of read and write resources in the following cases: a large number of slow SQL queries are performed on the source database, the tables have no primary keys, or a deadlock occurs in the destination database. Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. For example, you can synchronize data when the CPU utilization of the source and destination databases is less than 30%.
- DTS does not synchronize data definition language (DDL) operations. If a DDL operation is performed on a table in the source database during data synchronization, you must perform the following steps: Remove the table from the required objects, remove the index for the table from the Elasticsearch cluster, and then add the table to the required objects. For more information, see Remove an object from a data synchronization task and Add an object to a data synchronization task.
- To add columns to the table that you want to synchronize, perform the following steps: Modify the mapping of the table in the Elasticsearch cluster, perform DDL operations in the source MySQL database, and then pause and start the data synchronization task.
SQL operations that can be synchronized
INSERT, DELETE, and UPDATE
Data type conversion
The data types of MySQL databases and Elasticsearch clusters do not have one-to-one correspondence. During initial schema synchronization, DTS converts the data types of the source database into the data types of the destination database. For more information, see Data type mappings for schema synchronization.
Before you begin
Create an account for a user-created MySQL database and configure binary loggingProcedure
Check the index and data
If the data synchronization task is in the Synchronizing state, you can connect to the Elasticsearch cluster by using the Elasticsearch-Head plug-in. Then, you can check whether the index is created and data is synchronized as expected. For more information, see Use Cerebro to access an Elasticsearch cluster.
