If you want to search for and analyze the production data in your ApsaraDB RDS for MySQL database, you can use Data Transmission Service (DTS) to dynamically synchronize the data to an Alibaba Cloud Elasticsearch cluster. The synchronization is implemented based on a real-time synchronization task. This topic describes the procedure in detail.
Background information
DTS can be used to synchronize data in real time, migrate data, and subscribe to data. For more information, see DTS.
You can use DTS to synchronize the following SQL operations: INSERT, DELETE, and UPDATE.
This topic can be used to guide the real-time synchronization of data in relational databases.
Precautions
- DTS uses read and write resources of the source and destination databases during initial full data synchronization. This may increase the loads of the database servers. If the database performance is unfavorable, the specification is low, or the data volume is large, database services may become unavailable. For example, DTS occupies a large amount of read and write resources in the following cases: a large number of slow SQL queries are performed on the source database, the tables have no primary keys, or a deadlock occurs in the destination database. Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. For example, you can synchronize data when the CPU utilization of the source and destination databases is less than 30%.
- DTS cannot be used to synchronize data definition language (DDL) operations. If a DDL operation is performed on the source table during synchronization, you must remove the table from the synchronization task. Then, remove the index that corresponds to the table from your Elasticsearch cluster, and add the table to the synchronization task again. For more information, see Remove an object from a data synchronization task and Add an object to a data synchronization task.
- If you want to add columns to the source table, modify the mappings of the index that corresponds to the table. Then, perform DDL operations on the source table, pause the data synchronization task, and start the task again.
Preparations
- Create an Alibaba Cloud Elasticsearch cluster and enable the Auto Indexing feature
for the cluster.
For more information, see Create an Alibaba Cloud Elasticsearch cluster and Enable the Auto Indexing feature. In this topic, an Elasticsearch V6.7 cluster is used.
Note To ensure data security, Alibaba Cloud Elasticsearch disables Auto Indexing by default. When you use DTS to synchronize data to an Elasticsearch cluster, you must create indexes on the Elasticsearch cluster by submitting data instead of calling the create index operation. Therefore, before you use DTS to synchronize data, you must enable Auto Indexing for the cluster. - Create a database and a table. Then, insert data into the table.
In this topic, an ApsaraDB RDS for MySQL database is created. You can also create a database on your on-premises machine. The following figure shows the schema and data of the table.
For more information about how to create an ApsaraDB RDS for MySQL database and a table in the database, see Quick start.
Notice We recommend that you create an ApsaraDB RDS for MySQL instance in the same region as the Elasticsearch cluster. If the instance and cluster are created in different regions, you may fail to ensure connectivity between them.