PolarDB is a next-generation relational database service that is developed by Alibaba Cloud. PolarDB is compatible with the MySQL database engine and features high availability, ease of use, and reliability. This topic describes how to synchronize data from a user-created MySQL database that is connected over Express Connect, VPN Gateway, or Smart Access Gateway to a PolarDB for MySQL cluster by using Data Transmission Service (DTS).
Prerequisites
- A MySQL database of version 5.1, 5.5, 5.6, 5.7, or 8.0 is created.
- The user-created MySQL database is connected to Alibaba Cloud VPC over Express Connect,
VPN Gateway, or Smart Access Gateway. For more information, see Connect to local IDCs.
Note DTS is allowed to access the VPC to which the user-created MySQL database belongs. For more information, see Configure a route between DTS and Express Connect, VPN Gateway, or Smart Access Gateway.
- A PolarDB for MySQL cluster is created. For more information, see Create a PolarDB for MySQL cluster
Note The available storage space of the PolarDB for MySQL cluster is larger than the total size of the data in the user-created MySQL database.
Precautions
- DTS uses read and write resources of the source and destination databases during initial full data synchronization. This may increase the database load. If the database performance is unfavorable, the specification is low, or the data volume is large, database services may become unavailable. For example, DTS occupies a large amount of read and write resources in the following cases: a large number of slow SQL queries are performed on the source database, the tables have no primary keys, or a deadlock occurs in the destination database. Before synchronizing data, you must evaluate the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. For example, you can synchronize data when the CPU usage of the source and destination databases is less than 30%.
- If you have selected one or more tables (not a database) for synchronization, do not
use gh-ost or pt-online-schema-change to modify the tables during data synchronization.
Otherwise, data synchronization may fail.
Notice To avoid synchronization failure, you can use Data Management (DMS) to perform online DDL schema changes during data synchronization. For more information, see Change the table schema without locking.
- During initial full data synchronization, concurrent INSERT operations cause fragmentation in the tables of the destination cluster. After initial full data synchronization, the tablespace of the destination cluster is larger than that of the source database.
- The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, duplicate data may exist in the destination cluster.
SQL operations that can be synchronized
Operation type | SQL statements |
---|---|
DML | INSERT, UPDATE, DELETE, and REPLACE |
DDL |
|
Supported synchronization topologies
- One-way one-to-one synchronization
- One-way one-to-many synchronization
- One-way cascade synchronization
- One-way many-to-one synchronization
For more information about synchronization topologies, see Synchronization topologies.
Limits
- Incompatibility with triggers
If the object you want to synchronize is a database and the database contains a trigger that updates the synchronized table, the synchronized data may be inconsistent. For example, the source database contains Table A and Table B. If a data record is inserted into Table A, a trigger inserts a data record into Table B. In this case, after an INSERT operation is performed on Table A in the source database, the data in Table B becomes inconsistent between the source and destination databases.
To avoid this situation, you must delete the trigger that is synchronized to the destination database and select Table B as the object to be synchronized. For more information, see Configure synchronization when triggers exist.
- Limits on RENAME TABLE operations
RENAME TABLE operations may cause data inconsistency between the source and destination databases. For example, if only Table A needs to be synchronized and it is renamed Table B, Table B cannot be synchronized to the destination database. To avoid this situation, you can select the database to which Table A and Table B belong as the object when configuring the data synchronization task.