If you perform online data definition language (DDL) operations on a source database when a table merging task is running, data loss occurs in the destination database. This topic describes how to avoid this situation.

Issue

If you perform online DDL operations on a source database when a table merging task is running, the schema of the source database is changed. Some data will be lost when it is synchronized to the destination database.

Note A table merging task merges tables from different source databases. Then, you can synchronize or migrate the data of these tables to the destination database.
For example, if you configure a task to merge tables from two ApsaraDB RDS for MySQL databases (Database 1 and Database 2) to the destination ApsaraDB RDS for MySQL database, DTS migrates data based on the following logic:
  • Migrates the data of Table A1 in Database 1 to Table A in the destination database.
  • Migrates the data of Table A2 in Database 2 to Table A in the destination database.

Assume that Table A1 in Database 1 contains 100 data records and Table A2 in Database 2 contains 200 data records. After Table A1 and Table A2 are merged, Table A in the destination database is supposed to contain 300 data records.

Example

This example demonstrates how online DDL operations performed on a source database causes data loss in the destination database. In this example, a task is performed to merge tables from two ApsaraDB RDS for MySQL databases to the destination ApsaraDB RDS for MySQL database.

Table merging scenario: Assume that you configure a task to merge tables from two ApsaraDB RDS for MySQL databases (Database 1 and Database 2) to the destination ApsaraDB RDS for MySQL database, DTS migrates data based on the following logic:
  • Migrates the data of Table A1 in Database 1 to Table A in the destination database.
  • Migrates the data of Table A2 in Database 2 to Table A in the destination database.

Assume that Table A1 in Database 1 contains 100 data records and Table A2 in Database 2 contains 200 data records. After Table A1 and Table A2 are merged, Table A in the destination database is supposed to contain 300 data records.

The following online DDL operations are performed on Table A1 in Database 1:
Note For more information about online DDL operations, see Change schemas without locking tables.
  • Create a temporary table:
    CREATE TABLE tmp_table_table LIKE table_name
  • Change the schema of the temporary table:
     ALTER TABLE tmp_table_table XXXX
  • Copy the historical data:
    INSERT IGNORE INTO tmp_table_table (SELECT %s FROM table_name FORCE INDEX (%s) WHERE xxx
  • Synchronize the binary logs of incremental data:
    UPDATA/INSRT/DELETE tmp_table_name
  • Switch workloads to the temporary table:
    RENAME TABLE table_name to old_tmp_table_table, tmp_table_name to table_name

In the preceding operations, a temporary table is created, the schema of the temporary table is changed, and the historical data of the original table is copied to the temporary table. Then, the RENAME TABLE command is used to switch workloads to the temporary table. After the operations are performed, the data of Table A1 in Database 1 overwrites the data of Table A in the destination database. In this case, the data of Table A2 in Database 2 is lost.

Cause

Due to the internal mechanism of online DDL operations, the data of the source database overwrites the data of the destination database.

Solution

  • When you configure the data synchronization task, do not to synchronize online DDL operations.
  • Do not perform online DDL operations on the source database.