This topic describes the precautions and limits when you synchronize data from a PolarDB for MySQL cluster. To ensure that your data synchronization task runs as expected, read the precautions and limits before you configure the task.

Scenarios of synchronizing data from a PolarDB for MySQL cluster

You can view the precautions and limits based on the following synchronization scenarios:

Synchronize data between PolarDB for MySQL clusters

The following table describes the precautions and limits.
Type Description
Limits on the source database
  • The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
  • The following requirements for binary logs must be met:
    • The binary logging feature must be enabled. The value of the binlog_format parameter must be set to row. The value of the binlog_row_image parameter must be set to full. Otherwise, error messages are returned during precheck and the data synchronization task cannot be started.
    • Binary logs are retained for at least 7 days during initial full data synchronization. You can wait until initial full data synchronization is complete, and then clear the binary logs generated in the source database after the DTS task is run.
      Note To ensure data security, DTS stores only 50 GB of binary logs or the binary logs for the last 24 hours. If the limit is exceeded, DTS automatically clears the cached logs.
      Warning If you clear the binary logs of the source database during initial full data synchronization, the data synchronization task may fail. For example, initial full data synchronization takes more than 24 hours due to the large data volume in the source database and abnormal writing in the destination database. In this case, if the binary logs of the source database are cleared during initial full data synchronization, DTS cannot obtain the binary logs generated 24 hours ago. Therefore, the data synchronization task may fail.
Other limits
  • Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During initial full data synchronization, DTS uses read and write resources of the source and destination databases. This may increase the loads of the database servers.
  • During initial full data synchronization, concurrent INSERT operations cause fragmentation in the tables of the destination database. After initial full data synchronization is complete, the tablespace of the destination database is larger than that of the source database.
  • We recommend that you do not use gh-ost or pt-online-schema-change to perform data definition language (DDL) operations on source tables during data synchronization. Otherwise, data synchronization may fail.
  • If you use only DTS to write data to the destination database, you can use Data Management (DMS) to perform online DDL operations on source tables during data synchronization. For more information, see Change schemas without locking tables.
    Warning If you use tools other than DTS to write data to the destination database, we recommend that you do not use DMS to perform online DDL operations. Otherwise, data loss may occur in the destination database.

Synchronize data from a PolarDB for MySQL cluster to an ApsaraDB RDS for MySQL instance or a self-managed MySQL database

The following table describes the precautions and limits.
Type Description
Limits on the source database
  • The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
  • The following requirements for binary logs must be met:
    • The binary logging feature must be enabled. The value of the binlog_format parameter must be set to row. The value of the binlog_row_image parameter must be set to full. Otherwise, error messages are returned during precheck and the data synchronization task cannot be started.
    • Binary logs are retained for at least 7 days during initial full data synchronization. You can wait until initial full data synchronization is complete, and then clear the binary logs generated in the source database after the DTS task is run.
      Note To ensure data security, DTS stores only 50 GB of binary logs or the binary logs for the last 24 hours. If the limit is exceeded, DTS automatically clears the cached logs.
      Warning If you clear the binary logs of the source database during initial full data synchronization, the data synchronization task may fail. For example, initial full data synchronization takes more than 24 hours due to the large data volume in the source database and abnormal writing in the destination database. In this case, if the binary logs of the source database are cleared during initial full data synchronization, DTS cannot obtain the binary logs generated 24 hours ago. Therefore, the data synchronization task may fail.
Other limits
  • Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During initial full data synchronization, DTS uses read and write resources of the source and destination databases. This may increase the loads of the database servers.
  • During initial full data synchronization, concurrent INSERT operations cause fragmentation in the tables of the destination database. After initial full data synchronization is complete, the tablespace of the destination database is larger than that of the source database.
  • We recommend that you do not use gh-ost or pt-online-schema-change to perform data definition language (DDL) operations on source tables during data synchronization. Otherwise, data synchronization may fail.
  • If you use only DTS to write data to the destination database, you can use Data Management (DMS) to perform online DDL operations on source tables during data synchronization. For more information, see Change schemas without locking tables.
    Warning If you use tools other than DTS to write data to the destination database, we recommend that you do not use DMS to perform online DDL operations. Otherwise, data loss may occur in the destination database.

Synchronize data from a PolarDB for MySQL cluster to a PolarDB-X instance

The following table describes the precautions and limits.
Type Description
Limits on the source database
  • The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
  • The following requirements for binary logs must be met:
    • The binary logging feature must be enabled. The value of the binlog_format parameter must be set to row. The value of the binlog_row_image parameter must be set to full. Otherwise, error messages are returned during precheck and the data synchronization task cannot be started.
    • Binary logs are retained for at least 7 days during initial full data synchronization. You can wait until initial full data synchronization is complete, and then clear the binary logs generated in the source database after the DTS task is run.
      Note To ensure data security, DTS stores only 50 GB of binary logs or the binary logs for the last 24 hours. If the limit is exceeded, DTS automatically clears the cached logs.
      Warning If you clear the binary logs of the source database during initial full data synchronization, the data synchronization task may fail. For example, initial full data synchronization takes more than 24 hours due to the large data volume in the source database and abnormal writing in the destination database. In this case, if the binary logs of the source database are cleared during initial full data synchronization, DTS cannot obtain the binary logs generated 24 hours ago. Therefore, the data synchronization task may fail.
Other limits
  • Requirements for the objects to be synchronized:
    • DTS does not synchronize the following types of data: BIT, VARBIT, GEOMETRY, ARRAY, UUID, TSQUERY, TSVECTOR, and TXID_SNAPSHOT.
    • Prefix indexes cannot be synchronized. If the source database contains prefix indexes, data may fail to be synchronized.
  • Initial schema synchronization is not supported. Before you configure a data synchronization task, you must create databases and tables in the destination instance.
  • Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During initial full data synchronization, DTS uses read and write resources of the source and destination databases. This may increase the loads of the database servers.
  • During initial full data synchronization, concurrent INSERT operations cause fragmentation in the tables of the destination database. After initial full data synchronization is complete, the tablespace of the destination database is larger than that of the source database.
  • We recommend that you do not use gh-ost or pt-online-schema-change to perform data definition language (DDL) operations on source tables during data synchronization. Otherwise, data synchronization may fail.
  • If you use only DTS to write data to the destination database, you can use Data Management (DMS) to perform online DDL operations on source tables during data synchronization. For more information, see Change schemas without locking tables.
    Warning If you use tools other than DTS to write data to the destination database, we recommend that you do not use DMS to perform online DDL operations. Otherwise, data loss may occur in the destination database.

Synchronize data from a PolarDB for MySQL cluster to an AnalyticDB for MySQL cluster

The following table describes the precautions and limits.
Type Description
Limits on the source database
  • The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
  • The following requirements for binary logs must be met:
    • The binary logging feature must be enabled. The value of the binlog_format parameter must be set to row. The value of the binlog_row_image parameter must be set to full. Otherwise, error messages are returned during precheck and the data synchronization task cannot be started.
    • Binary logs are retained for at least 7 days during initial full data synchronization. You can wait until initial full data synchronization is complete, and then clear the binary logs generated in the source database after the DTS task is run.
      Note To ensure data security, DTS stores only 50 GB of binary logs or the binary logs for the last 24 hours. If the limit is exceeded, DTS automatically clears the cached logs.
      Warning If you clear the binary logs of the source database during initial full data synchronization, the data synchronization task may fail. For example, initial full data synchronization takes more than 24 hours due to the large data volume in the source database and abnormal writing in the destination database. In this case, if the binary logs of the source database are cleared during initial full data synchronization, DTS cannot obtain the binary logs generated 24 hours ago. Therefore, the data synchronization task may fail.
Other limits
  • Prefix indexes cannot be synchronized. If the source database contains prefix indexes, data may fail to be synchronized.
  • Due to the limits of AnalyticDB for MySQL, if the disk space usage of the nodes in an AnalyticDB for MySQL cluster reaches 80%, the task is delayed and error messages are returned. We recommend that you estimate the required disk space based on the objects that you want to synchronize. You must ensure that the destination cluster has sufficient storage space.
  • Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During initial full data synchronization, DTS uses read and write resources of the source and destination databases. This may increase the loads of the database servers.
  • During initial full data synchronization, concurrent INSERT operations cause fragmentation in the tables of the destination database. After initial full data synchronization is complete, the tablespace of the destination database is larger than that of the source database.
  • We recommend that you do not use gh-ost or pt-online-schema-change to perform data definition language (DDL) operations on source tables during data synchronization. Otherwise, data synchronization may fail.
  • If you use only DTS to write data to the destination database, you can use Data Management (DMS) to perform online DDL operations on source tables during data synchronization. For more information, see Change schemas without locking tables.
    Warning If you use tools other than DTS to write data to the destination database, we recommend that you do not use DMS to perform online DDL operations. Otherwise, data loss may occur in the destination database.

Synchronize data from a PolarDB for MySQL cluster to an AnalyticDB for PostgreSQL instance

The following table describes the precautions and limits.
Type Description
Limits on the source database
  • The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
  • The following requirements for binary logs must be met:
    • The binary logging feature must be enabled. The value of the binlog_format parameter must be set to row. The value of the binlog_row_image parameter must be set to full. Otherwise, error messages are returned during precheck and the data synchronization task cannot be started.
    • Binary logs are retained for at least 7 days during initial full data synchronization. You can wait until initial full data synchronization is complete, and then clear the binary logs generated in the source database after the DTS task is run.
      Note To ensure data security, DTS stores only 50 GB of binary logs or the binary logs for the last 24 hours. If the limit is exceeded, DTS automatically clears the cached logs.
      Warning If you clear the binary logs of the source database during initial full data synchronization, the data synchronization task may fail. For example, initial full data synchronization takes more than 24 hours due to the large data volume in the source database and abnormal writing in the destination database. In this case, if the binary logs of the source database are cleared during initial full data synchronization, DTS cannot obtain the binary logs generated 24 hours ago. Therefore, the data synchronization task may fail.
Other limits
  • Requirements for the objects to be synchronized:
    • Only tables can be selected as the objects to be synchronized.
    • DTS does not synchronize the following types of data: BIT, VARBIT, GEOMETRY, ARRAY, UUID, TSQUERY, TSVECTOR, and TXID_SNAPSHOT.
    • Prefix indexes cannot be synchronized. If the source database contains prefix indexes, data may fail to be synchronized.
  • Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During initial full data synchronization, DTS uses read and write resources of the source and destination databases. This may increase the loads of the database servers.
  • During initial full data synchronization, concurrent INSERT operations cause fragmentation in the tables of the destination database. After initial full data synchronization is complete, the tablespace of the destination database is larger than that of the source database.
  • We recommend that you do not use gh-ost or pt-online-schema-change to perform data definition language (DDL) operations on source tables during data synchronization. Otherwise, data synchronization may fail.
  • If you use only DTS to write data to the destination database, you can use Data Management (DMS) to perform online DDL operations on source tables during data synchronization. For more information, see Change schemas without locking tables.
    Warning If you use tools other than DTS to write data to the destination database, we recommend that you do not use DMS to perform online DDL operations. Otherwise, data loss may occur in the destination database.

Synchronize data from a PolarDB for MySQL cluster to a DataHub project

The following table describes the precautions and limits.
Type Description
Limits on the source database
  • The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
  • The following requirements for binary logs must be met:
    • The binary logging feature must be enabled. The value of the binlog_format parameter must be set to row. The value of the binlog_row_image parameter must be set to full. Otherwise, error messages are returned during precheck and the data synchronization task cannot be started.
    • Binary logs are retained for at least 7 days during initial full data synchronization. You can wait until initial full data synchronization is complete, and then clear the binary logs generated in the source database after the DTS task is run.
      Note To ensure data security, DTS stores only 50 GB of binary logs or the binary logs for the last 24 hours. If the limit is exceeded, DTS automatically clears the cached logs.
      Warning If you clear the binary logs of the source database during initial full data synchronization, the data synchronization task may fail. For example, initial full data synchronization takes more than 24 hours due to the large data volume in the source database and abnormal writing in the destination database. In this case, if the binary logs of the source database are cleared during initial full data synchronization, DTS cannot obtain the binary logs generated 24 hours ago. Therefore, the data synchronization task may fail.
Other limits
  • Initial full data synchronization is not supported. DTS does not synchronize historical data of the required objects from the source RDS instance to the destination DataHub instance.
  • Only tables can be selected as the objects to be synchronized.
  • We recommend that you do not use gh-ost or pt-online-schema-change to perform data definition language (DDL) operations on source tables during data synchronization. Otherwise, data synchronization may fail.
  • If you use only DTS to write data to the destination database, you can use Data Management (DMS) to perform online DDL operations on source tables during data synchronization. For more information, see Change schemas without locking tables.
    Warning If you use tools other than DTS to write data to the destination database, we recommend that you do not use DMS to perform online DDL operations. Otherwise, data loss may occur in the destination database.

Synchronize data from a PolarDB for MySQL cluster to an Elasticsearch cluster

The following table describes the precautions and limits.
Type Description
Limits on the source database
  • The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
  • The following requirements for binary logs must be met:
    • The binary logging feature must be enabled. The value of the binlog_format parameter must be set to row. The value of the binlog_row_image parameter must be set to full. Otherwise, error messages are returned during precheck and the data synchronization task cannot be started.
    • Binary logs are retained for at least 7 days during initial full data synchronization. You can wait until initial full data synchronization is complete, and then clear the binary logs generated in the source database after the DTS task is run.
      Note To ensure data security, DTS stores only 50 GB of binary logs or the binary logs for the last 24 hours. If the limit is exceeded, DTS automatically clears the cached logs.
      Warning If you clear the binary logs of the source database during initial full data synchronization, the data synchronization task may fail. For example, initial full data synchronization takes more than 24 hours due to the large data volume in the source database and abnormal writing in the destination database. In this case, if the binary logs of the source database are cleared during initial full data synchronization, DTS cannot obtain the binary logs generated 24 hours ago. Therefore, the data synchronization task may fail.
Other limits
  • Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During initial full data synchronization, DTS uses read and write resources of the source and destination databases. This may increase the loads of the database servers.
  • During initial full data synchronization, concurrent INSERT operations cause fragmentation in the tables of the destination database. After initial full data synchronization is complete, the tablespace of the destination database is larger than that of the source database.
  • We recommend that you do not use gh-ost or pt-online-schema-change to perform data definition language (DDL) operations on source tables during data synchronization. Otherwise, data synchronization may fail.
  • If you use only DTS to write data to the destination database, you can use Data Management (DMS) to perform online DDL operations on source tables during data synchronization. For more information, see Change schemas without locking tables.
    Warning If you use tools other than DTS to write data to the destination database, we recommend that you do not use DMS to perform online DDL operations. Otherwise, data loss may occur in the destination database.
  • To add columns to a table that you want to synchronize, you must perform the following steps: Modify the mappings of the table in the Elasticsearch cluster, perform DDL operations in the source MySQL database, and then pause and start the data synchronization task.

Synchronize data from a PolarDB for MySQL cluster to a Message Queue for Apache Kafka instance or a self-managed Kafka cluster

The following table describes the precautions and limits.
Type Description
Limits on the source database
  • The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
  • The following requirements for binary logs must be met:
    • The binary logging feature must be enabled. The value of the binlog_format parameter must be set to row. The value of the binlog_row_image parameter must be set to full. Otherwise, error messages are returned during precheck and the data synchronization task cannot be started.
    • Binary logs are retained for at least 7 days during initial full data synchronization. You can wait until initial full data synchronization is complete, and then clear the binary logs generated in the source database after the DTS task is run.
      Note To ensure data security, DTS stores only 50 GB of binary logs or the binary logs for the last 24 hours. If the limit is exceeded, DTS automatically clears the cached logs.
      Warning If you clear the binary logs of the source database during initial full data synchronization, the data synchronization task may fail. For example, initial full data synchronization takes more than 24 hours due to the large data volume in the source database and abnormal writing in the destination database. In this case, if the binary logs of the source database are cleared during initial full data synchronization, DTS cannot obtain the binary logs generated 24 hours ago. Therefore, the data synchronization task may fail.
Other limits
  • Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During initial full data synchronization, DTS uses read and write resources of the source and destination databases. This may increase the loads of the database servers.
  • During initial full data synchronization, concurrent INSERT operations cause fragmentation in the tables of the destination database. After initial full data synchronization is complete, the tablespace of the destination database is larger than that of the source database.
  • We recommend that you do not use gh-ost or pt-online-schema-change to perform data definition language (DDL) operations on source tables during data synchronization. Otherwise, data synchronization may fail.
  • If you use only DTS to write data to the destination database, you can use Data Management (DMS) to perform online DDL operations on source tables during data synchronization. For more information, see Change schemas without locking tables.
    Warning If you use tools other than DTS to write data to the destination database, we recommend that you do not use DMS to perform online DDL operations. Otherwise, data loss may occur in the destination database.