This topic describes the precautions and limits when you synchronize data from a PostgreSQL database, such as a self-managed PostgreSQL database and an ApsaraDB RDS for PostgreSQL database. To ensure that your data synchronization task runs as expected, read the precautions and limits before you configure the task.

Scenarios of synchronizing data from a PostgreSQL database

You can view the precautions and limits based on the following synchronization scenarios:

Synchronize data from an ApsaraDB RDS for PostgreSQL instance to an ApsaraDB RDS for MySQL instance

The following table describes the precautions and limits.
Type Description
Limits on the source database
  • The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
  • The following requirements for data logs must be met:
    • The data logging feature must be enabled.
    • Data logs are retained for at least 7 days during initial full data synchronization. You can wait until initial full data synchronization is complete, and then clear the data logs generated in the source database after the DTS task is run.
      Note To ensure data security, the DTS server stores only 50 GB of data logs or the data logs for the last 24 hours. If the limit is exceeded, the DTS server automatically clears the cached logs.
      Warning If you clear the data logs of the source database during initial full data synchronization, the data synchronization task may fail. For example, initial full data synchronization takes more than 24 hours due to the large data volume in the source database and abnormal writing in the destination database. In this case, if the data logs of the source database are cleared during initial full data synchronization, DTS cannot obtain the data logs generated 24 hours ago. Therefore, the data synchronization task may fail.
  • Limits on operations:
    To ensure that the data synchronization task runs as expected, you can perform primary/secondary switchover only on an ApsaraDB RDS for PostgreSQL instance V11. In this case, you must set the rds_failover_slot_mode parameter to sync. For more information, see Use the failover slot feature for logical subscriptions.
    Warning If you perform primary/secondary switchover on a self-managed PostgreSQL database or an ApsaraDB RDS for PostgreSQL instance of other versions, the data synchronization task stops.
Other limits
  • You can configure a data synchronization task for this scenario only in the DTS console of the new version.
  • A single data synchronization task can synchronize data from only one database. To synchronize data from multiple databases, you must create a data synchronization task for each database.
  • During data synchronization, if you select a schema as the object to be synchronized, take note of the following limits: If you create a table in the schema or run the RENAME command to rename the table, you must run the ALTER TABLE schema.table REPLICA IDENTITY FULL; command before you write data to the table. This ensures data consistency.
    Note Replace the schema and table in the preceding sample command with the actual schema name and table name.
  • To ensure that the latency of data synchronization is accurate, DTS adds a heartbeat table to the source database. The name of the heartbeat table is dts_postgres_heartbeat.
  • During data synchronization, DTS creates a replication slot for the source database. The replication slot is prefixed with dts_sync_. DTS automatically clears historical replication slots every 90 minutes to reduce storage usage.
    Note If the data synchronization task is released or fails, DTS automatically clears the replication slot. If a primary/secondary switchover is performed on the source ApsaraDB RDS for PostgreSQL instance, you must log on to the secondary database to clear the replication slot.
    Replication slot information
  • Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During initial full data synchronization, DTS uses read and write resources of the source and destination databases. This may increase the loads of the database servers.
  • During initial full data synchronization, concurrent INSERT operations cause fragmentation in the tables of the destination database. After initial full data synchronization is complete, the tablespace of the destination database is larger than that of the source database.
  • We recommend that you do not use gh-ost or pt-online-schema-change to perform data definition language (DDL) operations on source tables during data synchronization. Otherwise, data synchronization may fail.
  • If you use only DTS to write data to the destination database, you can use Data Management (DMS) to perform online DDL operations on source tables during data synchronization. For more information, see Change schemas without locking tables.
    Warning If you use tools other than DTS to write data to the destination database, we recommend that you do not use DMS to perform online DDL operations. Otherwise, data loss may occur in the destination database.
Special cases
  • If the source instance is an ApsaraDB RDS for PostgreSQL instance, take note of the following limits:

    During data synchronization, do not modify the endpoint and zone of the ApsaraDB RDS for PostgreSQL instance. Otherwise, the data synchronization task fails.

Synchronize data from a self-managed PostgreSQL database or an ApsaraDB RDS for PostgreSQL instance to an AnalyticDB for PostgreSQL instance

The following table describes the precautions and limits.
Type Description
Limits on the source database
  • The source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
  • The following requirements for data logs must be met:
    • The data logging feature must be enabled.
    • Data logs are retained for at least 7 days during initial full data synchronization. You can wait until initial full data synchronization is complete, and then clear the data logs generated in the source database after the DTS task is run.
      Note To ensure data security, the DTS server stores only 50 GB of data logs or the data logs for the last 24 hours. If the limit is exceeded, the DTS server automatically clears the cached logs.
      Warning If you clear the data logs of the source database during initial full data synchronization, the data synchronization task may fail. For example, initial full data synchronization takes more than 24 hours due to the large data volume in the source database and abnormal writing in the destination database. In this case, if the data logs of the source database are cleared during initial full data synchronization, DTS cannot obtain the data logs generated 24 hours ago. Therefore, the data synchronization task may fail.
  • Limits on operations:
    To ensure that the data synchronization task runs as expected, you can perform primary/secondary switchover only on an ApsaraDB RDS for PostgreSQL instance V11. In this case, you must set the rds_failover_slot_mode parameter to sync. For more information, see Use the failover slot feature for logical subscriptions.
    Warning If you perform primary/secondary switchover on a self-managed PostgreSQL database or an ApsaraDB RDS for PostgreSQL instance of other versions, the data synchronization task stops.
Other limits
  • Requirements for the objects to be synchronized:
    • Only tables can be selected as the objects to be synchronized.
    • DTS does not synchronize the following types of data: BIT, VARBIT, GEOMETRY, ARRAY, UUID, TSQUERY, TSVECTOR, and TXID_SNAPSHOT.
  • A single data synchronization task can synchronize data from only one database. To synchronize data from multiple databases, you must create a data synchronization task for each database.
  • During data synchronization, if you select a schema as the object to be synchronized, take note of the following limits: If you create a table in the schema or run the RENAME command to rename the table, you must run the ALTER TABLE schema.table REPLICA IDENTITY FULL; command before you write data to the table. This ensures data consistency.
    Note Replace the schema and table in the preceding sample command with the actual schema name and table name.
  • To ensure that the latency of data synchronization is accurate, DTS adds a heartbeat table to the source database. The name of the heartbeat table is dts_postgres_heartbeat.
  • During data synchronization, DTS creates a replication slot for the source database. The replication slot is prefixed with dts_sync_. DTS automatically clears historical replication slots every 90 minutes to reduce storage usage.
    Note If the data synchronization task is released or fails, DTS automatically clears the replication slot. If a primary/secondary switchover is performed on the source ApsaraDB RDS for PostgreSQL instance, you must log on to the secondary database to clear the replication slot.
    Replication slot information
  • Before you synchronize data, evaluate the impact of data synchronization on the performance of the source and destination databases. We recommend that you synchronize data during off-peak hours. During initial full data synchronization, DTS uses read and write resources of the source and destination databases. This may increase the loads of the database servers.
  • During initial full data synchronization, concurrent INSERT operations cause fragmentation in the tables of the destination database. After initial full data synchronization is complete, the tablespace of the destination database is larger than that of the source database.
  • We recommend that you do not use gh-ost or pt-online-schema-change to perform data definition language (DDL) operations on source tables during data synchronization. Otherwise, data synchronization may fail.
  • If you use only DTS to write data to the destination database, you can use Data Management (DMS) to perform online DDL operations on source tables during data synchronization. For more information, see Change schemas without locking tables.
    Warning If you use tools other than DTS to write data to the destination database, we recommend that you do not use DMS to perform online DDL operations. Otherwise, data loss may occur in the destination database.
Special cases
  • If the source instance is an ApsaraDB RDS for PostgreSQL instance, take note of the following limits:

    During data synchronization, do not modify the endpoint and zone of the ApsaraDB RDS for PostgreSQL instance. Otherwise, the data synchronization task fails.