Precautions and limits for migrating data from a PostgreSQL database - Data Transmission Service

This topic describes the precautions and limits that you must take note of when you migrate data from a PostgreSQL database, such as a self-managed PostgreSQL database or an ApsaraDB RDS for PostgreSQL instance. To ensure that your data migration task runs as expected, read the precautions and limits before you configure the task.

Scenarios of migrating data from a PostgreSQL database

The following list provides the scenarios of migrating data from a PostgreSQL database. The precautions and limits in the scenarios may vary. You can go to the related section to view the precautions and limits in a specific scenario.

Migrate data between PostgreSQL databases
Migrate data from a PostgreSQL database to a MySQL database
Migrate data from a self-managed PostgreSQL database to a PolarDB for PostgreSQL(Compatible with Oracle) cluster

Migrate data between PostgreSQL databases

Migrate data between ApsaraDB RDS for PostgreSQL instances

Category	Description
Limits on the source database	The tables to be migrated must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records. The name of the source database cannot contain hyphens (-). Example: dts-testdata. If you select tables as the objects to be migrated and you need to edit tables, such as renaming tables or columns in the destination database, up to 1,000 tables can be migrated in a single data migration task. If you run a task to migrate more than 1,000 tables, a request error occurs. In this case, we recommend that you configure multiple tasks to migrate the tables or configure a task to migrate the entire database. If you need to migrate incremental data, you must make sure that the following requirements are met: The value of the wal_level parameter must be set to logical. For an incremental data migration, the WAL logs of the source database must be stored for more than 24 hours. For a full data and incremental data migration, the WAL logs of the source database must be stored for at least seven days. Otherwise, DTS may fail to obtain the WAL logs and the task may fail. In exceptional circumstances, data inconsistency or loss may occur. After full data migration is complete, you can set the retention period to more than 24 hours. Make sure that you set the retention period of WAL logs based on the preceding requirements. Otherwise, the service reliability or performance in the Service Level Agreement (SLA) of DTS may not be guaranteed. Limits on operations to be performed on the source database: During schema migration and full data migration, do not perform DDL operations to change the schemas of databases or tables. Otherwise, the data migration task fails. If you perform only full data migration, do not write data to the source database during data migration. Otherwise, data inconsistency between the source and destination databases occurs. To ensure data consistency, we recommend that you select Schema Migration, Full Data Migration, and Incremental Data Migration as the migration types. If the source database has one or more long-running transactions and incremental data is migrated in the data migration task, the WAL logs that are generated before the long-running transactions are committed may not be cleared and therefore pile up, resulting in insufficient storage space in the source database.
Other limits	If you need to perform a primary/secondary switchover on the source ApsaraDB RDS for PostgreSQL instance, the Logical Replication Slot Failover feature must be enabled. This prevents logical subscriptions from being interrupted and ensures that your data migration task can run as expected. For more information, see Logical replication slot failover. A single data migration task can migrate data from only one database. To migrate data from multiple databases, you must create a data migration task for each database. If you select a schema as the object to be migrated and create a table in the schema or execute the RENAME statement to rename a table in the schema during incremental data migration, you must execute the `ALTER TABLE schema.table REPLICA IDENTITY FULL;` statement before you write data to the table. Note Replace the `schema` and `table` in the preceding sample statement with the actual schema name and table name. DTS does not check the validity of metadata such as sequences. You must manually check the validity of metadata. After your workloads are switched to the destination database, newly written sequences do not increment from the maximum value of the sequences in the source database. Therefore, you must query the maximum value of the sequences in the source database before you switch your workloads to the destination database. Then, you must specify the queried maximum value as the initial value of the sequences in the destination database. You can execute the following statements to query the maximum value of the sequences in the source database: `do language plpgsql $$ declare nsp name; rel name; val int8; begin for nsp,rel in select nspname,relname from pg_class t2 , pg_namespace t3 where t2.relnamespace=t3.oid and t2.relkind='S' loop execute format($_$select last_value from %I.%I$_$, nsp, rel) into val; raise notice '%', format($_$select setval('%I.%I'::regclass, %s);$_$, nsp, rel, val+1); end loop; end; $$;` DTS creates the following temporary tables in the source database to obtain the DDL statements of incremental data, the schemas of incremental tables, and the heartbeat information. During data migration, do not delete temporary tables in the source database. Otherwise, exceptions occur. After the DTS instance is released, temporary tables are automatically deleted. `public.dts_pg_class`, `public.dts_pg_attribute`, `public.dts_pg_type`, `public.dts_pg_enum`, `public.dts_postgres_heartbeat`, `public.dts_ddl_command`, and `public.dts_args_session`. If you run a full or incremental data migration task, the tables to be migrated from the source database contain foreign keys, triggers, or event triggers, and the account of the destination database is a privileged account or an account that has the permissions of the superuser role, DTS temporarily sets the session_replication_role parameter to replica at the session level during full or incremental data migration. If the account of the destination database does not have the permissions, you must manually set the session_replication_role parameter to replica in the destination database. After the session_replication_role parameter is set to replica during full migration or incremental migration, if a cascade update or delete operation is performed in the source database, data inconsistency may occur. After the DTS migration task is released, you can change the value of the session_replication_role parameter to origin. To ensure that the latency of incremental data migration is accurate, DTS creates a heartbeat table named `dts_postgres_heartbeat` in the source database. During incremental data migration, DTS creates a replication slot for the source database. The replication slot is prefixed with `dts_sync_`. By using this replication slot, DTS can obtain the incremental logs of the source database within the last 15 minutes. Note After the DTS instance is released, the replication slot is automatically deleted. If you modify the password of the source database or delete the IP address whitelist of DTS, the replication slot cannot be automatically deleted. In that case, you must manually delete the replication slot from the source database to prevent the replication slot from piling up. If the data migration task is released or fails, DTS automatically clears the replication slot. If a primary/secondary switchover is performed on the source ApsaraDB for PostgreSQL instance, you must log on to the secondary instance to clear the replication slot. Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. During full data migration, DTS uses the read and write resources of the source and destination databases. This may increase the loads of the database servers. During full data migration, concurrent INSERT operations cause fragmentation in the tables of the destination database. After full data migration is complete, the tablespace of the destination database is larger than that of the source database. Make sure that the precision settings for columns of the FLOAT or DOUBLE data type meet your business requirements. DTS uses the `ROUND(COLUMN,PRECISION)` function to retrieve values from columns of the FLOAT or DOUBLE data type. If you do not specify a precision, DTS sets the precision for the FLOAT data type to 38 digits and the precision for the DOUBLE data type to 308 digits. DTS attempts to resume data migration tasks that failed within the last seven days. Before you switch workloads to the destination database, you must stop or release the failed tasks. You can also execute the `REVOKE` statement to revoke the write permissions from the accounts that are used by DTS to access the destination database. Otherwise, the data in the source database will overwrite the data in the destination database after the failed task is resumed.
Special cases	During data migration, do not modify the endpoint or zone of the source ApsaraDB RDS for PostgreSQL instance. Otherwise, the data migration task fails.

Migrate data from a self-managed PostgreSQL database to an ApsaraDB RDS for PostgreSQL instance

Category	Description
Limits on the source database	The server on which the source database resides must have sufficient outbound bandwidth. Otherwise, the data migration speed is affected. The tables to be migrated must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records. The name of the source database cannot contain hyphens (-). Example: dts-testdata. If you select tables as the objects to be migrated and you need to edit tables, such as renaming tables or columns in the destination database, up to 1,000 tables can be migrated in a single data migration task. If you run a task to migrate more than 1,000 tables, a request error occurs. In this case, we recommend that you configure multiple tasks to migrate the tables or configure a task to migrate the entire database. DTS cannot migrate temporary tables in the source database, internal triggers, or some internal procedures and functions written in the C programming language. DTS can migrate custom parameters of the COMPOSITE, ENUM, and RANGE types. The tables to be migrated must have the PRIMARY KEY, FOREIGN KEY, UNIQUE, or CHECK constraints. If you need to migrate incremental data, you must make sure that the following requirements are met: The value of the wal_level parameter must be set to logical. For an incremental data migration, the WAL logs of the source database must be stored for more than 24 hours. For a full data and incremental data migration, the WAL logs of the source database must be stored for at least seven days. Otherwise, DTS may fail to obtain the WAL logs and the task may fail. In exceptional circumstances, data inconsistency or loss may occur. After full data migration is complete, you can set the retention period to more than 24 hours. Make sure that you set the retention period of WAL logs based on the preceding requirements. Otherwise, the service reliability or performance in the Service Level Agreement (SLA) of DTS may not be guaranteed. Limits on operations to be performed on the source database: If you perform a primary/secondary switchover on the source self-managed PostgreSQL database, the data migration task fails. During schema migration and full data migration, do not perform DDL operations to change the schemas of databases or tables. Otherwise, the data migration task fails. If the source database has one or more long-running transactions and incremental data is migrated in the data migration task, the WAL logs that are generated before the long-running transactions are committed may not be cleared and therefore pile up, resulting in insufficient storage space in the source database.
Other limits	Data may be inconsistent between the primary and secondary nodes of the source database due to migration latency. Therefore, you must use the primary node as the data source when you migrate data. A single data migration task can migrate data from only one database. To migrate data from multiple databases, you must create a data migration task for each database. If you select a schema as the object to be migrated and create a table in the schema or execute the RENAME statement to rename a table in the schema during incremental data migration, you must execute the `ALTER TABLE schema.table REPLICA IDENTITY FULL;` statement before you write data to the table. Note Replace the `schema` and `table` in the preceding sample statement with the actual schema name and table name. DTS does not check the validity of metadata such as sequences. You must manually check the validity of metadata. After your workloads are switched to the destination database, newly written sequences do not increment from the maximum value of the sequences in the source database. Therefore, you must query the maximum value of the sequences in the source database before you switch your workloads to the destination database. Then, you must specify the queried maximum value as the initial value of the sequences in the destination database. You can execute the following statements to query the maximum value of the sequences in the source database: `do language plpgsql $$ declare nsp name; rel name; val int8; begin for nsp,rel in select nspname,relname from pg_class t2 , pg_namespace t3 where t2.relnamespace=t3.oid and t2.relkind='S' loop execute format($_$select last_value from %I.%I$_$, nsp, rel) into val; raise notice '%', format($_$select setval('%I.%I'::regclass, %s);$_$, nsp, rel, val+1); end loop; end; $$;` DTS creates the following temporary tables in the source database to obtain the DDL statements of incremental data, the schemas of incremental tables, and the heartbeat information. During data migration, do not delete temporary tables in the source database. Otherwise, exceptions occur. After the DTS instance is released, temporary tables are automatically deleted. `public.dts_pg_class`, `public.dts_pg_attribute`, `public.dts_pg_type`, `public.dts_pg_enum`, `public.dts_postgres_heartbeat`, `public.dts_ddl_command`, and `public.dts_args_session`. To ensure that the latency of incremental data migration is accurate, DTS creates a heartbeat table named `dts_postgres_heartbeat` in the source database. During incremental data migration, DTS creates a replication slot for the source database. The replication slot is prefixed with `dts_sync_`. By using this replication slot, DTS can obtain the incremental logs of the source database within the last 15 minutes. Note If the data migration task is released or fails, DTS automatically clears the replication slot. If a primary/secondary switchover is performed on the source ApsaraDB for PostgreSQL instance, you must log on to the secondary instance to clear the replication slot. If you run a full or incremental data migration task, the tables to be migrated from the source database contain foreign keys, triggers, or event triggers, and the account of the destination database is a privileged account or an account that has the permissions of the superuser role, DTS temporarily sets the session_replication_role parameter to replica at the session level during full or incremental data migration. If the account of the destination database does not have the permissions, you must manually set the session_replication_role parameter to replica in the destination database. After the session_replication_role parameter is set to replica during full migration or incremental migration, if a cascade update or delete operation is performed in the source database, data inconsistency may occur. After the DTS migration task is released, you can change the value of the session_replication_role parameter to origin. Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. During full data migration, DTS uses the read and write resources of the source and destination databases. This may increase the loads of the database servers. During full data migration, concurrent INSERT operations cause fragmentation in the tables of the destination database. After full data migration is complete, the tablespace of the destination database is larger than that of the source database. Make sure that the precision settings for columns of the FLOAT or DOUBLE data type meet your business requirements. DTS uses the `ROUND(COLUMN,PRECISION)` function to retrieve values from columns of the FLOAT or DOUBLE data type. If you do not specify a precision, DTS sets the precision for the FLOAT data type to 38 digits and the precision for the DOUBLE data type to 308 digits. DTS attempts to resume data migration tasks that failed within the last seven days. Before you switch workloads to the destination database, you must stop or release the failed tasks. You can also execute the `REVOKE` statement to revoke the write permissions from the accounts that are used by DTS to access the destination database. Otherwise, the data in the source database will overwrite the data in the destination database after the failed task is resumed.

Migrate data from a PostgreSQL database to a MySQL database

The DTS console of the new version supports the following migration scenarios:

Migrate data from an ApsaraDB RDS for PostgreSQL instance to an ApsaraDB RDS for MySQL instance
Migrate data from a self-managed PostgreSQL database to a self-managed MySQL database

The following table describes the precautions and limits.

Category	Description
Limits on the source database	The server on which the source database resides must have sufficient outbound bandwidth. Otherwise, the data migration speed is affected. The tables to be migrated must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records. The name of the source database cannot contain hyphens (-). Example: dts-testdata. If you select tables as the objects to be migrated and you need to edit tables, such as renaming tables or columns in the destination database, up to 1,000 tables can be migrated in a single data migration task. If you run a task to migrate more than 1,000 tables, a request error occurs. In this case, we recommend that you configure multiple tasks to migrate the tables or configure a task to migrate the entire database. If you need to migrate incremental data, you must make sure that the following requirements are met: The value of the wal_level parameter must be set to logical. For an incremental data migration, the WAL logs of the source database must be stored for more than 24 hours. For a full data and incremental data migration, the WAL logs of the source database must be stored for at least seven days. Otherwise, DTS may fail to obtain the WAL logs and the task may fail. In exceptional circumstances, data inconsistency or loss may occur. After full data migration is complete, you can set the retention period to more than 24 hours. Make sure that you set the retention period of WAL logs based on the preceding requirements. Otherwise, the service reliability or performance in the Service Level Agreement (SLA) of DTS may not be guaranteed. Limits on operations to be performed on the source database: If you perform a primary/secondary switchover on the source self-managed PostgreSQL database, the data migration task fails. If you need to perform a primary/secondary switchover on the source ApsaraDB RDS for PostgreSQL instance, the Logical Replication Slot Failover feature must be enabled. This prevents logical subscriptions from being interrupted and ensures that your data migration task can run as expected. For more information, see Logical replication slot failover. During full data migration, do not perform DDL operations to change the schemas of databases or tables. Otherwise, the data migration task fails. If you perform only full data migration, do not write data to the source database during data migration. Otherwise, data inconsistency between the source and destination databases occurs. To ensure data consistency, we recommend that you select Full Data Migration and Incremental Data Migration as the migration types. If the source database has one or more long-running transactions and incremental data is migrated in the data migration task, the WAL logs that are generated before the long-running transactions are committed may not be cleared and therefore pile up, resulting in insufficient storage space in the source database.
Other limits	If DDL statements fail to be written to the destination database, the DTS task continues to run. You can view the DDL statements that fail to be executed in the task logs. For more information about how to view the task logs, see View task logs. A single data migration task can migrate data from only one database. To migrate data from multiple databases, you must create a data migration task for each database. If you select a schema as the object to be migrated and create a table in the schema or execute the RENAME statement to rename a table in the schema during incremental data migration, you must execute the `ALTER TABLE schema.table REPLICA IDENTITY FULL;` statement before you write data to the table. Note Replace the `schema` and `table` in the preceding sample statement with the actual schema name and table name. DTS creates the following temporary tables in the source database to obtain the DDL statements of incremental data, the schemas of incremental tables, and the heartbeat information. The DDL statements are not written to the destination database. During data migration, do not delete temporary tables in the source database. Otherwise, exceptions occur. After the DTS instance is released, temporary tables are automatically deleted. `public.dts_pg_class`, `public.dts_pg_attribute`, `public.dts_pg_type`, `public.dts_pg_enum`, `public.dts_postgres_heartbeat`, `public.dts_ddl_command`, and `public.dts_args_session`. To ensure that the latency of incremental data migration is accurate, DTS creates a heartbeat table named `dts_postgres_heartbeat` in the source database. During incremental data migration, DTS creates a replication slot for the source database. The replication slot is prefixed with `dts_sync_`. By using this replication slot, DTS can obtain the incremental logs of the source database within the last 15 minutes. Note After the DTS instance is released, the replication slot is automatically deleted. If you modify the password of the source database or delete the IP address whitelist of DTS, the replication slot cannot be automatically deleted. In that case, you must manually delete the replication slot from the source database to prevent the replication slot from piling up. If the data migration task is released or fails, DTS automatically clears the replication slot. If a primary/secondary switchover is performed on the source ApsaraDB for PostgreSQL instance, you must log on to the secondary instance to clear the replication slot. Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. During full data migration, DTS uses the read and write resources of the source and destination databases. This may increase the loads of the database servers. During full data migration, concurrent INSERT operations cause fragmentation in the tables of the destination database. After full data migration is complete, the tablespace of the destination database is larger than that of the source database. Make sure that the precision settings for columns of the FLOAT or DOUBLE data type meet your business requirements. DTS uses the `ROUND(COLUMN,PRECISION)` function to retrieve values from columns of the FLOAT or DOUBLE data type. If you do not specify a precision, DTS sets the precision for the FLOAT data type to 38 digits and the precision for the DOUBLE data type to 308 digits. DTS attempts to resume data migration tasks that failed within the last seven days. Before you switch workloads to the destination database, you must stop or release the failed tasks. You can also execute the `REVOKE` statement to revoke the write permissions from the accounts that are used by DTS to access the destination database. Otherwise, the data in the source database will overwrite the data in the destination database after the failed task is resumed.
Special cases	During data migration, do not modify the endpoint or zone of the source ApsaraDB RDS for PostgreSQL instance. Otherwise, the data migration task fails. If you migrate data to an ApsaraDB RDS for MySQL instance, DTS automatically creates a destination database in the ApsaraDB RDS for MySQL instance. However, if the name of the source database is invalid, you must manually create a database in the ApsaraDB RDS for MySQL instance before you configure the data migration task. For more information, see Create a database.

Migrate data from a self-managed PostgreSQL database to a PolarDB for PostgreSQL(Compatible with Oracle) cluster

The following table describes the precautions and limits.

Category	Description
Limits on the source database	The server on which the source database resides must have sufficient outbound bandwidth. Otherwise, the data migration speed is affected. The tables to be migrated must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records. The name of the source database cannot contain hyphens (-). Example: dts-testdata. If you select tables as the objects to be migrated and you need to edit tables, such as renaming tables or columns in the destination database, up to 1,000 tables can be migrated in a single data migration task. If you run a task to migrate more than 1,000 tables, a request error occurs. In this case, we recommend that you configure multiple tasks to migrate the tables or configure a task to migrate the entire database. If you need to migrate incremental data, you must make sure that the following requirements are met: The value of the wal_level parameter must be set to logical. For an incremental data migration, the WAL logs of the source database must be stored for more than 24 hours. For a full data and incremental data migration, the WAL logs of the source database must be stored for at least seven days. Otherwise, DTS may fail to obtain the WAL logs and the task may fail. In exceptional circumstances, data inconsistency or loss may occur. After full data migration is complete, you can set the retention period to more than 24 hours. Make sure that you set the retention period of WAL logs based on the preceding requirements. Otherwise, the service reliability or performance in the Service Level Agreement (SLA) of DTS may not be guaranteed. Limits on operations to be performed on the source database: If you perform a primary/secondary switchover on the source self-managed PostgreSQL database, the data migration task fails. During full data migration, do not perform DDL operations to change the schemas of databases or tables. Otherwise, the data migration task fails. If you perform only full data migration, do not write data to the source database during data migration. Otherwise, data inconsistency between the source and destination databases occurs. To ensure data consistency, we recommend that you select Full Data Migration and Incremental Data Migration as the migration types. If the source database has one or more long-running transactions and incremental data is migrated in the data migration task, the WAL logs that are generated before the long-running transactions are committed may not be cleared and therefore pile up, resulting in insufficient storage space in the source database.
Other limits	Before you configure a data migration task, you must create databases and tables in the destination cluster. If you select a schema as the object to be migrated and create a table in the schema or execute the RENAME statement to rename a table in the schema during incremental data migration, you must execute the `ALTER TABLE schema.table REPLICA IDENTITY FULL;` statement before you write data to the table. Note Replace the `schema` and `table` in the preceding sample statement with the actual schema name and table name. DTS creates the following temporary tables in the source database to obtain the DDL statements of incremental data, the schemas of incremental tables, and the heartbeat information. During data migration, do not delete temporary tables in the source database. Otherwise, exceptions occur. After the DTS instance is released, temporary tables are automatically deleted. `public.dts_pg_class`, `public.dts_pg_attribute`, `public.dts_pg_type`, `public.dts_pg_enum`, `public.dts_postgres_heartbeat`, `public.dts_ddl_command`, and `public.dts_args_session`. To ensure that the latency of incremental data migration is accurate, DTS creates a heartbeat table named `dts_postgres_heartbeat` in the source database. During incremental data migration, DTS creates a replication slot for the source database. The replication slot is prefixed with `dts_sync_`. By using this replication slot, DTS can obtain the incremental logs of the source database within the last 15 minutes. Note If the data migration task is released or fails, DTS automatically clears the replication slot. If a primary/secondary switchover is performed on the source ApsaraDB for PostgreSQL instance, you must log on to the secondary instance to clear the replication slot. A single data migration task can migrate data from only one database. To migrate data from multiple databases, you must create a data migration task for each database. Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. During full data migration, DTS uses the read and write resources of the source and destination databases. This may increase the loads of the database servers. During full data migration, concurrent INSERT operations cause fragmentation in the tables of the destination database. After full data migration is complete, the tablespace of the destination database is larger than that of the source database. Make sure that the precision settings for columns of the FLOAT or DOUBLE data type meet your business requirements. DTS uses the `ROUND(COLUMN,PRECISION)` function to retrieve values from columns of the FLOAT or DOUBLE data type. If you do not specify a precision, DTS sets the precision for the FLOAT data type to 38 digits and the precision for the DOUBLE data type to 308 digits. DTS attempts to resume data migration tasks that failed within the last seven days. Before you switch workloads to the destination database, you must stop or release the failed tasks. You can also execute the `REVOKE` statement to revoke the write permissions from the accounts that are used by DTS to access the destination database. Otherwise, the data in the source database will overwrite the data in the destination database after the failed task is resumed. DTS does not check the validity of metadata such as sequences. You must manually check the validity of metadata. After your workloads are switched to the destination database, newly written sequences do not increment from the maximum value of the sequences in the source database. Therefore, you must query the maximum value of the sequences in the source database before you switch your workloads to the destination database. Then, you must specify the queried maximum value as the initial value of the sequences in the destination database. You can execute the following statements to query the maximum value of the sequences in the source database: `do language plpgsql $$ declare nsp name; rel name; val int8; begin for nsp,rel in select nspname,relname from pg_class t2 , pg_namespace t3 where t2.relnamespace=t3.oid and t2.relkind='S' loop execute format($_$select last_value from %I.%I$_$, nsp, rel) into val; raise notice '%', format($_$select setval('%I.%I'::regclass, %s);$_$, nsp, rel, val+1); end loop; end; $$;`