This topic describes the precautions and limits that you must take note of when you migrate data from a PostgreSQL database, such as a self-managed PostgreSQL database or an ApsaraDB RDS for PostgreSQL instance. To ensure that your data migration task runs as expected, read the precautions and limits before you configure the task.
Scenarios of migrating data from a PostgreSQL database
The following list provides the scenarios of migrating data from a PostgreSQL database. The precautions and limits in the scenarios may vary. You can go to the related section to view the precautions and limits in a specific scenario.
Migrate data between PostgreSQL databases
Migrate data between ApsaraDB RDS for PostgreSQL instances
Category
Description
Limits on the source database
The tables to be migrated must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
The name of the source database cannot contain hyphens (-). Example: dts-testdata.
If you select tables as the objects to be migrated and you need to edit tables, such as renaming tables or columns in the destination database, up to 1,000 tables can be migrated in a single data migration task. If you run a task to migrate more than 1,000 tables, a request error occurs. In this case, we recommend that you configure multiple tasks to migrate the tables or configure a task to migrate the entire database.
If you need to migrate incremental data, you must make sure that the following requirements are met:
The value of the wal_level parameter must be set to logical.
For an incremental data migration, the WAL logs of the source database must be stored for more than 24 hours. For a full data and incremental data migration, the WAL logs of the source database must be stored for at least seven days. Otherwise, DTS may fail to obtain the WAL logs and the task may fail. In exceptional circumstances, data inconsistency or loss may occur. After full data migration is complete, you can set the retention period to more than 24 hours. Make sure that you set the retention period of WAL logs based on the preceding requirements. Otherwise, the service reliability or performance in the Service Level Agreement (SLA) of DTS may not be guaranteed.
Limits on operations to be performed on the source database:
During schema migration and full data migration, do not perform DDL operations to change the schemas of databases or tables. Otherwise, the data migration task fails.
If you perform only full data migration, do not write data to the source database during data migration. Otherwise, data inconsistency between the source and destination databases occurs. To ensure data consistency, we recommend that you select Schema Migration, Full Data Migration, and Incremental Data Migration as the migration types.
If the source database has one or more long-running transactions and incremental data is migrated in the data migration task, the WAL logs that are generated before the long-running transactions are committed may not be cleared and therefore pile up, resulting in insufficient storage space in the source database.
Other limits
If you need to perform a primary/secondary switchover on the source ApsaraDB RDS for PostgreSQL instance, the Logical Replication Slot Failover feature must be enabled. This prevents logical subscriptions from being interrupted and ensures that your data migration task can run as expected. For more information, see Logical replication slot failover.
A single data migration task can migrate data from only one database. To migrate data from multiple databases, you must create a data migration task for each database.
If you select a schema as the object to be migrated and create a table in the schema or execute the RENAME statement to rename a table in the schema during incremental data migration, you must execute the
ALTER TABLE schema.table REPLICA IDENTITY FULL;
statement before you write data to the table.NoteReplace the
schema
andtable
in the preceding sample statement with the actual schema name and table name.DTS does not check the validity of metadata such as sequences. You must manually check the validity of metadata.
After your workloads are switched to the destination database, newly written sequences do not increment from the maximum value of the sequences in the source database. Therefore, you must query the maximum value of the sequences in the source database before you switch your workloads to the destination database. Then, you must specify the queried maximum value as the initial value of the sequences in the destination database. You can execute the following statements to query the maximum value of the sequences in the source database:
do language plpgsql $$ declare nsp name; rel name; val int8; begin for nsp,rel in select nspname,relname from pg_class t2 , pg_namespace t3 where t2.relnamespace=t3.oid and t2.relkind='S' loop execute format($_$select last_value from %I.%I$_$, nsp, rel) into val; raise notice '%', format($_$select setval('%I.%I'::regclass, %s);$_$, nsp, rel, val+1); end loop; end; $$;
DTS creates the following temporary tables in the source database to obtain the DDL statements of incremental data, the schemas of incremental tables, and the heartbeat information. During data migration, do not delete temporary tables in the source database. Otherwise, exceptions occur. After the DTS instance is released, temporary tables are automatically deleted.
public.dts_pg_class
,public.dts_pg_attribute
,public.dts_pg_type
,public.dts_pg_enum
,public.dts_postgres_heartbeat
,public.dts_ddl_command
, andpublic.dts_args_session
.If you run a full or incremental data migration task, the tables to be migrated from the source database contain foreign keys, triggers, or event triggers, and the account of the destination database is a privileged account or an account that has the permissions of the superuser role, DTS temporarily sets the session_replication_role parameter to replica at the session level during full or incremental data migration. If the account of the destination database does not have the permissions, you must manually set the session_replication_role parameter to replica in the destination database. After the session_replication_role parameter is set to replica during full migration or incremental migration, if a cascade update or delete operation is performed in the source database, data inconsistency may occur. After the DTS migration task is released, you can change the value of the session_replication_role parameter to origin.
To ensure that the latency of incremental data migration is accurate, DTS creates a heartbeat table named
dts_postgres_heartbeat
in the source database.During incremental data migration, DTS creates a replication slot for the source database. The replication slot is prefixed with
dts_sync_
. By using this replication slot, DTS can obtain the incremental logs of the source database within the last 15 minutes.NoteAfter the DTS instance is released, the replication slot is automatically deleted. If you modify the password of the source database or delete the IP address whitelist of DTS, the replication slot cannot be automatically deleted. In that case, you must manually delete the replication slot from the source database to prevent the replication slot from piling up.
If the data migration task is released or fails, DTS automatically clears the replication slot. If a primary/secondary switchover is performed on the source ApsaraDB for PostgreSQL instance, you must log on to the secondary instance to clear the replication slot.
Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. During full data migration, DTS uses the read and write resources of the source and destination databases. This may increase the loads of the database servers.
During full data migration, concurrent INSERT operations cause fragmentation in the tables of the destination database. After full data migration is complete, the tablespace of the destination database is larger than that of the source database.
Make sure that the precision settings for columns of the FLOAT or DOUBLE data type meet your business requirements. DTS uses the
ROUND(COLUMN,PRECISION)
function to retrieve values from columns of the FLOAT or DOUBLE data type. If you do not specify a precision, DTS sets the precision for the FLOAT data type to 38 digits and the precision for the DOUBLE data type to 308 digits.DTS attempts to resume data migration tasks that failed within the last seven days. Before you switch workloads to the destination database, you must stop or release the failed tasks. You can also execute the
REVOKE
statement to revoke the write permissions from the accounts that are used by DTS to access the destination database. Otherwise, the data in the source database will overwrite the data in the destination database after the failed task is resumed.
Special cases
During data migration, do not modify the endpoint or zone of the source ApsaraDB RDS for PostgreSQL instance. Otherwise, the data migration task fails.
Migrate data from a self-managed PostgreSQL database to an ApsaraDB RDS for PostgreSQL instance
Category
Description
Limits on the source database
The server on which the source database resides must have sufficient outbound bandwidth. Otherwise, the data migration speed is affected.
The tables to be migrated must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.
The name of the source database cannot contain hyphens (-). Example: dts-testdata.
If you select tables as the objects to be migrated and you need to edit tables, such as renaming tables or columns in the destination database, up to 1,000 tables can be migrated in a single data migration task. If you run a task to migrate more than 1,000 tables, a request error occurs. In this case, we recommend that you configure multiple tasks to migrate the tables or configure a task to migrate the entire database.
DTS cannot migrate temporary tables in the source database, internal triggers, or some internal procedures and functions written in the C programming language. DTS can migrate custom parameters of the COMPOSITE, ENUM, and RANGE types. The tables to be migrated must have the PRIMARY KEY, FOREIGN KEY, UNIQUE, or CHECK constraints.
If you need to migrate incremental data, you must make sure that the following requirements are met:
The value of the wal_level parameter must be set to logical.
For an incremental data migration, the WAL logs of the source database must be stored for more than 24 hours. For a full data and incremental data migration, the WAL logs of the source database must be stored for at least seven days. Otherwise, DTS may fail to obtain the WAL logs and the task may fail. In exceptional circumstances, data inconsistency or loss may occur. After full data migration is complete, you can set the retention period to more than 24 hours. Make sure that you set the retention period of WAL logs based on the preceding requirements. Otherwise, the service reliability or performance in the Service Level Agreement (SLA) of DTS may not be guaranteed.
Limits on operations to be performed on the source database:
If you perform a primary/secondary switchover on the source self-managed PostgreSQL database, the data migration task fails.
During schema migration and full data migration, do not perform DDL operations to change the schemas of databases or tables. Otherwise, the data migration task fails.
If the source database has one or more long-running transactions and incremental data is migrated in the data migration task, the WAL logs that are generated before the long-running transactions are committed may not be cleared and therefore pile up, resulting in insufficient storage space in the source database.
Other limits
Data may be inconsistent between the primary and secondary nodes of the source database due to migration latency. Therefore, you must use the primary node as the data source when you migrate data.
A single data migration task can migrate data from only one database. To migrate data from multiple databases, you must create a data migration task for each database.
If you select a schema as the object to be migrated and create a table in the schema or execute the RENAME statement to rename a table in the schema during incremental data migration, you must execute the
ALTER TABLE schema.table REPLICA IDENTITY FULL;
statement before you write data to the table.NoteReplace the
schema
andtable
in the preceding sample statement with the actual schema name and table name.DTS does not check the validity of metadata such as sequences. You must manually check the validity of metadata.
After your workloads are switched to the destination database, newly written sequences do not increment from the maximum value of the sequences in the source database. Therefore, you must query the maximum value of the sequences in the source database before you switch your workloads to the destination database. Then, you must specify the queried maximum value as the initial value of the sequences in the destination database. You can execute the following statements to query the maximum value of the sequences in the source database:
do language plpgsql $$ declare nsp name; rel name; val int8; begin for nsp,rel in select nspname,relname from pg_class t2 , pg_namespace t3 where t2.relnamespace=t3.oid and t2.relkind='S' loop execute format($_$select last_value from %I.%I$_$, nsp, rel) into val; raise notice '%', format($_$select setval('%I.%I'::regclass, %s);$_$, nsp, rel, val+1); end loop; end; $$;
DTS creates the following temporary tables in the source database to obtain the DDL statements of incremental data, the schemas of incremental tables, and the heartbeat information. During data migration, do not delete temporary tables in the source database. Otherwise, exceptions occur. After the DTS instance is released, temporary tables are automatically deleted.
public.dts_pg_class
,public.dts_pg_attribute
,public.dts_pg_type
,public.dts_pg_enum
,public.dts_postgres_heartbeat
,public.dts_ddl_command
, andpublic.dts_args_session
.To ensure that the latency of incremental data migration is accurate, DTS creates a heartbeat table named
dts_postgres_heartbeat
in the source database.During incremental data migration, DTS creates a replication slot for the source database. The replication slot is prefixed with
dts_sync_
. By using this replication slot, DTS can obtain the incremental logs of the source database within the last 15 minutes.NoteIf the data migration task is released or fails, DTS automatically clears the replication slot. If a primary/secondary switchover is performed on the source ApsaraDB for PostgreSQL instance, you must log on to the secondary instance to clear the replication slot.
If you run a full or incremental data migration task, the tables to be migrated from the source database contain foreign keys, triggers, or event triggers, and the account of the destination database is a privileged account or an account that has the permissions of the superuser role, DTS temporarily sets the session_replication_role parameter to replica at the session level during full or incremental data migration. If the account of the destination database does not have the permissions, you must manually set the session_replication_role parameter to replica in the destination database. After the session_replication_role parameter is set to replica during full migration or incremental migration, if a cascade update or delete operation is performed in the source database, data inconsistency may occur. After the DTS migration task is released, you can change the value of the session_replication_role parameter to origin.
Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. During full data migration, DTS uses the read and write resources of the source and destination databases. This may increase the loads of the database servers.
During full data migration, concurrent INSERT operations cause fragmentation in the tables of the destination database. After full data migration is complete, the tablespace of the destination database is larger than that of the source database.
Make sure that the precision settings for columns of the FLOAT or DOUBLE data type meet your business requirements. DTS uses the
ROUND(COLUMN,PRECISION)
function to retrieve values from columns of the FLOAT or DOUBLE data type. If you do not specify a precision, DTS sets the precision for the FLOAT data type to 38 digits and the precision for the DOUBLE data type to 308 digits.DTS attempts to resume data migration tasks that failed within the last seven days. Before you switch workloads to the destination database, you must stop or release the failed tasks. You can also execute the
REVOKE
statement to revoke the write permissions from the accounts that are used by DTS to access the destination database. Otherwise, the data in the source database will overwrite the data in the destination database after the failed task is resumed.
Migrate data from a PostgreSQL database to a MySQL database
The DTS console of the new version supports the following migration scenarios:
Migrate data from an ApsaraDB RDS for PostgreSQL instance to an ApsaraDB RDS for MySQL instance
Migrate data from a self-managed PostgreSQL database to a self-managed MySQL database
The following table describes the precautions and limits.
Category | Description |
Limits on the source database |
|
Other limits |
|
Special cases |
|
Migrate data from a self-managed PostgreSQL database to a PolarDB for PostgreSQL(Compatible with Oracle) cluster
The following table describes the precautions and limits.
Category | Description |
Limits on the source database |
|
Other limits |
|