Use Data Transmission Service (DTS) to replicate data from a PolarDB for MySQL cluster into an AnalyticDB for PostgreSQL instance for analytics workloads. DTS handles schema creation, full data load, and ongoing incremental replication in a single task.
Prerequisites
Before you begin, ensure that you have:
A destination AnalyticDB for PostgreSQL instance with available storage space larger than the storage space used by the source PolarDB for MySQL cluster. See Create an instance.
A database created in the destination AnalyticDB for PostgreSQL instance to store the synchronized data. See CREATE DATABASE.
Binary logging enabled on the source PolarDB for MySQL cluster, with
loose_polar_log_binset toON. See Enable binary logging and Set cluster and node parameters.Binary log retention set to at least 3 days on the source cluster (7 days recommended). See Modify the retention period.
Database accounts with the required permissions. See Permissions required for database accounts.
Enabling binary logging on a PolarDB for MySQL cluster consumes storage space and incurs fees.
Billing
| Synchronization type | Pricing |
|---|---|
| Schema synchronization and full data synchronization | Free of charge |
| Incremental data synchronization | Charged. See Billing overview. |
Permissions required for database accounts
| Database | Required permissions | How to create and authorize |
|---|---|---|
| Source PolarDB for MySQL cluster | Read and write permissions for the objects to sync | Create and manage database accounts and manage database account passwords |
| Destination AnalyticDB for PostgreSQL instance | Read and write permissions for the destination database | Create and manage users and manage user permissions. You can also use the initial account or an account with RDS_SUPERUSER permissions. |
Supported SQL operations for incremental synchronization
| Operation type | SQL statements |
|---|---|
| DML | INSERT, UPDATE, DELETE. DTS converts UPDATE statements to REPLACE INTO when writing to the destination. For UPDATE operations on primary keys, DTS converts them to DELETE and INSERT statements. |
| DDL | CREATE TABLE, ALTER TABLE, TRUNCATE TABLE, DROP TABLE |
Limitations
Source database
Tables to synchronize must have a primary key or a UNIQUE constraint with unique fields. Without this, duplicate data may appear in the destination database.
For table-level synchronization where you need to edit the tables (such as mapping column names), a single task supports a maximum of 1,000 tables. If you exceed this limit, split the tables across multiple tasks or configure a task to synchronize the entire database.
Binary log retention must be at least 3 days. If DTS cannot obtain the binary logs, the task may fail—in extreme cases causing data inconsistency or data loss. Issues from a retention period shorter than required are not covered by the DTS Service-Level Agreement (SLA).
Do not run DDL operations that modify primary keys or add table comments (for example,
ALTER TABLE table_name COMMENT='Table comment';) during synchronization. These operations will fail.Do not run DDL operations that change database or table schemas during schema synchronization or full synchronization. The task will fail. During full synchronization, DTS queries the source database, creating metadata locks that can block DDL operations on the source.
If the source database contains date values of
0000-00-00 00:00:00, the task may fail. DTS converts these values tonullin the destination. As a workaround, change the source values to0001-01-01 00:00:00or configure the destination field to allow null values.
Other limitations
Only table-level synchronization is supported.
The following data types are not supported: BIT, VARBIT, GEOMETRY, ARRAY, UUID, TSQUERY, TSVECTOR, TXID_SNAPSHOT.
Prefix indexes are not supported. If the source has prefix indexes, synchronization may fail.
Synchronization from a read-only node of the source PolarDB for MySQL cluster is not supported.
OSS foreign tables cannot be synchronized from the source.
The following object types are not supported: PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, FK, INDEX.
Append-Only (AO) tables are not supported as destination tables.
Primary/secondary failover is not supported during initial full data synchronization. If a failover occurs, reconfigure the task.
Online DDL tools: pt-online-schema-change is not supported and will cause the task to fail. For online DDL in temporary table mode or function-based indexes on unique key columns, data loss or task failure may occur.
If you use column mapping for non-full table synchronization, or if the source and destination table schemas are inconsistent, data in columns missing from the destination table will be lost.
If the table to synchronize has a primary key, the primary key column in the destination table must match the source table. If the table lacks a primary key, the primary key column in the destination table must match the distribution key.
The unique key in the destination table—including the primary key column—must include all columns in the distribution key.
Initial full data synchronization runs concurrent INSERT operations, which causes fragmentation. As a result, the tablespace of the destination instance may be larger than that of the source after full synchronization completes.
Schedule synchronization during off-peak hours. Full data synchronization consumes read and write resources on both databases and may increase the database load.
For table-level synchronization, do not write data other than the data from DTS to the destination database. This may cause data inconsistency between source and destination.
For table-level synchronization, do not use pt-online-schema-change for online DDL operations on synchronization objects in the source database.
For table-level synchronization, if no data other than data from DTS is written to the destination database, you can use Data Management (DMS) for online DDL operations. See Change schemas without locking tables.
If a task fails, DTS support staff will attempt to restore it within 8 hours. They may restart the task or adjust DTS task parameters (not database parameters). Parameters that may be adjusted are listed in Modify instance parameters.
Conflict resolution behavior
| Phase | When primary or unique key conflicts | Result |
|---|---|---|
| Full data synchronization | Schemas consistent: record exists in destination with same primary/unique key | DTS retains the destination record and skips the source record |
| Incremental synchronization | Schemas consistent: record exists in destination with same primary/unique key | DTS overwrites the destination record with the source record |
| Either phase | Schemas inconsistent | Data initialization may fail, resulting in partial or complete synchronization failure |
Usage notes
DTS periodically runs
CREATE DATABASE IF NOT EXISTS \test\`` on the source database to advance the binary log offset.When using DMS or gh-ost for online DDL changes, configure the Copy the temporary table of the Online DDL tool advanced setting accordingly.
Create a synchronization task
The task configuration has eight steps: open the task page, connect source and destination databases, select synchronization objects, configure advanced settings, optionally set up data verification, optionally configure destination table fields, run the precheck, and purchase the instance.
Step 1: Open the task configuration page
Log on to the Data Management (DMS) console.
In the top navigation bar, click Data + AI.
In the left-side navigation pane, choose DTS (DTS) > Data Synchronization.
The navigation path may vary based on the DMS console mode and layout. See Simple mode console and Customize the layout and style of the DMS console. You can also go directly to the Data Synchronization Tasks page of the DTS console.
Click Create Task.
Step 2: Configure source and destination databases
On the task configuration page, fill in the following fields:
| Category | Field | Description |
|---|---|---|
| N/A | Task Name | DTS generates a name automatically. Specify a descriptive name for easy identification. The name does not need to be unique. |
| Source Database | Select Existing Connection | Select a registered database instance from the drop-down list to auto-populate the fields below, or configure the fields manually. In the DMS console, this field is labeled Select a DMS database instance. |
| Database Type | Select PolarDB for MySQL. | |
| Access Method | Select Alibaba Cloud Instance. | |
| Instance Region | Select the region where the source PolarDB for MySQL cluster resides. | |
| Replicate Data Across Alibaba Cloud Accounts | Select No if the source instance belongs to the current Alibaba Cloud account. | |
| PolarDB Cluster ID | Select the ID of the source PolarDB for MySQL cluster. | |
| Database Account | Enter the database account for the source cluster. See Permissions required for database accounts. | |
| Database Password | Enter the password for the database account. | |
| Destination Database | Select Existing Connection | Select a registered database instance from the drop-down list to auto-populate the fields below, or configure the fields manually. In the DMS console, this field is labeled Select a DMS database instance. |
| Database Type | Select AnalyticDB for PostgreSQL. | |
| Access Method | Select Alibaba Cloud Instance. | |
| Instance Region | Select the region where the destination AnalyticDB for PostgreSQL instance resides. | |
| Instance ID | Select the ID of the destination AnalyticDB for PostgreSQL instance. | |
| Database Name | Enter the name of the destination database to receive the synchronized data. | |
| Database Account | Enter the database account for the destination instance. See Permissions required for database accounts. | |
| Database Password | Enter the password for the database account. |
After filling in all fields, click Test Connectivity and Proceed.
Make sure that the DTS IP address blocks have been added to the security settings of both the source and destination databases—either automatically by DTS or manually. See Add the IP address whitelist of DTS servers.
Step 3: Configure synchronization objects
On the Configure Objects page, configure the following settings:
Synchronization types
DTS always selects Incremental Data Synchronization. Also select Schema Synchronization and Full Data Synchronization (selected by default). After the precheck, DTS initializes the destination instance with the full data of the selected source objects, which serves as the baseline for subsequent incremental synchronization. With all three types enabled, DTS:
Copies the schema to the destination.
Loads a full snapshot of the selected source objects as the baseline.
Applies ongoing incremental changes from the binary log.
Objects and operations
| Field | Description |
|---|---|
| DDL and DML Operations to Be Synchronized | Select the DDL or DML operations to synchronize at the instance level. To select operations for a specific database or table, right-click the object in the Selected Objects box. See Supported SQL operations for incremental synchronization. |
| Processing Mode of Conflicting Tables | Precheck and Report Errors: Checks for tables with the same names in the destination. If any are found, the precheck reports an error and the task does not start. If you cannot delete or rename the conflicting table, use object name mapping to rename it in the destination. Ignore Errors and Proceed: Skips the duplicate-table check. Use with caution—this may cause data inconsistency. See Conflict resolution behavior for details. |
| Storage Engine Type | Select the storage engine for destination tables. The default is Beam. This field is available only if the destination AnalyticDB for PostgreSQL kernel version is v7.0.6.6 or later and Schema Synchronization is selected. |
| Capitalization of Object Names in Destination Instance | Configure the case policy for database, table, and column names in the destination. The default is DTS default policy. See Case policy for destination object names. |
| Source Objects | Click objects in the Source Objects box, then click the right arrow to move them to the Selected Objects box. Select databases or tables as synchronization objects. |
| Selected Objects | To rename a single object, right-click it in the Selected Objects box. See Map a single object name. To rename multiple objects in bulk, click Batch Edit in the upper-right corner. See Map multiple object names in bulk. To filter rows in a table, right-click the table and configure filter conditions. See Set filter conditions. If you use object name mapping, synchronization of dependent objects may fail. |
Step 4: Configure advanced settings
Click Next: Advanced Settings and configure the following options:
| Field | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | By default, DTS uses a shared cluster. To improve task stability, purchase a dedicated cluster. See What is a DTS dedicated cluster?. |
| Copy the temporary table of the Online DDL tool that is generated in the source table to the destination database. | If the source uses DMS or gh-ost for online DDL changes, select how DTS handles the temporary tables generated. Yesalert notifications: Synchronizes the temporary tables. Large temporary tables may cause synchronization latency. No, Adapt to DMS Online DDL: Does not synchronize temporary tables. Synchronizes only the original DDL statements from DMS. This causes table locks on the destination. No, Adapt to gh-ost: Does not synchronize temporary tables. Synchronizes only the original DDL statements from gh-ost. This causes table locks on the destination. Note: DTS does not support pt-online-schema-change for online DDL changes; using it will cause the task to fail. |
| Retry Time for Failed Connections | If the connection to the source or destination fails, DTS retries immediately. Default: 720 minutes. Range: 10–1,440 minutes. Set to 30 minutes or more. If multiple DTS instances share a source or destination, DTS applies the shortest configured retry duration across all instances. DTS charges for task runtime during retries. |
| Retry Time for Other Issues | If a non-connection issue occurs (for example, a DDL or DML execution error), DTS retries immediately. Default: 10 minutes. Range: 1–1,440 minutes. Set to 10 minutes or more. This value must be less than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Synchronization | Limit the full synchronization rate to reduce load on the databases. Set Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Synchronization is selected. You can also adjust the rate while the task is running. |
| Enable Throttling for Incremental Data Synchronization | Limit the incremental synchronization rate by setting RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s). |
| Environment Tag | Select a tag to identify the instance based on your business needs. |
| Whether to delete SQL operations on heartbeat tables of forward and reverse tasks | Yes: DTS does not write heartbeat SQL to the source database. The DTS instance may show latency. No: DTS writes heartbeat SQL to the source database. This may interfere with operations such as physical backups and cloning. |
| Configure ETL | Enable the extract, transform, and load (ETL) feature to transform data in flight. Yes: Enter data processing statements in the code editor. See Configure ETL in a data migration or data synchronization task. No: Disable ETL. See What is ETL?. |
| Monitoring and Alerting | Yes: Configure alerts for task failures or latency exceeding a threshold. Set the alert threshold and notification contacts. See Configure monitoring and alerting during task configuration. No: No alerts. |
Step 5: Configure data verification (optional)
Click Data Verification to set up a data verification task. See Configure data verification.
Step 6: Configure destination table fields (optional)
Click Next: Configure Database and Table Fields to set the Type, Primary Key Column, and Distribution Key for each destination table in AnalyticDB for PostgreSQL.
This step is available only when Schema Synchronization is selected. Set Definition Status to All to modify the fields. To form a composite primary key, specify multiple columns in Primary Key Column, then select one or more of those columns as the Distribution Key. See Manage tables and Define table distribution.
Step 7: Run the precheck
Click Next: Save Task Settings and Precheck.
To preview the API parameters for this task configuration, hover over the Next: Save Task Settings and Precheck button and click Preview OpenAPI parameters.
Before a synchronization task starts, DTS performs a precheck. The task starts only if the precheck passes.
If the precheck fails, click View Details next to the failed item, fix the issue, and rerun the precheck.
If the precheck returns warnings:
For non-ignorable warnings, click View Details, fix the issue, and rerun the precheck.
For ignorable warnings, click Confirm Alert Details, then Ignore, then OK, then Precheck Again to proceed. Ignoring warnings may lead to data inconsistencies. Proceed with caution.
Step 8: Purchase and start the instance
When the Success Rate reaches 100%, click Next: Purchase Instance.
On the Purchase page, select the billing method and instance class:
| Category | Parameter | Description |
|---|---|---|
| New Instance Class | Billing Method | Subscription: Pay upfront for a set duration. Cost-effective for long-term, continuous tasks. Monthly options: 1–9 months. Yearly options: 1, 2, 3, or 5 years. Pay-as-you-go: Billed hourly for actual usage. Suitable for short-term or test tasks. |
| Resource Group Settings | — | The resource group for the instance. Default: default resource group. See What is Resource Management?. |
| Instance Class | — | Select a specification based on your required synchronization rate. See Data synchronization link specifications. |
Read and select Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start and confirm in the dialog box.
Monitor the task progress on the Data Synchronization Tasks page.Data Synchronization Tasks page of the new DTS console
What's next
To verify that data is synchronized correctly, configure a data verification task. See Configure data verification.
To modify synchronization objects or task parameters after the task starts, see Modify instance parameters.