Synchronize data from an RDS for MySQL instance to an AnalyticDB for PostgreSQL instance - Data Transmission Service

Data Transmission Service (DTS) keeps an AnalyticDB for PostgreSQL instance in sync with an ApsaraDB RDS for MySQL source, enabling real-time analytics on live business data. This topic shows you how to configure a data synchronization task for this pipeline.

Supported source databases

The procedure in this topic uses an ApsaraDB RDS for MySQL instance as the source. The same steps apply when the source is any of the following MySQL database types:

ApsaraDB RDS for MySQL instance
Self-managed MySQL database hosted on Elastic Compute Service (ECS)
Self-managed MySQL database connected over Express Connect, VPN Gateway, or Smart Access Gateway
Self-managed MySQL database connected over Database Gateway
Self-managed MySQL database connected over Cloud Enterprise Network (CEN)

Prerequisites

Before you begin, ensure that you have:

An ApsaraDB RDS for MySQL source instance
A destination AnalyticDB for PostgreSQL instance. See Create an instance
A database account on the source instance with the REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions
A database account on the destination instance with the initial account role or RDS_SUPERUSER permission. See Manage users and permissions

Binary log requirements

Verify the following binary log settings before you configure DTS.

For ApsaraDB RDS for MySQL (binary logging is enabled by default):

Set binlog_row_image to full. See Modify instance parameters
Retain binary logs for at least 3 days (7 days recommended). See Delete binary log files

For self-managed MySQL:

Enable binary logging
Set binlog_format to row
Set binlog_row_image to full
Retain binary logs for at least 7 days
For dual-primary clusters, set log_slave_updates to ON. See Create an account for a self-managed MySQL database and configure binary logging

Warning

If binary logs are purged before DTS reads them, the synchronization task fails and data inconsistency or loss may occur. DTS does not guarantee Service Level Agreement (SLA) performance if retention periods are shorter than the minimums above.

Billing

Synchronization type	Fee
Schema synchronization and full data synchronization	Free of charge
Incremental data synchronization	Charged. See Billing overview

Limitations

Source database

Tables must have PRIMARY KEY or UNIQUE constraints with all-unique fields. Tables without these constraints may produce duplicate records in the destination.
When synchronizing individual tables (not entire databases) and you want to edit the tables, such as renaming tables or columns, a single task supports up to 1,000 tables. If a task exceeds 1,000 tables, a request error occurs. For larger sets, configure multiple tasks or synchronize at the database level.
DTS does not synchronize the following data types: BIT, VARBIT, GEOMETRY, ARRAY, UUID, TSQUERY, TSVECTOR, TXID_SNAPSHOT, and POINT.
Prefix indexes are not supported and may cause the task to fail.
Only tables can be selected as synchronization objects. Indexes, partitions, views, stored procedures, functions, triggers, and foreign keys are excluded.
Do not execute DDL statements that modify primary keys or add comments during synchronization. For example, ALTER TABLE table_name COMMENT='...' has no effect.
Do not execute DDL statements that change database or table schemas during schema synchronization or full data synchronization phases.
If the source is MySQL 8.0.23 or later and tables include invisible columns, DTS cannot read those columns and data loss occurs. Run ALTER TABLE <table_name> ALTER COLUMN <column_name> SET VISIBLE; to make columns visible before starting the task. Tables without explicit primary keys auto-generate invisible primary keys — make those visible too. See Invisible Columns and Generated Invisible Primary Keys.
If the source is an ApsaraDB RDS for MySQL instance with EncDB enabled, full data synchronization is not supported. Instances with Transparent Data Encryption (TDE) enabled support all three synchronization types.
DATETIME values of 0000-00-00 00:00:00 are converted to null in the destination. To avoid this, change the value to 0001-01-01 00:00:00 in the source or leave the destination field blank before starting the task.
Data generated by binary log change operations — such as data restored from a physical backup or data from cascade operations — is not recorded or synchronized to the destination while the synchronization instance is running. If this occurs, you can remove the affected databases or tables from the synchronized objects and then re-add them, provided your business is not impacted. See Modify the objects to be synchronized.

Destination database

The destination table cannot be an append-optimized (AO) table.
The unique key (including primary key) of the destination table must include all columns of the distribution key.
If the source table has a primary key, the destination primary key column matches the source. If the source table has no primary key, the destination primary key column matches the distribution key.

Other limitations

Data written to the destination from other sources during synchronization may cause inconsistency. Avoid concurrent writes to destination tables while DTS is running.
Online DDL tools such as pt-online-schema-change are not supported when individual tables are selected as objects. Use Data Management (DMS) for lock-free DDL instead. See Perform lock-free DDL operations.
If column mapping is used or source and destination schemas differ, columns present in the source but absent in the destination are not synchronized.
During schema synchronization, DTS synchronizes foreign keys from the source database to the destination database.
DTS temporarily disables foreign key constraint checks and cascade operations at the session level during full and incremental data synchronization. Cascade updates or deletes on the source during this period may cause data inconsistency.
During initial full data synchronization, concurrent INSERT operations cause table fragmentation in the destination. The destination tablespace will be larger than the source after full sync completes. Schedule initial synchronization during off-peak hours to reduce load.
If a DTS task fails, DTS technical support attempts to restore it within 8 hours. The task may be restarted and its parameters adjusted during recovery.

Self-managed MySQL source

A primary/secondary switchover while the task is running causes the task to fail.
DTS calculates synchronization latency based on the latest synchronized timestamp versus the current source timestamp. If no DML operations run on the source for an extended period, the reported latency may be inaccurate. Run a DML operation on the source to refresh the latency metric. If you select an entire database as the synchronization object, you can create a heartbeat table that is updated every second to keep the latency metric accurate.
DTS periodically executes CREATE DATABASE IF NOT EXISTS 'test' on the source to advance the binary log position.

ApsaraDB RDS for MySQL source

ApsaraDB RDS for MySQL instances that do not record transaction logs, such as read-only ApsaraDB RDS for MySQL 5.6 instances, cannot be used as the source.
DTS periodically executes CREATE DATABASE IF NOT EXISTS 'test' on the source to advance the binary log position.

SQL operations that can be synchronized

DML: INSERT, UPDATE, DELETE

When writing to AnalyticDB for PostgreSQL, DTS converts UPDATE to REPLACE INTO. If UPDATE targets primary key columns, DTS converts it to DELETE followed by INSERT.

DDL: CREATE TABLE, ALTER TABLE, TRUNCATE TABLE, DROP TABLE

Supported synchronization topologies

One-way one-to-one
One-way one-to-many
One-way many-to-one

Term mappings

MySQL	AnalyticDB for PostgreSQL
Database	Schema
Table	Table

Configure a data synchronization task

This procedure has 9 steps: open the task page, select a region, configure source and destination databases, test connectivity, configure objects and settings, configure advanced settings, set key columns, run the precheck, and purchase and start the instance.

Step 1: Open the Data Synchronization Tasks page

Log on to the Data Management (DMS) console.
In the top navigation bar, click Data + AI.
In the left-side navigation pane, choose DTS (DTS) > Data Synchronization.

Console navigation may vary based on your DMS layout. See Simple mode and Customize the layout and style of the DMS console. Alternatively, go directly to the Data Synchronization Tasks page.

Step 2: Select a region

On the right side of Data Synchronization Tasks, select the region where the data synchronization instance resides.

In the new DTS console, select the region from the top navigation bar.

Step 3: Configure the source and destination databases

Click Create Task. In the wizard, configure the following parameters.

Source database

Parameter	Description
Task Name	A name for the DTS task. DTS generates one automatically. Use a descriptive name for easy identification — uniqueness is not required.
Database Type	Select MySQL.
Access Method	Select Alibaba Cloud Instance.
Instance Region	The region of the source ApsaraDB RDS for MySQL instance.
Replicate Data Across Alibaba Cloud Accounts	Select No for same-account synchronization.
RDS Instance ID	The ID of the source instance. The source and destination can be the same instance (for within-instance synchronization) or different instances.
Database Account	The account with REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions.
Database Password	The password for the database account.
Encryption	Select Non-encrypted or SSL-encrypted. To use SSL encryption, enable it on the RDS instance first. See Use a cloud certificate to enable SSL encryption.

Destination database

Parameter	Description
Database Type	Select AnalyticDB for PostgreSQL.
Access Method	Select Alibaba Cloud Instance.
Instance Region	The region of the destination AnalyticDB for PostgreSQL instance.
Instance ID	The ID of the destination instance.
Database Name	The name of the target database in the destination instance.
Database Account	The initial account or an account with the RDS_SUPERUSER permission.
Database Password	The password for the database account.

Step 4: Test connectivity

Click Test Connectivity and Proceed.

DTS automatically adds its server CIDR blocks to the whitelist of Alibaba Cloud database instances and to the security group rules of ECS-hosted databases. If the database is hosted on ECS, you must also ensure that the ECS instance can access the database. If the database is deployed across multiple ECS instances, you must manually add the DTS CIDR blocks to the security group rules of each ECS instance. For self-managed databases in data centers or third-party clouds, manually add the DTS CIDR blocks to the database whitelist. See CIDR blocks of DTS servers.

Warning

Adding DTS CIDR blocks to whitelists or security groups introduces security exposure. Before proceeding, take preventive measures such as enforcing strong credentials, restricting exposed ports, regularly auditing whitelist rules, and removing unauthorized CIDR blocks. Consider connecting through Express Connect, VPN Gateway, or Smart Access Gateway to reduce exposure.

Step 5: Configure synchronization objects and settings

Parameter	Description
Synchronization Types	Select Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization. By default, Incremental Data Synchronization is selected; you must also select the other two. Schema and full synchronization establish the baseline; incremental synchronization keeps the destination current afterward.
Processing Mode of Conflicting Tables	Precheck and Report Errors (default): fails the precheck if identically named tables exist in the destination. Use object name mapping to rename conflicting tables if needed. See Map object names. Ignore Errors and Proceed: skips the precheck for name conflicts. During full sync, existing destination records with matching primary or unique key values are retained. During incremental sync, they are overwritten. If schemas differ, initialization may fail, only partial columns are synchronized, or the data synchronization instance fails — proceed with caution.
DDL and DML Operations to Be Synchronized	The SQL operations to synchronize. See SQL operations that can be synchronized. To configure per-object SQL filtering, right-click an object in Selected Objects and select the operations.
Source Objects	Select objects and click the arrow icon to add them to Selected Objects. Only tables are supported.
Selected Objects	To rename a single object, right-click it. For bulk renaming, click Batch Edit. See Map object names. To filter rows, right-click a table and specify WHERE conditions. See Specify filter conditions. Renaming an object may break synchronization of dependent objects.

Step 6: Configure advanced settings

Click Next: Advanced Settings.

Data verification

See Configure data verification.

Advanced settings

Parameter	Description
Dedicated Cluster for Task Scheduling	By default, DTS schedules to the shared cluster. Purchase a dedicated cluster for improved stability. See What is a DTS dedicated cluster.
Set Alerts	No: alerting disabled. Yes: configure alert thresholds and contacts. See Configure monitoring and alerting.
Retry Time for Failed Connections	How long DTS retries on connection failure. Range: 10–1440 minutes. Default: 720 minutes. Set to 30 minutes or more. If multiple tasks share the same source or destination, the shortest retry window takes precedence. DTS charges for the instance during retries — release the instance promptly if the source and destination are decommissioned.
Retry Time for Other Issues	How long DTS retries on DDL/DML failures. Range: 1–1440 minutes. Default: 10 minutes. Must be less than Retry Time for Failed Connections.
Enable Throttling for Full Data Synchronization	Limit QPS to the source, RPS for full sync, and data throughput (MB/s) to reduce load during initial full data synchronization. Available only when Full Data Synchronization is selected.
Enable Throttling for Incremental Data Synchronization	Limit RPS and throughput (MB/s) for incremental synchronization to reduce load on the destination.
Environment Tag	Tag the DTS instance by environment. Optional.
Enclose Object Names in Quotation Marks	Yes: DTS encloses schema, table, and column names in single or double quotation marks during schema and incremental synchronization when names contain mixed case, start with non-letters, include non-standard characters, or conflict with reserved words in the destination. No: names are not quoted.
Configure ETL	Yes: enter data transformation statements in the code editor. See Configure ETL. No: no transformation.
Whether to delete SQL operations on heartbeat tables of forward and reverse tasks	Yes: DTS does not write heartbeat table operations to the source. A latency metric may be displayed on the task. No: DTS writes heartbeat operations to the source, which may affect physical backup and cloning of the source database.

Step 7: Set primary key and distribution key columns

Click Next: Configure Database and Table Fields. Set the primary key and distribution key columns for each table being synchronized to the destination AnalyticDB for PostgreSQL instance.

Step 8: Run the precheck

Click Next: Save Task Settings and Precheck.

To preview the equivalent API parameters before saving, hover over the button and click Preview OpenAPI parameters.

DTS runs a precheck before starting synchronization. If any item fails:

Click View Details next to the failed item, resolve the issue, then click Precheck Again.
If an alert item can be safely ignored, click Confirm Alert Details, then Ignore in the dialog, then OK, then Precheck Again. Ignoring alerts may lead to data inconsistency.

Wait until Success Rate reaches 100%, then click Next: Purchase Instance.

Step 9: Purchase and start the instance

On the buy page, configure the following parameters.

Parameter	Description
Billing Method	Pay-as-you-go: billed hourly. Release the instance when no longer needed to avoid ongoing charges.
Resource Group Settings	The resource group for the instance. Default: default resource group. See What is Resource Management?
Instance Class	The synchronization throughput class. See Instance classes of data synchronization instances.

Read and select Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start. In the confirmation dialog, click OK.

The task appears in the task list. Monitor its progress there.