Data Transmission Service (DTS) keeps an AnalyticDB for PostgreSQL instance in sync with an ApsaraDB RDS for MySQL source, enabling real-time analytics on live business data. This topic shows you how to configure a data synchronization task for this pipeline.
Supported source databases
The procedure in this topic uses an ApsaraDB RDS for MySQL instance as the source. The same steps apply when the source is any of the following MySQL database types:
-
ApsaraDB RDS for MySQL instance
-
Self-managed MySQL database hosted on Elastic Compute Service (ECS)
-
Self-managed MySQL database connected over Express Connect, VPN Gateway, or Smart Access Gateway
-
Self-managed MySQL database connected over Database Gateway
-
Self-managed MySQL database connected over Cloud Enterprise Network (CEN)
Prerequisites
Before you begin, ensure that you have:
-
An ApsaraDB RDS for MySQL source instance
-
A destination AnalyticDB for PostgreSQL instance. See Create an instance
-
A database account on the source instance with the REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions
-
A database account on the destination instance with the initial account role or RDS_SUPERUSER permission. See Manage users and permissions
Binary log requirements
Verify the following binary log settings before you configure DTS.
For ApsaraDB RDS for MySQL (binary logging is enabled by default):
-
Set
binlog_row_imagetofull. See Modify instance parameters -
Retain binary logs for at least 3 days (7 days recommended). See Delete binary log files
For self-managed MySQL:
-
Enable binary logging
-
Set
binlog_formattorow -
Set
binlog_row_imagetofull -
Retain binary logs for at least 7 days
-
For dual-primary clusters, set
log_slave_updatestoON. See Create an account for a self-managed MySQL database and configure binary logging
If binary logs are purged before DTS reads them, the synchronization task fails and data inconsistency or loss may occur. DTS does not guarantee Service Level Agreement (SLA) performance if retention periods are shorter than the minimums above.
Billing
| Synchronization type | Fee |
|---|---|
| Schema synchronization and full data synchronization | Free of charge |
| Incremental data synchronization | Charged. See Billing overview |
Limitations
Source database
-
Tables must have PRIMARY KEY or UNIQUE constraints with all-unique fields. Tables without these constraints may produce duplicate records in the destination.
-
When synchronizing individual tables (not entire databases) and you want to edit the tables, such as renaming tables or columns, a single task supports up to 1,000 tables. If a task exceeds 1,000 tables, a request error occurs. For larger sets, configure multiple tasks or synchronize at the database level.
-
DTS does not synchronize the following data types: BIT, VARBIT, GEOMETRY, ARRAY, UUID, TSQUERY, TSVECTOR, TXID_SNAPSHOT, and POINT.
-
Prefix indexes are not supported and may cause the task to fail.
-
Only tables can be selected as synchronization objects. Indexes, partitions, views, stored procedures, functions, triggers, and foreign keys are excluded.
-
Do not execute DDL statements that modify primary keys or add comments during synchronization. For example,
ALTER TABLE table_name COMMENT='...'has no effect. -
Do not execute DDL statements that change database or table schemas during schema synchronization or full data synchronization phases.
-
If the source is MySQL 8.0.23 or later and tables include invisible columns, DTS cannot read those columns and data loss occurs. Run
ALTER TABLE <table_name> ALTER COLUMN <column_name> SET VISIBLE;to make columns visible before starting the task. Tables without explicit primary keys auto-generate invisible primary keys — make those visible too. See Invisible Columns and Generated Invisible Primary Keys. -
If the source is an ApsaraDB RDS for MySQL instance with EncDB enabled, full data synchronization is not supported. Instances with Transparent Data Encryption (TDE) enabled support all three synchronization types.
-
DATETIME values of
0000-00-00 00:00:00are converted tonullin the destination. To avoid this, change the value to0001-01-01 00:00:00in the source or leave the destination field blank before starting the task. -
Data generated by binary log change operations — such as data restored from a physical backup or data from cascade operations — is not recorded or synchronized to the destination while the synchronization instance is running. If this occurs, you can remove the affected databases or tables from the synchronized objects and then re-add them, provided your business is not impacted. See Modify the objects to be synchronized.
Destination database
-
The destination table cannot be an append-optimized (AO) table.
-
The unique key (including primary key) of the destination table must include all columns of the distribution key.
-
If the source table has a primary key, the destination primary key column matches the source. If the source table has no primary key, the destination primary key column matches the distribution key.
Other limitations
-
Data written to the destination from other sources during synchronization may cause inconsistency. Avoid concurrent writes to destination tables while DTS is running.
-
Online DDL tools such as
pt-online-schema-changeare not supported when individual tables are selected as objects. Use Data Management (DMS) for lock-free DDL instead. See Perform lock-free DDL operations. -
If column mapping is used or source and destination schemas differ, columns present in the source but absent in the destination are not synchronized.
-
During schema synchronization, DTS synchronizes foreign keys from the source database to the destination database.
-
DTS temporarily disables foreign key constraint checks and cascade operations at the session level during full and incremental data synchronization. Cascade updates or deletes on the source during this period may cause data inconsistency.
-
During initial full data synchronization, concurrent INSERT operations cause table fragmentation in the destination. The destination tablespace will be larger than the source after full sync completes. Schedule initial synchronization during off-peak hours to reduce load.
-
If a DTS task fails, DTS technical support attempts to restore it within 8 hours. The task may be restarted and its parameters adjusted during recovery.
Self-managed MySQL source
-
A primary/secondary switchover while the task is running causes the task to fail.
-
DTS calculates synchronization latency based on the latest synchronized timestamp versus the current source timestamp. If no DML operations run on the source for an extended period, the reported latency may be inaccurate. Run a DML operation on the source to refresh the latency metric. If you select an entire database as the synchronization object, you can create a heartbeat table that is updated every second to keep the latency metric accurate.
-
DTS periodically executes
CREATE DATABASE IF NOT EXISTS 'test'on the source to advance the binary log position.
ApsaraDB RDS for MySQL source
-
ApsaraDB RDS for MySQL instances that do not record transaction logs, such as read-only ApsaraDB RDS for MySQL 5.6 instances, cannot be used as the source.
-
DTS periodically executes
CREATE DATABASE IF NOT EXISTS 'test'on the source to advance the binary log position.
SQL operations that can be synchronized
DML: INSERT, UPDATE, DELETE
When writing to AnalyticDB for PostgreSQL, DTS converts UPDATE to REPLACE INTO. If UPDATE targets primary key columns, DTS converts it to DELETE followed by INSERT.
DDL: CREATE TABLE, ALTER TABLE, TRUNCATE TABLE, DROP TABLE
Supported synchronization topologies
-
One-way one-to-one
-
One-way one-to-many
-
One-way many-to-one
Term mappings
| MySQL | AnalyticDB for PostgreSQL |
|---|---|
| Database | Schema |
| Table | Table |
Configure a data synchronization task
This procedure has 9 steps: open the task page, select a region, configure source and destination databases, test connectivity, configure objects and settings, configure advanced settings, set key columns, run the precheck, and purchase and start the instance.
Step 1: Open the Data Synchronization Tasks page
-
Log on to the Data Management (DMS) console.
-
In the top navigation bar, click Data + AI.
-
In the left-side navigation pane, choose DTS (DTS) > Data Synchronization.
Console navigation may vary based on your DMS layout. See Simple mode and Customize the layout and style of the DMS console. Alternatively, go directly to the Data Synchronization Tasks page.
Step 2: Select a region
On the right side of Data Synchronization Tasks, select the region where the data synchronization instance resides.
In the new DTS console, select the region from the top navigation bar.
Step 3: Configure the source and destination databases
Click Create Task. In the wizard, configure the following parameters.
Source database
| Parameter | Description |
|---|---|
| Task Name | A name for the DTS task. DTS generates one automatically. Use a descriptive name for easy identification — uniqueness is not required. |
| Database Type | Select MySQL. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | The region of the source ApsaraDB RDS for MySQL instance. |
| Replicate Data Across Alibaba Cloud Accounts | Select No for same-account synchronization. |
| RDS Instance ID | The ID of the source instance. The source and destination can be the same instance (for within-instance synchronization) or different instances. |
| Database Account | The account with REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions. |
| Database Password | The password for the database account. |
| Encryption | Select Non-encrypted or SSL-encrypted. To use SSL encryption, enable it on the RDS instance first. See Use a cloud certificate to enable SSL encryption. |
Destination database
| Parameter | Description |
|---|---|
| Database Type | Select AnalyticDB for PostgreSQL. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | The region of the destination AnalyticDB for PostgreSQL instance. |
| Instance ID | The ID of the destination instance. |
| Database Name | The name of the target database in the destination instance. |
| Database Account | The initial account or an account with the RDS_SUPERUSER permission. |
| Database Password | The password for the database account. |
Step 4: Test connectivity
Click Test Connectivity and Proceed.
DTS automatically adds its server CIDR blocks to the whitelist of Alibaba Cloud database instances and to the security group rules of ECS-hosted databases. If the database is hosted on ECS, you must also ensure that the ECS instance can access the database. If the database is deployed across multiple ECS instances, you must manually add the DTS CIDR blocks to the security group rules of each ECS instance. For self-managed databases in data centers or third-party clouds, manually add the DTS CIDR blocks to the database whitelist. See CIDR blocks of DTS servers.
Adding DTS CIDR blocks to whitelists or security groups introduces security exposure. Before proceeding, take preventive measures such as enforcing strong credentials, restricting exposed ports, regularly auditing whitelist rules, and removing unauthorized CIDR blocks. Consider connecting through Express Connect, VPN Gateway, or Smart Access Gateway to reduce exposure.
Step 5: Configure synchronization objects and settings
| Parameter | Description |
|---|---|
| Synchronization Types | Select Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization. By default, Incremental Data Synchronization is selected; you must also select the other two. Schema and full synchronization establish the baseline; incremental synchronization keeps the destination current afterward. |
| Processing Mode of Conflicting Tables | Precheck and Report Errors (default): fails the precheck if identically named tables exist in the destination. Use object name mapping to rename conflicting tables if needed. See Map object names. Ignore Errors and Proceed: skips the precheck for name conflicts. During full sync, existing destination records with matching primary or unique key values are retained. During incremental sync, they are overwritten. If schemas differ, initialization may fail, only partial columns are synchronized, or the data synchronization instance fails — proceed with caution. |
| DDL and DML Operations to Be Synchronized | The SQL operations to synchronize. See SQL operations that can be synchronized. To configure per-object SQL filtering, right-click an object in Selected Objects and select the operations. |
| Source Objects | Select objects and click the arrow icon to add them to Selected Objects. Only tables are supported. |
| Selected Objects | To rename a single object, right-click it. For bulk renaming, click Batch Edit. See Map object names. To filter rows, right-click a table and specify WHERE conditions. See Specify filter conditions. Renaming an object may break synchronization of dependent objects. |
Step 6: Configure advanced settings
Click Next: Advanced Settings.
Data verification
See Configure data verification.
Advanced settings
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | By default, DTS schedules to the shared cluster. Purchase a dedicated cluster for improved stability. See What is a DTS dedicated cluster. |
| Set Alerts | No: alerting disabled. Yes: configure alert thresholds and contacts. See Configure monitoring and alerting. |
| Retry Time for Failed Connections | How long DTS retries on connection failure. Range: 10–1440 minutes. Default: 720 minutes. Set to 30 minutes or more. If multiple tasks share the same source or destination, the shortest retry window takes precedence. DTS charges for the instance during retries — release the instance promptly if the source and destination are decommissioned. |
| Retry Time for Other Issues | How long DTS retries on DDL/DML failures. Range: 1–1440 minutes. Default: 10 minutes. Must be less than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Synchronization | Limit QPS to the source, RPS for full sync, and data throughput (MB/s) to reduce load during initial full data synchronization. Available only when Full Data Synchronization is selected. |
| Enable Throttling for Incremental Data Synchronization | Limit RPS and throughput (MB/s) for incremental synchronization to reduce load on the destination. |
| Environment Tag | Tag the DTS instance by environment. Optional. |
| Enclose Object Names in Quotation Marks | Yes: DTS encloses schema, table, and column names in single or double quotation marks during schema and incremental synchronization when names contain mixed case, start with non-letters, include non-standard characters, or conflict with reserved words in the destination. No: names are not quoted. |
| Configure ETL | Yes: enter data transformation statements in the code editor. See Configure ETL. No: no transformation. |
| Whether to delete SQL operations on heartbeat tables of forward and reverse tasks | Yes: DTS does not write heartbeat table operations to the source. A latency metric may be displayed on the task. No: DTS writes heartbeat operations to the source, which may affect physical backup and cloning of the source database. |
Step 7: Set primary key and distribution key columns
Click Next: Configure Database and Table Fields. Set the primary key and distribution key columns for each table being synchronized to the destination AnalyticDB for PostgreSQL instance.
Step 8: Run the precheck
Click Next: Save Task Settings and Precheck.
To preview the equivalent API parameters before saving, hover over the button and click Preview OpenAPI parameters.
DTS runs a precheck before starting synchronization. If any item fails:
-
Click View Details next to the failed item, resolve the issue, then click Precheck Again.
-
If an alert item can be safely ignored, click Confirm Alert Details, then Ignore in the dialog, then OK, then Precheck Again. Ignoring alerts may lead to data inconsistency.
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
Step 9: Purchase and start the instance
On the buy page, configure the following parameters.
| Parameter | Description |
|---|---|
| Billing Method | Pay-as-you-go: billed hourly. Release the instance when no longer needed to avoid ongoing charges. |
| Resource Group Settings | The resource group for the instance. Default: default resource group. See What is Resource Management? |
| Instance Class | The synchronization throughput class. See Instance classes of data synchronization instances. |
Read and select Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start. In the confirmation dialog, click OK.
The task appears in the task list. Monitor its progress there.