Use Data Transmission Service (DTS) to continuously replicate data from ApsaraDB RDS for MySQL to an AnalyticDB for PostgreSQL instance in Serverless mode. DTS handles schema synchronization, full data load, and ongoing incremental replication so your analytics database stays in sync with your transactional source.
Prerequisites
Before you begin, make sure you have:
-
An ApsaraDB RDS for MySQL instance. See Create an ApsaraDB RDS for MySQL instance.
-
An AnalyticDB for PostgreSQL instance running V1.0.3.1 or later in Serverless mode. See Create an instance. To check or update the minor version, see View the minor engine version and Update the minor engine version.
-
A database account on the source RDS MySQL instance with the REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions.
-
A database account on the destination AnalyticDB for PostgreSQL instance. Use the initial account or an account with the RDS_SUPERUSER permission. See Manage users and permissions.
Binary log requirements for the source database
DTS reads MySQL binary logs for incremental replication. Verify the following settings on your source instance:
| Parameter | Required value | Notes |
|---|---|---|
| Binary logging | Enabled | Enabled by default on ApsaraDB RDS for MySQL |
binlog_row_image |
full |
Precheck fails if not set to full. See Modify instance parameters. |
| Binary log retention | More than 7 days | DTS may fail to retrieve logs if the retention period is shorter. See Manage binary log files. |
The source instance cannot be a read-only ApsaraDB RDS for MySQL instance that does not record transaction logs, such as an ApsaraDB RDS for MySQL V5.6 read-only instance.
Supported source database types
This guide uses ApsaraDB RDS for MySQL as the source. The same procedure applies to the following MySQL source types:
-
ApsaraDB RDS for MySQL instance
-
Self-managed MySQL database hosted on Elastic Compute Service (ECS)
-
Self-managed MySQL database connected over Express Connect, VPN Gateway, or Smart Access Gateway (SAG)
-
Self-managed MySQL database connected over Database Gateway
-
Self-managed MySQL database connected over Cloud Enterprise Network (CEN)
DTS also supports PostgreSQL, SQL Server, and Db2 as source databases. For the full list, see Supported databases.
Limitations
Source database
-
Tables must have a PRIMARY KEY or UNIQUE constraint, and all fields must be unique. Without this, the destination database may contain duplicate rows.
-
If you select individual tables as sync objects and need to rename tables or columns, a single task supports up to 1,000 tables. For more than 1,000 tables, split the work across multiple tasks or sync the entire database instead.
-
Do not perform DDL operations that modify primary keys or add comments during synchronization (for example,
ALTER TABLE table_name COMMENT='Table comments';). These operations cannot take effect during sync.
Objects and data types
-
Only tables can be selected as sync objects.
-
The following data types are not synchronized: BIT, VARBIT, GEOMETRY, ARRAY, UUID, TSQUERY, TSVECTOR, TXID_SNAPSHOT, and POINT.
-
Prefix indexes are not synchronized. Tables with prefix indexes may fail to sync.
-
Append-optimized (AO) tables cannot be used as destination tables.
-
If column mapping is used for non-full-table synchronization, or if the source and destination schemas differ, data in columns absent from the destination table is lost.
Operations
-
DTS disables FOREIGN KEY constraints on the destination database. Cascade and delete operations on the source are not replicated to the destination.
-
Initial full data synchronization uses read and write resources on both ends. Run it during off-peak hours to limit impact.
-
During initial full data synchronization, concurrent INSERT operations create table fragmentation in the destination. The destination tablespace will be larger than the source after the full load completes.
-
If you selected individual tables (not an entire database) as sync objects, do not use pt-online-schema-change for online DDL operations during synchronization. Use Data Management (DMS) instead. See Perform lock-free DDL operations.
-
Write only through DTS to the destination database during synchronization. Writing through other tools risks data inconsistency, and can cause data loss when DMS online DDL operations are used.
Self-managed MySQL sources
-
A primary/secondary switchover on the source while the task is running causes the task to fail.
-
Synchronization latency is calculated based on the timestamp of the latest replicated row versus the current source time. If the source has no DML activity for a long time, the reported latency may be inaccurate. Run a DML operation on the source to refresh the latency reading.
If you sync an entire database, create a heartbeat table that receives a write every second to keep the latency measurement accurate.
-
DTS periodically executes
CREATE DATABASE IF NOT EXISTS 'test'on the source to advance the binary log file position. -
For self-managed MySQL, set
binlog_formattorowin addition to the requirements listed in Binary log requirements. -
For a self-managed MySQL in a dual-primary cluster, set
log_slave_updatestoONso DTS can read all binary logs. See Create an account for a self-managed MySQL database and configure binary logging.
Billing
| Synchronization stage | Cost |
|---|---|
| Schema synchronization | Free |
| Full data synchronization | Free |
| Incremental data synchronization | Charged. See Billing overview. |
Supported synchronization topologies
-
One-way one-to-one synchronization
-
One-way one-to-many synchronization
-
One-way many-to-one synchronization
Supported SQL operations
| Type | Operations |
|---|---|
| DML | INSERT, UPDATE, DELETE |
| DDL | ADD COLUMN |
CREATE TABLE is not supported. To synchronize a new table, add it to the selected objects manually. See Add an object to a data synchronization task.
Term mappings
| MySQL | AnalyticDB for PostgreSQL |
|---|---|
| Database | Schema |
| Table | Table |
Create a data synchronization task
-
Go to the Data Synchronization Tasks page of the new DTS console.
Alternatively, log on to the DMS console. In the top navigation bar, click DTS, then choose DTS (DTS) > Data Synchronization in the left-side navigation pane.
-
In the upper-left corner, select the region where the data synchronization instance resides.
-
Click Create Task and configure the source and destination databases.
Source database
Parameter Setting Select Existing Connection (Optional) Select an existing ApsaraDB RDS for MySQL connection Database Type MySQL Access Method Alibaba Cloud Instance Instance Region Region of the source ApsaraDB RDS for MySQL instance Replicate Data Across Alibaba Cloud Accounts No (for this example) RDS Instance ID ID of the source ApsaraDB RDS for MySQL instance Database Account Account with REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions Database Password Password of the database account Encryption Non-encrypted or SSL-encrypted. If you choose SSL-encrypted, enable SSL encryption on the RDS instance first. See Configure the SSL encryption feature. Destination database
Parameter Setting Select Existing Connection (Optional) Select an existing AnalyticDB for PostgreSQL Serverless connection Database Type AnalyticDB for PostgreSQL Access Method Alibaba Cloud Instance Instance Region Region of the destination AnalyticDB for PostgreSQL Serverless instance Instance ID ID of the destination AnalyticDB for PostgreSQL Serverless instance Database Name Name of the destination database Database Account Initial account or an account with the RDS_SUPERUSER permission Database Password Password of the database account -
Click Test Connectivity and Proceed.
DTS adds its server CIDR blocks to the whitelist of the ApsaraDB RDS for MySQL instance. See Add the CIDR blocks of DTS servers. After the task is completed or released, remove these CIDR blocks from the whitelist manually.
-
Select objects and configure task settings.
Task settings
Parameter Description Task Name A descriptive name for the task. Names do not need to be unique. Task Stages Select Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization. All three stages are required: schema and full data sync establish the baseline, and incremental sync keeps the destination up to date. Processing Mode of Conflicting Tables Precheck and Report Errors (recommended): fails the precheck if identical table names exist in the destination. Use object name mapping to rename conflicting tables. See Map object names. Ignore Errors and Proceed: skips the check. If records share a primary key with existing destination rows, full sync skips them (existing rows are kept) and incremental sync overwrites them. If the source and destination databases have different schemas, data may fail to be initialized; in this case, only specific columns are synchronized or the data synchronization task fails. DDL and DML Operations to Be Synchronized Select the operations to sync. See Supported SQL operations. To configure per-table operations, right-click an object in Selected Objects. Select objects In Source Objects, select the tables to sync and click the arrow icon to add them to Selected Objects. Only tables can be selected. Rename databases and tables Right-click an object in Selected Objects to rename it. To rename multiple objects at once, click Batch Edit. See Map object names. Filter data Specify WHERE conditions to filter rows. See Set filter conditions. -
Click Next: Advanced Settings and configure the following options.
Parameter Description Monitoring and Alerting Select Yes to receive notifications when the task fails or synchronization latency exceeds the alert threshold. Copy the temporary table of the Online DDL tool that is generated in the source table to the destination database Yes: syncs temporary tables created by DMS online DDL operations (may increase latency for large DDL jobs). No: syncs only the original DDL; destination tables may be locked during the operation. Retry Time for Failed Connections Time range for automatic reconnection after a disconnection. Valid values: 10–1,440 minutes. Default: 120 minutes. We recommend that you set the retry time range to more than 30 minutes. If DTS reconnects within this window, the task resumes automatically; otherwise, it fails. Enclose Object Names in Quotation Marks Select Yes if object names are case-sensitive, contain non-standard characters, or are reserved keywords in the destination. After sync, query these objects using quoted names. Configure ETL Select Yes to transform data during sync using extract, transform, and load (ETL) statements. See Configure ETL in a data migration or synchronization task. -
Click Next: Configure Database and Table Fields. Set the primary key columns and distribution columns for the destination AnalyticDB for PostgreSQL tables.
-
Click Next: Save Task Settings and Precheck. DTS runs a precheck before the task can start. If the precheck fails:
-
Click View Details next to each failed item, fix the issue, then click Precheck Again.
-
If an item is flagged as an alert and can be ignored, click Confirm Alert Details, then click Ignore in the dialog, and click Precheck Again. Ignoring alerts may cause data inconsistency.
To preview the API parameters for this task configuration, hover over the button and click Preview OpenAPI parameters before proceeding.
-
-
Wait until the success rate reaches 100%, then click Next: Purchase Instance.
-
On the Purchase Instance page, configure the billing settings.
Parameter Description Billing Method Subscription: pay upfront for a set duration, more cost-effective for long-term use. Pay-as-you-go: billed hourly, suitable for short-term use. Release the instance when no longer needed to stop charges. Resource Group Resource group for the instance. Default: default resource group. See What is Resource Management? Instance Class Determines synchronization performance and speed. See Specifications of data synchronization instances. Duration Subscription only. Choose 1–9 months, or 1, 2, 3, or 5 years. -
Accept the Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start.
The task appears in the task list. Monitor its progress there.
FAQ
Schema synchronization keeps failing even though my table schemas match. What should I do?
Submit a ticket and include the error details.
Should I run VACUUM on the destination database?
DTS does not run VACUUM automatically during synchronization because it can slow down writes. Run VACUUM on your AnalyticDB for PostgreSQL database periodically as part of routine maintenance.
What do I do if an exception occurs during full data synchronization?
Clear all data from the destination table and restart the synchronization task to reload it.
Synchronization is slow for hot rows or when writing from many tables at once. How can I improve performance?
AnalyticDB for PostgreSQL in Serverless mode performs well for bulk writes from a single table but may be slower for frequent small writes to hot rows or concurrent writes from many tables. Submit a ticket for parameter tuning specific to your workload.