Sync PolarDB MySQL to AnalyticDB MySQL V3.0 for Real-Time Analytics - DTS

Data Transmission Service (DTS) lets you continuously synchronize data from a PolarDB for MySQL cluster to an AnalyticDB for MySQL V3.0 cluster. This enables you to use AnalyticDB for MySQL to power internal business intelligence (BI) systems, interactive query systems, and real-time reporting systems.

What this task supports:

Full initial load followed by continuous incremental synchronization using binary logging (CDC)
Three synchronization topologies: one-to-one, one-to-many, and many-to-one
DML operations (INSERT, UPDATE, DELETE) and the most common DDL operations
Optional multi-table merging for OLAP consolidation scenarios

Prerequisites

Before you begin, make sure you have:

A PolarDB for MySQL cluster. See Purchase a subscription cluster or Purchase a pay-as-you-go cluster
A destination AnalyticDB for MySQL V3.0 cluster. See Create a cluster
Available storage space on the destination cluster that exceeds the total data size of the source cluster

Billing

Synchronization type	Fee
Schema synchronization and full data synchronization	Free
Incremental data synchronization	Charged. See Billing overview

Supported synchronization topologies

One-way one-to-one synchronization
One-way one-to-many synchronization
One-way many-to-one synchronization

SQL operations that can be synchronized

Type	Supported operations
DML	INSERT, UPDATE, DELETE
DDL	CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE, ADD COLUMN, MODIFY COLUMN, DROP COLUMN

How DML operations behave on the destination:

UPDATE statements are automatically converted to REPLACE INTO when written to the destination AnalyticDB for MySQL V3.0 cluster.
If an UPDATE targets the primary key, it is converted to a DELETE followed by an INSERT.

Warning

If you change the data type of a field in the source table during synchronization, the data synchronization task fails. To recover: 1. The data synchronization task fails when the data type of a field in the source table is changed during synchronization. 2. Create a new table (for example, customer_new) in the destination cluster with the same schema as the updated source table. 3. Copy all data from the original destination table into the new table using INSERT INTO SELECT. 4. Rename or delete the original destination table, then rename the new table to match the original name. 5. Restart the data synchronization task in the DTS console.

Limitations

Source database requirements

Tables must have a PRIMARY KEY or UNIQUE constraint, with all fields unique. Without this, the destination may contain duplicate records.
If you synchronize tables individually (not at the database level) and need to rename tables or columns in the destination, a single task supports up to 1,000 tables. Exceeding this limit causes a request error. Split the workload across multiple tasks, or synchronize at the database level instead.
Binary logging must be enabled, and the loose_polar_log_bin parameter must be set to ON. If it is not, the precheck fails and the task cannot start. See Enable binary logging and Modify parameters.
Enabling binary logging on a PolarDB for MySQL cluster incurs storage charges for the binary log files.
Retain binary logs for at least 3 days; 7 days is recommended. If the retention period is too short, data inconsistency or data loss may occur, and DTS Service Level Agreement (SLA) guarantees may not apply. See the Modify the retention period section in Enable binary logging.
During synchronization, do not execute DDL statements that modify primary keys or add comments (for example, ALTER TABLE table_name COMMENT='Table comments';). Such statements have no effect.
During schema synchronization and full data synchronization, do not run DDL statements that change database or table schemas. Doing so causes the task to fail.

Destination and general limits

Prefix indexes: Prefix indexes cannot be synchronized. If the source database contains prefix indexes, the task may fail.
Primary key requirement: Specify a custom primary key in the destination database, or configure the Primary Key Column setting during the Configurations for Databases, Tables, and Columns step. Without a primary key, synchronization may fail.
Read-only nodes: DTS does not synchronize from read-only nodes of the source PolarDB for MySQL cluster.
OSS external tables: DTS does not synchronize Object Storage Service (OSS) external tables from the source cluster.
Disk usage threshold: If disk usage on the destination AnalyticDB for MySQL cluster exceeds 80%, an exception occurs and the task is delayed. Estimate disk requirements based on the objects to be synchronized and make sure the destination cluster has sufficient space.
Backup conflicts: If the destination AnalyticDB for MySQL V3.0 cluster is being backed up while the DTS task is running, the task fails.
Off-peak synchronization: Run synchronization during off-peak hours. Initial full data synchronization uses read and write resources on both source and destination databases, increasing server load.
Table fragmentation: Concurrent INSERT operations during initial full data synchronization cause table fragmentation on the destination. After full synchronization completes, the destination tablespace size is larger than that of the source.
Online DDL tools: Do not use tools such as pt-online-schema-change to perform DDL operations on source tables during synchronization. The task will fail.
- If no data from other sources is written to the destination during synchronization, use Data Management (DMS) to perform online DDL operations. See Perform lock-free DDL operations.
- If data from other sources is being written to the destination at the same time, data inconsistency or data loss can occur.
DDL execution failures: If DDL statements fail to execute on the destination, the task continues to run. Check the task logs to view the failed statements. See View task logs.
Task recovery: If a DTS task fails, DTS support will attempt to restore it within 8 hours. During this period, the task may be restarted and task parameters (not database parameters) may be modified. Parameters that may be changed are listed in Modify instance parameters.
Foreign keys: During schema synchronization, DTS synchronizes foreign keys from source to destination. During full and incremental synchronization, DTS temporarily disables foreign key constraint checks and cascade operations at the session level. Cascade update and delete operations on the source during synchronization may cause data inconsistency.

Special behavior

DTS periodically executes CREATE DATABASE IF NOT EXISTS `test` on the source database to advance the binary log file position.

Create a data synchronization task

Step 1: Go to the Data synchronization page

Use the DTS console or the DMS console.

DTS console

Log on to the DTS console.
In the left-side navigation pane, click Data Synchronization.
In the upper-left corner, select the region where the data synchronization instance resides.

DMS console

The exact navigation path depends on the DMS console mode. See Simple mode and Customize the layout and style of the DMS console.

Log on to the DMS console.
In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.
From the drop-down list to the right of Data Synchronization Tasks, select the region where the data synchronization instance resides.

Step 2: Configure the source and destination databases

Click Create Task.

Configure the source and destination databases using the parameters in the following table.

Warning

After configuring the source and destination databases, read the Limits displayed on the page before proceeding. Skipping this step may cause the task to fail or result in data inconsistency.

Source database

Parameter	Description
Task Name	The name of the DTS task. DTS generates a name automatically. Specify a descriptive name to help identify the task. The name does not need to be unique.
Select Existing Connection	If you have a database instance registered with DTS, select it from the drop-down list. DTS automatically fills in the database parameters. See Manage database connections. In the DMS console, select the instance from the Select a DMS database instance drop-down list. If the instance is not registered with DTS, configure the parameters below manually.
Database Type	Select PolarDB for MySQL.
Access Method	Select Alibaba Cloud Instance.
Instance Region	The region where the source PolarDB for MySQL cluster resides.
PolarDB Cluster ID	The ID of the source PolarDB for MySQL cluster.
Database Account	The database account for the source cluster. The account must have read permissions on the objects to be synchronized.
Database Password	The password for the database account.
Encryption	Whether to encrypt the connection to the source cluster. Configure based on your security requirements. See Configure SSL encryption.

Destination database

Parameter	Description
Select Existing Connection	If you have a database instance registered with DTS, select it from the drop-down list. See Manage database connections. In the DMS console, select the instance from the Select a DMS database instance drop-down list. If the instance is not registered with DTS, configure the parameters below manually.
Database Type	Select AnalyticDB MySQL 3.0.
Access Method	Select Alibaba Cloud Instance.
Instance Region	The region where the destination AnalyticDB for MySQL V3.0 cluster resides.
Instance ID	The ID of the destination AnalyticDB for MySQL V3.0 cluster.
Database Account	The database account for the destination cluster. The account must have read and write permissions on the destination database.
Database Password	The password for the database account.

Click Test Connectivity and Proceed.
DTS server CIDR blocks must be added to the security settings of both the source and destination databases before connectivity can be established. This can be done automatically or manually. See Add the CIDR blocks of DTS servers.

Step 3: Configure objects to synchronize

In the Configure Objects step, configure the following parameters.

Parameter	Description
Synchronization Types	Select Schema Synchronization and Full Data Synchronization in addition to the default Incremental Data Synchronization. DTS first synchronizes historical data from the source (full load), then continuously applies changes (incremental). Note With Full Data Synchronization selected, DTS also synchronizes the schema and data of tables created via CREATE TABLE statements.
DDL and DML Operations to Be Synchronized	The operations to synchronize. See the SQL operations that can be synchronized section. To restrict operations for a specific database or table, right-click the object in the Selected Objects section and select the operations to include.
Merge Tables	Yes: Merges multiple source tables with the same schema into a single destination table. Useful in OLAP scenarios where sharded OLTP data needs to be consolidated. See Enable the multi-table merging feature. When enabled: rename the selected source tables to the destination table name using the object name mapping feature (see Map object names); DTS adds a `__dts_data_source` column (TEXT type) to the destination table to identify each source, in the format `instance_id:db_name:schema_name:table_name` (example: `dts******:dtstestdata:testschema:customer1`); all selected source tables are merged — if you need to exclude specific tables, create a separate task for them. Warning** Do not run DDL operations that change source schemas while table merging is active. This causes data inconsistency or task failure. No (default): Each source table maps to its own destination table.
Processing Mode of Conflicting Tables	Precheck and Report Errors (recommended): The precheck verifies whether the destination contains tables with the same names as the source. If identical names exist, the precheck fails and the task cannot start. Use the object name mapping feature to resolve naming conflicts. See Map object names. Ignore Errors and Proceed: Skips the name conflict check. Warning This option may cause data inconsistency. During full synchronization, if a record in the destination has the same primary or unique key as a source record, the destination record is retained. During incremental synchronization, the destination record is overwritten. If the source and destination schemas differ, initialization may fail or only partial columns are synchronized.
Capitalization of Object Names in Destination Instance	Controls the capitalization of database names, table names, and column names in the destination. Default is DTS default policy. See Specify the capitalization of object names in the destination instance.
Source Objects	Select the databases, tables, or columns to synchronize, then click to move them to Selected Objects. Note If you select tables or columns, DTS does not synchronize views, triggers, or stored procedures. If you select a database, DTS applies the following defaults: tables with a primary key use primary key columns as distribution keys; tables without a primary key get an auto-generated auto-increment primary key, which may cause inconsistency between source and destination.
Selected Objects	To rename a single object in the destination, right-click it. See Map the name of a single object. To rename multiple objects at once, click Batch Edit. See Map multiple object names at a time. To filter rows for a specific table, right-click the table and specify WHERE conditions. See Specify filter conditions.

Click Next: Advanced Settings and configure the following parameters.

Parameter	Description
Dedicated Cluster for Task Scheduling	By default, DTS schedules the task to the shared cluster. For higher stability, purchase a dedicated cluster. See What is a DTS dedicated cluster.
Copy the temporary table of the Online DDL tool that is generated in the source table to the destination database.	Applies when you use DMS or gh-ost for online DDL operations. Yes: Synchronizes data from temporary tables generated during online DDL. Note Large-scale online DDL can significantly extend task duration. No, Adapt to DMS Online DDL: Skips temporary table data; synchronizes only the final DDL from DMS. Destination tables may be locked temporarily. No, Adapt to gh-ost: Skips temporary table data; synchronizes only the final DDL from gh-ost. Use default or custom regular expressions to filter out shadow tables. Destination tables may be locked temporarily. Note In this scenario, DTS does not synchronize DDL operations. Set this parameter to No, Adapt to DMS Online DDL or No, Adapt to gh-ost.
Retry Time for Failed Connections	How long DTS retries when it cannot connect to the source or destination after the task starts. Valid range: 10–1440 minutes. Default: 720 minutes. Set this to at least 30 minutes. If DTS reconnects within the retry window, the task resumes. Otherwise, it fails. Note When multiple tasks share the same source or destination, the shortest retry time among them applies. DTS instance fees continue to accrue during retry.
Retry Time for Other Issues	How long DTS retries failed DDL or DML operations. Valid range: 1–1440 minutes. Default: 10 minutes. Set this to at least 10 minutes. This value must be less than Retry Time for Failed Connections.
Enable Throttling for Full Data Migration	Limits the read and write load during full data synchronization. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s) as needed. Only available when Full Data Synchronization is selected.
Enable Throttling for Incremental Data Migration	Limits the load during incremental data synchronization. Configure RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s) as needed.
Environment Tag	A tag to identify the DTS instance. Select based on your environment (development, staging, production). Optional.
Whether to delete SQL operations on heartbeat tables of forward and reverse tasks	Controls whether DTS writes heartbeat SQL to the source database while the task runs. Yes: DTS does not write heartbeat SQL. A synchronization latency may appear in the DTS console. No: DTS writes heartbeat SQL. This may affect physical backup and cloning operations on the source database.
Configure ETL	Whether to enable extract, transform, and load (ETL). Yes: Enables ETL. Enter data processing statements in the code editor. See Configure ETL in a data migration or data synchronization task. No: Disables ETL.
Monitoring and Alerting	Whether to configure alerting. Yes: Configure alert thresholds and notification contacts. Alerts are sent if the task fails or synchronization latency exceeds the threshold. See Configure monitoring and alerting when you create a DTS task. No: No alerts.

Click Next Step: Data Verification to configure data verification. For details, see Configure a data verification task.
(Optional) Click Next: Configure Database and Table Fields. Configure the Type, Primary Key Column, Distribution Key, and partition settings (Partition Key, Partitioning Rules, Partition Lifecycle) for the tables to be synchronized in the destination.
- This step is only available when Schema Synchronization is selected under Synchronization Types. To modify parameters, set Definition Status to All. - Primary Key Column supports multiple columns to form a composite primary key. In this case, specify one or more columns as the Distribution Key and Partition Key. See CREATE TABLE.

Step 4: Save settings and run the precheck

To preview the API parameters for this task configuration, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
To proceed, click Next: Save Task Settings and Precheck.

DTS runs a precheck before starting the task. The task starts only after the precheck passes.

If the precheck fails, click View Details next to each failed item, fix the issue, then rerun the precheck.

If an alert is triggered:

If the alert cannot be ignored, click View Details, resolve the issue, and run the precheck again.

If the alert can be ignored, click Confirm Alert Details, click Ignore in the dialog box, click OK, then click Precheck Again.

Step 5: Purchase an instance

Wait until Success Rate reaches 100%, then click Next: Purchase Instance.

On the purchase page, configure the following parameters.

Parameter	Description
Billing Method	Subscription: Pay upfront for a fixed term. More cost-effective for long-term use. Pay-as-you-go: Billed hourly. More flexible for short-term use. Release the instance when it is no longer needed to stop charges.
Resource Group Settings	The resource group for the data synchronization instance. Default: default resource group. See What is Resource Management?
Instance Class	The synchronization speed tier. Select based on your data volume and latency requirements. See Instance classes of data synchronization instances.
Subscription Duration	Available only for the subscription billing method. Options: 1–9 months, or 1, 2, 3, or 5 years.

Read and select Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start, then click OK in the dialog box.

The task appears in the task list. DTS first runs schema synchronization and full data synchronization, then switches automatically to incremental data synchronization.

What's next

Monitor synchronization latency in the DTS console to confirm the task is running as expected.
If you configured alerting, verify that the alert contacts are set up correctly.
After the task is stable, consider adjusting the instance class if the synchronization speed does not meet your requirements.