Synchronize data from PolarDB-X 2.0 to AnalyticDB for MySQL V3.0 - Data Transmission Service

Use Data Transmission Service (DTS) to replicate data from a PolarDB-X 2.0 instance to an AnalyticDB for MySQL V3.0 cluster in real time. After synchronization is set up, the destination cluster keeps pace with every write on your source, enabling you to build internal business intelligence (BI) systems, interactive query systems, and real-time report systems without impacting your production workload.

Prerequisites

Before you begin, make sure that:

A PolarDB-X 2.0 instance compatible with MySQL 5.7 exists and is running
A destination AnalyticDB for MySQL V3.0 cluster is created. For more information, see Create a cluster
The destination cluster has more storage space than the source instance currently occupies
Binary logging is enabled on the PolarDB-X 2.0 instance and the binlog_row_image parameter is set to full. For more information, see Parameter settings
The source database account has SELECT, REPLICATION CLIENT, and REPLICATION SLAVE permissions on all objects to be synchronized. For details on granting these permissions, see Data synchronization tools for PolarDB-X
The destination database account has read and write permissions on the destination cluster

Limitations

Source database requirements

Constraint	Detail	Action
Primary key or unique key	Every table to be synchronized must have a PRIMARY KEY or UNIQUE constraint, with all fields unique. Without this, the destination database may contain duplicate records.	Add a PRIMARY KEY or UNIQUE constraint to each table before starting the task.
Table limit per task	If you select individual tables as synchronization objects and need to rename them or columns in the destination, a single task supports up to 5,000 tables.	For more than 5,000 tables, create multiple tasks or synchronize at the database level instead.
Binary log retention — incremental only	Retain binary logs for at least 24 hours. If DTS cannot get the required logs, the task fails and data loss may occur.	Increase the binary log retention period before starting the task.
Binary log retention — full + incremental	Retain binary logs for at least 7 days during the full synchronization phase. After full synchronization completes, you can set the retention period to more than 24 hours. Make sure that you set the retention period in accordance with these requirements. Otherwise, the service reliability and performance stated in the Service Level Agreement (SLA) of DTS cannot be achieved.	Set the retention period to at least 7 days before starting, and adjust it only after full synchronization completes.
Uppercase table names	Tables with uppercase letters in their names support schema synchronization only — not full or incremental data synchronization.	Rename tables to use lowercase before starting the task if you need full or incremental synchronization.
TABLEGROUP and Locality	DTS does not support synchronizing TABLEGROUP objects or databases and schemas that have a Locality attribute.	Exclude such objects from the synchronization scope.
DDL during initial sync	Do not run DDL statements that change database or table schemas while schema synchronization or full data synchronization is in progress.	Schedule DDL changes before or after the initial synchronization phases complete.

Other limitations

Constraint	Detail	Action
Prefix indexes	Prefix indexes cannot be synchronized. If your source tables use prefix indexes, synchronization may fail for those tables.	Remove or convert prefix indexes before starting the task.
Primary key configuration	Without a primary key, data synchronization may fail.	Specify a custom primary key in the destination, or configure Primary Key Column in the Configurations for Databases, Tables, and Columns step.
Destination disk usage	AnalyticDB for MySQL delays writes and returns errors when node disk usage reaches 80%.	Estimate the required storage in advance and make sure the destination cluster has enough space.
Destination backup conflict	If the destination cluster is being backed up while the DTS task runs, the task fails.	Schedule synchronization tasks outside backup windows.
Off-peak hours	Initial full synchronization uses read and write resources from both source and destination, which increases load on both servers.	Run synchronization during off-peak hours.
Tablespace size after full sync	After full data synchronization, the used tablespace in the destination is larger than in the source because concurrent INSERT operations cause fragmentation. This is expected behavior.	No action required.
pt-online-schema-change	Using pt-online-schema-change for DDL operations while a DTS task is running may break synchronization.	Avoid pt-online-schema-change while the task is active.
Writing to the destination	Writing from other tools alongside DTS can cause data inconsistency or loss, especially when using Data Management Service (DMS) online DDL.	Write data to the destination exclusively through DTS during synchronization.
DDL execution failures	When DDL statements fail to execute in the destination, DTS continues retrying.	View failed DDL statements in the task logs. For more information, see View task logs.
Task restoration	If a DTS task fails, DTS technical support tries to restore it within 8 hours. The task may be restarted and task-level parameters may be modified during restoration. Database parameters are not modified.	No action required.

DTS writes to the dts_health_check.ha_health_check table in the source database on a fixed schedule to advance the binary log position.

Foreign keys are not synchronized during schema synchronization. During full and incremental synchronization, DTS disables foreign key constraint checks and cascade operations at the session level. If you run cascade UPDATE or DELETE operations on the source during synchronization, data inconsistency may occur.

Billing

Synchronization type	Fee
Schema synchronization and full data synchronization	Free
Incremental data synchronization	Charged. For more information, see Billing overview.

Supported synchronization topologies

One-way one-to-one synchronization
One-way one-to-many synchronization
One-way cascade synchronization
One-way many-to-one synchronization

For a full list of supported topologies, see Synchronization topologies.

SQL operations that can be synchronized

Type	Operations
DML	INSERT, UPDATE, DELETE
DDL	CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE, ADD COLUMN, MODIFY COLUMN, DROP COLUMN

When DTS writes to AnalyticDB for MySQL V3.0, UPDATE statements are automatically converted to REPLACE INTO. If the UPDATE targets a primary key column, it is converted to DELETE followed by INSERT.

Create a data synchronization task

The task setup consists of seven steps: navigate to the Data Synchronization page, configure source and destination databases, select synchronization objects, configure advanced settings, optionally set table field properties, run a precheck, and purchase the instance.

Step 1: Go to the Data Synchronization page

Use the DTS console or the DMS console.

DTS console

Log on to the DTS console.DTS console
In the left-side navigation pane, click Data Synchronization.
In the upper-left corner, select the region where the data synchronization instance resides.

DMS console

Note

The exact steps may differ depending on the DMS console layout. For more information, see Simple mode and Customize the layout and style of the DMS console.

Log on to the DMS console.DMS console
In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.
From the drop-down list next to Data Synchronization Tasks, select the region where the data synchronization instance resides.

Step 2: Configure source and destination databases

Click Create Task.

Configure the source database parameters.

Parameter	Description
Task Name	Specify a descriptive name for the task. The name does not need to be unique. DTS generates a default name if you leave this blank.
Select Existing Connection	Select a pre-registered database instance from the drop-down list if available. DTS populates the remaining parameters automatically. For more information, see Manage database connections. If not using a pre-registered instance, configure the parameters below manually.
Database Type	Select PolarDB-X 2.0.
Connection Type	Select Alibaba Cloud Instance.
Instance Region	Select the region where the source PolarDB-X 2.0 instance resides.
Replicate Data Across Alibaba Cloud Accounts	Select No for same-account synchronization.
Instance ID	Select the source PolarDB-X 2.0 instance.
Database Account	Enter the database account. The account must have SELECT, REPLICATION CLIENT, and REPLICATION SLAVE permissions on the objects to be synchronized.
Database Password	Enter the password for the database account.

Configure the destination database parameters.

Parameter	Description
Select Existing Connection	Select a pre-registered cluster from the drop-down list if available. DTS populates the remaining parameters automatically. If not using a pre-registered instance, configure the parameters below manually.
Database Type	Select AnalyticDB MySQL 3.0.
Connection Type	Select Alibaba Cloud Instance.
Instance Region	Select the region where the destination AnalyticDB for MySQL V3.0 cluster resides.
Replicate Data Across Alibaba Cloud Accounts	Select No for same-account synchronization.
Instance ID	Select the destination AnalyticDB for MySQL V3.0 cluster.
Database Account	Enter the database account. The account must have read and write permissions on the destination cluster.
Database Password	Enter the password for the database account.

Click Test Connectivity and Proceed.
DTS server CIDR blocks must be added to the security settings of both the source and destination to allow DTS to connect. DTS can add them automatically, or you can add them manually. For more information, see Add the CIDR blocks of DTS servers.

Warning

After configuring the source and destination databases, review the Limits displayed on the page carefully. Skipping this step may result in task failures or data inconsistency.

Step 3: Configure synchronization objects

In the Configure Objects step, set the following parameters.

Synchronization Type

Select all three types: Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization.

The three types work in sequence:

Schema Synchronization replicates the table structure from the source to the destination.
Full Data Synchronization copies all existing data to the destination. This completes before incremental synchronization begins and forms the baseline for ongoing synchronization.
Incremental Data Synchronization (selected by default) captures and applies ongoing changes from the source. This phase is billed — schema and full synchronization are free.

If Full Data Synchronization is selected, DTS also synchronizes the schema and data of tables created by CREATE TABLE statements to the destination.

Processing Mode for Existing Destination Tables

Choose how DTS handles tables in the destination that have the same name as tables in the source.

Precheck and Report Errors (default): DTS checks for naming conflicts before starting. If identical table names exist, the precheck fails and the task cannot start. To resolve this, rename tables in the destination or use object name mapping.
Ignore Errors and Proceed: DTS skips the naming conflict check. If the source and destination have the same schema and a record with a matching primary or unique key exists in the destination, DTS keeps the existing record during full synchronization and overwrites it during incremental synchronization. Schema mismatches may cause the task to fail or synchronize only partial columns.

Warning

Selecting Ignore Errors and Proceed may cause data inconsistency.

Table Merging

Set to Yesalert notification settings to merge multiple source tables with the same schema into a single destination table. This is useful in OLAP scenarios where sharded OLTP tables need to be consolidated for analysis.

When merging, use object name mapping to rename each source table to the target destination table name.
DTS adds a __dts_data_source column (TEXT type) to the destination table to track each row's origin. The format is <synchronization instance ID>:<source database name>:<source schema name>:<source table name>, for example: dts********:dtstestdata:testschema:customer1.
All selected source tables in the task are merged into the single destination table. To exclude specific source tables from merging, create a separate synchronization task for those tables.

Warning

Do not run DDL operations that change source database or table schemas when Table Merging is set to Yes. Doing so causes data inconsistency or task failure.

Set to No (default) if each source table maps to a corresponding destination table.

Source objects and selected objects

Select objects from the Source Objects section and click the arrow icon to move them to Selected Objects.

Select columns, tables, or databases. If you select tables or columns, DTS does not synchronize other object types such as views, triggers, or stored procedures.
If you select a full database, DTS sets distribution keys automatically: tables with a primary key use the primary key columns as distribution keys; tables without a primary key get an auto-increment primary key column in the destination, which may cause inconsistency between source and destination.

To rename objects in the destination, right-click an object in Selected Objects for individual renaming, or click Batch Edit for bulk renaming. For more information, see Database, table, and column name mapping.

To filter which rows are synchronized, right-click an object in Selected Objects and specify a WHERE condition. For more information, see Specify filter conditions.

To select specific SQL operations for a particular table, right-click the table in Selected Objects and choose which operations to include.

Step 4: Configure advanced settings

Click Next: Advanced Settings and configure the following parameters.

Parameter	Description
Dedicated Cluster for Task Scheduling	By default, DTS runs tasks on the shared cluster. For higher stability, purchase a dedicated cluster. For more information, see What is a DTS dedicated cluster.
Retry Time for Failed Connections	How long DTS retries a failed connection before marking the task as failed. Valid range: 10–1440 minutes. Default: 720 minutes. Set this to at least 30 minutes. If multiple tasks share the same source or destination, the shortest retry window applies to all of them. DTS charges for the instance during retry periods.
Retry Time for Other Issues	How long DTS retries failed DML or DDL operations. Valid range: 1–1440 minutes. Default: 10 minutes. Set this to a value greater than 10 minutes, and always less than Retry Time for Failed Connections.
Enable Throttling for Full Data Synchronization	Limit the read/write rate during full synchronization to reduce load on the source and destination. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Synchronization is selected.
Enable Throttling for Incremental Data Synchronization	Limit the rate during incremental synchronization. Configure RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s).
Environment Tag	Tag the DTS instance with an environment label (for example, production or staging). Optional.
Whether to delete SQL operations on heartbeat tables of forward and reverse tasks	Controls whether DTS writes heartbeat records to the source database. Select Yes to skip writing heartbeat records — a latency indicator may show in the console. Select No to allow heartbeat writes — this may affect physical backup and cloning of the source database.
Configure ETL	Select Yes to enable extract, transform, and load (ETL) and enter processing statements in the code editor. For more information, see Configure ETL in a data migration or data synchronization task. Select No to skip ETL.
Monitoring and Alerting	Select Yes to receive alerts when the task fails or synchronization latency exceeds a threshold. Configure the alert threshold and notification contacts. For more information, see Configure monitoring and alerting.

Step 5: (Optional) Configure database and table fields

Click Next: Configure Database and Table Fields to set destination table properties. This step is available only when Schema Synchronization is selected.

Set Definition Status to All to see and edit all tables. For each table, you can configure:

Type: the table type in AnalyticDB for MySQL
Primary Key Column: specify one or more columns as the primary key. You can form a composite primary key.
Distribution Key: one or more columns used to distribute data across nodes.
Partition Key, Partitioning Rules, Partition Lifecycle: partition configuration for the destination table.

For details on AnalyticDB for MySQL table creation options, see CREATE TABLE.

Step 6: Save and run the precheck

Click Next: Save Task Settings and Precheck.

To preview the API parameters for this task configuration, hover over the button and click Preview OpenAPI parameters before saving.

DTS runs a precheck before starting the task. The task can only start after the precheck passes.

If the precheck fails, click View Details next to each failed item. Fix the issue and click Precheck Again.
If an alert is triggered but the item can be ignored, click Confirm Alert Details, then in the dialog that appears, click Ignore > OK > Precheck Again. Ignoring precheck alerts may result in data inconsistency.

Step 7: Purchase and start the instance

Wait until Success Rate shows 100%, then click Next: Purchase Instance.

On the purchase page, configure the instance parameters.

Parameter	Description
Billing Method	Subscription: pay upfront for a set term (1–9 months, or 1, 2, 3, or 5 years). More cost-effective for long-running tasks. Pay-as-you-go: billed hourly. Suitable for short-term use — release the instance when no longer needed to stop charges.
Resource Group Settings	Assign the instance to a resource group. Defaults to the default resource group. For more information, see What is Resource Management?
Instance Class	Select an instance class based on the required synchronization speed. For more information, see Instance classes of data synchronization instances.
Subscription Duration	Shown only for subscription billing. Set the term length and the number of instances to create.

Read and accept Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start, then click OK in the confirmation dialog.

The task appears in the task list. Monitor its progress from the list view.

What's next

To track synchronization latency and task health, see Configure monitoring and alerting.
To rename or remap synchronized objects, see Database, table, and column name mapping.
To view failed DDL statements, see View task logs.
To modify instance parameters, see Modify the parameters of a DTS instance.