Data Transmission Service (DTS) streams incremental data changes from a PolarDB for MySQL cluster to a DataHub project in real time. Once data lands in DataHub, you can feed it into downstream analytics services such as Realtime Compute for Apache Flink.
Prerequisites
Before you begin, make sure you have:
A PolarDB for MySQL cluster. See Purchase an Enterprise Edition cluster and Purchase a subscription cluster
DataHub activated with a project created to receive the synchronized data. See Get started with DataHub and Manage projects
Binary logging enabled on the PolarDB for MySQL cluster. See Enable binary logging
Limitations
DTS does not synchronize foreign keys from the source to the destination. Cascade and delete operations on the source are not replicated to the destination.
Source Database
| Limitation | Details |
|---|---|
| Primary key or UNIQUE constraint required | Tables without PRIMARY KEY or UNIQUE constraints may produce duplicate records in the destination. If your tables lack these constraints, enable the Exactly-Once write feature. See Synchronize tables without primary keys or UNIQUE constraints. |
| 1,000-table limit per task (with renaming) | If you select tables as sync objects and need to rename tables or columns in the destination, a single task supports up to 1,000 tables. Exceeding this limit causes a request error. Split the workload across multiple tasks, or sync at the database level instead. |
Binary logging and loose_polar_log_bin required for incremental sync | Binary logging must be enabled and loose_polar_log_bin must be set to on. Otherwise, the precheck fails and the task cannot start. See Enable binary logging and Modify parameters. Enabling binary logging incurs storage charges. |
| Binary log retention period | Retain binary logs for at least 24 hours for incremental-only sync, or at least 7 days for full + incremental sync. Insufficient retention may cause task failure and, in exceptional cases, data loss. After full data synchronization completes, you can set the retention period to more than 24 hours. |
| No DDL during schema or full data sync | Do not run DDL statements while schema synchronization or full data synchronization is in progress. Doing so causes task failure. |
Other limitations
| Limitation | Details |
|---|---|
| 2 MB string limit | A single string in the destination DataHub project cannot exceed 2 MB. |
| Sync object types | Only tables and databases can be selected as sync objects. |
| Read-only nodes excluded | DTS does not sync read-only nodes of the source PolarDB for MySQL cluster. |
| OSS external tables excluded | DTS does not sync Object Storage Service (OSS) external tables from the source cluster. |
Avoid pt-online-schema-change | Using tools like pt-online-schema-change for DDL during sync causes task failure. |
| Online DDL during sync (single source only) | If no other sources write to the destination during sync, you can use Data Management (DMS) for lock-free DDL on source tables. See Perform lock-free DDL operations. |
| Data loss risk with concurrent writes | If other sources write to the destination while you run online DDL via DMS, data loss may occur in the destination. |
| Task restoration SLA | If a task fails, DTS support attempts restoration within 8 hours. The task may be restarted and task parameters may be modified during restoration. Database parameters are not modified. |
Special case: DTS periodically executes CREATE DATABASE IF NOT EXISTS `test` on the source database to advance the binary log file position.
Billing
| Synchronization type | Fee |
|---|---|
| Schema synchronization and full data synchronization | Free |
| Incremental data synchronization | Charged. See Billing overview. |
Supported synchronization topologies
One-way one-to-one synchronization
One-way one-to-many synchronization
One-way many-to-one synchronization
One-way cascade synchronization
For the full topology reference, see Synchronization topologies.
SQL operations that can be synchronized
INSERT, UPDATE, and DELETE.
Permissions required
The database account of the source PolarDB for MySQL cluster needs at least read permissions on the objects to be synchronized.
Create a data synchronization task
The following steps are based on the new DTS console. If there are discrepancies between the DTS console and the DTS module in the Data Management (DMS) console, the DMS console takes precedence.
Go to the Data Synchronization Tasks page.
Log on to the Data Management (DMS) console.
In the top navigation bar, click Data + AI.
In the left-side navigation pane, choose DTS (DTS) > Data Synchronization.
Steps may vary based on the DMS console mode and layout. See Simple mode and Customize the layout and style of the DMS console. Alternatively, go directly to the Data Synchronization Tasks page.
Select the region where the data synchronization instance resides.
In the new DTS console, select the region from the top navigation bar.
Click Create Task. In the wizard, configure the source and destination databases.
Source Database
Parameter Description Select DMS Database Instance Select an existing database instance to auto-populate the fields, or leave blank to configure manually. Database Type Select PolarDB for MySQL. Connection Type Select Alibaba Cloud Instance. Instance Region The region where the source PolarDB for MySQL cluster resides. Cross-account Select No for same-account sync. PolarDB Cluster ID The ID of the source PolarDB for MySQL cluster. Database Account The database account for the source cluster. See Permissions required. Database Password The password for the database account. Destination Database
Parameter Description Select DMS Database Instance Select an existing database instance to auto-populate the fields, or leave blank to configure manually. Database Type Select DataHub. Connection Type Select Alibaba Cloud Instance. Instance Region The region where the destination DataHub project resides. Project The DataHub project to receive the synchronized data. Click Test Connectivity and Proceed. DTS automatically adds its server CIDR blocks to the whitelist of Alibaba Cloud database instances (such as ApsaraDB RDS for MySQL or ApsaraDB for MongoDB) and to the security group rules of Elastic Compute Service (ECS)-hosted databases. For databases deployed across multiple ECS instances, manually add the DTS CIDR blocks to each instance's security group rules. For self-managed databases in data centers or on third-party clouds, manually add the CIDR blocks to the database whitelist. See Add the CIDR blocks of DTS servers.
WarningAdding DTS CIDR blocks to whitelists or security groups introduces security exposure. Before proceeding, take protective measures such as: using strong credentials, restricting exposed ports, authenticating API calls, regularly checking the whitelist or ECS security group rules and forbidding unauthorized CIDR blocks, and connecting via Express Connect, VPN Gateway, or Smart Access Gateway where possible.
Configure the objects to be synchronized and the synchronization settings.
Parameter Description Synchronization Type Incremental Data Synchronization is selected by default. You can also select Schema Synchronization only. Full Data Synchronization is not available for this destination type. During schema synchronization, DTS copies the schemas of the selected tables from the source to the destination DataHub project. Processing Mode of Conflicting Tables Precheck and Report Errors (default): fails the precheck if identically named tables already exist in the destination. To resolve name conflicts without deleting or renaming destination tables, use the object name mapping feature. See Map object names. Ignore Errors and Proceed: skips the name conflict check. WarningThis option risks data inconsistency. During full data synchronization, existing destination records with matching primary or unique key values are retained (not overwritten). During incremental sync, they are overwritten. Schema mismatches may cause partial sync or task failure.
Naming Rules of Additional Columns Select Yes or No to control whether DTS uses the new naming rules for additional columns added to the destination topic. Check for naming conflicts before setting this option — conflicts cause task failure or data loss. See Naming rules for additional columns. Case Policy for Destination Object Names Controls the case of database, table, and column names in the destination. Default: DTS default policy. See Specify the capitalization of object names. Source Objects Select the tables or databases to synchronize and click the arrow icon to move them to Selected Objects. Selected Objects Right-click an object to rename it (single object). Click Batch Edit to rename multiple objects at once. See Map object names. Click Next: Advanced Settings.
Parameter Description Monitoring and Alerting No: disables alerting. Yes: enables alerting. Configure the alert threshold and notification contacts. See Configure monitoring and alerting. Retry Time for Failed Connections How long DTS retries failed connections after a task starts. Range: 10–1,440 minutes. Default: 720 minutes. Set this to at least 30 minutes. If DTS reconnects within the retry window, the task resumes; otherwise, it fails. When multiple tasks share the same source or destination, the shortest retry window applies to all. DTS charges for the instance during retry — release the instance promptly if the source or destination is decommissioned. Configure ETL No (default) or Yes. If enabled, enter extract, transform, and load (ETL) domain-specific language (DSL) statements in the code editor. See Configure ETL. (Optional) In the Selected Objects section, right-click a topic name to rename a table or database, or set a shard key for partitioning.
Click Next: Save Task Settings and Precheck. To preview the OpenAPI parameters for this task, hover over the button and click Preview OpenAPI parameters before proceeding.
DTS runs a precheck before starting the task. The task starts only after passing the precheck. If the precheck fails, click View Details next to each failed item, fix the underlying issue, then click Precheck Again. If an alert item can be ignored, click Confirm Alert Details > Ignore > OK > Precheck Again. Ignoring alerts may cause data inconsistency.
Wait for the Success Rate to reach 100%, then click Next: Purchase Instance.
On the purchase page, configure the instance.
Parameter Description Billing Method Subscription: pay upfront for a fixed term — more cost-effective for long-term use. Pay-as-you-go: billed hourly — suitable for short-term use. Release the instance when no longer needed to stop charges. Resource Group Settings The resource group for the instance. Default: default resource group. See What is Resource Management? Instance Class The synchronization throughput class. See Instance classes of data synchronization instances. Subscription Duration Available for subscription billing only. Options: 1–9 months, 1 year, 2 years, 3 years, or 5 years. Read and select Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start, then click OK in the confirmation dialog.
The task appears in the task list. You can track its progress there.
DataHub topic schema
When DTS writes incremental data to a DataHub topic, it adds system columns to store change metadata alongside the original data fields.
The following figure shows an example topic schema. In this example, id, name, and address are original data fields. With the previous naming rules, DTS adds a dts_ prefix to all fields including the originals. With the new naming rules, original data fields keep their names without prefixes.

The table below describes each additional column.
| Previous column name | New column name | Type | Description |
|---|---|---|---|
dts_record_id | new_dts_sync_dts_record_id | String | Unique ID of the incremental log entry. Auto-increments under normal conditions; may not increment after a rollback in disaster recovery scenarios, so some IDs can be duplicated. For UPDATE operations, both log entries (pre-update and post-update) share the same dts_record_id. |
dts_operation_flag | new_dts_sync_dts_operation_flag | String | Operation type. Values: I = INSERT, D = DELETE, U = UPDATE, F = full data synchronization. |
dts_instance_id | new_dts_sync_dts_instance_id | String | The server ID of the database. |
dts_db_name | new_dts_sync_dts_db_name | String | The database name. |
dts_table_name | new_dts_sync_dts_table_name | String | The table name. |
dts_utc_timestamp | new_dts_sync_dts_utc_timestamp | String | UTC timestamp of the operation (also the log file timestamp). |
dts_before_flag | new_dts_sync_dts_before_flag | String | Whether the row values are pre-update values. Y = yes, N = no. INSERT: N. UPDATE pre-update entry: Y. UPDATE post-update entry: N. DELETE: Y. |
dts_after_flag | new_dts_sync_dts_after_flag | String | Whether the row values are post-update values. Y = yes, N = no. INSERT: Y. UPDATE pre-update entry: N. UPDATE post-update entry: Y. DELETE: N. |
Flag values by operation type
The dts_before_flag and dts_after_flag columns encode which version of a row a given log entry represents.
INSERT
An INSERT entry records the newly inserted values (post-update values).
dts_before_flag | dts_after_flag |
|---|---|
| N | Y |

UPDATE
DTS generates two log entries per UPDATE: one for the pre-update state and one for the post-update state. Both entries share the same dts_record_id, dts_operation_flag, and dts_utc_timestamp values.
| Entry | dts_before_flag | dts_after_flag |
|---|---|---|
| Pre-update (entry 1) | Y | N |
| Post-update (entry 2) | N | Y |

DELETE
A DELETE entry records the deleted values (pre-update values).
dts_before_flag | dts_after_flag |
|---|---|
| Y | N |

What's next
After the synchronization task is running, use Realtime Compute for Apache Flink to analyze the data flowing into the DataHub project. See What is Alibaba Cloud Realtime Compute for Apache Flink?