Data Transmission Service (DTS) lets you migrate data from a PolarDB for MySQL cluster to a Kafka cluster — schema, full historical data, and ongoing incremental changes — without interrupting your applications.
Prerequisites
Before you begin, make sure you have:
A target Kafka cluster (self-managed) or a Message Queue for Apache Kafka instance
Sufficient available storage on the Kafka cluster — more than the storage currently used by the source PolarDB for MySQL cluster
Read permissions on the objects to migrate for the PolarDB for MySQL database account (see Create and manage a database account)
If the destination is a Message Queue for Apache Kafka instance, create a topic to receive the migrated data before you begin, then configure the instance as a self-managed Kafka cluster. See Step 1: Create a topic.
For supported source and destination database versions, see Migration solutions.
Limitations
Source database
The server hosting the source database must have sufficient outbound bandwidth. Insufficient bandwidth reduces migration speed.
Tables to migrate must have primary keys or UNIQUE constraints with unique field values. Without these, duplicate data may appear in the destination.
When selecting tables as migration objects and editing them (such as mapping table or column names), a single task supports a maximum of 1,000 tables. To migrate more than 1,000 tables, split the tables across multiple tasks or configure a task to migrate the entire database instead.
Data cannot be migrated from read-only nodes of the source cluster.
For incremental data migration, the following additional requirements apply:
Binary logging must be enabled, and the
loose_polar_log_binparameter must be set toon. If this is not configured before the task starts, the precheck fails. See Enable binary logging and Modify parameters.Enabling binary logging on a PolarDB for MySQL cluster incurs storage charges for the space used by binary logs.
Binary logs must be retained for at least 3 days (7 days recommended). If the retention period is too short, DTS may fail to obtain the binary logs and the task may fail. In exceptional circumstances, data inconsistency or loss may occur. To set the retention period, see Modify the retention period. If the retention period does not meet these requirements, the service reliability and performance stated in the DTS Service Level Agreement (SLA) may not be guaranteed.
Do not perform DDL operations that change database or table schemas during schema migration or full data migration. Otherwise, the migration task fails.
During full data migration, DTS queries the source database, which creates metadata locks that may block DDL operations on the source database.
If you perform only full data migration (without incremental), do not write new data to the source database during migration. To maintain real-time data consistency, select Schema Migration, Full Data Migration, and Incremental Data Migration together.
Other limitations
DTS does not migrate foreign keys. Cascade and delete operations defined on the source database are not replicated to the destination.
DTS does not migrate read-only nodes or Object Storage Service (OSS) external tables from the source PolarDB for MySQL cluster.
The following object types cannot be migrated: INDEX, PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, and FK objects.
DTS does not support active/standby switchover for the database instance during full data migration. If a switchover occurs, reconfigure the migration task immediately.
Do not use tools such as
pt-online-schema-changeto perform online DDL operations on migration objects in the source database. This causes migration failures.For FLOAT or DOUBLE columns, DTS reads values using
ROUND(COLUMN, PRECISION). If precision is not explicitly defined in the schema, DTS uses 38 for FLOAT and 308 for DOUBLE. Confirm that this precision meets your business requirements before starting the task.Run the migration during off-peak hours. Full data migration consumes read and write resources on both the source and destination databases, which increases database load.
DTS attempts to resume failed tasks for up to 7 days after failure. Before switching workloads to the destination, end or release the DTS instance, or use the
REVOKEcommand to revoke write permissions from the DTS database account. This prevents the task from auto-resuming and overwriting data in the destination.If a DTS instance fails, the DTS support team attempts to recover it within 8 hours. Recovery operations may include restarting the instance or adjusting DTS instance parameters (database parameters are not modified). For parameters that may be adjusted, see Modify instance parameters.
Usage notes
DTS periodically runs
CREATE DATABASE IF NOT EXISTS \test\`` on the source database to advance the binary log offset.Full data migration uses concurrent INSERT operations, which causes table fragmentation in the destination. After full migration completes, the destination table storage will be larger than the source.
Billing
| Migration type | Instance configuration fee | Internet traffic fee |
|---|---|---|
| Schema migration and full data migration | Free | Charged when Access Method of the destination database is set to Public IP Address. See Billing overview. |
| Incremental data migration | Charged. See Billing overview. | — |
Migration types
| Migration type | What it does |
|---|---|
| Schema migration | Migrates the schemas of selected objects from the source database to the destination Kafka cluster. |
| Full data migration | Migrates all historical data from the selected objects. |
| Incremental data migration | After full migration completes, continuously replicates ongoing data changes to Kafka without interrupting your applications. |
Incremental migration supports the following SQL operations:
| Operation type | SQL statements |
|---|---|
| DML | INSERT, UPDATE, DELETE |
| DDL | CREATE TABLE, ALTER TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE |
Create a migration task
Step 1: Open the Data Migration page
Use one of the following methods:
DTS console
Log on to the DTS console.DTS console
In the left-side navigation pane, click Data Migration.
In the upper-left corner, select the region where the migration instance resides.
DMS console
The actual operation may vary based on the mode and layout of the DMS console. For more information, see Simple mode and Customize the layout and style of the DMS console.
Log on to the DMS console.DMS console
In the top navigation bar, move the pointer over Data + AI > DTS (DTS) > Data Migration.
From the drop-down list to the right of Data Migration Tasks, select the region where the migration instance resides.
Step 2: Configure source and destination databases
Click Create Task, then configure the following parameters:
Task Name
| Parameter | Description |
|---|---|
| Task Name | DTS generates a name automatically. Specify a descriptive name to make the task easy to identify. The name does not need to be unique. |
Source database (PolarDB for MySQL)
| Parameter | Value |
|---|---|
| Select Existing Connection | If the instance is already registered with DTS, select it from the list. DTS populates the remaining parameters automatically. Otherwise, configure the parameters below. In the DMS console, select the instance from Select a DMS database instance. |
| Database Type | PolarDB for MySQL |
| Access Method | Cloud Instance |
| Instance Region | Region where the source PolarDB for MySQL instance resides |
| Cross-account | No (this example uses the same Alibaba Cloud account) |
| PolarDB Instance ID | ID of the source PolarDB for MySQL instance |
| Database Account | Database account for the source instance. For required permissions, see Prerequisites. |
| Database Password | Password for the database account |
| Encryption | Whether to encrypt the connection to the source database. For SSL encryption configuration, see Configure SSL encryption. |
Destination Database
| Parameter | Value |
|---|---|
| Select Existing Connection | If the instance is already registered with DTS, select it from the list. DTS populates the remaining parameters automatically. Otherwise, configure the parameters below. In the DMS console, select the instance from Select a DMS database instance. |
| Database Type | Kafka |
| Access Method | Select based on where the destination instance is deployed. This example uses Self-managed Database On ECS. If the destination is a self-managed database, complete the required preparations first. See Preparation overview. |
| Instance Region | Region where the destination Kafka cluster resides |
| ECS Instance ID | ID of the destination Kafka cluster |
| Port | Service port of the Kafka cluster. Default: 9092 |
| Database Account | Kafka username. Leave blank if authentication is not enabled. |
| Database Password | Kafka password. Leave blank if authentication is not enabled. |
| Kafka Version | Version of the Kafka cluster. If the self-managed Kafka cluster is version 1.0 or later, select 1.0 Or Later. |
| Encryption | Non-encrypted Connection or SCRAM-SHA-256, based on your security requirements |
| Topic | Topic that receives the migrated data |
| Use Kafka Schema Registry | Kafka Schema Registry is a service layer for metadata that provides a RESTful interface for storing and retrieving Avro schemas. Select No to skip, or Yesalert notification settings to use it and enter the URL or IP address registered in Kafka Schema Registry for the Avro schema. |
Step 3: Test connectivity
Click Test Connectivity and Proceed at the bottom of the page.
Make sure that DTS server CIDR blocks are added to the security settings of the source and destination databases. See Add DTS server IP addresses to a whitelist.
If the source or destination database is self-managed and Access Method is not set to Alibaba Cloud Instance, click Test Connectivity in the CIDR Blocks of DTS Servers dialog box.
Step 4: Configure migration objects
On the Configure Objects page, set the following parameters:
Migration Types
| Scenario | Selection |
|---|---|
| Full migration only | Select Schema Migration and Full Data Migration |
| Migration without service interruption | Select Schema Migration, Full Data Migration, and Incremental Data Migration |
If the destination Kafka instance Access Method is Alibaba Cloud Instance, Schema Migration is not supported.
If you do not select Schema Migration, make sure the destination database already contains the databases and tables to receive the data.
If you do not select Incremental Data Migration, do not write new data to the source instance during migration.
Processing Mode for Existing Destination Tables
Precheck and Report Errors: DTS checks whether the destination contains tables with the same names as source tables. If identical table names exist, the precheck fails and the task cannot start.
If identical table names exist and the destination tables cannot be deleted or renamed, use the object name mapping feature to rename migrated tables. See Map object names.
Ignore Errors and Proceed: Skips the precheck for identical table names.
WarningSelecting this option may cause data inconsistency: - During full data migration: if a source record has the same primary key as an existing destination record, DTS skips the source record. The existing destination record is retained. - During incremental data migration: if a source record has the same primary key as an existing destination record, DTS writes the source record, overwriting the destination record. - If source and destination schemas differ, only specific columns are migrated, or the task may fail. Proceed with caution.
Data format in Kafka
Select the format for data written to Kafka:
DTS Avro: Data is parsed based on the DTS Avro schema definition. See GitHub.
Canal JSON: See Canal JSON for parameters and examples.
Shareplex JSON: See Shareplex JSON for parameters and examples.
Kafka data compression format
| Format | Compression ratio | Compression speed | Best for |
|---|---|---|---|
| LZ4 (default) | Low | High | General use; prioritize speed |
| GZIP | High | Low | Storage-constrained environments. Note: GZIP consumes significantly more CPU. |
| Snappy | Medium | Medium | Balanced performance |
Policy for shipping data to Kafka partitions
Select a partition policy based on your requirements.
Message acknowledgement mechanism
Select a message acknowledgement mechanism based on your requirements.
Topic that stores DDL information
Select the topic used to store DDL information. If no topic is selected, DDL information is stored in the data topic by default.
Case Policy for Destination Object Names
Configure the case-sensitivity policy for migrated database, table, and column names. The default is DTS Default Policy. For details, see Case-sensitivity of object names in the destination database.
Source and selected objects
In the Source Objects section, select the tables to migrate and click
to add them to Selected Objects. Objects can be selected at the table level.
In Selected Objects, you can configure the Kafka topic name, number of partitions, and partition key for each source table. See Configure topic mapping for details.
Using the object name mapping feature may cause other dependent objects to fail migration.
To select which SQL operations to include in incremental migration, right-click the migration object in Selected Objects and select the operations in the dialog box.
Step 5: Configure advanced settings
Click Next: Advanced Settings and configure the following:
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | By default, DTS schedules the task to the shared cluster. For higher migration stability, purchase a dedicated cluster. See What is a DTS dedicated cluster. |
| Retry Time for Failed Connections | How long DTS retries connection failures after the task starts. Valid values: 10–1,440 minutes. Default: 720 minutes. Set to more than 30 minutes. If reconnection succeeds within this window, the task resumes. If multiple tasks share the same source or destination, the value specified last takes precedence. During retries, DTS instance charges apply. |
| Retry Time for Other Issues | How long DTS retries failed DML or DDL operations. Valid values: 1–1,440 minutes. Default: 10 minutes. Set to more than 10 minutes. Must be less than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Migration | Limits the read load on the source and write load on the destination during full migration. Configure QPS (queries per second) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Migration is selected. |
| Enable Throttling for Incremental Data Migration | Limits the load during incremental migration. Configure RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s). Available only when Incremental Data Migration is selected. |
| Whether to delete SQL operations on heartbeat tables of forward and reverse tasks | Controls whether DTS writes SQL operations on heartbeat tables to the source database. Yes: DTS does not write to heartbeat tables; a latency may be displayed for the DTS instance. No: DTS writes to heartbeat tables; this may affect features such as physical backup and cloning of the source database. |
| Environment Tag | Optional. Select a tag to identify the environment (such as production or staging). |
| Configure ETL | Whether to enable extract, transform, and load (ETL). Yes: Enter data processing statements in the code editor. See Configure ETL in a data migration or data synchronization task. No: ETL is disabled. For more information, see What is ETL? |
| Monitoring and Alerting | Whether to configure alerting. Yes: Set alert thresholds and notification contacts; DTS sends alerts when the task fails or migration latency exceeds the threshold. See Configure monitoring and alerting when you create a DTS task. No: No alerting. |
Step 6: Run the precheck
Click Next: Save Task Settings and Precheck.
To view the API parameters for this task configuration, move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
DTS runs a precheck before the migration starts. The task can only start after passing the precheck.
If the precheck fails, click View Details next to the failed item, fix the issue, and run the precheck again.
If a precheck item triggers an alert:
If the alert cannot be ignored, click View Details, fix the issue, and run the precheck again.
If the alert can be ignored, click Confirm Alert Details > Ignore > OK, then click Precheck Again. Ignoring an alert may cause data inconsistency.
Step 7: Purchase and start the instance
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
On the Purchase Instance page, configure the following:
Parameter Description Resource Group Resource group for the migration instance. Default: default resource group. See What is Resource Management? Instance Class Determines migration speed. Choose based on your data volume and required throughput. See Instance classes of data migration instances. Read and agree to Data Transmission Service (Pay-as-you-go) Service Terms by selecting the check box.
Click Buy and Start, then click OK in the confirmation message.
After the task starts, you can monitor its progress on the Data Migration page.
Full migration only (no incremental): The task stops automatically when complete. The status changes to Completed.
With incremental migration: The task runs continuously and does not stop automatically. The status shows Running.
Configure topic mapping
Use topic mapping to customize the destination topic, partition count, and partition key for each source table.
In the Selected Objects section, move the pointer over the destination topic name at the table level.
Click Edit next to the topic name.
In the Edit Table dialog box, configure the mapping:
At the database level, the Edit Schema dialog box appears, which supports fewer parameters. Name of target Topic and Number of Partitions cannot be modified in Edit Schema if the migration objects are not an entire database.
Parameter Description Name of target Topic The destination topic for this source table. Defaults to the topic selected in the Destination Database section. If the destination is a Message Queue for Apache Kafka instance, the topic must already exist — DTS does not create it automatically. If the destination is a self-managed Kafka cluster with a schema migration task, DTS attempts to create the topic. If you change this value, data is written to the specified topic. Filter Conditions SQL-based filter for the rows to migrate. See Set filter conditions. Number of Partitions Number of partitions for writing data to the destination topic. Partition Key Available when Policy for Shipping Data to Kafka Partitions is set to Ship Data to Separate Partitions Based on Hash Values of Primary Keys. Specify one or more columns as the partition key. DTS calculates a hash value for each row and routes it to the corresponding partition. This parameter is only available in the Edit Table dialog box (not at the database level). Click OK.