All Products
Search
Document Center

Data Transmission Service:Migrate data from a PolarDB for MySQL cluster to a Kafka cluster

Last Updated:Mar 28, 2026

Data Transmission Service (DTS) lets you migrate data from a PolarDB for MySQL cluster to a Kafka cluster — schema, full historical data, and ongoing incremental changes — without interrupting your applications.

Prerequisites

Before you begin, make sure you have:

  • A target Kafka cluster (self-managed) or a Message Queue for Apache Kafka instance

  • Sufficient available storage on the Kafka cluster — more than the storage currently used by the source PolarDB for MySQL cluster

  • Read permissions on the objects to migrate for the PolarDB for MySQL database account (see Create and manage a database account)

If the destination is a Message Queue for Apache Kafka instance, create a topic to receive the migrated data before you begin, then configure the instance as a self-managed Kafka cluster. See Step 1: Create a topic.

For supported source and destination database versions, see Migration solutions.

Limitations

Source database

  • The server hosting the source database must have sufficient outbound bandwidth. Insufficient bandwidth reduces migration speed.

  • Tables to migrate must have primary keys or UNIQUE constraints with unique field values. Without these, duplicate data may appear in the destination.

  • When selecting tables as migration objects and editing them (such as mapping table or column names), a single task supports a maximum of 1,000 tables. To migrate more than 1,000 tables, split the tables across multiple tasks or configure a task to migrate the entire database instead.

  • Data cannot be migrated from read-only nodes of the source cluster.

For incremental data migration, the following additional requirements apply:

  • Binary logging must be enabled, and the loose_polar_log_bin parameter must be set to on. If this is not configured before the task starts, the precheck fails. See Enable binary logging and Modify parameters.

    Enabling binary logging on a PolarDB for MySQL cluster incurs storage charges for the space used by binary logs.
  • Binary logs must be retained for at least 3 days (7 days recommended). If the retention period is too short, DTS may fail to obtain the binary logs and the task may fail. In exceptional circumstances, data inconsistency or loss may occur. To set the retention period, see Modify the retention period. If the retention period does not meet these requirements, the service reliability and performance stated in the DTS Service Level Agreement (SLA) may not be guaranteed.

  • Do not perform DDL operations that change database or table schemas during schema migration or full data migration. Otherwise, the migration task fails.

    During full data migration, DTS queries the source database, which creates metadata locks that may block DDL operations on the source database.
  • If you perform only full data migration (without incremental), do not write new data to the source database during migration. To maintain real-time data consistency, select Schema Migration, Full Data Migration, and Incremental Data Migration together.

Other limitations

  • DTS does not migrate foreign keys. Cascade and delete operations defined on the source database are not replicated to the destination.

  • DTS does not migrate read-only nodes or Object Storage Service (OSS) external tables from the source PolarDB for MySQL cluster.

  • The following object types cannot be migrated: INDEX, PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, and FK objects.

  • DTS does not support active/standby switchover for the database instance during full data migration. If a switchover occurs, reconfigure the migration task immediately.

  • Do not use tools such as pt-online-schema-change to perform online DDL operations on migration objects in the source database. This causes migration failures.

  • For FLOAT or DOUBLE columns, DTS reads values using ROUND(COLUMN, PRECISION). If precision is not explicitly defined in the schema, DTS uses 38 for FLOAT and 308 for DOUBLE. Confirm that this precision meets your business requirements before starting the task.

  • Run the migration during off-peak hours. Full data migration consumes read and write resources on both the source and destination databases, which increases database load.

  • DTS attempts to resume failed tasks for up to 7 days after failure. Before switching workloads to the destination, end or release the DTS instance, or use the REVOKE command to revoke write permissions from the DTS database account. This prevents the task from auto-resuming and overwriting data in the destination.

  • If a DTS instance fails, the DTS support team attempts to recover it within 8 hours. Recovery operations may include restarting the instance or adjusting DTS instance parameters (database parameters are not modified). For parameters that may be adjusted, see Modify instance parameters.

Usage notes

  • DTS periodically runs CREATE DATABASE IF NOT EXISTS \test\`` on the source database to advance the binary log offset.

  • Full data migration uses concurrent INSERT operations, which causes table fragmentation in the destination. After full migration completes, the destination table storage will be larger than the source.

Billing

Migration typeInstance configuration feeInternet traffic fee
Schema migration and full data migrationFreeCharged when Access Method of the destination database is set to Public IP Address. See Billing overview.
Incremental data migrationCharged. See Billing overview.

Migration types

Migration typeWhat it does
Schema migrationMigrates the schemas of selected objects from the source database to the destination Kafka cluster.
Full data migrationMigrates all historical data from the selected objects.
Incremental data migrationAfter full migration completes, continuously replicates ongoing data changes to Kafka without interrupting your applications.

Incremental migration supports the following SQL operations:

Operation typeSQL statements
DMLINSERT, UPDATE, DELETE
DDLCREATE TABLE, ALTER TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE

Create a migration task

Step 1: Open the Data Migration page

Use one of the following methods:

DTS console

  1. Log on to the DTS console.DTS console

  2. In the left-side navigation pane, click Data Migration.

  3. In the upper-left corner, select the region where the migration instance resides.

DMS console

Note

The actual operation may vary based on the mode and layout of the DMS console. For more information, see Simple mode and Customize the layout and style of the DMS console.

  1. Log on to the DMS console.DMS console

  2. In the top navigation bar, move the pointer over Data + AI > DTS (DTS) > Data Migration.

  3. From the drop-down list to the right of Data Migration Tasks, select the region where the migration instance resides.

Step 2: Configure source and destination databases

Click Create Task, then configure the following parameters:

Task Name

ParameterDescription
Task NameDTS generates a name automatically. Specify a descriptive name to make the task easy to identify. The name does not need to be unique.

Source database (PolarDB for MySQL)

ParameterValue
Select Existing ConnectionIf the instance is already registered with DTS, select it from the list. DTS populates the remaining parameters automatically. Otherwise, configure the parameters below. In the DMS console, select the instance from Select a DMS database instance.
Database TypePolarDB for MySQL
Access MethodCloud Instance
Instance RegionRegion where the source PolarDB for MySQL instance resides
Cross-accountNo (this example uses the same Alibaba Cloud account)
PolarDB Instance IDID of the source PolarDB for MySQL instance
Database AccountDatabase account for the source instance. For required permissions, see Prerequisites.
Database PasswordPassword for the database account
EncryptionWhether to encrypt the connection to the source database. For SSL encryption configuration, see Configure SSL encryption.

Destination Database

ParameterValue
Select Existing ConnectionIf the instance is already registered with DTS, select it from the list. DTS populates the remaining parameters automatically. Otherwise, configure the parameters below. In the DMS console, select the instance from Select a DMS database instance.
Database TypeKafka
Access MethodSelect based on where the destination instance is deployed. This example uses Self-managed Database On ECS. If the destination is a self-managed database, complete the required preparations first. See Preparation overview.
Instance RegionRegion where the destination Kafka cluster resides
ECS Instance IDID of the destination Kafka cluster
PortService port of the Kafka cluster. Default: 9092
Database AccountKafka username. Leave blank if authentication is not enabled.
Database PasswordKafka password. Leave blank if authentication is not enabled.
Kafka VersionVersion of the Kafka cluster. If the self-managed Kafka cluster is version 1.0 or later, select 1.0 Or Later.
EncryptionNon-encrypted Connection or SCRAM-SHA-256, based on your security requirements
TopicTopic that receives the migrated data
Use Kafka Schema RegistryKafka Schema Registry is a service layer for metadata that provides a RESTful interface for storing and retrieving Avro schemas. Select No to skip, or Yesalert notification settings to use it and enter the URL or IP address registered in Kafka Schema Registry for the Avro schema.

Step 3: Test connectivity

Click Test Connectivity and Proceed at the bottom of the page.

Make sure that DTS server CIDR blocks are added to the security settings of the source and destination databases. See Add DTS server IP addresses to a whitelist.
If the source or destination database is self-managed and Access Method is not set to Alibaba Cloud Instance, click Test Connectivity in the CIDR Blocks of DTS Servers dialog box.

Step 4: Configure migration objects

On the Configure Objects page, set the following parameters:

Migration Types

ScenarioSelection
Full migration onlySelect Schema Migration and Full Data Migration
Migration without service interruptionSelect Schema Migration, Full Data Migration, and Incremental Data Migration
If the destination Kafka instance Access Method is Alibaba Cloud Instance, Schema Migration is not supported.
If you do not select Schema Migration, make sure the destination database already contains the databases and tables to receive the data.
If you do not select Incremental Data Migration, do not write new data to the source instance during migration.

Processing Mode for Existing Destination Tables

  • Precheck and Report Errors: DTS checks whether the destination contains tables with the same names as source tables. If identical table names exist, the precheck fails and the task cannot start.

    If identical table names exist and the destination tables cannot be deleted or renamed, use the object name mapping feature to rename migrated tables. See Map object names.
  • Ignore Errors and Proceed: Skips the precheck for identical table names.

    Warning

    Selecting this option may cause data inconsistency: - During full data migration: if a source record has the same primary key as an existing destination record, DTS skips the source record. The existing destination record is retained. - During incremental data migration: if a source record has the same primary key as an existing destination record, DTS writes the source record, overwriting the destination record. - If source and destination schemas differ, only specific columns are migrated, or the task may fail. Proceed with caution.

Data format in Kafka

Select the format for data written to Kafka:

  • DTS Avro: Data is parsed based on the DTS Avro schema definition. See GitHub.

  • Canal JSON: See Canal JSON for parameters and examples.

  • Shareplex JSON: See Shareplex JSON for parameters and examples.

Kafka data compression format

FormatCompression ratioCompression speedBest for
LZ4 (default)LowHighGeneral use; prioritize speed
GZIPHighLowStorage-constrained environments. Note: GZIP consumes significantly more CPU.
SnappyMediumMediumBalanced performance

Policy for shipping data to Kafka partitions

Select a partition policy based on your requirements.

Message acknowledgement mechanism

Select a message acknowledgement mechanism based on your requirements.

Topic that stores DDL information

Select the topic used to store DDL information. If no topic is selected, DDL information is stored in the data topic by default.

Case Policy for Destination Object Names

Configure the case-sensitivity policy for migrated database, table, and column names. The default is DTS Default Policy. For details, see Case-sensitivity of object names in the destination database.

Source and selected objects

In the Source Objects section, select the tables to migrate and click Rightwards arrow to add them to Selected Objects. Objects can be selected at the table level.

In Selected Objects, you can configure the Kafka topic name, number of partitions, and partition key for each source table. See Configure topic mapping for details.

Using the object name mapping feature may cause other dependent objects to fail migration.
To select which SQL operations to include in incremental migration, right-click the migration object in Selected Objects and select the operations in the dialog box.

Step 5: Configure advanced settings

Click Next: Advanced Settings and configure the following:

ParameterDescription
Dedicated Cluster for Task SchedulingBy default, DTS schedules the task to the shared cluster. For higher migration stability, purchase a dedicated cluster. See What is a DTS dedicated cluster.
Retry Time for Failed ConnectionsHow long DTS retries connection failures after the task starts. Valid values: 10–1,440 minutes. Default: 720 minutes. Set to more than 30 minutes. If reconnection succeeds within this window, the task resumes. If multiple tasks share the same source or destination, the value specified last takes precedence. During retries, DTS instance charges apply.
Retry Time for Other IssuesHow long DTS retries failed DML or DDL operations. Valid values: 1–1,440 minutes. Default: 10 minutes. Set to more than 10 minutes. Must be less than Retry Time for Failed Connections.
Enable Throttling for Full Data MigrationLimits the read load on the source and write load on the destination during full migration. Configure QPS (queries per second) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Migration is selected.
Enable Throttling for Incremental Data MigrationLimits the load during incremental migration. Configure RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s). Available only when Incremental Data Migration is selected.
Whether to delete SQL operations on heartbeat tables of forward and reverse tasksControls whether DTS writes SQL operations on heartbeat tables to the source database. Yes: DTS does not write to heartbeat tables; a latency may be displayed for the DTS instance. No: DTS writes to heartbeat tables; this may affect features such as physical backup and cloning of the source database.
Environment TagOptional. Select a tag to identify the environment (such as production or staging).
Configure ETLWhether to enable extract, transform, and load (ETL). Yes: Enter data processing statements in the code editor. See Configure ETL in a data migration or data synchronization task. No: ETL is disabled. For more information, see What is ETL?
Monitoring and AlertingWhether to configure alerting. Yes: Set alert thresholds and notification contacts; DTS sends alerts when the task fails or migration latency exceeds the threshold. See Configure monitoring and alerting when you create a DTS task. No: No alerting.

Step 6: Run the precheck

Click Next: Save Task Settings and Precheck.

To view the API parameters for this task configuration, move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
DTS runs a precheck before the migration starts. The task can only start after passing the precheck.
If the precheck fails, click View Details next to the failed item, fix the issue, and run the precheck again.
If a precheck item triggers an alert:
If the alert cannot be ignored, click View Details, fix the issue, and run the precheck again.
If the alert can be ignored, click Confirm Alert Details > Ignore > OK, then click Precheck Again. Ignoring an alert may cause data inconsistency.

Step 7: Purchase and start the instance

  1. Wait until Success Rate reaches 100%, then click Next: Purchase Instance.

  2. On the Purchase Instance page, configure the following:

    ParameterDescription
    Resource GroupResource group for the migration instance. Default: default resource group. See What is Resource Management?
    Instance ClassDetermines migration speed. Choose based on your data volume and required throughput. See Instance classes of data migration instances.
  3. Read and agree to Data Transmission Service (Pay-as-you-go) Service Terms by selecting the check box.

  4. Click Buy and Start, then click OK in the confirmation message.

After the task starts, you can monitor its progress on the Data Migration page.

Full migration only (no incremental): The task stops automatically when complete. The status changes to Completed.
With incremental migration: The task runs continuously and does not stop automatically. The status shows Running.

Configure topic mapping

Use topic mapping to customize the destination topic, partition count, and partition key for each source table.

  1. In the Selected Objects section, move the pointer over the destination topic name at the table level.

  2. Click Edit next to the topic name.

  3. In the Edit Table dialog box, configure the mapping:

    At the database level, the Edit Schema dialog box appears, which supports fewer parameters. Name of target Topic and Number of Partitions cannot be modified in Edit Schema if the migration objects are not an entire database.
    ParameterDescription
    Name of target TopicThe destination topic for this source table. Defaults to the topic selected in the Destination Database section. If the destination is a Message Queue for Apache Kafka instance, the topic must already exist — DTS does not create it automatically. If the destination is a self-managed Kafka cluster with a schema migration task, DTS attempts to create the topic. If you change this value, data is written to the specified topic.
    Filter ConditionsSQL-based filter for the rows to migrate. See Set filter conditions.
    Number of PartitionsNumber of partitions for writing data to the destination topic.
    Partition KeyAvailable when Policy for Shipping Data to Kafka Partitions is set to Ship Data to Separate Partitions Based on Hash Values of Primary Keys. Specify one or more columns as the partition key. DTS calculates a hash value for each row and routes it to the corresponding partition. This parameter is only available in the Edit Table dialog box (not at the database level).
  4. Click OK.