All Products
Search
Document Center

Data Transmission Service:Migrate data from PolarDB-X 2.0 to Message Queue for Kafka

Last Updated:Mar 30, 2026

Data Transmission Service (DTS) lets you migrate data from a PolarDB-X 2.0 instance to a Message Queue for Apache Kafka instance. You can run a one-time full migration or keep Kafka in sync with ongoing incremental change capture during migration.

How it works

DTS reads the binary log (binlog) of the source PolarDB-X instance to capture row-level INSERT, UPDATE, and DELETE operations as they are committed. For combined full and incremental migration, DTS first takes a consistent snapshot of the source data, then switches to binlog-based incremental capture from the point where the snapshot completed.

Because MySQL purges binlogs on a schedule, your source instance must retain binlogs long enough to cover the entire snapshot phase plus any downtime or interruptions. If binlogs are purged before DTS reads them, the task fails or data loss occurs.

What DTS supports

Capability Details
Migration types Schema migration, full data migration, incremental data migration
Incremental DML operations INSERT, UPDATE, DELETE
Kafka data formats DTS Avro, Canal Json
Kafka authentication Non-encrypted, SCRAM-SHA-256
Partition routing Configurable policy for routing data to Kafka partitions
Object name mapping Rename tables or columns at the destination
Data filtering WHERE condition-based row filtering
ETL Extract, transform, and load (ETL) processing during migration
DTS does not migrate foreign keys. Cascade and delete operations defined in the source database are not replicated to Kafka.

Prerequisites

Before you begin, make sure you have:

  • A PolarDB-X 2.0 instance that is compatible with MySQL 5.7

  • A topic created in the destination Message Queue for Apache Kafka instance to receive the migrated data. For more information, see Step 1: Create a topic

  • Compatible versions for both source and destination. For more information, see Overview of data migration scenarios

  • Sufficient storage space in the destination Kafka instance to hold the source data

Source database account permissions:

Migration type Required permissions
Schema migration SELECT
Full data migration SELECT
Incremental data migration REPLICATION SLAVE, REPLICATION CLIENT, and SELECT on the objects to be migrated

For instructions on granting permissions, see Data synchronization tools for PolarDB-X.

Binary log configuration (required for incremental data migration):

Enable binary logging on the source PolarDB-X instance and set the following parameter:

binlog_row_image=full

Set the retention period based on your migration type:

Migration type Minimum retention period
Incremental data migration only More than 24 hours
Full data migration + incremental data migration At least 7 days
Warning

If binlogs are purged before DTS reads them, the migration task fails. For large source databases, the full data migration (snapshot) phase can take longer than the retention period. If binlogs are purged during the snapshot, DTS cannot resume incremental migration from the correct binlog position, which causes data inconsistency or loss. After full data migration completes, you can reduce the retention period to more than 24 hours.

Limitations

Source database limits:

  • The server hosting the source database must have sufficient outbound bandwidth. Insufficient bandwidth reduces migration speed.

  • Tables to be migrated must have a PRIMARY KEY or UNIQUE constraint with all fields unique. Without these constraints, the destination may contain duplicate records.

  • If you select tables as migration objects and need to rename tables or columns at the destination, a single migration task supports a maximum of 1,000 tables. For more than 1,000 tables, configure multiple tasks or migrate the entire database instead.

  • During schema migration and full data migration, do not perform DDL operations on the source database. DDL changes during migration cause the task to fail.

  • If you change the network type of the PolarDB-X instance during migration, update the network connection settings of the migration task accordingly.

  • For full data migration without incremental migration: do not write to the source database during migration. Writes during this window cause data inconsistency between source and destination. To ensure consistency, select Schema Migration, Full Data Migration, and Incremental Data Migration together.

Other limits:

  • Evaluate the performance impact before starting migration. Migrate during off-peak hours when possible. Full data migration consumes read and write resources on both source and destination.

  • During full data migration, concurrent INSERT operations cause fragmentation in destination tables. After migration, the used tablespace in the destination is larger than in the source.

  • DTS attempts to resume failed tasks for up to 7 days. Before switching workloads to the destination, stop or release any failed tasks. Alternatively, run the REVOKE statement to revoke write permissions from the DTS account to prevent source data from overwriting destination data when a failed task resumes.

Precaution:

DTS updates the dts_health_check.ha_health_check table in the source database on a schedule to advance the binary log position.

Billing

Migration type Instance configuration fee Internet traffic fee
Schema migration and full data migration Free of charge Charged only when data is migrated from Alibaba Cloud over the Internet. See Billing overview
Incremental data migration Charged. See Billing overview

Set up the migration task

Step 1: Go to the Data Migration Tasks page

  1. Log on to the Data Management (DMS) console.

  2. In the top navigation bar, click DTS.

  3. In the left-side navigation pane, choose DTS (DTS) > Data Migration.

Console layout and available options may vary based on your DMS mode. For more information, see Simple mode and Customize the layout and style of the DMS console. You can also go directly to the Data Migration Tasks page of the new DTS console.

Step 2: Configure the source and destination databases

  1. From the drop-down list next to Data Migration Tasks, select the region where the migration instance resides.

    In the new DTS console, select the region in the upper-left corner.
  2. Click Create Task. In the Create Task wizard, configure the source and destination databases.

    Warning

    After you configure the source and destination databases, read the limits displayed at the top of the page before proceeding. Skipping this step may cause task failures or data inconsistency.

    Source database parameters:

    Parameter Description
    Task Name A descriptive name for the task. DTS assigns a default name. A unique name is not required.
    Select an existing DMS database instance Select an existing instance to have DTS populate the parameters automatically, or configure them manually.
    Database Type Select PolarDB-X 2.0.
    Connection Type Select Alibaba Cloud Instance.
    Instance Region The region where the source PolarDB-X instance resides.
    Instance ID The ID of the source PolarDB-X instance.
    Database Account The account for the source instance. See the permissions table in Prerequisites.
    Database Password The password for the database account.

    Destination database parameters:

    Parameter Description
    Select an existing DMS database instance Select an existing instance to have DTS populate the parameters automatically, or configure them manually.
    Database Type Select Kafka.
    Connection Type Select Express Connect, VPN Gateway, or Smart Access Gateway. DTS does not list Message Queue for Apache Kafka as a direct access method—connect to it as a self-managed Kafka cluster.
    Instance Region The region where the destination Message Queue for Apache Kafka instance resides.
    Connected VPC The ID of the virtual private cloud (VPC) associated with the Kafka instance. To get the VPC ID: log on to the Message Queue for Apache Kafka console, open the Instance Details page, and find the VPC ID in the Configuration Information section of the Instance Information tab.
    IP Address or Domain Name An IP address of the Kafka instance. To get an IP address: on the Instance Details page, find an IP address in the Default Endpoint field of the Endpoint Information section.
    Port Number The service port of the Kafka instance. Default: 9092.
    Database Account The account for the Kafka instance. Not required for VPC-connected instances.
    Database Password The password for the database account. Not required for VPC-connected instances.
    Kafka Version The version of the destination Kafka instance.
    Encryption Select Non-encrypted or SCRAM-SHA-256 based on your security requirements.
    Topic The topic that receives the migrated data. Select from the drop-down list.
    Topic That Stores DDL Information The topic that stores DDL information. If left blank, DDL information is stored in the topic specified by Topic.
    Use Kafka Schema Registry Whether to use Kafka Schema Registry, which provides a RESTful API for storing and retrieving Avro schemas. Select No to skip, or Yes and provide the URL or IP address registered in Schema Registry for your Avro schemas.
  3. Click Test Connectivity and Proceed. DTS automatically adds its CIDR blocks to the IP address whitelist of Alibaba Cloud database instances (such as ApsaraDB RDS for MySQL or ApsaraDB for MongoDB) and to the security group rules of Elastic Compute Service (ECS) instances hosting self-managed databases. For self-managed databases on multiple ECS instances, manually add the DTS CIDR blocks to the security group rules of each instance. For on-premises or third-party cloud databases, manually add the DTS CIDR blocks to the database's IP address whitelist. For more information, see Add the CIDR blocks of DTS servers to the security settings of on-premises databases.

    Warning

    Adding DTS CIDR blocks to whitelists or security group rules introduces potential security risks. Before using DTS, take preventive measures: strengthen account and password security, restrict exposed ports, authenticate API calls, regularly audit your IP whitelist and ECS security group rules, and remove unauthorized CIDR blocks. For stronger isolation, connect databases to DTS using Express Connect, VPN Gateway, or Smart Access Gateway.

Step 3: Configure migration objects and settings

Parameter Description
Migration Types Select Schema Migration and Full Data Migration for a one-time migration. Select Schema Migration, Full Data Migration, and Incremental Data Migration to keep the destination in sync during migration without interrupting your application. If you do not select Incremental Data Migration, do not write to the source database during migration to prevent data inconsistency.
Processing Mode of Conflicting Tables Precheck and Report Errors: fails the precheck if identically named tables exist in both source and destination. Use object name mapping to rename destination tables before resolving conflicts. Ignore Errors and Proceed: skips the conflict check. If schemas match, records with duplicate primary keys are not migrated. If schemas differ, only specific columns are migrated or the task fails. Proceed with caution.
Data Format in Kafka DTS Avro: data is parsed using the DTS Avro schema definition. See the schema definition on GitHub. Canal Json: data is stored in Canal Json format. See the Canal Json section of Data formats of a Kafka cluster.
Policy for Shipping Data to Kafka Partitions Select a partition routing policy based on your requirements. See Specify the policy for migrating data to Kafka partitions. Not available when the source is a PolarDB-X 1.0 instance.
Source Objects Select one or more objects and click the Rightwards arrow icon to add them to Selected Objects. Select columns, tables, or schemas. If you select tables or columns, DTS does not migrate views, triggers, or stored procedures.
Selected Objects To rename a single object at the destination, right-click it. See Map the name of a single object. To rename multiple objects at once, click Batch Edit. See Map multiple object names at a time. Renaming an object may break other objects that depend on it. To filter rows, right-click an object and specify WHERE conditions. See Set filter conditions. To select specific SQL operations for a table, right-click the object and choose the operations to migrate.

Step 4: Configure advanced settings

Click Next: Advanced Settings and configure the following parameters.

Parameter Description
Set Alerts Select Yes to receive notifications when the task fails or migration latency exceeds a threshold. Specify the alert threshold and contacts. See Configure monitoring and alerting. Select No to skip alerting.
Capitalization of Object Names in Destination Instance Controls the capitalization of database, table, and column names in the destination. Defaults to DTS default policy. See Specify the capitalization of object names in the destination instance.
Retry Time for Failed Connections How long DTS retries failed connections after the task starts. Valid values: 10–1440 minutes. Default: 720 minutes. Set this to more than 30 minutes. If DTS reconnects within this window, the task resumes. Otherwise, it fails. When multiple tasks share the same source or destination, the shortest retry time among them takes precedence. DTS charges for the instance during retry attempts—set the retry window based on your needs and release the instance promptly when the source and destination are decommissioned.
Configure ETL Select Yes to enable extract, transform, and load (ETL) processing and enter data processing statements. See What is ETL? and Configure ETL in a data migration or data synchronization task. Select No to skip ETL.
Whether to delete SQL operations on heartbeat tables of forward and reverse tasks Select Yes to suppress writes to the heartbeat table. A migration latency may appear on the DTS instance. Select No to allow DTS to write to the heartbeat table. Some source database features such as physical backup and cloning may be affected.

Step 5: Run the precheck and purchase an instance

  1. Click Next: Save Task Settings and Precheck. To preview the API parameters for this task configuration, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.

    DTS runs a precheck before starting the migration. The task starts only after the precheck passes. If the precheck fails, click View Details next to each failed item, resolve the reported issues, and click Precheck Again. If an alert is triggered and can be ignored, click Confirm Alert Details, then Ignore in the View Details dialog, click OK, and click Precheck Again. Ignoring alerts may cause data inconsistency.
  2. Wait until the success rate reaches 100%, then click Next: Purchase Instance.

  3. On the Purchase Instance page, configure the instance class.

    Parameter Description
    Resource Group The resource group for the migration instance. Default: default resource group. See What is Resource Management?
    Instance Class The instance class determines migration speed. Select based on your workload. See Specifications of data migration instances.
  4. Read and select the Data Transmission Service (Pay-as-you-go) Service Terms check box.

  5. Click Buy and Start. The task appears in the task list and begins running.

What's next

After the migration task is running, monitor its progress in the task list. Once the task is stable and the destination Kafka instance is consuming data correctly, switch your downstream applications to read from the destination topics. Before switching, stop or release any failed tasks to prevent DTS from overwriting destination data when resuming.