Use Data Transmission Service (DTS) to migrate data from a self-managed TiDB database to a PolarDB for MySQL cluster. DTS supports schema migration, full data migration, and incremental data migration, so you can minimize downtime by keeping the destination in sync with the source until cutover.
The overall process has four stages:
-
Create the destination PolarDB for MySQL cluster and verify account permissions.
-
(Incremental migration only) Set up a Kafka cluster and configure TiDB Binlog or TiCDC to stream changes to Kafka.
-
Create and configure the DTS migration task.
-
Run the precheck, purchase the instance, and monitor task progress.
Prerequisites
Before you begin, make sure you have:
-
A PolarDB for MySQL cluster with storage space larger than the space used by the TiDB source database. See Purchase a pay-as-you-go PolarDB for MySQL cluster and Purchase a subscription PolarDB for MySQL cluster
-
The required database accounts with the permissions listed in Required permissions
-
(Incremental migration only) A Kafka cluster and a configured incremental data collection pipeline. See (Optional) Set up incremental data collection
Required permissions
| Database | Required permissions | Reference |
|---|---|---|
| TiDB database | SHOW VIEW, and SELECT on the objects to be migrated | Privilege management |
| PolarDB for MySQL cluster | Read and write on the destination database | Create and manage database accounts |
Billing
| Migration type | Instance configuration fee | Internet traffic fee |
|---|---|---|
| Schema migration and full data migration | Free | Charged when Access Method is set to Public IP Address. See Billing overview. |
| Incremental data migration | Charged. See Billing overview. | See Billing overview. |
SQL operations supported for incremental migration
| Operation type | SQL statements |
|---|---|
| DML | INSERT, UPDATE, DELETE |
| DDL | CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE, ADD COLUMN, DROP COLUMN |
Usage notes
Source database limits
-
Egress bandwidth: The source database server must have sufficient egress bandwidth. Insufficient bandwidth reduces migration speed.
-
Primary key or UNIQUE constraint: Tables must have a PRIMARY KEY or UNIQUE constraint with all fields unique. Without these constraints, the destination database may contain duplicate records.
-
Table limit per task: When migrating individual tables with column-level edits (such as renaming columns), a single task supports up to 1,000 tables. For more than 1,000 tables, split the work across multiple tasks or migrate the entire database instead.
-
DDL restrictions during migration: During schema migration and full data migration, do not perform DDL operations that change the schema of the database or tables. The migration task fails if DDL operations are detected.
-
Prefix indexes: TiDB metadata does not store prefix index lengths. After migration, prefix index lengths are lost, which may prevent the instance from running. Manually fix prefix index lengths for any affected tables after migration.
-
Incremental data requirements: Incremental data migration requires a Kafka cluster and TiDB Binlog or TiCDC components. See (Optional) Set up incremental data collection.
Other limits
-
Incremental data partition: DTS reads incremental data only from partition ID 0 in the Kafka topic.
-
Position initialization: After creating a task with incremental migration, perform changes or insert test data on the source database promptly to update position information. Failing to do so may cause the task to fail due to excessive latency.
-
4-byte characters: If the data includes rare characters or emoji (4-byte characters), the destination database and tables must use the UTF8mb4 character set. If you use DTS schema migration, also set the
character_set_serverinstance parameter to UTF8mb4 in the destination database. -
Full migration resource usage: Full data migration consumes read and write resources on both the source and destination databases, increasing their load. Evaluate database performance before migrating and run full migration during off-peak hours (for example, when CPU load is below 30%).
-
Tablespace size after full migration: Concurrent INSERT operations during full data migration cause table fragmentation. The tablespace in the destination database will be larger than in the source database after migration completes.
-
Writes to the destination during migration: Writing to the destination database during migration may cause data inconsistency between the source and destination.
-
FLOAT and DOUBLE precision: DTS uses
ROUND(COLUMN,PRECISION)to read FLOAT and DOUBLE values. The default precision is 38 digits for FLOAT and 308 digits for DOUBLE. Verify that these precision settings meet your requirements. -
Auto-resume: DTS automatically attempts to resume tasks that failed within the last seven days. Before switching workloads to the destination instance, stop or release the DTS task, or run
REVOKEto remove write permissions from the DTS account on the destination instance. -
DDL failures: If a DDL statement fails in the destination database, the DTS task continues to run. View failed DDL statements in task logs. See View task logs.
-
Technical support intervention: If a DTS task fails, technical support attempts to restore it within 8 hours. During restoration, the task may be restarted and task parameters (not database parameters) may be modified. For the parameters that may be modified, see the Modify instance parameters section.
(Optional) Set up incremental data collection
Skip this section if you only need schema migration and full data migration.
To capture incremental changes from TiDB, you need a Kafka cluster and either TiDB Binlog or TiCDC. Both methods use Kafka as the transport layer. Complete the shared Kafka setup steps first, then follow the steps for your chosen method.
Step 1: Set up a Kafka cluster
Use one of the following options:
-
Self-managed Kafka cluster: Deploy Apache Kafka by following the Apache Kafka documentation.
WarningTo handle the volume of binary log data from TiDB, increase the following parameters in your Kafka configuration. See Kafka configuration.
Component Parameter Why Broker message.max.bytesSets the maximum size of a message the broker accepts. TiDB binary log events can be large, so the default limit causes write failures. Broker replica.fetch.max.bytesSets the maximum bytes a follower can fetch per request. Must be at least as large as message.max.bytesto replicate oversized messages.Consumer fetch.message.max.bytesSets the maximum bytes a consumer fetches per partition. Must be at least as large as message.max.bytesso DTS can read full events. -
Message Queue for Apache Kafka: Use the Alibaba Cloud managed Kafka service. See Getting started.
Deploy the Message Queue for Apache Kafka instance in the same virtual private cloud (VPC) as the source database server to reduce network latency and ensure stable connectivity.
Step 2: Create a Kafka topic
Create a topic in the Kafka cluster or Message Queue for Apache Kafka instance.
The topic must have exactly one partition. DTS reads incremental data only from the partition with ID 0.
Step 3: Configure TiDB Binlog or TiCDC
Choose one method based on your environment:
<details> <summary>Option A: TiDB Binlog</summary>
To minimize the impact of network latency, deploy the Pump component, Drainer component, and Kafka cluster on servers that share the same internal network as the source TiDB database server.
-
Deploy the Pump and Drainer components. See TiDB Binlog cluster deployment.
-
Edit the Drainer configuration file to set Kafka as the output. See Kafka custom development.Kafka Custom Development
-
Verify that the TiDB database server can connect to the Kafka cluster.
-
Add the CIDR blocks of DTS servers to the TiDB database whitelist. See Add the CIDR blocks of DTS servers.
</details>
<details> <summary>Option B: TiCDC</summary>
-
Install the TiCDC component. Use TiUP to add or scale out TiCDC on your existing TiDB cluster. See Deploy TiCDC.
-
Replicate incremental data to Kafka using
tiup cdc cli changefeed create \as the first command. See Replicate data to Kafka. -
Verify that the TiDB database server can connect to the Kafka cluster.
</details>
Create a migration task
Step 1: Go to the Data Migration page
Use one of the following methods:
DTS console
-
Log on to the DTS console.DTS console
-
In the left-side navigation pane, click Data Migration.
-
In the upper-left corner, select the region where the migration instance resides.
DMS console
The actual steps may vary based on the mode and layout of the DMS console. See Simple mode and Customize the layout and style of the DMS console.
-
Log on to the DMS console.DMS console
-
In the top navigation bar, move the pointer over Data + AI > DTS (DTS) > Data Migration.
-
In the drop-down list to the right of Data Migration Tasks, select the region where the migration instance resides.
Step 2: Configure source and destination databases
-
Click Create Task.
-
On the task configuration page, configure the source and destination databases.
Source database parameters
| Parameter | Description |
|---|---|
| Task Name | A name for the DTS task. DTS auto-generates a name. Specify a descriptive name for easy identification. The name does not need to be unique. |
| Select Existing Connection | Select a registered database instance from the drop-down list, or configure the connection manually. In the DMS console, select from Select a DMS database instance. |
| Database Type | Select TiDB. |
| Access Method | Select the connection type based on where the TiDB database is deployed. This example uses Self-managed Database on ECS. For other connection types, complete the corresponding preparations. |
| Instance Region | The region where the TiDB database is located. |
| ECS Instance ID | The ID of the Elastic Compute Service (ECS) instance running the TiDB database. |
| Port Number | The service port of the TiDB database. Default: 4000. |
| Database Account | The account for the TiDB database. |
| Database Password | The password for the TiDB database account. |
| Migrate Incremental Data | To migrate incremental data, select Yesalert notification settings and fill in the Kafka cluster information. See Kafka cluster parameters. |
Destination database parameters
| Parameter | Description |
|---|---|
| Select Existing Connection | Select a registered database instance from the drop-down list, or configure the connection manually. In the DMS console, select from Select a DMS database instance. |
| Database Type | Select PolarDB for MySQL. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | The region where the destination PolarDB for MySQL cluster is located. |
| Replicate Data Across Alibaba Cloud Accounts | Select No when using a database instance under the current Alibaba Cloud account. |
| PolarDB Cluster ID | Select the ID of the destination PolarDB for MySQL cluster. |
| Database Account | The account for the destination PolarDB for MySQL cluster. |
| Database Password | The password for the destination PolarDB for MySQL cluster account. |
| Encryption | Whether to encrypt the connection. For details on SSL encryption, see Configure SSL encryption. |
Step 3: Test connectivity
In the lower part of the page, click Test Connectivity and Proceed. In the CIDR Blocks of DTS Servers dialog box, click Test Connectivity.
Make sure the CIDR blocks of DTS servers are added to the security settings of the source and destination databases. See Add the CIDR blocks of DTS servers.
Step 4: Configure migration objects
-
On the Configure Objects page, set the following parameters.
Migration types and conflict handling
| Parameter | Description |
|---|---|
| Migration Types | Select the migration types based on your needs: <br>- Schema Migration + Full Data Migration: migrates the schema and existing data only. <br>- Schema Migration + Full Data Migration + Incremental Data Migration: minimizes downtime by also capturing changes during migration. <br> Note
If you skip Schema Migration, create the target database and tables manually and enable object name mapping in Selected Objects. If you skip Incremental Data Migration, avoid writing to the source database during migration to maintain data consistency. |
| Processing Mode of Conflicting Tables | - Precheck and Report Errors (default): fails the precheck if the destination contains tables with the same names as the source. Use object name mapping to rename conflicting tables. <br>- Ignore Errors and Proceed: skips the precheck for name conflicts. Warning
This option risks data inconsistency — during full migration, DTS skips conflicting records; during incremental migration, DTS overwrites them. If schemas differ, only specific columns are migrated or the task may fail. |
| Capitalization of Object Names in Destination Instance | Controls the capitalization of database names, table names, and column names in the destination. Default: DTS default policy. See Specify the capitalization of object names in the destination instance. |
Object selection
| Parameter | Description |
|---|---|
| Source Objects | Select the tables or databases to migrate. Click |
| Selected Objects | Right-click an object to rename it. See Map the name of a single object. Click Batch Edit to rename multiple objects at once. See Map multiple object names at a time. Right-click a table to configure a WHERE filter condition. Note
Renaming an object may prevent dependent objects from being migrated. Note
|
-
Click Next: Advanced Settings.
Advanced settings
| Parameter | Description |
|---|---|
| Select the engine type of the destination database | The storage engine for the destination database: InnoDB (default) or X-Engine (for online transaction processing (OLTP) workloads). |
| Retry Time for Failed Connections | How long DTS retries failed connections after the task starts. Range: 10–1,440 minutes. Default: 720. We recommend that you set the parameter to a value greater than 30 minutes. If the connection is restored within this period, the task resumes; otherwise, it fails. Note
When multiple tasks share the same source or destination database, the most recently set value takes effect. DTS charges for the instance during retry time. |
| Retry Time for Other Issues | How long DTS retries failed DDL or DML operations. Range: 1–1,440 minutes. Default: 10. We recommend that you set the parameter to a value greater than 10 minutes. Important
This value must be less than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Migration | Limits resource usage during full data migration. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Migration is selected. |
| Enable Throttling for Incremental Data Migration | Limits resource usage during incremental data migration. Configure RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s). Available only when Incremental Data Migration is selected. |
| Environment Tag | An optional tag to identify the migration instance. |
| Configure ETL | Whether to enable extract, transform, and load (ETL). Select Yes to enter data processing statements. See What is ETL? and Configure ETL in a data migration or data synchronization task. |
| Monitoring and Alerting | Whether to configure alerts for the task. Select Yes to set an alert threshold and notification contacts. See Configure monitoring and alerting when you create a DTS task. |
-
Click Next Step: Data Verification to configure data verification. See Configure a data verification task.
Step 5: Run the precheck
-
To preview the API parameters for this task, move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
-
Click Next: Save Task Settings and Precheck.
DTS runs a precheck before starting the migration. The task only starts after the precheck passes.
For failed precheck items, click View Details to diagnose the issue, resolve it, then run the precheck again.
For alert items: if the alert cannot be ignored, fix the issue and rerun the precheck. If the alert can be ignored, click Confirm Alert Details, then click Ignore > OK > Precheck Again.
Step 6: Purchase and start the instance
-
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
-
On the Purchase Instance page, configure the instance class.
| Parameter | Description |
|---|---|
| Resource Group | The resource group for the migration instance. Default: default resource group. See What is Resource Management? |
| Instance Class | The instance class determines migration speed. See Instance classes of data migration instances. |
-
Read and accept Data Transmission Service (Pay-as-you-go) Service Terms by selecting the check box.
-
Click Buy and Start, then click OK.
The task appears on the Data Migration page. Monitor its progress there.
Kafka cluster parameters
Configure the following parameters when Migrate Incremental Data is set to Yes.
| Parameter | Description |
|---|---|
| Kafka Cluster Type | The connection type based on where the Kafka cluster is deployed. This example uses Self-managed Database on ECS. If you select Express Connect, VPN Gateway, or Smart Access Gateway, also select Connected VPC and enter Domain Name or IP. |
| Kafka Data Source Component | Select based on the incremental collection method you configured: Use the default binlog format of the TiDB database. (TiDB Binlog) or Use the TiCDC Canal-JSON format. (TiCDC). |
| ECS Instance ID | The ID of the ECS instance running the Kafka cluster. |
| Port Number | The service port of the Kafka cluster. |
| Kafka Cluster Account | The Kafka cluster username. Leave blank if authentication is not enabled. |
| Kafka Cluster Password | The Kafka cluster password. Leave blank if authentication is not enabled. |
| Kafka Version | The Kafka cluster version. If the version is 1.0 or later, select 1.0. |
| Encryption | Select Non-encrypted or SCRAM-SHA-256 based on your security requirements. |
| Topic | The topic that contains the incremental data. |