Use Data Transmission Service (DTS) to migrate data from a self-managed TiDB database to a PolarDB-X 2.0 instance. DTS supports schema migration, full data migration, and incremental data migration, so you can migrate with minimal downtime.
Migration process overview
The end-to-end migration has the following stages:
-
Set up incremental data collection (required only for incremental data migration): deploy Apache Kafka and connect TiDB Binlog or TiCDC to stream changes to Kafka.
-
Grant the required permissions: on both the source TiDB database and the destination PolarDB-X 2.0 instance.
-
Create a DTS migration task: configure source, destination, and migration type, then run a precheck.
-
Purchase and start the instance: select an instance class and start the migration.
Prerequisites
Before you begin, ensure that you have:
-
A PolarDB-X 2.0 instance with more storage space than the source TiDB database. To create one, see Create PolarDB-X instances.
-
(Required for incremental data migration) A Kafka cluster and TiDB Binlog or TiCDC deployed and configured. See Set up incremental data collection.
Permissions required
Grant the following permissions before creating the DTS task.
| Database | Required permissions | Reference |
|---|---|---|
| TiDB (source) | SHOW VIEW and SELECT on the objects to be migrated | Permission Management |
| PolarDB-X 2.0 (destination) | Read and write on the destination database | Manage database accounts |
Supported SQL operations for incremental migration
| Type | Operations |
|---|---|
| DML | INSERT, UPDATE, DELETE |
| DDL | CREATE TABLE, DROP TABLE, ALTER TABLE, RENAME TABLE, TRUNCATE TABLE, CREATE VIEW, ALTER VIEW |
Billing
| Migration type | Instance configuration fee | Internet traffic fee |
|---|---|---|
| Schema migration and full data migration | Free of charge | Charged when Access Method is set to Public IP Address. See Billing overview. |
| Incremental data migration | Charged. See Billing overview. | — |
Limitations
Review the following limitations before starting the migration.
Source database
-
The source database server must have enough outbound bandwidth. Insufficient bandwidth reduces migration speed.
-
Tables to be migrated must have PRIMARY KEY or UNIQUE constraints, with all fields unique. Without these, the destination database may contain duplicate records.
-
If you select individual tables as migration objects and need to rename tables or columns in the destination database, a single task can migrate up to 1,000 tables. For more than 1,000 tables, configure multiple tasks or migrate the entire database.
-
To migrate incremental data from the source TiDB database, you must deploy a Kafka cluster and install related components for the TiDB database to collect the incremental data.
-
TiDB does not store prefix index length in metadata. Migrating tables with prefix indexes causes length loss in the destination instance, which may cause the instance to fail. Manually fix prefix index lengths after migration.
-
Do not run DDL operations on the source database during schema migration or full data migration. DDL changes during this phase cause the migration task to fail.
Incremental data migration
-
DTS reads data only from the partition with ID 0 in the Kafka topic.
-
After the DTS instance is created, immediately run change or INSERT operations on test data in the source database. This updates the DTS instance offset. Without this step, high latency may cause the instance to fail.
General
-
Full data migration uses concurrent INSERT operations, which causes table fragmentation in the destination database. After full data migration, the storage used in the destination is larger than in the source.
-
Full data migration consumes read and write resources on both databases. Run migration during off-peak hours when CPU load is below 30%.
-
Do not write data from sources other than DTS to the destination database while the DTS instance is running. Doing so can cause data inconsistency and may cause the instance to fail.
-
For FLOAT and DOUBLE columns, DTS reads values using
ROUND(COLUMN, PRECISION). The default precision is 38 for FLOAT and 308 for DOUBLE. Confirm that the precision meets your requirements. -
DTS attempts to resume failed instances for up to 7 days. Before switching traffic to the destination, end or release the DTS instance, or revoke write permissions for the DTS database account to prevent resumed instances from overwriting destination data.
-
If a DTS task fails, DTS technical support will attempt to restore it within 8 hours. The task may be restarted and task parameters (not database parameters) may be modified during restoration. For the list of parameters that may be modified, see Modify instance parameters.
Set up incremental data collection
Skip this section if you only need full data migration.
To migrate incremental data from TiDB, you must route changes through Apache Kafka. DTS then reads from Kafka. Choose one of two methods: TiDB Binlog or TiCDC.
Deploy the source database server, Pump, Drainer (for TiDB Binlog) or TiCDC (for TiCDC), and the Kafka cluster on the same internal network. This minimizes network latency during incremental data migration.
Step 1: Prepare a Kafka cluster
Both methods require a Kafka cluster. Set the following parameters to values large enough to accommodate the binary log data volume from TiDB. For reference values, see CONFIGURATION.
| Parameter | Where to set | Why |
|---|---|---|
message.max.bytes |
Kafka broker | Allows the broker to receive larger binary log payloads from TiDB |
replica.fetch.max.bytes |
Kafka broker | Allows replicas to fetch larger messages |
fetch.message.max.bytes |
Kafka consumer | Allows the consumer to fetch larger messages |
Use one of the following options to create a Kafka cluster:
-
Self-managed Apache Kafka cluster: deploy Kafka on your own infrastructure. See the Apache Kafka official website.
-
ApsaraMQ for Kafka instance: create a managed Kafka instance. See Getting started overview.
The ApsaraMQ for Kafka instance must be in the same virtual private cloud (VPC) as the source database server.
Step 2: Create a topic
Create a topic in the Kafka cluster.
The topic must contain exactly one partition. This ensures incremental data is replicated to partition ID 0, which is the only partition DTS reads from.
Step 3: Configure your change capture method
Choose Option A (TiDB Binlog) or Option B (TiCDC) based on your environment.
Use TiDB Binlog
-
Deploy Pump and Drainer. See TiDB Binlog cluster deployment.
-
Configure Drainer to forward data to your Kafka cluster. See Binlog Consumer Client User Guide.
-
Verify that the TiDB database server can connect to the Kafka cluster.
-
Add the CIDR blocks of DTS servers to the TiDB database whitelist. See Add the CIDR blocks of DTS servers.
Use TiCDC
-
Install TiCDC. Use TiUP to add a new TiCDC node or scale out an existing TiCDC node in the TiDB cluster. See Deploy and maintain TiCDC.Deploy and Maintain TiCDC
-
Create a changefeed to replicate incremental data from TiDB to Kafka. Use
tiup cdc cli changefeed create \in the first command line. See Replicate data to Kafka.Replicate Data to Kafka -
Verify that the TiDB database server can connect to the Kafka cluster.
Create a migration task
Step 1: Go to the Data Migration page
Use one of the following consoles:
DTS console
-
Log on to the DTS console.DTS console
-
In the left-side navigation pane, click Data Migration.
-
In the upper-left corner, select the region where the migration instance will reside.
DMS console
The actual operation may vary based on the mode and layout of the DMS console. See Simple mode and Customize the layout and style of the DMS console.
-
Log on to the DMS console.DMS console
-
In the top navigation bar, move the pointer over Data + AI > DTS (DTS) > Data Migration.
-
From the drop-down list to the right of Data Migration Tasks, select the region.
Step 2: Configure source and destination databases
Click Create Task and configure the following parameters.
Source database (TiDB)
| Parameter | Description |
|---|---|
| Task Name | A descriptive name for the DTS task. DTS generates a name automatically. No uniqueness required. |
| Select Existing Connection | If the TiDB instance is registered with DTS, select it from the list. DTS pre-fills the following fields. Otherwise, configure them manually. |
| Database Type | Select TiDB. |
| Access Method | Select the connection type based on where the TiDB database is deployed. This example uses Self-managed Database on ECS. For other connection types, complete the relevant preparations. |
| Instance Region | The region of the ECS instance hosting the TiDB database. |
| ECS Instance ID | The ID of the ECS instance hosting the TiDB database. |
| Port Number | The service port of the TiDB database. Default: 4000. |
| Database Account | The database account for the TiDB database. |
| Database Password | The password for the database account. |
| Migrate Incremental Data | Select Yesalert notification settings to enable incremental data migration. You must then configure the Kafka cluster parameters below. |
Kafka cluster (required when Migrate Incremental Data is Yes)
| Parameter | Description |
|---|---|
| Kafka Cluster Type | The deployment location of the Kafka cluster. This example uses Self-managed Database on ECS. If you select Express Connect, VPN Gateway, or Smart Access Gateway, also select a VPC from Connected VPC and specify Domain Name or IP. |
| Kafka Data Source Component | Select Use the default binlog format of the TiDB database (TiDB Binlog) or Use the TiCDC Canal-JSON format (TiCDC), based on your setup. |
| ECS Instance ID | The ID of the ECS instance where the Kafka cluster is deployed. |
| Port Number | The service port of the Kafka cluster. |
| Kafka Cluster Account | The username for the Kafka cluster. Leave blank if authentication is not enabled. |
| Kafka Cluster Password | The password for the Kafka cluster. Leave blank if authentication is not enabled. |
| Kafka Version | The Kafka cluster version. Select 1.0 if the version is 1.0 or later. |
| Encryption | Select Non-encrypted or SCRAM-SHA-256 based on your security requirements. |
| Topic | The topic that receives incremental data. |
Destination database (PolarDB-X 2.0)
| Parameter | Description |
|---|---|
| Select Existing Connection | If the PolarDB-X 2.0 instance is registered with DTS, select it from the list. Otherwise, configure the fields below. |
| Database Type | Select PolarDB-X 2.0. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | The region of the destination PolarDB-X 2.0 instance. |
| Instance ID | The ID of the destination PolarDB-X 2.0 instance. |
| Database Account | The database account for the destination instance. |
| Database Password | The password for the database account. |
Step 3: Test connectivity
In the lower part of the page, click Test Connectivity and Proceed. In the CIDR Blocks of DTS Servers dialog box, click Test Connectivity.
Make sure DTS server CIDR blocks are added to the security settings of both source and destination databases. See Add the CIDR blocks of DTS servers.
Step 4: Configure migration objects
On the Configure Objects page, configure the following settings.
Migration types
| Goal | Selection |
|---|---|
| Full migration only | Schema Migration + Full Data Migration |
| Migration with minimal downtime | Schema Migration + Full Data Migration + Incremental Data Migration |
If you skip Schema Migration, create the target database and tables in the destination before starting. Enable object name mapping in Selected Objects.
If you skip Incremental Data Migration, avoid writing to the source database during migration to maintain data consistency.
Processing mode of conflicting tables
| Option | Behavior |
|---|---|
| Precheck and Report Errors | DTS checks for tables with identical names in source and destination. The precheck fails if conflicts exist, blocking the task. To resolve conflicts, use object name mapping to rename destination tables. |
| Ignore Errors and Proceed | DTS skips the conflict check. During full data migration, conflicting records are not overwritten; the destination record is retained. During incremental data migration, conflicting records overwrite the destination record. If schemas differ, only specific columns may be migrated or the task may fail. Use with caution. |
Capitalization of object names in destination instance: controls the capitalization of database names, table names, and column names in the destination. Default: DTS default policy. See Specify the capitalization of object names.
Source Objects: select objects to migrate at the database or table level, then click the arrow icon to move them to Selected Objects.
Selected Objects:
-
To rename an object or specify the destination object, right-click it and use object name mapping.
-
To filter rows using a WHERE clause, right-click the table and set a filter condition.
-
To remove an object, click it and then click the remove icon.
Object name mapping may cause dependent objects to fail migration.
Step 5: Configure advanced settings
Click Next: Advanced Settings and configure the following.
| Parameter | Description |
|---|---|
| Retry Time for Failed Connections | How long DTS retries after a connection failure. Range: 10–1,440 minutes. Default: 720. Set to greater than 30 minutes. If DTS reconnects within this period, the task resumes; otherwise, it fails. Note that DTS charges for the instance during retries. If multiple tasks share a source or destination database, the most recently set retry time applies. |
| Retry Time for Other Issues | How long DTS retries after DDL or DML failures. Range: 1–1,440 minutes. Default: 10. Set to greater than 10 minutes. This value must be smaller than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Migration | Throttle full data migration to reduce load on source and destination. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Migration is selected. |
| Enable Throttling for Incremental Data Migration | Throttle incremental data migration. Configure RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s). Available only when Incremental Data Migration is selected. |
| Environment Tag | An optional tag to identify the instance. |
| Configure ETL | Enable extract, transform, and load (ETL) to transform data during migration. Select Yes to enter data processing statements. See What is ETL? and Configure ETL in a data migration or data synchronization task. |
| Monitoring and Alerting | Configure alerts for task failures or latency exceeding a threshold. Select Yes to configure the alert threshold and notification settings. See Configure monitoring and alerting. |
Step 6: Run a precheck
Click Next: Save Task Settings and Precheck.
To preview API parameters, move the pointer over the button and click Preview OpenAPI parameters before clicking through.
DTS runs a precheck before starting the task. The task can only start after passing the precheck.
If the precheck fails, click View Details next to each failed item, resolve the issues, then click Precheck Again.
If an alert is triggered: for non-ignorable alerts, resolve the issue and rerun the precheck. For ignorable alerts, click Confirm Alert Details > Ignore > OK > Precheck Again. Ignoring alerts may cause data inconsistency.
Step 7: Purchase and start the instance
-
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
-
On the Purchase Instance page, configure the following:
Parameter Description Resource Group The resource group for the instance. Default: default resource group. See What is Resource Management? Instance Class The instance class, which determines migration speed. See Instance classes of data migration instances. -
Read and accept Data Transmission Service (Pay-as-you-go) Service Terms.
-
Click Buy and Start, then click OK in the confirmation dialog.
Monitor the migration task
After the task starts, view progress on the Data Migration page.
-
Full migration only: the task stops automatically when complete. The status changes to Completed.
-
With incremental migration: the task runs continuously and does not stop automatically. The status shows Running.
Before switching your business traffic to the destination, end or release the DTS instance, or revoke write permissions for the DTS database account. This prevents a resumed instance from overwriting destination data.
What's next
-
After migration is complete, verify data integrity between source and destination databases.
-
Switch your business connections to the PolarDB-X 2.0 instance.
-
Release or end the DTS instance to avoid unnecessary charges.