This topic describes how to use Data Transmission Service (DTS), in combination with a Kafka cluster and TiDB's Pump and Drainer components, to perform an incremental data migration. This method allows for a smooth database migration to the cloud with minimal application downtime.
Prerequisites
Before you perform an incremental data migration, you can first migrate the existing data from your self-managed TiDB database to the destination ApsaraDB RDS for MySQL instance. For more information, see Migrate full data from a self-managed TiDB database to an ApsaraDB RDS for MySQL instance.
-
This feature is available only in specific regions. The destination ApsaraDB RDS for MySQL instance must reside in one of the following regions: China (Hangzhou), China (Shanghai), China (Qingdao), China (Beijing), China (Shenzhen), China (Zhangjiakou), China (Hong Kong), Singapore, US (Silicon Valley), or US (Virginia).
-
The destination ApsaraDB RDS for MySQL instance must have more storage space than the source TiDB database uses.
Background information

The binary log format and implementation mechanism of TiDB are different from those of MySQL. To perform an incremental data migration while minimizing changes to the source database, you must deploy a Kafka cluster and the Pump and Drainer components of the TiDB database.
The Pump component captures the binlogs generated by TiDB in real time and provides them to the Drainer component. Drainer then writes these binlogs to a downstream Kafka cluster. During incremental data migration, DTS retrieves this data from the Kafka cluster and applies it in real time to the destination ApsaraDB RDS for MySQL instance.
Usage notes
-
During a full data migration, DTS consumes read and write resources on the source and destination databases, increasing their load. If your databases have poor performance, low specifications, or high workloads (for example, if the source database has many slow SQL queries or tables without primary keys, or if deadlocks occur in the destination database), the increased load can strain your databases or even cause service interruptions. Perform the data migration during off-peak hours, such as when the CPU utilization of both databases is below 30%.
-
If a source table lacks a primary key or a unique constraint and contains non-unique data, duplicate data may be created in the destination database.
-
For columns with the FLOAT or DOUBLE data type, DTS reads their values by using the
ROUND(COLUMN,PRECISION)function. If the precision is not explicitly defined, DTS migrates FLOAT values with a precision of 38 digits and DOUBLE values with a precision of 308 digits. Verify that these precisions meet your business requirements. -
If a source database name violates ApsaraDB RDS naming conventions, you must create a compliant database in the destination instance before configuring the task. Otherwise, DTS creates the database automatically.
NoteFor more information about naming conventions and how to create a database, see Create a database and accounts.
-
If a data migration task fails, DTS automatically attempts to resume it. Before you switch your business workloads to the destination instance, make sure to stop or release the task. This prevents an automatically resumed task from overwriting data in the destination instance with data from the source database.
Billing
|
Migration type |
Task configuration fee |
Internet traffic fee |
|
Schema migration and full data migration |
Free of charge. |
DTS charges an Internet traffic fee when the Access Method of the destination database is set to Public IP Address. Billing overview. |
|
Incremental data migration |
Charged. Billing overview. |
Migration types
|
Migration type |
Description |
|
Schema migration |
DTS migrates the schema definitions of the selected objects to the destination database. Currently, DTS supports schema migration for databases, tables, and views. Warning
This scenario involves data migration between heterogeneous databases. During schema migration, DTS cannot guarantee a perfect one-to-one mapping of data types. Carefully evaluate the impact of data type mappings on your business. For more information, see Data type mappings between heterogeneous databases. |
|
Full data migration |
DTS migrates all existing data of the selected objects from the source database to the destination database. Note
During a full data migration, concurrent INSERT operations can cause table fragmentation in the destination instance. After the full migration is complete, the tablespace in the destination database may be larger than that in the source database. |
|
Incremental data migration |
DTS retrieves binlog data generated by TiDB from the Kafka cluster and applies the incremental updates to the destination database in real time. The following SQL operations are supported during incremental data migration:
|
Preparations
To reduce the impact of network latency on the incremental data migration, deploy the Pump component, Drainer component, and Kafka cluster in the same internal network as the source database server.
-
Deploy the Pump and Drainer components. For more information, see Deploy a TiDB Binlog Cluster.
-
Modify the Drainer component's configuration file to set the output to Kafka. For more information, see Develop a custom Kafka consumer.
-
Prepare a Kafka cluster using one of the following methods:
-
Deploy a self-managed Kafka cluster. For more information, visit the official Apache Kafka website.
WarningTo ensure that the Kafka cluster can receive large binlog data from TiDB, increase the values of the
message.max.bytesandreplica.fetch.max.bytesparameters for the broker component, and thefetch.message.max.bytesparameter for the consumer component. For more information, see Kafka Configuration. -
Use ApsaraMQ for Kafka. For more information, see Quick start for ApsaraMQ for Kafka.
NoteTo ensure proper communication and reduce the impact of network latency, deploy the ApsaraMQ for Kafka instance in the same virtual private cloud (VPC) as the source database server.
-
-
Create a topic in the self-managed Kafka cluster or the ApsaraMQ for Kafka instance.
-
Add the DTS server CIDR blocks to your TiDB database's whitelist. For the specific CIDR blocks, see Add the CIDR blocks of DTS servers to a whitelist.
Procedure
-
Log on to the DTS console.
NoteIf you are automatically redirected to the Data Management (DMS) console, you can click the
icon in the lower-right corner and then click
to return to the classic DTS console. -
In the left-side navigation pane, click Data Migration.
-
At the top of the Migration Tasks page, select the region of the destination cluster.
-
In the upper-right corner of the page, click Create Data Migration Task.
-
Configure the source and destination databases.
-
Configure the task name and the source database.

Parameter
Description
Task Name
DTS automatically generates a task name. We recommend that you specify a descriptive name for easy identification. The name does not need to be unique.
Instance Type
Select the deployment location of the source database. This topic uses User-Created Database in ECS Instance as an example.
NoteIf your self-managed database is of a different instance type, you may need to complete additional preparations. For more information, see Preparations for data migration.
Instance Region
Select the region where the ECS instance that hosts the TiDB database is deployed.
Database Type
Select TiDB.
Port
Enter the service port of the TiDB database. The default value is 4000.
Database Account
Enter the account for the TiDB database. This account must have the SHOW VIEW permission and the SELECT permission on the objects to be migrated.
Database Password
Enter the password for the database account.
ImportantAfter you enter the source database information, you can click Test Connectivity next to Database Password to verify that the information is correct. If the information is correct, the message Passed is displayed. If the message Failed is displayed, click Diagnose next to the Failed message and adjust the source database information based on the prompts.
Perform Incremental Migration
Select whether to perform an incremental data migration based on your business requirements. In this topic, Yes is selected. To perform only a full data migration, see Migrate full data from a self-managed TiDB database to an ApsaraDB RDS for MySQL instance.
Kafka Cluster Type
Select the deployment location of your Kafka cluster. This topic uses User-Created Database in ECS Instance as an example. If your self-managed Kafka cluster uses a different instance type, you must also complete the related preparations. For more information, see Preparations for data migration.
NoteDTS does not currently support directly selecting an ApsaraMQ for Kafka instance. If you are using an ApsaraMQ for Kafka instance, you must configure it as a self-managed Kafka cluster. Select User-Created Database Connected over Express Connect, VPN Gateway, or Smart Access Gateway and then select the virtual private cloud (VPC) where your ApsaraMQ for Kafka instance resides.
Instance Region
This parameter is set to the same region as the source instance and cannot be changed.
ECS Instance ID
Select the ID of the ECS instance that hosts your self-managed Kafka cluster.
Kafka Port
The service port of the self-managed Kafka cluster. The default value is 9092.
Kafka Cluster Account
Enter the username for the self-managed Kafka cluster. You can leave this blank if authentication is not enabled.
Kafka Cluster Password
Enter the password for the username. You can leave this blank if authentication is not enabled.
Topic
Click Get Topic List on the right and select a topic from the drop-down list.
Kafka Version
Select the version of your self-managed Kafka cluster.
Kafka Cluster Connection Method
Select Non-encrypted or SCRAM-SHA-256 based on your business and security requirements.
-
Configure the destination database.

Parameter
Description
Instance Type
Select RDS Instance.
Instance Region
Select the region of the destination ApsaraDB RDS for MySQL instance.
Database Account
Enter the database account for the destination ApsaraDB RDS for MySQL instance. The account must have read and write permissions on the destination database. For information about how to create an account and grant permissions, see Create an account and Modify account permissions.
Database Password
Enter the password for the database account.
ImportantAfter you enter the source database information, you can click Test Connectivity next to Database Password to verify that the information is correct. If the information is correct, the message Passed is displayed. If the message Failed is displayed, click Diagnose next to the Failed message and adjust the source database information based on the prompts.
Connection Method
Select Non-encrypted or SSL-encrypted based on your requirements. If you select SSL-encrypted, you must enable SSL encryption for the ApsaraDB RDS for MySQL instance before you configure the migration task. For more information, see Configure SSL encryption.
ImportantThe Encryption parameter is available only for instances in the Chinese mainland and China (Hong Kong) regions.
-
-
After you complete the configuration, click Set Whitelist and Next in the lower-right corner of the page.
If the source or destination is an Alibaba Cloud database instance (such as ApsaraDB RDS for MySQL or ApsaraDB for MongoDB), DTS automatically adds the CIDR blocks of the DTS servers in the corresponding region to the instance's whitelist. If the source or destination is a self-managed database on an ECS instance, DTS automatically adds the CIDR blocks to the security group rules of the ECS instance. You must also ensure that the self-managed database does not restrict access from the ECS instance. If the database is deployed in a cluster across multiple ECS instances, you must manually add the CIDR blocks of the DTS servers to the security group rules of each remaining ECS instance. If the source or destination is a database in an on-premises data center or another cloud, you must manually add the CIDR blocks of the DTS servers to allow access. For a list of DTS server CIDR blocks, see CIDR blocks of DTS servers.
WarningAdding the public CIDR blocks of DTS servers, whether automatically or manually, may introduce security risks. By using this product, you acknowledge and accept these potential risks. You are responsible for implementing basic security measures, including but not limited to using strong passwords, restricting open ports, using authentication for internal API calls, regularly reviewing and restricting unnecessary network segments, or connecting through private networks such as Express Connect, VPN Gateway, or Smart Access Gateway.
-
Select the migration types and objects.

Parameter
Description
Migration types
-
To perform only a full migration, select both Schema Migration and Full Data Migration.
-
To perform a migration with minimal downtime, select Schema Migration, Full Data Migration, and Incremental Data Migration. In this topic, all three migration types are selected.
Migration Objects
In the Available box, click the objects that you want to migrate, and then click the
icon to move them to the Selected Objects box. Note-
You can select objects at the database, table, or column level. If you select only tables or columns, other objects such as views, triggers, and stored procedures are not migrated.
-
By default, object names in the destination database are the same as in the source database. To rename an object in the destination database, use the object name mapping feature. For more information, see Object name mapping.
-
Using the object name mapping feature can cause the migration of dependent objects to fail.
Edit Mapped Object Name
To rename the migrated objects in the destination instance, use the object name mapping feature. For more information, see Map databases, tables, and columns.
Retry Duration for Connection Failure
By default, DTS retries the connection for 12 hours. You can also specify a custom duration. If DTS reconnects to the databases within the specified duration, the task automatically resumes. Otherwise, the task fails.
NoteYou are charged for the DTS instance during the retry period. Specify a retry duration that meets your business needs, or release the DTS instance as soon as possible after the source and destination instances are released.
-
After you complete the configuration, click Precheck and Start in the lower-right corner of the page.
Note-
Before the migration task starts, DTS runs a precheck. The task can start only after it passes the precheck.
-
If the precheck fails, click the
icon next to the failed item to view details.-
Fix the issues as prompted and run the precheck again.
-
If you do not need to fix the warning items, you can select Ignore and then click Ignore Warnings and Rerun Precheck to run the precheck again.
-
-
-
After the task passes the precheck, click Next.
-
In the Confirm Settings dialog box that appears, select a Instance Class and select the Data Transmission Service (pay-as-you-go) Service Terms checkbox.
-
Click Buy and Start to begin the migration.
-
Schema migration + Full data migration
Allow the task to complete automatically. Stopping it manually may result in incomplete data.
-
Schema migration + Full data migration + Incremental data migration
The migration task does not stop automatically. You must stop it manually.
ImportantChoose an appropriate time to stop the task manually, such as during off-peak hours or when you are ready to switch your business to the destination cluster.
-
Wait until the migration task enters the Incremental Data Migration phase and the status shows Undelayed. Then, stop writing data to the source database for several minutes. During this time, the status of Incremental Data Migration may show a latency.
-
Wait for the Incremental Data Migration status to show Undelayed again. Then, manually stop the migration task.

-
-