All Products
Search
Document Center

Data Transmission Service:Migrate data from a self-managed TiDB database to PolarDB for MySQL

Last Updated:Apr 10, 2025

This topic describes how to use Data Transmission Service (DTS) to migrate data from a self-managed TiDB database to a PolarDB for MySQL cluster.

Prerequisites

(Optional) preparations

You can use one of the following methods to collect incremental data changes from the TiDB database based on your business requirements.

Use TiDB Binlog

Note

To reduce the impact of network latency on data migration, make sure that the servers on which the Pump component, Drainer component, and Kafka cluster are deployed are connected to the server of the source database over the same internal network.

  1. Prepare a Kafka cluster. You can use one of the following methods:

    • Deploy a self-managed Kafka cluster. For more information, see Apache Kafka.

      Warning

      To ensure that the Kafka cluster can receive large amounts of binary log data from the TiDB database, you must increase the values of the message.max.bytes and replica.fetch.max.bytes parameters in the Broker component and the fetch.message.max.bytes parameter in the Consumer component. For more information, see Kafka configuration.

    • Use Message Queue for Apache Kafka. For more information, see Quick Start.

      Note

      To ensure normal communication and reduce the impact of network latency on incremental data migration, you must deploy the Message Queue for Apache Kafka instance in the same virtual private cloud (VPC) as the source database server.

  2. Create a topic in the self-managed Kafka cluster or Message Queue for Apache Kafka instance.

    Important

    The topic must contain only one partition to ensure that incremental data can be replicated to the partition whose ID is 0.

  3. Deploy Pump and Drainer. For more information, see TiDB Binlog Cluster Deployment.

  4. Modify the configuration file of the Drainer component and set the output to Kafka. For more information, see Kafka Custom Development.

    Note

    Make sure that the server on which the TiDB database is deployed can connect to the Kafka cluster.

  5. Add the CIDR blocks of DTS servers to a whitelist of the TiDB database. For more information, see Add the CIDR blocks of DTS servers.

Use TiDB CDC

  1. Prepare a Kafka cluster. You can use one of the following methods:

    • Deploy a self-managed Kafka cluster. For more information, see Apache Kafka.

      Warning

      To ensure that the Kafka cluster can receive large amounts of binary log data from the TiDB database, you must increase the values of the message.max.bytes and replica.fetch.max.bytes parameters in the Broker component and the fetch.message.max.bytes parameter in the Consumer component. For more information, see Kafka configuration.

    • Use Message Queue for Apache Kafka. For more information, see Quick Start.

      Note

      To ensure normal communication and reduce the impact of network latency on incremental data migration, you must deploy the Message Queue for Apache Kafka instance in the same virtual private cloud (VPC) as the source database server.

  2. Create a topic in the self-managed Kafka cluster or Message Queue for Apache Kafka instance.

    Important

    The topic must contain only one partition to ensure that incremental data can be replicated to the partition whose ID is 0.

  3. Install the TiCDC component. For more information, see Deploy TiCDC.

    Note

    We recommend that you use TiUP to add or scale out the TiCDC component on the existing TiDB cluster.

  4. Replicate incremental data to Kafka. For more information, see Replicate Data to Kafka.

    Note
    • Make sure that the server on which the TiDB database is deployed can connect to the Kafka cluster.

    • We recommend that you use tiup cdc cli changefeed create \ in the first line of the command.

Considerations

Type

Description

Limits on the source database

  • Bandwidth requirements: The server to which the source database belongs must have sufficient egress bandwidth. Otherwise, the data migration speed is affected.

  • The tables to be migrated must have PRIMARY KEY or UNIQUE constraints, and all fields must be unique. Otherwise, the destination database may contain duplicate data records.

  • If you select tables as the objects to migrate and you need to edit the tables (such as renaming table columns), up to 1,000 tables can be migrated in a single data migration task. If you need to migrate more than 1,000 tables, we recommend that you split these tables into multiple data migration tasks or configure a task to migrate the entire database. Otherwise, an error may occur after you submit the task.

  • If you need to migrate incremental data, you must deploy a Kafka cluster and related components of the TiDB database to collect incremental data changes from the TiDB database.

  • During schema migration and full data migration, do not perform DDL operations to change the schemas of databases or tables. Otherwise, the data migration task fails.

  • The metadata of the TiDB database does not store the length of prefix indexes. After data is migrated to the destination database, the length of prefix indexes is lost. This may cause the instance to fail to run. If the tables to be migrated contain prefix indexes, you must manually fix the length of the prefix indexes.

Other limits

  • During incremental data migration, DTS supports obtaining data only from the partition whose ID is 0 in the destination topic.

  • If the task includes incremental data migration, after the task is created, you must perform changes or insert test data on the source database in a timely manner to update the position information of the instance. Otherwise, the instance may fail due to excessive latency.

  • If the data to be migrated contains information such as rare characters or emojis that takes up four bytes, the destination databases and tables to receive the data must use UTF8mb4 character set.

    Note

    If you use the schema migration feature of DTS, set the instance parameter character_set_server in the destination database to UTF8mb4 character set.

  • During full data migration, DTS will consume certain read and write resources from both the source and destination databases, which may increase the database load. Therefore, it is recommended to evaluate the performance of the source and destination databases before executing data migration and perform data migration during off-peak hours (for example, when the CPU load of the source and destination databases is below 30%).

  • During full data migration, concurrent INSERT operations cause fragmentation in the tables of the destination database. After full data migration is complete, the size of the tablespace used by the destination database is larger than that of the source database.

  • If data is written to the destination database during data migration, data inconsistency may occur between the source and destination databases.

  • For columns of the FLOAT or DOUBLE data type, DTS uses the ROUND(COLUMN,PRECISION) function to read the values. If no precision is specified, DTS sets the precision for the FLOAT data type to 38 digits and the precision for the DOUBLE data type to 308 digits. You must check whether the precision settings meet your business requirements.

  • DTS attempts to automatically resume data migration tasks that failed within the last seven days. Before you switch your workloads to the destination instance, stop or release the data migration task. You can also execute the REVOKE statement to revoke the write permissions from the account that is used by DTS to access the destination instance. This prevents the data migration task from being automatically resumed, which ensures that the data in the destination instance is consistent with the data in the source database.

  • If DDL statements fail to be executed in the destination database, the DTS task continues to run. You can view the DDL statements that fail to be executed in task logs. For more information about how to view task logs, see View task logs.

  • If a DTS task fails to run, DTS technical support will try to restore the task within 8 hours. During the restoration, the task may be restarted, and the parameters of the task may be modified.

    Note

    Only the parameters of the task may be modified. The parameters of databases are not modified. The parameters that may be modified include but are not limited to the parameters in the "Modify instance parameters" section of the Modify the parameters of a DTS instance topic.

Billing

Migration type

Instance configuration fee

Internet traffic fee

Schema migration and full data migration

Free of charge.

When the Access Method parameter of the destination database is set to Public IP Address, you are charged for Internet traffic. For more information, see Billing overview.

Incremental data migration

Charged. For more information, see Billing overview.

SQL operations that can be incrementally migrated

Operation type

SQL statement

DML

INSERT, UPDATE, DELETE

DDL

CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE, ADD COLUMN, DROP COLUMN

Permissions required for database accounts

Database

Permission requirement

Method to create and authorize an account

TiDB database

The SHOW VIEW permission and the SELECT permission on the objects to be migrated

Privilege Management

PolarDB for MySQL cluster

Read and write permissions on the destination database

Create and manage database accounts

Procedure

  1. Use one of the following methods to go to the Data Migration page and select the region in which the data migration instance resides.

    DTS console

    1. Log on to the DTS console.

    2. In the left-side navigation pane, click Data Migration.

    3. In the upper-left corner of the page, select the region in which the data migration instance resides.

    DMS console

    Note

    The actual operation may vary based on the mode and layout of the DMS console. For more information, see Simple mode and Customize the layout and style of the DMS console.

    1. Log on to the DMS console.

    2. In the top navigation bar, move the pointer over Data + AI > DTS (DTS) > Data Migration .

    3. From the drop-down list to the right of Data Migration Tasks, select the region in which the data synchronization instance resides.

  2. Click Create Task to go to the task configuration page.

  3. Configure the source and destination databases. The following table describes the parameters.

    Category

    Parameter

    Description

    N/A

    Task Name

    The name of the DTS task. DTS automatically generates a task name. We recommend that you specify a descriptive name that makes it easy to identify the task. You do not need to specify a unique task name.

    Source Database

    Select Existing Connection

    • If you use a database instance that is registered with DTS, select the instance from the drop-down list. DTS automatically populates the following database parameters for the instance. For more information, see Manage database connections.

      Note

      In the DMS console, you can select the database instance from the Select a DMS database instance drop-down list.

    • If you fail to register the instance with DTS, or you do not need to use the instance that is registered with DTS, you must configure the following database information.

    Database Type

    Select TiDB.

    Access Method

    Select a connection type based on the deployment location of the TiDB database. In this example, Self-managed Database on ECS is selected.

    Note

    If the self-managed database uses other connection types, you must perform the corresponding preparations.

    Instance Region

    Select the region where the TiDB database is located.

    ECS Instance ID

    Select the ID of the Elastic Compute Service (ECS) instance to which the TiDB database belongs.

    Port Number

    Enter the service port number of the TiDB database. The default port number is 4000.

    Database Account

    Enter the database account of the TiDB database.

    Database Password

    The password that is used to access the database.

    Migrate Incremental Data

    Specify whether to migrate incremental data from the TiDB database.

    Note

    If you need to migrate incremental data from the TiDB database, select Yes and enter the Kafka cluster information. For more information, see Kafka cluster information.

    Destination Database

    Select Existing Connection

    • If you use a database instance that is registered with DTS, select the instance from the drop-down list. DTS automatically populates the following database parameters for the instance. For more information, see Manage database connections.

      Note

      In the DMS console, you can select the database instance from the Select a DMS database instance drop-down list.

    • If you fail to register the instance with DTS, or you do not need to use the instance that is registered with DTS, you must configure the following database information.

    Database Type

    Select PolarDB for MySQL.

    Access Method

    Select Alibaba Cloud Instance.

    Instance Region

    Select the region where the destination PolarDB for MySQL cluster is located.

    Replicate Data Across Alibaba Cloud Accounts

    In this example, a database instance of the current Alibaba Cloud account is used. Select No.

    PolarDB Cluster ID

    Select the ID of the destination PolarDB for MySQL cluster.

    Database Account

    Enter the database account of the destination PolarDB for MySQL cluster.

    Database Password

    The password that is used to access the database instance.

    Encryption

    Specifies whether to encrypt the connection to the source PolarDB for MySQL cluster. You can set this parameter based on your business requirements. For more information about the SSL encryption feature, see Configure SSL encryption.

  4. In the lower part of the page, click Test Connectivity and Proceed, and then click Test Connectivity in the CIDR Blocks of DTS Servers dialog box that appears.

    Note

    Make sure that the CIDR blocks of DTS servers can be automatically or manually added to the security settings of the source and destination databases to allow access from DTS servers. For more information, see Add the CIDR blocks of DTS servers.

  5. Configure the objects to be migrated.

    1. On the Configure Objects page, configure the objects that you want to migrate.

      Parameter

      Description

      Migration Types

      • To perform only full data migration, select Schema Migration and Full Data Migration.

      • To ensure service continuity during data migration, select Schema Migration, Full Data Migration, and Incremental Data Migration.

      Note
      • If you do not select Schema Migration, make sure a database and a table are created in the destination database to receive data and the object name mapping feature is enabled in Selected Objects.

      • If you do not select Incremental Data Migration, we recommend that you do not write data to the source database during data migration. This ensures data consistency between the source and destination databases.

      Processing Mode of Conflicting Tables

      • Precheck and Report Errors: checks whether the destination database contains tables that use the same names as tables in the source database. If the source and destination databases do not contain tables that have identical table names, the precheck is passed. Otherwise, an error is returned during the precheck and the data migration task cannot be started.

        Note

        If the source and destination databases contain tables with identical names and the tables in the destination database cannot be deleted or renamed, you can use the object name mapping feature to rename the tables that are migrated to the destination database. For more information, see Map object names.

      • Ignore Errors and Proceed: skips the precheck for identical table names in the source and destination databases.

        Warning

        If you select Ignore Errors and Proceed, data inconsistency may occur and your business may be exposed to the following potential risks:

        • If the source and destination databases have the same schema, and a data record has the same primary key as an existing data record in the destination database, the following scenarios may occur:

          • During full data migration, DTS does not migrate the data record to the destination database. The existing data record in the destination database is retained.

          • During incremental data migration, DTS migrates the data record to the destination database. The existing data record in the destination database is overwritten.

        • If the source and destination databases have different schemas, only specific columns are migrated or the data migration task fails. Proceed with caution.

      Capitalization of Object Names in Destination Instance

      The capitalization of database names, table names, and column names in the destination instance. By default, DTS default policy is selected. You can select other options to make sure that the capitalization of object names is consistent with that of the source or destination database. For more information, see Specify the capitalization of object names in the destination instance.

      Source Objects

      Select one or more objects from the Source Objects section. Click the 向右小箭头 icon to add the objects to the Selected Objects section.

      Note

      You can select tables or databases as the objects that you want to migrate.

      Selected Objects

      • To rename an object that you want to migrate to the destination instance, right-click the object in the Selected Objects section. For more information, see Map the name of a single object.

      • To rename multiple objects at a time, click Batch Edit in the upper-right corner of the Selected Objects section. For more information, see Map multiple object names at a time.

      Note
      • To configure a WHERE condition to filter data, right-click the table that you want to migrate in the Selected Objects section. In the dialog box that appears, configure a filter condition.

      • If you use the object name mapping feature to rename an object, other objects that are dependent on the object may fail to be migrated.

    2. Click Next: Advanced Settings to configure advanced settings.

      Parameter

      Description

      Select the engine type of the destination database

      The engine type of the destination database. Select an engine type based on your business requirements. Valid values:

      • InnoDB: the default storage engine.

      • X-Engine: a storage engine for online transaction processing (OLTP) databases.

      Retry Time for Failed Connections

      The retry time range for failed connections. If the source or destination database fails to be connected after the data migration task is started, DTS immediately retries a connection within the retry time range. Valid values: 10 to 1,440. Unit: minutes. Default value: 720. We recommend that you set the parameter to a value greater than 30. If DTS is reconnected to the source and destination databases within the specified retry time range, DTS resumes the data migration task. Otherwise, the data migration task fails.

      Note
      • If you specify different retry time ranges for multiple data migration tasks that share the same source or destination database, the value that is specified later takes precedence.

      • When DTS retries a connection, you are charged for the DTS instance. We recommend that you specify the retry time range based on your business requirements. You can also release the DTS instance at the earliest opportunity after the source database and destination instance are released.

      Retry Time for Other Issues

      The retry time range for other issues. For example, if DDL or DML operations fail to be performed after the data migration task is started, DTS immediately retries the operations within the retry time range. Valid values: 1 to 1440. Unit: minutes. Default value: 10. We recommend that you set the parameter to a value greater than 10. If the failed operations are successfully performed within the specified retry time range, DTS resumes the data migration task. Otherwise, the data migration task fails.

      Important

      The value of the Retry Time for Other Issues parameter must be smaller than the value of the Retry Time for Failed Connections parameter.

      Enable Throttling for Full Data Migration

      Specifies whether to enable throttling for full data migration. During full data migration, DTS uses the read and write resources of the source and destination databases. This may increase the loads of the database servers. You can enable throttling for full data migration based on your business requirements. To configure throttling, you must configure the Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s) parameters. This reduces the loads of the destination database server.

      Note

      You can configure this parameter only if you select Full Data Migration for the Migration Types parameter.

      Enable Throttling for Incremental Data Migration

      Specifies whether to enable throttling for incremental data migration. To configure throttling, you must configure the RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s) parameters. This reduces the loads of the destination database server.

      Note

      You can configure this parameter only if you select Incremental Data Migration for the Migration Types parameter.

      Environment Tag

      The environment tag that identifies the data migration instance. You can select an environment tag based on your business requirements. In this example, you do not need to configure this parameter.

      Configure ETL

      Specifies whether to enable the extract, transform, and load (ETL) feature. For more information, see What is ETL? Valid values:

      Monitoring and Alerting

      Specifies whether to configure alerting for the data migration task. If the task fails or the migration latency exceeds the specified threshold, the alert contacts receive notifications. Valid values:

    3. Click Next Step: Data Verification to configure the data verification task.

      For more information about how to use the data verification feature, see Configure a data verification task.

  6. Save the task settings and run a precheck.

    • To view the parameters to be specified when you call the relevant API operation to configure the DTS task, move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.

    • If you do not need to view or have viewed the parameters, click Next: Save Task Settings and Precheck in the lower part of the page.

    Note
    • Before you can start the data migration task, DTS performs a precheck. You can start the data migration task only after the task passes the precheck.

    • If the task fails to pass the precheck, click View Details next to each failed item. After you analyze the causes based on the check results, troubleshoot the issues. Then, run a precheck again.

    • If an alert is triggered for an item during the precheck:

      • If an alert item cannot be ignored, click View Details next to the failed item and troubleshoot the issues. Then, run a precheck again.

      • If the alert item can be ignored, click Confirm Alert Details. In the View Details dialog box, click Ignore. In the message that appears, click OK. Then, click Precheck Again to run a precheck again. If you ignore the alert item, data inconsistency may occur, and your business may be exposed to potential risks.

  7. Purchase an instance.

    1. Wait until Success Rate becomes 100%. Then, click Next: Purchase Instance.

    2. On the Purchase Instance page, configure the Instance Class parameter for the data migration instance. The following table describes the parameters.

      Section

      Parameter

      Description

      New Instance Class

      Resource Group

      The resource group to which the data migration instance belongs. Default value: default resource group. For more information, see What is Resource Management?

      Instance Class

      DTS provides instance classes that vary in the migration speed. You can select an instance class based on your business scenario. For more information, see Instance classes of data migration instances.

    3. Read and agree to Data Transmission Service (Pay-as-you-go) Service Terms by selecting the check box.

    4. Click Buy and Start. In the message that appears, click OK.

      You can view the progress of the task on the Data Migration page.

Kafka cluster information

Parameter

Description

Kafka Cluster Type

Select a connection type based on the deployment location of the Kafka cluster. In this example, Self-managed Database on ECS is selected.

Note

If you select Express Connect, VPN Gateway, or Smart Access Gateway, you must also select Connected VPC and enter Domain Name or IP.

Kafka Data Source Component

Based on the preparations, select Use the default binlog format of the TiDB database. or Use the TiCDC Canal-JSON format..

ECS Instance ID

Select the ID of the ECS instance to which the Kafka cluster belongs.

Port Number

Enter the service port number of the Kafka cluster.

Kafka Cluster Account

Enter the username and password of the Kafka cluster. If the Kafka cluster does not have authentication enabled, you do not need to enter the username and password.

Kafka Cluster Password

Kafka Version

Select the version of the Kafka cluster.

Note

If the version of the Kafka cluster is 1.0 or later, select 1.0.

Encryption

Select Non-encrypted or SCRAM-SHA-256 based on your business and security requirements.

Topic

Select the topic that contains the incremental data.