Use Data Transmission Service (DTS) to migrate data from a PolarDB-X 1.0 instance to an ApsaraMQ for Kafka instance. DTS supports schema migration, full data migration, and incremental data migration, so you can migrate data with minimal service disruption.
Prerequisites
Before you begin, ensure that you have:
-
A PolarDB-X 1.0 instance. For more information, see Create a PolarDB-X 1.0 instance.
-
A topic created in the destination ApsaraMQ for Kafka instance to receive the migrated data. For more information, see Overview.
-
Available storage space in the destination ApsaraMQ for Kafka instance that is larger than the total data size in the source PolarDB-X 1.0 instance.
Limitations
Source database requirements
Hard constraints — migration fails if not met:
-
Tables to be migrated must have PRIMARY KEY or UNIQUE constraints, and all fields must be unique. Otherwise, the destination may contain duplicate records.
-
Migration from a read-only PolarDB-X 1.0 instance is not supported.
-
If you select tables as migration objects and need to rename tables or columns, a single task supports up to 1,000 tables. For more than 1,000 tables, split the migration into multiple tasks or migrate the entire database instead.
Required configuration for incremental data migration:
-
Binary logging must be enabled, and
binlog_row_imagemust be set tofull. If this is not configured, the precheck fails and the task cannot start. -
Binary log retention periods: If the retention period is insufficient, DTS may fail to read the binary logs, which can cause task failure or data inconsistency. After full migration completes, you can reduce the retention period to more than 24 hours.
-
Incremental migration only: retain binary logs for more than 24 hours.
-
Full + incremental migration: retain binary logs for at least 7 days.
-
Operational restrictions during migration:
-
Do not upgrade or downgrade the source instance, modify frequently-updated tables, change shard keys, or run DDL operations on source objects. These actions will cause the task to fail.
-
During full and incremental migration, DTS temporarily disables foreign key constraint checks and cascade operations at the session level. Cascade updates or deletes during this period may cause data inconsistency.
-
If the network type of the PolarDB-X 1.0 instance changes during migration, update the network connection settings in the DTS migration task accordingly.
-
For full-only migration (without incremental): do not write data to the source during migration, as this will cause inconsistency. To keep data consistent, select Schema Migration, Full Data Migration, and Incremental Data Migration together.
Other limitations
-
Evaluate the impact on source and destination database performance before migrating. Run migration tasks during off-peak hours. Full data migration uses read and write resources on both databases, which may increase server load.
-
Full data migration uses concurrent INSERT operations, which causes table fragmentation in the destination. After full migration, the destination tablespace is larger than the source.
-
DTS attempts to resume failed tasks for up to 7 days. Before switching workloads to the destination, stop or release any failed tasks. Alternatively, run the
REVOKEstatement to revoke write permissions from the DTS accounts that access the destination. If a failed task resumes after you switch workloads, source data may overwrite destination data.
Precautions
-
DTS periodically updates the `
dts_health_check.ha_health_check` table in the source database to advance the binary log position. -
If the destination ApsaraMQ for Kafka instance is upgraded or downgraded during migration, restart the instance to resume migration.
Billing
| Migration type | Instance configuration fee | Internet traffic fee |
|---|---|---|
| Schema migration and full data migration | Free of charge | Free of charge |
| Incremental data migration | Charged. For more information, see Billing overview. | — |
Migration types
| Migration type | Description |
|---|---|
| Schema migration | DTS migrates object schemas from the source to the destination. |
| Full data migration | DTS migrates all existing data from the source to the destination. |
| Incremental data migration | After full data migration completes, DTS continuously migrates incremental changes from the source. This keeps the destination in sync and allows you to cut over without interrupting your application. |
SQL operations for incremental migration
| Operation type | SQL statements |
|---|---|
| DML | INSERT, UPDATE, DELETE |
Permissions required
| Database | Schema migration | Full data migration | Incremental data migration |
|---|---|---|---|
| PolarDB-X 1.0 | SELECT | SELECT | REPLICATION SLAVE, REPLICATION CLIENT, and SELECT on the objects to be migrated. For details, see the Permissions required for an account to synchronize data section. |
| ApsaraMQ for Kafka | Read and write | Read and write | Read and write |
Data type mappings
For more information, see Data type mappings for schema synchronization.
Create a migration task
Step 1: Go to the Data Migration Tasks page
-
Log on to the Data Management (DMS) console.
-
In the top navigation bar, click DTS.
-
In the left-side navigation pane, choose DTS (DTS) > Data Migration.
Console operations may vary based on the DMS mode and layout. For more information, see Simple mode and Customize the layout and style of the DMS console. You can also go directly to the Data Migration Tasks page.
Step 2: Select a region
From the drop-down list next to Data Migration Tasks, select the region where the migration instance resides.
In the new DTS console, select the region in the upper-left corner.
Step 3: Configure source and destination databases
Click Create Task and configure the following parameters.
After selecting the source and destination instances, read the Limits section at the top of the page before proceeding.
Source database (PolarDB-X 1.0)
| Parameter | Description |
|---|---|
| Task Name | A name for the task. DTS assigns a default name. Specify a descriptive name to make the task easy to identify. The name does not need to be unique. |
| Select an existing DMS database instance | An existing database instance to use. If you select one, DTS populates the other parameters automatically. If not, configure the parameters manually. |
| Database Type | Select PolarDB-X 1.0. |
| Connection Type | Select Alibaba Cloud Instance. |
| Instance Region | The region where the source PolarDB-X 1.0 instance resides. |
| Replicate Data Across Alibaba Cloud Accounts | Whether to migrate data across Alibaba Cloud accounts. Select No for same-account migration. |
| Instance ID | The ID of the source PolarDB-X 1.0 instance. |
| Database Account | The database account for the source instance. Grant permissions based on the data format used in the destination ApsaraMQ for Kafka instance. |
| Database Password | The password for the database account. |
Destination database (ApsaraMQ for Kafka)
| Parameter | Description |
|---|---|
| Select an existing DMS database instance | An existing database instance to use. If you select one, DTS populates the other parameters automatically. If not, configure the parameters manually. |
| Database Type | Select Kafka. |
| Connection Type | Select Express Connect, VPN Gateway, or Smart Access Gateway. |
| Instance Region | The region where the destination ApsaraMQ for Kafka instance resides. |
| Connected VPC | The virtual private cloud (VPC) ID of the destination ApsaraMQ for Kafka instance. To find the VPC ID, go to the Instance Details page in the ApsaraMQ for Kafka console and check the Configuration Information section on the Instance Information tab. |
| IP Address or Domain Name | An IP address of the destination ApsaraMQ for Kafka instance. To find an IP address, go to the Instance Details page and check the Endpoint Information section on the Instance Information tab. Copy an IP address from the Default Endpoint field. |
| Port Number | The service port of the destination instance. Default value: 9092. |
| Database Account | The database account of the destination ApsaraMQ for Kafka instance. Required only if the access control list (ACL) feature is enabled. For more information, see Grant permissions to SASL users. |
| Database Password | The password for the database account. Required only if the ACL feature is enabled. |
| Kafka Version | The version of the destination ApsaraMQ for Kafka instance. |
| Encryption | Whether to encrypt the connection. Select Non-encrypted or SCRAM-SHA-256 based on your security requirements. |
| Topic | The topic to receive migrated data. Select from the drop-down list. |
| Topic That Stores DDL Information | The topic to store DDL information. If not specified, DDL information is stored in the topic set in the Topic parameter. |
| Use Kafka Schema Registry | Whether to use Kafka Schema Registry, which provides a RESTful interface to store and retrieve Avro schemas. Select No to skip, or Yes and enter the URL or IP address registered in Kafka Schema Registry. |
Step 4: Test connectivity
Click Test Connectivity and Proceed at the bottom of the page.
For Alibaba Cloud database instances (such as ApsaraDB RDS for MySQL or ApsaraDB for MongoDB), DTS automatically adds its server CIDR blocks to the instance IP address whitelist. For self-managed databases on Elastic Compute Service (ECS) instances, DTS automatically adds the CIDR blocks to the ECS security group rules. If the self-managed database spans multiple ECS instances, manually add the CIDR blocks to each instance's security group rules. For self-managed databases in a data center or hosted by a third-party provider, manually add the DTS server CIDR blocks to the database IP address whitelist. For details, see Add the CIDR blocks of DTS servers to the security settings of on-premises databases.
Adding DTS server CIDR blocks to IP whitelists or security group rules introduces security exposure. Before using DTS, take preventive measures, including: securing your account credentials, restricting exposed ports, authenticating API calls, auditing whitelist and security group rules regularly, and using Express Connect, VPN Gateway, or Smart Access Gateway for the database connection.
Step 5: Select objects and configure migration settings
Configure the following parameters:
| Parameter | Description |
|---|---|
| Migration Types | Select the migration types based on your requirements: <br>- Full migration only: select Schema Migration and Full Data Migration.<br>- Full migration with live sync: select Schema Migration, Full Data Migration, and Incremental Data Migration.<br><br> Note
Without Incremental Data Migration, do not write to the source database during migration to avoid data inconsistency. |
| Processing Mode of Conflicting Tables | - Precheck and Report Errors: checks for tables in the destination with the same names as source tables. The precheck fails if matches exist. To handle name conflicts, use object name mapping to rename destination tables.<br>- Ignore Errors and Proceed: skips the precheck for duplicate table names. Warning
If schemas match, DTS skips records with duplicate primary keys. If schemas differ, only specific columns are migrated or the task may fail. |
| Data Format in Kafka | The format for data written to ApsaraMQ for Kafka. PolarDB-X 1.0 supports only DTS Avro; Canal JSON is not supported. DTS Avro data is parsed according to the DTS Avro schema definition. For the schema reference, see GitHub. |
| Policy for Shipping Data to Kafka Partitions | Custom partition routing is not supported for this source-destination combination. |
| Capitalization of Object Names in Destination Instance | The capitalization policy for database, table, and column names in the destination. Default is DTS default policy. For more information, see Specify the capitalization of object names in the destination instance. |
| Source Objects | Select objects from the Source Objects section and click the arrow icon to move them to Selected Objects. <br><br> Note
<br>- Selecting tables migrates only tables, not views, triggers, or stored procedures.<br>- Selecting a database migrates data based on these rules:<br> - Tables with a primary key: primary key columns become distribution keys.<br> - Tables without a primary key: DTS auto-generates an auto-increment primary key column, which may cause data inconsistency. |
| Selected Objects | Right-click an object to rename it or set WHERE filter conditions. To rename multiple objects at once, click Batch Edit. <br><br> Note
Renaming an object may cause dependent objects to fail migration. For filter conditions, see Set filter conditions. |
Step 6: Configure advanced settings
Click Next: Advanced Settings and configure the following:
| Parameter | Description |
|---|---|
| Select the dedicated cluster used to schedule the task | Optional. Leave blank to use the default shared cluster. For more information, see What is a DTS dedicated cluster? |
| Set Alerts | Whether to configure alerts for task failures or high migration latency. Select No to skip, or Yes to set an alert threshold and specify alert contacts. For more information, see Configure monitoring and alerting. |
| Retry Time for Failed Connections | How long DTS retries a failed connection after the task starts. Valid values: 10–1440 minutes. Default: 720 minutes. Set this to more than 30 minutes. If DTS reconnects within this period, the task resumes; otherwise, the task fails. <br><br> Note
If multiple tasks share a source or destination database and have different retry times, the most recently set value takes effect. DTS charges for instances during retry periods — set the retry time based on your requirements and release instances promptly when no longer needed. |
| The wait time before a retry when other issues occur in the source and destination databases | How long DTS retries failed DDL or DML operations. Valid values: 1–1440 minutes. Default: 10 minutes. Set this to more than 10 minutes. This value must be less than the Retry Time for Failed Connections value. |
| Configure ETL | Whether to enable extract, transform, and load (ETL). Select Yes to enter data processing statements, or No to skip. For more information, see What is ETL? and Configure ETL in a data migration or data synchronization task. |
Step 7: Run the precheck
Click Next: Save Task Settings and Precheck.
To review the API parameters for this task configuration, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
DTS runs a precheck before the task starts. The task can proceed only after the precheck passes.
-
If the precheck fails, click View Details next to each failed item, fix the issues, and run the precheck again.
-
If an alert appears during the precheck:
-
For alerts that cannot be ignored: click View Details, fix the issues, and rerun the precheck.
-
For alerts that can be ignored: click Confirm Alert Details, then click Ignore > OK > Precheck Again. Ignoring alerts may result in data inconsistency.
-
Step 8: Purchase the migration instance
Wait until the success rate reaches 100%, then click Next: Purchase Instance.
On the Purchase Instance page, configure the following:
| Parameter | Description |
|---|---|
| Resource Group | The resource group for the migration instance. Default: default resource group. For more information, see What is Resource Management? |
| Instance Class | The instance class determines migration speed. Select based on your data volume and time requirements. For more information, see Specifications of data migration instances. |
Step 9: Start migration
Read and accept the Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start.
The migration task starts. Monitor progress in the task list.
What's next
After migration completes, verify data consistency between the source and destination before switching your application to the destination ApsaraMQ for Kafka instance. Stop or release any failed DTS tasks to prevent source data from overwriting destination data after a task resumes.