Data Transmission Service (DTS) lets you migrate data from a PolarDB for MySQL cluster to an ApsaraDB for ClickHouse cluster, enabling centralized analytics on your operational data.
Minimum steps to complete this migration:
Meet the prerequisites (ClickHouse cluster version, binary logging if needed).
Grant the required permissions on both the source and destination databases.
Create and configure a DTS migration task.
Run the precheck, purchase an instance, and start the migration.
Prerequisites
Before you begin, make sure you have:
An ApsaraDB for ClickHouse cluster running version 20.8 or later. See Create a cluster.
A ClickHouse cluster with storage space larger than the used storage space of the source PolarDB for MySQL cluster.
For incremental data migration: binary logging enabled on the source cluster, with the
loose_polar_log_binparameter set toon. See Enable binary logging and Modify parameters.
Enabling binary logging on a PolarDB for MySQL cluster incurs storage charges for the binary log files. Set the binary log retention period to at least 3 days (7 days recommended) to prevent DTS from losing access to the logs. See Modify the retention period.
Billing
| Migration type | Link setup fee | Data transfer cost |
|---|---|---|
| Schema migration + full data migration | Free | Free |
| Incremental data migration | Charged | See Billing overview |
Permissions required
Grant the following permissions before configuring the migration task.
| Database | Required permissions | How to grant |
|---|---|---|
| Source PolarDB for MySQL cluster | Read permissions on the migration objects | Create and manage a database account and Manage the password of a database account |
| Destination ApsaraDB for ClickHouse cluster (v22.8 or later) | Read and write permissions — a privileged account meets this requirement | Account management for Community-compatible Edition clusters |
| Destination ApsaraDB for ClickHouse cluster (v21.8) | Read, Write, and Set Permissions; and Enable DDL | Same as above |
Limitations
Source database limits
The source database server must have sufficient outbound bandwidth; low bandwidth reduces migration speed.
A single migration task supports a maximum of 1,000 tables when migrating at the table level with object name mapping. If you exceed this limit, split the tables across multiple tasks or migrate at the database level.
During schema migration and full data migration, do not run DDL operations on the source database — this causes the migration task to fail.
During full data migration, DTS queries the source database and may create a metadata lock that temporarily blocks DDL operations.
If you run only full data migration (no incremental), do not write new data to the source database during migration, or data inconsistency will occur.
Other limits
Read-only nodes of the source PolarDB for MySQL cluster cannot be migrated.
Object Storage Service (OSS) external tables are not migrated.
The following object types are not supported: INDEX, PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, and FOREIGN KEY (FK).
Primary/standby switchover is not supported during full data migration. If a switchover occurs, reconfigure the migration task.
The RENAME TABLE operation cannot be migrated.
The maximum number of databases to migrate is 256 (ApsaraDB for ClickHouse limit).
Database, table, and column names must comply with ApsaraDB for ClickHouse naming conventions.
The Partition Key cannot be a nullable field. Partition keys support only BIGINT, INT, TIMESTAMP, DATETIME, and DATE data types.
Time-type data in ApsaraDB for ClickHouse has specific range limits. See Time data type ranges for details.
When schema migration is enabled, DTS automatically adds
_sign,_is_deleted, and_versionfields to destination tables. If you skip schema migration, create the destination tables manually with these fields. See Destination table requirements.During DTS migration, do not write data from other sources to the destination database — this causes data inconsistency.
If a DTS instance fails, the DTS team attempts to recover it within 8 hours. Recovery operations may include restarting the instance and adjusting DTS instance parameters (database parameters are never modified).
Before migration, evaluate the performance of both databases and run the migration during off-peak hours to reduce load.
If you are migrating specific tables (not an entire database), do not use pt-online-schema-change for online DDL changes on those tables — this causes migration failure. Use Data Management (DMS) instead. See Change schemas without locking tables.
Online DDL limits
| Tool or mode | Behavior |
|---|---|
| DMS or gh-ost (online DDL changes at the source) | DTS migrates the original DDL statement to the destination without migrating temporary table data, which may cause table locks at the destination. |
| pt-online-schema-change | Not supported. Data may be lost at the destination or the migration instance may fail. |
| Temporary table mode online DDL (including multi-table merge) | Data may be lost at the destination or the migration instance may fail. |
| Non-standard MySQL DDL syntax in source | Migration may fail or data may be lost. |
Supported SQL operations for incremental migration
| Operation type | Supported SQL statements | Unsupported SQL statements |
|---|---|---|
| DML | INSERT, UPDATE, DELETE | — |
| DDL | CREATE TABLE, TRUNCATE TABLE, DROP TABLE, ADD COLUMN, MODIFY COLUMN, DROP COLUMN | RENAME TABLE; all DDL related to INDEX, PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, and FOREIGN KEY |
Data type mappings
PolarDB for MySQL and ApsaraDB for ClickHouse support different data types, so a one-to-one mapping is not always possible. During schema migration, DTS maps source data types to the closest supported types in ClickHouse. See Data type mappings between heterogeneous databases for the full mapping table.
Create a migration task
Step 1: Go to the Data Migration page
Use one of the following consoles.
DTS console
Log on to the DTS console.
In the left-side navigation pane, click Data Migration.
In the upper-left corner, select the region where the migration instance will reside.
DMS console
The exact steps may vary based on your DMS console mode and layout. See Simple mode and Customize the layout and style of the DMS console.
Log on to the DMS console.
In the top navigation bar, move the pointer over Data + AI > DTS (DTS) > Data Migration.
From the drop-down list next to Data Migration Tasks, select the region where the migration instance will reside.
Step 2: Configure source and destination databases
Click Create Task, then fill in the following parameters.
| Category | Parameter | Description |
|---|---|---|
| — | Task Name | A name for the DTS task. DTS generates one automatically. Specify a descriptive name to make the task easy to identify. The name does not need to be unique. |
| Source Database | Select Existing Connection | Select a registered database instance from the drop-down list — DTS populates the remaining parameters automatically. If the instance is not registered, fill in the parameters manually. In the DMS console, use the Select a DMS database instance drop-down. |
| Database Type | Select PolarDB for MySQL. | |
| Access Method | Select Alibaba Cloud Instance. | |
| Instance Region | Select the region where the source PolarDB for MySQL cluster resides. | |
| Replicate Data Across Alibaba Cloud Accounts | Select No for same-account migration. | |
| PolarDB Cluster ID | Select the ID of the source PolarDB for MySQL cluster. | |
| Database Account | Enter the database account. See Permissions required. | |
| Database Password | Enter the password for the database account. | |
| Encryption | Select an encryption method as needed. For Secure Sockets Layer (SSL) encryption, see Set SSL encryption. | |
| Destination Database | Select Existing Connection | Same as source — select a registered instance or fill in parameters manually. |
| Database Type | Select ClickHouse. | |
| Access Method | Select Alibaba Cloud Instance. | |
| Instance Region | Select the region where the destination ApsaraDB for ClickHouse cluster resides. | |
| Replicate Data Across Alibaba Cloud Accounts | Select No for same-account migration. | |
| Cluster Type | Select the ApsaraDB for ClickHouse cluster type. | |
| Cluster ID | Select the ID of the destination ApsaraDB for ClickHouse cluster. | |
| Database Account | Enter the database account. See Permissions required. | |
| Database Password | Enter the password for the database account. |
Click Test Connectivity and Proceed.
Make sure the CIDR blocks of DTS servers are added to the security settings (whitelist) of both databases. See Add DTS server IP addresses to a whitelist.
Step 3: Configure objects to migrate
On the Configure Objects page, set the following parameters.
| Parameter | Description |
|---|---|
| Migration Types | Select the migration types based on your goal: <br>• Schema Migration + Full Data Migration: one-time data copy, no ongoing replication. <br>• Schema Migration + Full Data Migration + Incremental Data Migration: continuous replication that keeps the destination in sync while your source stays live. <br><br> Note If you skip Schema Migration, create the destination tables manually before starting. If you skip Incremental Data Migration, avoid writing to the source database during migration. |
| Processing Mode of Conflicting Tables | Precheck and Report Errors (default): fails the precheck if destination tables share names with source tables. Use object name mapping to rename tables if needed. <br><br>Ignore Errors and Proceed: skips the check. During full data migration, conflicting records in the destination are retained; during incremental migration, they are overwritten. If schemas differ, only matching columns are migrated. Use with caution. |
| Capitalization of Object Names in Destination Instance | Controls the capitalization of database, table, and column names in the destination. Default is DTS default policy. See Specify the capitalization of object names in the destination instance. |
| Source Objects | Select databases or tables from the Source Objects section, then click the arrow icon to add them to Selected Objects. |
| Selected Objects | Right-click an object to rename it. Click Batch Edit to rename multiple objects at once. Right-click a table to set filter conditions for data. See Map object names and Set filter conditions. |
Step 4: Configure advanced settings
Click Next: Advanced Settings and configure the following.
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | By default, DTS schedules the task to the shared cluster. Purchase a dedicated cluster to improve stability. See What is a DTS dedicated cluster. |
| Time zone of destination database | The time zone for date and time data (such as DateTime) written to the destination ClickHouse cluster. |
| Retry Time for Failed Connections | How long DTS retries a failed connection before failing the task. Range: 10–1,440 minutes. Default: 720 minutes. Set to a value greater than 30 minutes. <br><br> Note If multiple tasks share the same source or destination database, the value set most recently takes effect. DTS charges for the instance during retry. |
| Retry Time for Other Issues | How long DTS retries failed DDL or DML operations before failing the task. Range: 1–1,440 minutes. Default: 10 minutes. Set to a value greater than 10 minutes. Must be less than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Migration | Limits the read/write load on the source and destination during full data migration. Configure Queries per second (QPS) to the source database, rows per second (RPS) of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Migration is selected. |
| Enable Throttling for Incremental Data Migration | Limits load during incremental data migration. Configure RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s). Available only when Incremental Data Migration is selected. |
| Whether to delete SQL operations on heartbeat tables of forward and reverse tasks | Controls whether DTS writes heartbeat table SQL operations to the source while the instance runs. Yes: heartbeat writes are suppressed, but a latency indicator may be displayed. No: heartbeat writes are enabled, which may affect physical backup and cloning of the source database. |
| Environment Tag | An optional tag to identify the environment (for example, production or test). |
| Configure ETL | Enables the extract, transform, and load (ETL) feature. Yes: enter data processing statements in the code editor. See Configure ETL in a data migration or data synchronization task. No: skip ETL configuration. |
| Monitoring and Alerting | Configures alerts for task failures or latency exceeding a threshold. Yes: set the alert threshold and notification contacts. See Configure monitoring and alerting. No: no alerts. |
Step 5: Configure table fields for ClickHouse
Click Next: Configure Database and Table Fields and configure the Type, Primary Key Column, Sort Key, Distribution Key, and Partition Key for each migrated table.
DTS provides default configurations. To review or modify them, set Definition Status to All.
Key rules:
Primary Key Column and Sort Key support composite keys — select multiple fields from the drop-down lists.
Select one or more columns from the Primary Key Column to serve as the Partition Key.
Distribution Key supports only a single field.
Leave Partition Key blank if not needed, but do not select a nullable field — this causes the migration task to fail.
Partition keys support only BIGINT, INT, TIMESTAMP, DATETIME, and DATE. See Partition key calculation logic.
For more information about primary keys, sort keys, and partition keys, see CREATE TABLE.
Step 6: Run the precheck and start the migration
Click Next: Save Task Settings and Precheck. To preview the API parameters for this task, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters before proceeding.
DTS runs a precheck before starting the migration. If the precheck fails:
Click View Details next to each failed item, fix the issue, then click Precheck Again.
If an item shows a warning (not a failure) and you can accept the risk, click Confirm Alert Details > Ignore > OK, then click Precheck Again.
ImportantIgnoring a precheck alert may result in data inconsistency.
Step 7: Purchase the instance and start migration
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
On the Purchase Instance page, configure the following parameters.
Section Parameter Description New Instance Class Resource Group The resource group for the migration instance. Default: default resource group. See What is Resource Management? Instance Class The instance class determines migration speed. See Instance classes of data migration instances. Read and accept Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start > OK.
View migration progress on the Data Migration page.
If the task does not include incremental data migration, it stops automatically when complete. The Status column shows Completed.
If the task includes incremental data migration, it runs continuously and never stops automatically. The Status column shows Running.
Query migrated data
After migration, run the following query to retrieve data from the destination ClickHouse cluster:
SELECT * FROM <table_name> FINAL WHERE _sign > 0;The
FINALmodifier deduplicates rows with the same sort keys, ensuring you see the latest version of each row.The
WHERE _sign > 0clause filters out deleted rows.
Without FINAL, queries may return duplicate or deleted records.
Appendix
Destination table requirements
If you skip schema migration and create destination tables manually, the tables must meet the following requirements.
If a destination table includes the ENGINE parameter, set it to ENGINE = ReplicatedReplacingMergeTree(_version, _is_deleted). Any other engine configuration may cause data inconsistency.
| Cluster edition | Table creation requirement |
|---|---|
| Community-compatible Edition | Create both a local table and a distributed table. The distributed table name must match the source table name. The local table name must follow the format <distributed_table_name>_local. |
| Enterprise Edition | Create a table with the same name as the source table. |
Extra fields added by DTS
DTS automatically adds the following fields to each destination table during schema migration.
To query synchronized data, use:
SELECT * FROM <table_name> FINAL WHERE _sign > 0;| Cluster version | Field | Data type | Default value | Description |
|---|---|---|---|---|
| Community-compatible Edition running ClickHouse V23.8 or earlier | _sign | Int8 | 1 | DML operation type: INSERT or UPDATE = 1; DELETE = -1. |
_version | UInt64 | 1 | Timestamp when the record was written to the ClickHouse cluster. | |
| Enterprise Edition or Community-compatible Edition running ClickHouse V23.8 or later | _sign | Int8 | 1 | DML operation type: INSERT or UPDATE = 1; DELETE = -1. |
_is_deleted | UInt8 | 0 | Whether the record is deleted: Insert or Update = 0; Delete = 1. | |
_version | UInt64 | 1 | Timestamp when the record was written to the ClickHouse cluster. |
Time data type ranges
If time data in the source PolarDB for MySQL cluster falls outside the following ranges, the values written to ApsaraDB for ClickHouse will be incorrect.
| Data type | Minimum value | Maximum value |
|---|---|---|
| Date | 1970-01-01 00:00:00 | 2149-06-06 00:00:00 |
| Date32 | 1925-01-01 00:00:00 | 2283-11-11 00:00:00 |
| DateTime | 1970-01-01 08:00:00 | 2106-02-07 14:28:15 |
| DateTime64 | 1925-01-01 08:00:00 | 2283-11-12 07:59:59 |
Partition key calculation logic
DTS applies the following logic when deriving partition key values from source fields.
| Source field type | Calculation logic |
|---|---|
| BIGINT | intDiv(<partition_key_field>, 18014398509481984) |
| INT | intDiv(<partition_key_field>, 4194304) |
| TIMESTAMP | toYYYYMM(<partition_key_field>) |
| DATETIME | toYYYYMM(<partition_key_field>) |
| DATE | toYYYYMM(<partition_key_field>) |