All Products
Search
Document Center

Data Transmission Service:Synchronize data from an RDS MySQL instance to an ApsaraDB for ClickHouse cluster

Last Updated:Mar 28, 2026

Data Transmission Service (DTS) lets you synchronize data from a MySQL database, such as a self-managed MySQL database or an ApsaraDB RDS for MySQL instance, to an ApsaraDB for ClickHouse cluster. ApsaraDB for ClickHouse is a columnar database built for online analytical processing (OLAP) that delivers aggregate analysis and large-table queries at speeds an order of magnitude faster than other analytic databases. This guide walks you through configuring a synchronization task that performs an initial full load followed by continuous incremental CDC synchronization.

Before you begin

Before you begin, ensure that you have:

  • A destination ApsaraDB for ClickHouse cluster running version 20.8 or later. See Create a cluster

  • Storage space in the ClickHouse cluster that exceeds the storage space used by the source RDS MySQL instance

  • A database account on the source RDS MySQL instance with read permissions on the objects to be synchronized. If the account was not created through the RDS console, it must also have the REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions. See Create an account and Modify account permissions

  • A database account on the destination ApsaraDB for ClickHouse cluster with the required permissions:

  • Binary logging configured correctly on the source database:

    • ApsaraDB RDS for MySQL: Binary logging is enabled by default. The binlog_row_image parameter must be set to full. See Modify instance parameters

    • Self-managed MySQL: You must manually enable the binary logging feature and set binlog_format to row and binlog_row_image to full. If the source is a self-managed MySQL database deployed in a dual-primary cluster, you must also set log_slave_updates to ON. See Create an account for a self-managed MySQL database and configure binary logging

    • Binary logs must be retained for at least 3 days for ApsaraDB RDS for MySQL (7 days recommended), and at least 7 days for self-managed MySQL databases. See Delete binary log files

Billing

Synchronization typeFee
Schema synchronization and full data synchronizationFree
Incremental data synchronizationCharged. See Billing overview.

Supported SQL operations

Operation typeSQL statement
DMLINSERT, UPDATE, DELETE
DDLCREATE TABLE, DROP TABLE, TRUNCATE TABLE; ADD COLUMN, MODIFY COLUMN, DROP COLUMN

Data type mappings

MySQL and ApsaraDB for ClickHouse support different data types, so one-to-one type mapping is not always possible. During schema synchronization, DTS maps source types to the closest supported types in ClickHouse. For the full mapping table, see Data type mappings for initial schema synchronization.

Limitations

Source database limitations

ConstraintDetails
Primary key requiredTables without a primary key cannot be synchronized.
Table-level sync limitWhen synchronizing at the table level with object name mapping, a single task supports a maximum of 1,000 tables. If you exceed this limit, split the tables into multiple tasks or configure the task to synchronize the entire database.
Binary logging requirementsFor ApsaraDB RDS for MySQL, binlog_row_image must be set to full. Otherwise, the precheck fails and the task cannot start. For self-managed MySQL databases, you must enable binary logging and set binlog_format to row and binlog_row_image to full. For self-managed MySQL databases deployed in a dual-primary cluster, log_slave_updates must also be set to ON.
Binary log retentionApsaraDB RDS for MySQL: retain binary logs for at least 3 days (7 days recommended). Self-managed MySQL databases: retain binary logs for at least 7 days. If DTS cannot obtain the binary logs, the task may fail or data inconsistency may occur.
No DDL during initial syncDo not run DDL statements that change database or table schemas during schema synchronization or full data synchronization.
MySQL 8.0.23 and laterIf the source MySQL version is 8.0.23 or later and the data includes invisible columns, those columns cannot be read and data loss occurs. To make a column visible, run ALTER TABLE <table_name> ALTER COLUMN <column_name> SET VISIBLE;. Tables without explicit primary keys automatically generate invisible primary keys — make them visible before synchronization.
EncDBIf EncDB is enabled on the RDS MySQL instance, full data synchronization cannot be performed. Instances with Transparent Data Encryption (TDE) enabled support schema synchronization, full data synchronization, and incremental data synchronization.
Read-only instancesA read-only ApsaraDB RDS for MySQL 5.6 instance (which does not record transaction logs) cannot be used as the source database.

Destination and other limitations

ConstraintDetails
RENAME TABLEThe RENAME TABLE operation is not supported.
Supported object typesSynchronization of INDEX, PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, and FK is supported.
Non-standard DDLIf the source RDS MySQL instance uses DDL statements that do not follow standard MySQL syntax, the synchronization task may fail or data may be lost.
Online DDL with temporary tablesIf you run online DDL operations that use temporary tables (such as table merges) on the source, data may be lost at the destination or the synchronization instance may fail.
Online DDL toolsDTS supports online DDL changes made with Data Management (DMS) or gh-ost — it synchronizes the original DDL statements to the destination. DTS does not support synchronizing online DDL changes made with pt-online-schema-change; if such changes exist, data may be lost at the destination or the synchronization instance may fail.
ClickHouse database limitThe number of databases to be synchronized cannot exceed the ClickHouse limit of 256.
Object namingDatabase, table, and column names must comply with ClickHouse naming conventions.
Extra fields added by DTSDuring schema synchronization, DTS adds _sign, _is_deleted, and _version fields to destination tables. If you skip schema synchronization, create the destination tables manually and add these fields. See Table and field information.
Partition Key type and nullabilityThe Partition Key cannot be a nullable field, or the task will fail. Supported types: BIGINT, INT, TIMESTAMP, DATETIME, and DATE.
ClickHouse time rangeTime-type data in ApsaraDB for ClickHouse has range limits. If the source MySQL time data is outside the supported range, the synchronized values will be incorrect. See Time information.
Writes to destination during syncDo not allow any data outside of DTS to be written to the destination database during synchronization. Doing so causes data inconsistency between the source and destination.
Primary/unique key conflictsDuring full data synchronization, if a record in the destination has the same primary key or unique key value as a source record, the destination record is retained (not overwritten). During incremental data synchronization, the destination record is overwritten. If the source and destination schemas differ, data initialization may fail.
Synchronization performanceEvaluate source and destination database performance before starting. Run data synchronization during off-peak hours to avoid increased database load.
DTS task failure recoveryIf a DTS task fails, DTS support attempts to restore it within 8 hours. The task may be restarted and task parameters may be modified during restoration.

Usage notes

  • DTS executes the CREATE DATABASE IF NOT EXISTS 'test' statement in the source database on a schedule to advance the binary log file position.

  • DTS calculates synchronization latency based on the timestamp of the latest synchronized data at the destination and the current timestamp at the source. If no DML operations are performed on the source for a long time, latency readings may be inaccurate. Run a DML operation on the source to refresh the latency. If you synchronize an entire database, create a heartbeat table that is updated every second.

  • If the source database is a self-managed MySQL database and you perform a primary/secondary switchover on the source while the data synchronization task is running, the task fails.

Configure a data synchronization task

Step 1: Open the Data Synchronization page

Use one of the following methods:

DTS console

  1. Log on to the DTS console.DTS console

  2. In the left-side navigation pane, click Data Synchronization.

  3. In the upper-left corner, select the region where the synchronization instance resides.

DMS console

Operations may vary based on the mode and layout of the DMS console. For more information, see Simple mode and Customize the layout and style of the DMS console.

  1. Log on to the DMS console.DMS console

  2. In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.

  3. From the drop-down list to the right of Data Synchronization Tasks, select the region where the instance resides.

Step 2: Configure source and destination databases

  1. Click Create Task.

  2. Configure the parameters described in the following table.

Task name

ParameterDescription
Task NameEnter a descriptive name for the task. DTS generates a name automatically; you do not need to change it, and it does not need to be unique.

Source database

ParameterDescription
Select Existing ConnectionIf the source instance is already registered with DTS, select it from the drop-down list. DTS populates the remaining parameters automatically. Otherwise, configure the parameters below.
Database TypeSelect MySQL.
Access MethodSelect Alibaba Cloud Instance.
Instance RegionSelect the region where the source RDS MySQL instance resides.
Replicate Data Across Alibaba Cloud AccountsSelect No for same-account synchronization.
RDS Instance IDSelect the ID of the source RDS MySQL instance.
Database AccountEnter the database account. See Before you begin for permission requirements.
Database PasswordEnter the account password.
EncryptionSelect Non-encrypted or SSL-encrypted. To use SSL encryption, enable SSL on the RDS MySQL instance first. See Use a cloud certificate to enable SSL encryption.

Destination database

ParameterDescription
Select Existing ConnectionIf the destination cluster is already registered with DTS, select it from the drop-down list. DTS populates the remaining parameters automatically. Otherwise, configure the parameters below.
Database TypeSelect ClickHouse.
Access MethodSelect Alibaba Cloud Instance.
Instance RegionSelect the region where the destination ApsaraDB for ClickHouse cluster resides.
Replicate Data Across Alibaba Cloud AccountsSelect No for same-account synchronization.
Cluster TypeSelect the cluster type.
Cluster IDSelect the ID of the destination cluster.
Database AccountEnter the database account. See Before you begin for permission requirements.
Database PasswordEnter the account password.
  1. Click Test Connectivity and Proceed.

Make sure that the CIDR blocks of DTS servers are allowed by the security settings of the source and destination databases. DTS adds these CIDR blocks automatically for Alibaba Cloud instances. For self-managed databases, add them manually. See Add the CIDR blocks of DTS servers.

Step 3: Select synchronization objects

  1. In the Configure Objects step, configure the following settings.

ConfigurationDescription
Synchronization TypesSelect Schema Synchronization, Full Data Synchronization, and Incremental Data Synchronization. DTS synchronizes historical data during the initial full load, which is required for subsequent incremental synchronization.
Processing Mode of Conflicting TablesPrecheck and Report Errors (default): the precheck fails if the destination contains tables with the same names as source tables. Ignore Errors and Proceed: skips this check. >
Warning

Selecting Ignore Errors and Proceed may cause data inconsistency.

Capitalization of Object Names in Destination InstanceBy default, DTS default policy is used. Select another option if you need to match the capitalization of the source or destination. See Specify the capitalization of object names.
Source ObjectsSelect objects from the Source Objects section and click the icon to add them to Selected Objects. Select at the database or table level.
Selected ObjectsTo rename a single object, right-click it. To rename multiple objects, click Batch Edit. See Map object names. To filter rows, right-click a table and set filter conditions. See Set filter conditions.
Note
  • To filter data, you can right-click the table to synchronize in the Selected Objects box and set the filter conditions in the dialog box. For more information, see Set filter conditions.

  • If you use the object name mapping feature, other objects that depend on the mapped object may fail to be synchronized.

If you use object name mapping, other objects that depend on mapped objects may fail to synchronize.
  1. Click Next: Advanced Settings and configure the following parameters.

ConfigurationDescription
Dedicated Cluster for Task SchedulingBy default, DTS schedules the task to a shared cluster. For higher stability, purchase a dedicated cluster. See What is a DTS dedicated cluster.
Time zone of destination databaseSelect the time zone for DateTime data written to the ClickHouse cluster.
Retry Time for Failed ConnectionsThe time range (in minutes) DTS retries if the source or destination database fails to connect after the task starts. Valid values: 10–1440. Default: 720. Set this to more than 30.
Retry Time for Other IssuesThe time range (in minutes) DTS retries failed DDL or DML operations. Valid values: 1–1440. Default: 10. Set this to more than 10. Must be less than Retry Time for Failed Connections.
Enable Throttling for Full Data SynchronizationConfigure QPS (queries per second), RPS (rows per second), and data speed limits to reduce load on the source and destination databases during full data synchronization. Available only when Full Data Synchronization is selected.
Enable Throttling for Incremental Data SynchronizationConfigure RPS and data speed limits for incremental data synchronization to reduce destination database load.
Whether to delete SQL operations on heartbeat tables of forward and reverse tasksYesalert notification settings: DTS does not write heartbeat SQL to the source, but latency readings may be displayed. No: DTS writes heartbeat SQL to the source, which may affect source database features such as physical backup and cloning.
Environment Tag(Optional) Select a tag to identify the instance.
Configure ETLSelect Yes to configure extract, transform, and load (ETL) processing. See Configure ETL in a data migration or data synchronization task. Select No to skip.
Monitoring and AlertingSelect Yes to receive alerts if the task fails or latency exceeds a threshold. Configure the threshold and notification settings. See Configure monitoring and alerting.
  1. Click Next: Configure Database and Table Fields to configure ClickHouse table settings. For each table to synchronize, set the Type, Primary Key Column, Sort Key, Distribution Key, and Partition Key:

    • DTS provides default configurations. Set Definition Status to All to review and modify them.

    • Primary Key Column and Sort Key support composite keys — select multiple fields from the drop-down lists.

    • Select one or more columns from Primary Key Column as the Partition Key. Only one field can be selected as the Distribution Key.

    • The Partition Key is optional but cannot be a nullable field. Only BIGINT, INT, TIMESTAMP, DATETIME, and DATE types are supported as partition keys. See Calculation logic for partition keys.

    • For details on primary key columns, sort keys, and partition keys, see CREATE TABLE.

Step 4: Run the precheck

Click Next: Save Task Settings and Precheck.

To preview the API parameters for this configuration, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.

DTS runs a precheck before the task can start.

  • If the precheck fails, click View Details next to each failed item, resolve the issues, and rerun the precheck.

  • If an alert is triggered:

    • For alerts that cannot be ignored: click View Details, fix the issue, and rerun the precheck.

    • For alerts that can be ignored: click Confirm Alert Details > Ignore > OK, then click Precheck Again. Ignoring alerts may cause data inconsistency.

Step 5: Purchase the instance

  1. Wait until Success Rate reaches 100%, then click Next: Purchase Instance.

  2. On the buy page, configure the following parameters.

ParameterDescription
Billing MethodSubscription: pay upfront for a set period. More cost-effective for long-term use. Pay-as-you-go: billed hourly. Release the instance when no longer needed to stop charges.
Resource Group SettingsThe resource group for the synchronization instance. Default: default resource group. See What is Resource Management?
Instance ClassSelect a class based on your required synchronization speed. See Instance classes of data synchronization instances.
Subscription DurationAvailable for the Subscription billing method. Options: 1–9 months, or 1, 2, 3, or 5 years.
  1. Read and select Data Transmission Service (Pay-as-you-go) Service Terms.

  2. Click Buy and Start, then click OK in the confirmation dialog.

The task appears in the task list. Monitor its progress from there.

Appendix

Time information

Data typeMinimum valueMaximum value
Date1970-01-01 00:00:002149-06-06 00:00:00
Date321925-01-01 00:00:002283-11-11 00:00:00
DateTime1970-01-01 08:00:002106-02-07 14:28:15
DateTime641925-01-01 08:00:002283-11-12 07:59:59

Table and field information

Table information

If you create destination tables manually (without using object name mapping), the tables must meet the following requirements.

Important

If the destination table includes an ENGINE clause, it must be ENGINE = ReplicatedReplacingMergeTree(_version, _is_deleted). Otherwise, data inconsistency may occur.

  • ClickHouse Community Edition instance: create one local table and one distributed table. The distributed table name must match the source table name. The local table name must be <distributed_table_name>_local.

  • ClickHouse Enterprise Edition instance: create a table with the same name as the source table.

Field information

DTS adds the following fields to each destination table. To query data, run:

SELECT * FROM table_name FINAL WHERE _sign > 0;

The WHERE _sign > 0 condition filters out deleted records. The FINAL keyword deduplicates rows with the same sort key.

VersionNameData typeDefaultDescription
Community Edition earlier than 23.8_signInt81DML operation type. INSERT and UPDATE: 1. DELETE: -1.
Community Edition earlier than 23.8_versionUInt641Timestamp when the data was written to ClickHouse.
Enterprise Edition and Community Edition 23.8 and later_signInt81DML operation type. INSERT and UPDATE: 1. DELETE: -1.
Enterprise Edition and Community Edition 23.8 and later_is_deletedUInt80Whether the record is deleted. INSERT and UPDATE: 0. DELETE: 1.
Enterprise Edition and Community Edition 23.8 and later_versionUInt641Timestamp when the data was written to ClickHouse.

Calculation logic for partition keys

Source field typePartition key calculation
BIGINTintDiv(tablePartKey, 18014398509481984)
INTintDiv(tablePartKey, 4194304)
TIMESTAMP, DATETIME, DATEtoYYYYMM(tablePartKey)