All Products
Search
Document Center

ApsaraDB for ClickHouse:Synchronize data from an ApsaraDB RDS for MySQL instance to an ApsaraDB for ClickHouse cluster

Last Updated:Mar 28, 2026

Use Data Transmission Service (DTS) to synchronize data from an ApsaraDB RDS for MySQL instance to an ApsaraDB for ClickHouse cluster. DTS handles full data initialization and continuous incremental sync via Change Data Capture (CDC), so your ClickHouse cluster stays current without manual intervention.

Prerequisites

Before you begin, make sure that you have:

Important

Give the destination database the same name as the source database in RDS MySQL. If the names differ, use the name mapping feature in the Selected Objects section during the Configure Objects and Advanced Settings step to map the source database name to the destination database name. For more information, see Map object names.

Billing

Synchronization typePricing
Schema synchronization and full data synchronizationFree of charge
Incremental data synchronizationCharged. For more information, see Billing overview.

Supported SQL operations

Operation typeSQL statements
DMLINSERT, UPDATE, DELETE
DDLCREATE TABLE, DROP TABLE, TRUNCATE TABLE, ADD COLUMN, MODIFY COLUMN, DROP COLUMN

DDL synchronization behavior

Understanding how each DDL statement behaves during synchronization helps you avoid unexpected data loss or task failures.

DDL operationBehavior
CREATE TABLEPropagated automatically. The destination table is created with DTS-added fields (_sign, _version, _is_deleted).
DROP TABLEPropagated automatically. The destination table is dropped.
TRUNCATE TABLEPropagated automatically. All rows in the destination table are deleted.
ADD COLUMNPropagated automatically. New rows replicated after the change include the added column.
MODIFY COLUMNPropagated automatically.
DROP COLUMNPropagated automatically.
RENAME TABLENot supported. The task may fail if the source runs RENAME TABLE.
INDEX, PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, FKSupported.

Data type mappings

MySQL and ApsaraDB for ClickHouse support different data types, so a one-to-one mapping is not always possible. DTS maps MySQL types to the closest equivalent types supported by ApsaraDB for ClickHouse during initial schema synchronization. For more information, see Data type mappings for initial schema synchronization.

Database account permissions

DatabaseRequired permissionsHow to configure
Source RDS MySQLRead permissions on all objects to be synchronizedCreate an account and Modify account permissions
Destination ApsaraDB for ClickHouse cluster (v22.8 or later)Read and write permissions on the destination database. A privileged account meets the requirements.Community-compatible Edition account management
Destination ApsaraDB for ClickHouse cluster (v21.8)Read/Write And Settings and Allow DDLCommunity-compatible Edition account management
If the source database account was not created through the RDS MySQL console, make sure the account has the REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions.

Limitations

Source database limitations

LimitationDetails
Primary key requiredTables without a primary key cannot be synchronized.
Table-level sync limitWhen synchronizing at the table level with object edits (such as name mapping), a single task supports a maximum of 1,000 tables. If you exceed this limit, split the tables across multiple tasks or configure the task to synchronize the entire database.
Binary log retentionRetain local binary logs for at least 3 days (7 days recommended) for RDS MySQL, and at least 7 days for self-managed MySQL. If DTS cannot retrieve the required binary logs, the task fails. In extreme cases, this may cause data inconsistency or data loss. Issues caused by insufficient retention periods are not covered under the DTS SLA. To configure the retention period for an RDS MySQL instance, see Automatically delete local logs.
No DDL during synchronizationDo not run DDL operations that change database or table schemas during schema synchronization or full synchronization—the task will fail. During full synchronization, DTS queries the source database, which creates metadata locks that may block DDL operations.
Non-binlog dataData generated by operations that do not write to binary logs—such as physical backup restores or cascade operations—is not synchronized. If this occurs, remove the affected objects from the sync task and add them back, if your business allows it. For more information, see Modify synchronization objects.
Invisible columns (MySQL 8.0.23+)DTS may not read invisible hidden columns in MySQL 8.0.23 or later, which may cause data loss. Run ALTER TABLE <table_name> ALTER COLUMN <column_name> SET VISIBLE; to make hidden columns visible before synchronizing. For more information, see Invisible columns.

Binary log configuration

Configure these parameters before starting the sync task:

For ApsaraDB RDS for MySQL:

ApsaraDB RDS for MySQL enables binary logging by default. Verify the following:

  1. Set binlog_row_image to full. If it is not, the precheck fails and the task cannot start. For instructions, see Configure instance parameters.

  2. Retain local binary logs for at least 3 days (7 days recommended).

For self-managed MySQL:

  1. Enable binary logging.

  2. Set binlog_format to row.

  3. Set binlog_row_image to full.

  4. For dual-primary clusters, also enable log_slave_updates so DTS can capture all binary log events.

  5. Retain local binary logs for at least 7 days.

For instructions, see Create an account and configure binary logging for a self-managed MySQL database.

Other limitations

LimitationDetails
Non-standard DDL syntaxIf the DDL statements of the source RDS MySQL instance do not follow standard MySQL syntax, the task may fail or data may be lost.
Online DDL in temporary table modeIf the source database uses online DDL in temporary table mode—including multi-table merge scenarios or function-based indexes on unique key columns—data loss or task failure may occur.
Online DDL toolsDMS and gh-ost: DTS synchronizes only the original DDL statements, which may cause destination tables to be locked temporarily. pt-online-schema-change is not supported when synchronizing at the table level—if used at the source, data may be lost or the task may fail. You can use Data Management (DMS) to perform online DDL operations instead. For more information, see Change schemas without locking tables.
Primary key or unique key conflictsDuring full synchronization, DTS retains the destination record and skips the conflicting source record. During incremental synchronization, DTS overwrites the destination record with the source record. If table schemas are inconsistent, data initialization may fail, resulting in partial or complete synchronization failure.
Time-type data rangeApsaraDB for ClickHouse enforces range limits on time-type data. If the time data in RDS MySQL falls outside these ranges, the synchronized values will be incorrect. See Time information.
Partition keyThe partition key cannot be a nullable field. Supported types: BIGINT, INT, TIMESTAMP, DATETIME, and DATE.
Database limitThe number of databases to synchronize cannot exceed 256, the limit of ApsaraDB for ClickHouse.
Naming conventionsDatabase, table, and column names must comply with ApsaraDB for ClickHouse naming conventions. For more information, see Object naming conventions.
DTS-added fieldsDTS adds _sign, _is_deleted, and _version fields to destination tables during schema synchronization. If you do not select Schema Synchronization, manually create the destination tables and add these fields. See Table and field information.
Performance impactFull data synchronization consumes read and write resources from both the source and destination databases. Run the synchronization during off-peak hours to minimize impact.
External writes to destinationDo not allow any data source other than DTS to write to the destination database during synchronization. Doing so will cause data inconsistency.
Always-EncryptedFull data synchronization is not supported for RDS MySQL instances with Always-Encrypted enabled. Instances with Transparent Data Encryption (TDE) enabled support all synchronization types.
Task failure recoveryIf a task fails, DTS support staff will attempt to restore it within 8 hours. They may restart the task or adjust DTS task parameters (not database parameters) during restoration.

Usage notes

For self-managed MySQL sources:

  • If a primary/secondary switchover occurs during synchronization, the task fails.

  • DTS calculates latency by comparing the timestamp of the last synchronized record against the current time. If no DML operations run for an extended period, latency reporting may become inaccurate. Run a DML operation in the source database to reset the reported latency. If you select a full database for synchronization, create a heartbeat table and update it every second.

  • DTS periodically runs CREATE DATABASE IF NOT EXISTS \test\`` in the source database to advance the binary log offset.

  • If the source is Amazon Aurora MySQL or another clustered MySQL instance, make sure the domain name or IP address used in the task configuration always resolves to a read/write (RW) node.

For ApsaraDB RDS for MySQL sources:

  • Read-only instances—such as ApsaraDB RDS for MySQL 5.6 read-only instances—that do not record transaction logs cannot serve as source databases.

  • DTS periodically runs CREATE DATABASE IF NOT EXISTS \test\`` in the source database to advance the binary log offset.

Create a synchronization task

The task configuration has seven steps. After the task starts, DTS first runs a full copy of the source data, then continuously applies incremental changes.

Step 1: Open the data synchronization page

Open the Data Synchronization task list in the destination region using one of the following methods:

DTS console

  1. Log on to the DTS console.

  2. In the left navigation pane, click Data Synchronization.

  3. In the upper-left corner, select the region where the synchronization instance will be located.

DMS console

The steps may vary depending on the mode and layout of the DMS console. For more information, see Simple mode console and Customize the layout and style of the DMS console.
  1. Log on to the DMS console.

  2. In the top menu bar, choose Data + AI > DTS (DTS) > Data Synchronization.

  3. To the right of Data Synchronization Tasks, select the region of the synchronization instance.

Step 2: Create a task

Click Create Task to open the task configuration page.

Step 3: Configure source and destination databases

CategoryParameterDescription
NoneTask NameDTS generates a name automatically. Specify a descriptive name for easy identification. The name does not need to be unique.
Source DatabaseSelect Existing ConnectionSelect a registered database instance from the drop-down list to auto-fill the connection details. In the DMS console, this parameter is labeled Select a DMS database instance. If no registered instance is available, configure the connection manually.
Database TypeSelect MySQL.
Access MethodSelect Alibaba Cloud Instance.
Instance RegionSelect the region where the source RDS MySQL instance resides.
Replicate Data Across Alibaba Cloud AccountsSelect No (same Alibaba Cloud account).
RDS Instance IDSelect the ID of the source RDS MySQL instance.
Database AccountEnter the database account. For permission requirements, see Database account permissions.
Database PasswordEnter the password for the database account.
EncryptionSelect Non-encrypted or SSL-encrypted. If you select SSL-encrypted, enable SSL encryption on the RDS MySQL instance first. For more information, see Use a cloud certificate to quickly enable SSL link encryption.
Destination DatabaseSelect Existing ConnectionSelect a registered database instance from the drop-down list. In the DMS console, this parameter is labeled Select a DMS database instance. If no registered instance is available, configure the connection manually.
Database TypeSelect ClickHouse.
Access MethodSelect Alibaba Cloud Instance.
Instance RegionSelect the region where the destination ApsaraDB for ClickHouse cluster resides.
Replicate Data Across Alibaba Cloud AccountsSelect No (same Alibaba Cloud account).
Cluster TypeSelect the type of the ApsaraDB for ClickHouse cluster.
Cluster IDSelect the ID of the destination ApsaraDB for ClickHouse cluster.
Database AccountEnter the database account. For permission requirements, see Database account permissions.
Database PasswordEnter the password for the database account.

Step 4: Test connectivity

Click Test Connectivity and Proceed at the bottom of the page.

Add the CIDR blocks of DTS servers to the security settings of both the source and destination databases to allow access. This can be done automatically or manually. For more information, see Add the IP address whitelist of DTS servers.
If the source or destination is a self-managed database (that is, Access Method is not Alibaba Cloud Instance), also click Test Connectivity in the CIDR Blocks of DTS Servers dialog box.

Step 5: Configure objects and advanced settings

Configure objects

On the Configure Objects page, specify the synchronization scope.

ParameterDescription
Synchronization TypesDTS always selects Incremental Data Synchronization. By default, Schema Synchronization and Full Data Synchronization are also selected. After the precheck, DTS initializes the destination cluster with a full copy of the source data, which serves as the baseline for incremental synchronization.
Processing Mode of Conflicting TablesPrecheck and Report Errors: Checks for tables with identical names in the destination database. If any are found, an error is reported during precheck and the task does not start. If you cannot delete or rename the conflicting table, use name mapping to assign a different name in the destination. For more information, see Database table column name mapping. Ignore Errors and Proceed: Skips the duplicate table name check. During full synchronization, DTS retains the destination record and skips conflicting source records. During incremental synchronization, DTS overwrites destination records. If table schemas are inconsistent, data initialization may fail. Use with caution.
Capitalization of Object Names in Destination InstanceSets the case-sensitivity policy for database, table, and column names in the destination. The default is DTS default policy. For more information, see Case policy for destination object names.
Source ObjectsClick the objects to synchronize in the Source Objects box, then click the arrow to move them to the Selected Objects box. Select objects at the database or table level.
Selected ObjectsTo rename a single object in the destination, right-click it in the Selected Objects box. To rename multiple objects in bulk, click Batch Edit in the upper-right corner. To filter data, right-click a table in the Selected Objects box and set filter conditions. For more information, see Set filter conditions. Note that using name mapping may cause dependent objects to fail synchronization.

Click Next: Advanced Settings.

Advanced settings

ParameterDescription
Dedicated Cluster for Task SchedulingDTS uses a shared cluster by default. For greater task stability, purchase a dedicated cluster. For more information, see What is a DTS dedicated cluster?.
Time zone of destination databaseSelect the time zone for DateTime data written to the ApsaraDB for ClickHouse cluster.
Retry Time for Failed ConnectionsSets how long DTS retries after a connection failure. Default: 720 minutes. Range: 10–1,440 minutes. Set to 30 minutes or more. If the connection is restored within this period, the task resumes automatically. If multiple DTS instances share a source or destination, DTS uses the shortest configured retry duration across all instances. DTS charges for task runtime during retries—release the DTS instance promptly after you release the source or destination instances.
Retry Time for Other IssuesSets how long DTS retries after non-connection errors (such as DDL or DML execution failures). Default: 10 minutes. Range: 1–1,440 minutes. Must be less than Retry Time for Failed Connections.
Enable Throttling for Full Data SynchronizationLimits the full synchronization rate to reduce load on the destination database. Set Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Only available when Full Data Synchronization is selected. You can also adjust the rate after the task starts.
Enable Throttling for Incremental Data SynchronizationLimits the incremental synchronization rate. Set RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s).
Whether to delete SQL operations on heartbeat tables of forward and reverse tasksControls whether DTS writes heartbeat SQL to the source database. Yes: Does not write heartbeat SQL—the DTS instance may show higher latency. No: Writes heartbeat SQL—this may interfere with physical backups and cloning on the source.
Environment TagOptionally tag the instance to identify its environment.
Configure ETLYes: Enables extract, transform, and load (ETL). Enter data processing statements in the code editor. For more information, see Configure ETL in a data migration or data synchronization task. No: Disables ETL.
Monitoring and AlertingYes: Configures alerts. Set the alert threshold and notification contacts. For more information, see Configure monitoring and alerting during task configuration. No: No alerts configured.

Configure database and table fields

Click Next: Configure Database and Table Fields to configure the Type, Primary Key Column, Sort Key, Distribution Key, and Partition Key for each table being synchronized to ClickHouse.

DTS provides a default configuration. Set Definition Status to All to review and modify all tables.

  • The Primary Key Column and Sort Key support composite keys—select multiple fields from the drop-down list.

  • Only one field can be selected as the Distribution Key.

  • Select one or more columns from the Primary Key Column as the Partition Key.

  • The Partition Key is optional, but if specified, it cannot be a nullable field. Supported types: BIGINT, INT, TIMESTAMP, DATETIME, and DATE. For the calculation logic, see Partition key calculation logic.

For more information about primary key columns, sort keys, and partition keys, see CREATE TABLE.

Step 6: Save the task and run the precheck

  • To preview the API parameters for this task, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters in the tooltip.

  • Click Next: Save Task Settings and Precheck.

DTS runs a precheck before the task starts. The task only starts if the precheck passes.

  • If the precheck fails, click View Details next to the failed item, fix the issue, and rerun the precheck.

  • If the precheck generates warnings:

    • For non-ignorable warnings, click View Details, fix the issue, and rerun the precheck.

    • For ignorable warnings, click Confirm Alert Details > Ignore > OK, then click Precheck Again. Ignoring warnings may lead to data inconsistencies. Proceed with caution.

Step 7: Purchase the synchronization instance

  1. When the Success Rate reaches 100%, click Next: Purchase Instance.

  2. On the Purchase page, select the billing method and instance class.

CategoryParameterDescription
New Instance ClassBilling methodSubscription: Pay upfront for a fixed duration. Suitable for long-term, continuous tasks. Monthly options: 1–9 months. Yearly options: 1, 2, 3, or 5 years. Pay-as-you-go: Billed hourly for actual usage. Suitable for short-term or test tasks.
Resource Group SettingsThe resource group for the instance. Default: default resource group. For more information, see What is Resource Management?.
Instance ClassAffects synchronization performance. Select based on your business requirements. For more information, see Data synchronization link specifications.
  1. Read and select the Data Transmission Service (Pay-as-you-go) Service Terms checkbox.

  2. Click Buy and Start, then click OK.

Monitor the task progress on the data synchronization page.

Appendix

Time information

Data typeMinimum valueMaximum value
Date1970-01-01 00:00:002149-06-06 00:00:00
Date321925-01-01 00:00:002283-11-11 00:00:00
DateTime1970-01-01 08:00:002106-02-07 14:28:15
DateTime641925-01-01 08:00:002283-11-12 07:59:59

Table and field information

Table requirements

If you do not use the object name mapping feature, the tables you create must meet the following requirements.

Important

If the destination table specifies an ENGINE, it must be ENGINE = ReplicatedReplacingMergeTree(_version, _is_deleted). Otherwise, data inconsistency may occur.

  • Community Edition instance: Create one local table and one distributed table. The distributed table name must match the source table name. The local table name must follow the format <distributed_table_name>_local.

  • Enterprise Edition instance: Create a table with the same name as the source table.

Fields added by DTS

DTS adds the following fields to destination tables during schema synchronization. To query data and filter out deleted records, run:

SELECT * FROM table_name FINAL WHERE _sign > 0;

The FINAL keyword filters out rows with duplicate sort keys, and _sign > 0 excludes deleted records.

VersionFieldData typeDefault valueDescription
Community Edition earlier than v23.8_signInt81DML operation type. INSERT and UPDATE = 1. DELETE = -1.
Community Edition earlier than v23.8_versionUInt641Timestamp when the record was written to ClickHouse.
Enterprise Edition and Community Edition v23.8 or later_signInt81DML operation type. INSERT and UPDATE = 1. DELETE = -1.
Enterprise Edition and Community Edition v23.8 or later_is_deletedUInt80Whether the record is deleted. INSERT and UPDATE = 0. DELETE = 1.
Enterprise Edition and Community Edition v23.8 or later_versionUInt641Timestamp when the record was written to ClickHouse.

Partition key calculation logic

Source field typeCalculation logic
BIGINTintDiv(field, 18014398509481984)
INTintDiv(field, 4194304)
TIMESTAMPtoYYYYMM(field)
DATETIMEtoYYYYMM(field)
DATEtoYYYYMM(field)