All Products
Search
Document Center

ApsaraDB for ClickHouse:Migrate from self-managed to Community-Compatible ClickHouse

Last Updated:Mar 28, 2026

Self-managed ClickHouse clusters are hard to scale, costly to maintain, and lack enterprise-grade disaster recovery. This topic explains how to migrate your data to an ApsaraDB for ClickHouse Community-Compatible Edition cluster.

Choose a migration method

MethodProsConsUse when
Console migrationGuided interface. No manual metadata migration.Migrates only the entire cluster — no per-database, per-table, or historical data selection.Migrating an entire cluster within data volume limits.
Manual migrationFull control over which databases and tables to migrate.Complex operations. Metadata migration is manual.Migrating specific databases or tables; cold data exceeds 1 TB; hot data exceeds 10 TB; or cluster does not meet console migration requirements.

Prerequisites

Before you begin, make sure that you have:

  • A destination ApsaraDB for ClickHouse Community-Compatible Edition cluster with an account that has the highest permissions. See Account management for Community-Compatible Edition clusters and Modify permissions

  • A self-managed ClickHouse cluster account with read permissions on databases and tables, and permissions to run SYSTEM commands

  • Network connectivity between the destination cluster and the self-managed cluster

Same Virtual Private Cloud (VPC): Add the IP addresses of all destination cluster nodes and the IPv4 CIDR Block of its vSwitch ID to the self-managed cluster's whitelist.

Different VPCs, local data center, or another cloud: Resolve network connectivity before migrating. See How do I establish a network connection between a destination cluster and a data source?

Validate in a test environment first

Before migrating production data, run the full migration procedure in a test environment to verify business compatibility and performance. This identifies problems before they affect production.

  1. Create a migration task using the steps in the Console migration or Manual migration section.

  2. Analyze compatibility and performance. See Analysis and solutions for compatibility and performance bottlenecks in migrating self-managed ClickHouse to the cloud.

Console migration

Console migration provides a guided interface and handles metadata automatically, but has the following constraints.

Limitations

  • The destination cluster version must be 21.8 or later.

  • Supports only full and incremental migration of the entire cluster. Individual databases, tables, or historical data snapshots cannot be selected.

Data volume limits

Data typeLimitBehavior when exceeded
Cold dataRecommended total ≤ 1 TBSlow migration; high risk of task failure
Hot data≤ 10 TBHigh probability of failure

If your data volume exceeds these limits, use Manual migration instead.

What gets migrated

Migrated:

  • Databases, data dictionaries, and materialized views

  • Table schemas for all table engines except Kafka and RabbitMQ

  • Data from MergeTree family tables (incremental migration)

Not migrated:

  • Kafka and RabbitMQ table schemas and their data

  • Data in non-MergeTree tables (external tables, Log tables)

Important

Handle unsupported content manually during the migration process. Steps are included in the procedure below.

Potential impacts

Self-managed cluster:

  • Increased CPU and memory usage from read operations

  • DDL operations are blocked

Destination cluster:

  • Increased CPU and memory usage from write operations

  • DDL operations are blocked

  • DDL operations are blocked on migrated databases and tables (this restriction does not apply to other databases and tables)

  • Merges are paused on migrated databases and tables (merges on other databases and tables are not affected)

  • The cluster restarts at the start and end of the migration task

  • After migration completes, the cluster runs high-frequency merges, which increases I/O usage and request latency. Calculate the expected merge duration using Calculate post-migration merge time.

If migration takes too long, metadata accumulates in the destination cluster. Complete the migration within 5 days of creating the task.

Procedure

Step 1: Enable system tables in the self-managed cluster

Console migration requires system.part_log and system.query_log to be enabled and correctly configured in the self-managed cluster's config.xml.

Logs not enabled

If `system.part_log` and `system.query_log` are not enabled, add the following blocks to config.xml:

system.part_log

<part_log>
    <database>system</database>
    <table>part_log</table>
    <partition_by>event_date</partition_by>
    <order_by>event_time</order_by>
    <ttl>event_date + INTERVAL 15 DAY DELETE</ttl>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
</part_log>

system.query_log

<query_log>
    <database>system</database>
    <table>query_log</table>
    <partition_by>event_date</partition_by>
    <order_by>event_time</order_by>
    <ttl>event_date + INTERVAL 15 DAY DELETE</ttl>
    <flush_interval_milliseconds>7500</flush_interval_milliseconds>
</query_log>

Logs enabled

If `system.part_log` and `system.query_log` are already enabled, compare the existing configuration with the blocks above. If they differ, update them to match, then drop and recreate the tables:

system.part_log

system.query_log

DROP TABLE system.part_log;
DROP TABLE system.query_log;

The tables are recreated automatically on the next data insert.

Step 2: Set compatibility on the destination cluster

Configure the destination cluster to use the same compatibility level as the self-managed cluster. This minimizes changes required after migration.

  1. Check the version of both clusters:

    SELECT version();
  2. If the versions differ, set the compatibility parameter on the destination cluster to match the self-managed cluster version. For example:

    SET GLOBAL compatibility = '22.8';

Step 3 (Optional): Enable the MaterializedMySQL engine

If the self-managed cluster has tables using the MaterializedMySQL engine, run the following on the destination cluster:

SET GLOBAL allow_experimental_database_materialized_mysql = 1;
The ClickHouse community no longer maintains the MaterializedMySQL engine. After migration, use Data Transmission Service (DTS) to synchronize MySQL data with ReplacingMergeTree tables instead. See Use seamless integration to synchronize data from ApsaraDB RDS for MySQL to an ApsaraDB for ClickHouse cluster and Synchronize data from ApsaraDB RDS for MySQL to an ApsaraDB for ClickHouse cluster.

Step 4: Create a migration task

  1. Log in to the ApsaraDB for ClickHouse console.

  2. On the Clusters page, select Clusters of Community-compatible Edition and click the destination cluster ID.

  3. In the left navigation pane, click Data Migration and Synchronization > Migrate from ClickHouse.

  4. Click Create Migration Task.

  5. Configure the source cluster connection settings: To get IP addresses and ports for the source cluster: replica_num = 1 selects the first replica set. You can select other replica sets as needed.

    Cross-account or cross-region migration

    Cross-account or cross-region Alibaba Cloud ClickHouse instance: ``sql SELECT shard_num, replica_num, host_address AS ip, port FROM system.clusters WHERE cluster = 'default' AND replica_num = 1; ``

    Non-Alibaba Cloud instance migration

    Non-Alibaba Cloud ClickHouse instance: ``sql SELECT shard_num, replica_num, host_address AS ip, port FROM system.clusters WHERE cluster = '<cluster_name>' AND replica_num = 1; ``

    Source cluster parameters

    Source cluster settings:

    FieldDescriptionExample
    Source Access MethodSelect Express Connect, VPN Gateway, Smart Access Gateway, or Self-managed ClickHouse Clusters on an ECS Instance.
    Cluster NameA name for the source cluster. Digits and lowercase letters only.source
    Source Cluster NameThe internal cluster name from SELECT * FROM system.clusters;.default
    VPC IP AddressTCP IP:PORT pairs for each shard, comma-separated. Do not use VPC domain names or SLB addresses. Format: IP:PORT,IP:PORT,...192.168.0.5:9000,192.168.0.6:9000
    Database AccountAccount for the source cluster.test
    Database PasswordPassword for the source cluster account.

    Destination cluster parameters

    Destination cluster settings:

    FieldDescription
    Database AccountAccount for the destination cluster.
    Database PasswordPassword for the destination cluster account.

    image

  6. Click Test Connection and Proceed. If the test fails, fix the connection settings and try again.

  7. Review the migration content and click Next: Pre-detect and Start Synchronization.

  8. The system runs three pre-checks. If all checks pass: If a check fails, follow the error guidance in Pre-check error messages and solutions.

    1. Read the impact prompts carefully.

    2. Click Completed to create and start the task.

    Check itemRequirement
    Instance Status DetectionNo running management tasks (scale-out, upgrade, or downgrade) on either cluster.
    Storage Space DetectionDestination cluster storage ≥ 1.2× the used space of the self-managed cluster.
    Local Table and Distributed Table DetectionEach local table in the self-managed cluster must have exactly one corresponding distributed table.

    image

Step 5: Assess whether the migration can complete

Skip this step if the source cluster write speed is less than 20 MB/s.

If the source cluster write speed exceeds 20 MB/s, check that the destination cluster can keep up:

  1. View the Disk throughput metric on the destination cluster. See View cluster monitoring information.

  2. Compare write speeds:

    • Destination write speed > source write speed: migration is likely to succeed. Continue to Step 6.

    • Destination write speed < source write speed: migration is likely to fail. Cancel the task and use Manual migration instead.

Step 6: Monitor the migration task

  1. Log in to the ApsaraDB for ClickHouse console.

  2. On the Clusters page, select Clusters of Community-compatible Edition and click the destination cluster ID.

  3. In the left navigation pane, click Data Migration and Synchronization > Migrate from ClickHouse.

  4. Monitor the task status and Running Information column.

Important

Track the estimated remaining time shown in the Running Information column. When the task is nearing completion, follow Step 7 to stop writes and handle Kafka and RabbitMQ tables.

To view task details, click View Details in the Actions column. The details page shows migrated table schemas, database structures, and any error messages.

After the task ends (status: Completed or Canceled), the View Details page is cleared. To view migrated table schemas afterward, run:
SELECT `database`, `name`, `engine_full`
FROM `system`.`tables`
WHERE `database` NOT IN ('system', 'INFORMATION_SCHEMA', 'information_schema');

Task statuses:

StatusDescription
RunningPreparing environment and resources.
InitializationInitializing the migration task.
Configuration MigrationMigrating cluster configuration.
Schema MigrationMigrating databases, MergeTree family tables, and distributed tables.
Data MigrationIncrementally migrating data from MergeTree family tables.
Other Schema MigrationMigrating materialized views and non-MergeTree table schemas.
Data CheckVerifying that data volume in the destination matches the self-managed cluster.
Post-configurationCleaning up the migration environment and re-enabling writes to the source cluster.
CompletedMigration is complete.
CanceledMigration was canceled.

Step 7: Stop writes and handle Kafka and RabbitMQ tables

Before switching business traffic to the destination cluster, stop all writes to the self-managed cluster and manually create Kafka and RabbitMQ tables on the destination.

  1. On the self-managed cluster, identify the tables to process:

    SELECT * FROM system.tables WHERE engine IN ('RabbitMQ', 'Kafka');
  2. Get the DDL for each table:

    SHOW CREATE TABLE <target_table_name>;
  3. On the destination cluster, run the DDL statements to create the tables. See Connect to a ClickHouse cluster using DMS for connection instructions.

  4. On the self-managed cluster, delete the Kafka and RabbitMQ tables.

Important

When deleting a Kafka table, also delete all materialized views that reference it. Leaving them in place causes the migration task to fail.

Step 8: Complete the migration

After writes to the self-managed cluster are stopped, finalize the migration. This migrates remaining data, runs a data volume check, and migrates any outstanding database and table schemas.

Important

Complete this within 5 days of creating the migration task. Longer-running tasks accumulate excessive metadata in the destination cluster, which slows migration. If the data volume check fails, cancel the task and create a new one.

  1. Log in to the ApsaraDB for ClickHouse console.

  2. On the Clusters page, select Clusters of Community-compatible Edition and click the destination cluster ID.

  3. In the left navigation pane, click Data Migration and Synchronization > Migrate from ClickHouse.

  4. In the Actions column for the task, click Complete Migration.

  5. In the dialog box, click OK.

Step 9: Migrate data for non-MergeTree tables

The migration task copies table schemas for non-MergeTree tables (external tables, Log tables) but not their data. Migrate the data manually.

  1. On the self-managed cluster, identify tables that need manual data migration:

    SELECT
        `database` AS database_name,
        `name` AS table_name,
        `engine`
    FROM `system`.`tables`
    WHERE (`engine` NOT LIKE '%MergeTree%')
      AND (`engine` != 'Distributed')
      AND (`engine` != 'MaterializedView')
      AND (`engine` NOT IN ('Kafka', 'RabbitMQ'))
      AND (`database` NOT IN ('system', 'INFORMATION_SCHEMA', 'information_schema'))
      AND (`database` NOT IN (
          SELECT `name`
          FROM `system`.`databases`
          WHERE `engine` IN ('MySQL', 'MaterializedMySQL', 'MaterializeMySQL', 'Lazy', 'PostgreSQL', 'MaterializedPostgreSQL', 'SQLite')
      ))
  2. On the destination cluster, migrate the data using the remote function. See Migrate data using the remote function in the Manual migration section.

Other operations

When a migration task completes, its status changes to Completed, but the list may not refresh immediately. Refresh the page to see the latest status.

OperationWhat it doesImpactUse when
Stop MigrationStops data migration immediately. Skips the data volume check and migrates remaining schemas.Destination cluster restarts.Migrating a portion of data for testing without stopping writes to the self-managed cluster.
Cancel MigrationForcibly cancels the task. Skips the data volume check and does not migrate remaining schemas.Destination cluster restarts. Database and table schemas may be incomplete. Clear migrated data from the destination before retrying.Migration is affecting the self-managed cluster and you need to stop immediately and re-enable writes.

Stop migration

Stop migration:

  1. Log in to the ApsaraDB for ClickHouse console.

  2. On the Clusters page, select Clusters of Community-compatible Edition and click the destination cluster ID.

  3. In the left navigation pane, click Data Migration and Synchronization > Migrate from ClickHouse.

  4. In the Actions column, click Stop Migration and confirm.

Cancel migration

Cancel migration:

  1. Log in to the ApsaraDB for ClickHouse console.

  2. On the Clusters page, select Clusters of Community-compatible Edition and click the destination cluster ID.

  3. In the left navigation pane, click Data Migration and Synchronization > Migrate from ClickHouse.

  4. In the Actions column, click Cancel Migration and confirm.

Manual migration

Use manual migration when you need granular control: choosing specific databases or tables, migrating large cold or hot data volumes, or handling a cluster that cannot use the console migration tool.

Method 1: BACKUP and RESTORE commands

See Use the BACKUP and RESTORE commands to back up and restore data.

Method 2: INSERT INTO SELECT with the remote function

This two-step method migrates table schemas first, then data.

Step 1: Migrate table schemas (DDL)

Install clickhouse-client at the same version as the destination ApsaraDB for ClickHouse cluster. See clickhouse-client for download instructions.

  1. List databases in the self-managed cluster:

    clickhouse-client \
      --host="<old_host>" --port="<old_port>" \
      --user="<old_username>" --password="<old_password>" \
      --query="SHOW databases" > database.list

    The system database does not need to be migrated.

  2. List tables in the self-managed cluster. Use either approach:

    Tables with names starting with .inner. are internal representations of materialized views and do not need to be migrated.
    clickhouse-client \
      --host="<old_host>" --port="<old_port>" \
      --user="<old_username>" --password="<old_password>" \
      --query="SHOW tables from <database_name>" > table.list

    Or query all databases and tables at once:

    SELECT DISTINCT database, name FROM system.tables WHERE database != 'system';
  3. Export the DDL for all tables in a database:

    clickhouse-client \
      --host="<old_host>" --port="<old_port>" \
      --user="<old_username>" --password="<old_password>" \
      --query="SELECT concat(create_table_query, ';') FROM system.tables WHERE database='<database_name>' FORMAT TabSeparatedRaw" > tables.sql
  4. Create the target database on the destination cluster, then import the DDL:

    ParameterDescription
    Address of the self-managed cluster.
    Port of the self-managed cluster.
    Account with DML read/write, settings, and DDL permissions.
    Password for the account.
    <new_host>Address of the destination ApsaraDB for ClickHouse cluster.
    <new_port>Port of the destination ApsaraDB for ClickHouse cluster.
    <new_username>Account with DML read/write, settings, and DDL permissions.
    <new_password>Password for the account.
    clickhouse-client \
      --host="<new_host>" --port="<new_port>" \
      --user="<new_username>" --password="<new_password>" \
      -d '<database_name>' --multiquery < tables.sql

Step 2: Migrate data

Three options are available. Use the remote function (Option A) when possible, as it supports partition-level migration and reduces resource usage.

Option A: Migrate data using the remote function

Run the following on the destination cluster:

  1. (Optional) Reduce network traffic by enabling ZSTD compression:

    SET network_compression_method = 'ZSTD';

    To check the current setting:

    SELECT * FROM system.settings WHERE name = 'network_compression_method';
  2. Migrate data table by table, optionally filtering by partition:

    For version 20.8, try remoteRaw first. If migration fails, request a minor version upgrade. ``sql INSERT INTO <new_database>.<new_table> SELECT * FROM remoteRaw('<old_endpoint>', <old_database>.<old_table>, '<username>', '<password>') [WHERE _partition_id = '<partition_id>'] SETTINGS max_execution_time = 0, max_bytes_to_read = 0, log_query_threads = 0, max_result_rows = 0; ``
    Important

    Always use the VPC internal endpoint, not the public endpoint.

    Parameters:

    ParameterDescription
    <new_database>Target database name on the destination cluster.
    <new_table>Target table name on the destination cluster.
    Source endpoint (see formats below).
    Source database name.
    Source table name.
    <username>Source cluster account.
    <password>Source cluster password.
    max_execution_timeMaximum query run time. 0 = no limit.
    max_bytes_to_readMaximum bytes read from source. 0 = no limit.
    log_query_threadsLog thread info for queries. 0 = disabled.
    max_result_rowsMaximum rows in result. 0 = no limit.
    _partition_idPartition ID to filter. Filtering by partition reduces resource usage (recommended).

    Endpoint formats for `<old_endpoint>`:

    Self-managed ClickHouse
    ApsaraDB for ClickHouse

    Use the VPC endpoint of the source instance, not the public endpoint.

    Important

    The ports 3306 and 9000 are fixed values.

    • Community-compatible Edition instance:

      • Endpoint format: VPC internal address:3306

      • Example: cc-2zeqhh5v7y6q*****.clickhouse.ads.aliyuncs.com:3306

    • Enterprise Edition instance:

      • Endpoint format: VPC internal address:9000

      • Example: cc-bp1anv7jo84ta*****clickhouse.clickhouseserver.rds.aliyuncs.com:9000

    Source typeFormatExample
    Self-managed ClickHouseIP:TCP_port192.168.0.5:9000
    ApsaraDB for ClickHouse Community-Compatible EditionVPC_internal_address:3306cc-2zeqhh5v7y6q*****.clickhouse.ads.aliyuncs.com:3306
    ApsaraDB for ClickHouse EnterpriseVPC_internal_address:9000cc-bp1anv7jo84ta*****clickhouse.clickhouseserver.rds.aliyuncs.com:9000
    INSERT INTO <new_database>.<new_table>
    SELECT *
    FROM remote('<old_endpoint>', <old_database>.<old_table>, '<username>', '<password>')
    [WHERE _partition_id = '<partition_id>']
    SETTINGS max_execution_time = 0, max_bytes_to_read = 0, log_query_threads = 0, max_result_rows = 0;

    To look up partition IDs and part counts:

    SELECT partition_id, count(*) AS part_count
    FROM clusterAllReplicas(default, system, parts)
    WHERE `database` = '<old_database>' AND `table` = '<old_table>'
    GROUP BY partition_id;

File export and import

Option B: Export and import via CSV files

Export from the self-managed cluster:

clickhouse-client \
  --host="<old_host>" --port="<old_port>" \
  --user="<old_username>" --password="<old_password>" \
  --query="SELECT * FROM <database_name>.<table_name> FORMAT CSV" > table.csv

Import to the destination cluster:

clickhouse-client \
  --host="<new_host>" --port="<new_port>" \
  --user="<new_username>" --password="<new_password>" \
  --query="INSERT INTO <database_name>.<table_name> FORMAT CSV" < table.csv
Option C: Stream via Linux pipe

Pipe export and import in a single command without intermediate files:

clickhouse-client \
  --host="<old_host>" --port="<old_port>" \
  --user="<username>" --password="<password>" \
  --query="SELECT * FROM <database_name>.<table_name> FORMAT CSV" | \
clickhouse-client \
  --host="<new_host>" --port="<new_port>" \
  --user="<username>" --password="<password>" \
  --query="INSERT INTO <database_name>.<table_name> FORMAT CSV"

Pre-check error messages and solutions

Error messageCauseSolution
Missing unique distributed table or sharding_key not set.A local table in the self-managed cluster has no corresponding distributed table.Create a distributed table for the local table before migrating.
The corresponding distributed table is not unique.A local table has multiple distributed tables.Delete the extra distributed tables, keeping only one.
MergeTree table on multiple replica cluster.The self-managed cluster is a multi-replica cluster with non-replicated tables.See Non-replicated tables in multi-replica clusters in the FAQ section.
Data reserved table on destination cluster.The destination already has data in the corresponding table.Delete the corresponding table from the destination cluster.
Columns of distributed table and local table conflict.Column definitions in the distributed table and local table do not match.Rebuild the distributed table to match the local table.
Storage is not enough.Insufficient storage space on the destination cluster.Upgrade the destination cluster disk. Destination total space must be ≥ 1.2× the self-managed cluster's used space. See Upgrade/Downgrade and scale-out/scale-in of community-compatible clusters.
Missing system table.A required system table is missing in the self-managed cluster.Update config.xml to enable the required system tables. See Step 1: Enable system tables.
The table is incomplete across different nodes.A table is missing on some shards.Create the table on all shards. For materialized view inner tables, rename the inner table and rebuild the materialized view. See The inner table of a materialized view is inconsistent across shards.

Calculate post-migration merge time

After migration, the destination cluster runs high-frequency merges. This increases I/O usage and request latency. If your workload is latency-sensitive, consider upgrading the instance type or ESSD performance level to shorten this period. See Vertical scaling, scale-out, and scale-in of community-compatible clusters.

This formula applies to both single-replica and master-replica clusters:

Total merge time = MAX(hot data merge time, cold data merge time)

  • Hot data merge time = hot data per node × 2 ÷ MIN(instance type bandwidth, disk bandwidth × n)

  • Cold data merge time = (cold data ÷ nodes) ÷ MIN(instance type bandwidth, OSS read bandwidth) + (cold data ÷ nodes) ÷ MIN(instance type bandwidth, OSS write bandwidth)

How to get each value:

ValueHow to obtain
Hot data per nodeDisk Usage - Single-Node Statistics in View cluster monitoring information.
Cold dataCold storage usage in View cluster monitoring information.
Number of nodesSELECT count() FROM system.clusters WHERE cluster = 'default' AND replica_num = 1;
n (disks per node)SELECT count() FROM system.disks WHERE type = 'local';
Disk bandwidthMaximum throughput per disk (MB/s) in the ESSD performance level table.
OSS read bandwidthTotal Intranet and Internet Download Bandwidth in the OSS bandwidth table.
OSS write bandwidthTotal Intranet and Internet Upload Bandwidth in the OSS bandwidth table.

Instance type bandwidth reference (minimum values):

SpecificationBandwidth (MB/s)
Standard 8-core 32 GB250
Standard 16-core 64 GB375
Standard 24-core 96 GB500
Standard 32-core 128 GB625
Standard 64-core 256 GB1,250
Standard 80-core 384 GB2,000
Standard 104-core 384 GB2,000
Actual bandwidth varies by machine type. These values are minimum references.

FAQ

Too many partitions for single INSERT block

Error: Too many partitions for single INSERT block (more than 100)

Each INSERT creates data parts across partitions. When a single INSERT writes to more than max_partitions_per_insert_block partitions (default: 100), ClickHouse rejects the operation to prevent performance degradation during merges and queries.

Fix this by reducing the number of partitions per INSERT, or increase the limit if unavoidable:

SET GLOBAL ON CLUSTER DEFAULT max_partitions_per_insert_block = <value>;
Keep this value as low as possible. The default of 100 is the community recommendation. Restore the default after the batch import completes.

Connection from destination to self-managed cluster fails

The self-managed cluster likely has a firewall or whitelist configured. Add the IPv4 CIDR Block of the destination cluster's vSwitch ID to the self-managed cluster's whitelist. For instructions on finding the CIDR block, see the Prerequisites section.

Non-replicated tables in multi-replica clusters

Why non-replicated tables are blocked: In multi-replica clusters, the migration tool picks one replica as the data source. If non-replicated MergeTree tables exist, each replica may hold different data — migrating from a single replica causes data loss.

For example: if replica r0 holds rows 1, 2, and 3, and replica r1 holds rows 4 and 5, migrating from r0 loses rows 4 and 5.

image