Use the BACKUP and RESTORE commands to back up ClickHouse data to Object Storage Service (OSS) for long-term archiving, or to migrate a self-managed ClickHouse cluster to ApsaraDB for ClickHouse. Both are built-in ClickHouse SQL commands that operate on databases, tables, and other objects.
Limitations
You cannot back up or restore data between the Community-compatible Edition and the Enterprise Edition. Their database and table engines are incompatible.
CPU utilization increases during backup and restore operations. Adjust
backup_threadsandrestore_threadsto limit resource usage. Memory usage is not significantly affected.
Prerequisites
Before you begin, ensure that you have:
ClickHouse access granted to the OSS bucket — see Common examples of bucket policies
The endpoint for your region
Syntax
-- Commands
BACKUP | RESTORE [ASYNC]
-- What to back up or restore
TABLE [db.]table_name [AS [db.]table_name_in_backup]
[PARTITION[S] partition_expr [,...]] |
DICTIONARY [db.]dictionary_name [AS [db.]name_in_backup] |
DATABASE database_name [AS database_name_in_backup]
[EXCEPT TABLES ...] |
TEMPORARY TABLE table_name [AS table_name_in_backup] |
VIEW view_name [AS view_name_in_backup] |
ALL [EXCEPT {TABLES|DATABASES}...] [,...]
-- Cluster scope (optional)
[ON CLUSTER 'cluster_name']
-- Destination (BACKUP) or source (RESTORE)
TO|FROM
File('<path>/<filename>') |
Disk('<disk_name>', '<path>/') |
S3('<S3 endpoint>/<path>', '<Access Key ID>', '<Access Key Secret>')
-- Incremental backup base (optional)
[SETTINGS base_backup =
File('<path>/<filename>') |
Disk(...) |
S3('<S3 endpoint>/<path>', '<Access Key ID>', '<Access Key Secret>')]Parameters
| Parameter | Description |
|---|---|
ASYNC | Runs the operation asynchronously. The command returns immediately and the operation continues in the background. |
TABLE [db.]table_name [AS ...] | Backs up or restores a specific table. Use AS to rename it in the backup. |
PARTITION[S] partition_expr | Limits the backup or restore to specific partitions of a table. |
DICTIONARY [db.]dictionary_name [AS ...] | Backs up or restores a dictionary object. |
DATABASE database_name [AS ...] | Backs up or restores an entire database. Use EXCEPT TABLES to exclude specific tables. |
ALL | Backs up or restores all objects. Use EXCEPT to exclude specific tables or databases. |
ON CLUSTER 'cluster_name' | Runs the operation across all nodes in the cluster. |
TO | FROM | Specifies direction: TO for a backup destination, FROM for a restore source. |
File(...) | Backs up to or restores from a local file. |
Disk(...) | Backs up to or restores from a named disk configured in ClickHouse. |
S3(...) | Backs up to or restores from an S3-compatible endpoint (such as OSS). Use a folder path for cluster backups; ZIP files are not supported for multi-node deployments. |
SETTINGS base_backup | Specifies the base backup for an incremental backup. Only data changed since the base backup is included. |
Back up data
Community-compatible Edition
Upload directly to OSS
BACKUP TABLE default.data ON CLUSTER default
TO S3('https://<yourBucketName>.<yourEndpoint>/data/', '<yourAccessKeyID>', '<yourAccessKeySecret>')Back up to a local disk, then upload to OSS
ApsaraDB for ClickHouse does not support backing up data to a local disk. Use this approach only for self-managed ClickHouse clusters.
Create the file
/etc/clickhouse-server/config.d/backup_disk.xmlto define a backup disk. The<storage_configuration>block defines a disk namedbackupsat the path/backups/. The<backups>block restricts backups to that disk and path.<clickhouse> <storage_configuration> <disks> <backups> <type>local</type> <path>/backups/</path> </backups> </disks> </storage_configuration> <backups> <allowed_disk>backups</allowed_disk> <allowed_path>/backups/</allowed_path> </backups> </clickhouse>Back up the table to the local disk.
BACKUP TABLE test.table TO Disk('backups', 'data_1.zip')Upload the backup file to OSS using ossutil.
ossutil cp data_1.zip oss://<yourBucketName>/data/data_1.zip \ -i <yourAccessKeyID> \ -k <yourAccessKeySecret> \ -e <yourEndpoint>
Enterprise Edition
BACKUP TABLE default.data
TO S3('https://<yourBucketName>.<yourEndpoint>/data/data_1.zip', '<yourAccessKeyID>', '<yourAccessKeySecret>')Restore data
Community-compatible Edition
-- Restore to a single node from a ZIP file
RESTORE TABLE default.data
FROM S3('https://<yourBucketName>.<yourEndpoint>/data/data_1.zip', '<yourAccessKeyID>', '<yourAccessKeySecret>')
-- Restore to all nodes from a directory
RESTORE TABLE default.data ON CLUSTER default
FROM S3('https://<yourBucketName>.<yourEndpoint>/data/', '<yourAccessKeyID>', '<yourAccessKeySecret>')Enterprise Edition
RESTORE TABLE default.data
FROM S3('https://<yourBucketName>.<yourEndpoint>/data/data_1.zip', '<yourAccessKeyID>', '<yourAccessKeySecret>')Incremental backups
Use SETTINGS base_backup to back up only the data changed since a previous backup. Incremental backups are useful for large databases or frequent backup schedules where a full backup each time would be too costly.
-- Step 1: Create a base (full) backup
BACKUP TABLE default.data ON CLUSTER default
TO S3('https://<yourBucketName>.<yourEndpoint>/base_backup/', '<yourAccessKeyID>', '<yourAccessKeySecret>')
-- Step 2: Create an incremental backup against the base
BACKUP TABLE default.data ON CLUSTER default
TO S3('https://<yourBucketName>.<yourEndpoint>/incremental_backup/', '<yourAccessKeyID>', '<yourAccessKeySecret>')
SETTINGS base_backup = S3('https://<yourBucketName>.<yourEndpoint>/base_backup/', '<yourAccessKeyID>', '<yourAccessKeySecret>')
-- Step 3: Restore from the incremental backup
RESTORE TABLE default.data ON CLUSTER default
FROM S3('https://<yourBucketName>.<yourEndpoint>/incremental_backup/', '<yourAccessKeyID>', '<yourAccessKeySecret>')When to use full vs. incremental backups:
| Strategy | Use case |
|---|---|
| Full backup | Smaller databases or critical data where restore simplicity matters |
| Incremental backup | Larger databases or frequent backup schedules where cost matters |
| Both combined | For example, weekly full backups and daily incremental backups |
Monitor backup and restore progress
Run operations asynchronously
Add ASYNC to avoid holding a connection open during long-running operations. The command returns immediately and the operation continues in the background.
BACKUP TABLE default.data ON CLUSTER default
TO S3('https://<yourBucketName>.<yourEndpoint>/data/', '<yourAccessKeyID>', '<yourAccessKeySecret>')
ASYNC;Check progress
Query system.backups to check status:
SELECT * FROM system.backups;Performance tuning
View performance parameters
-- User-level parameters
SELECT * FROM system.settings WHERE name LIKE '%backup%' OR name LIKE '%restore%';
-- Server-level parameters
SELECT * FROM system.server_settings WHERE name LIKE '%backup%' OR name LIKE '%restore%';Backup parameters
| Scope | Parameter | Description |
|---|---|---|
| Server | backup_threads | Maximum number of threads for a backup. Default: 16. Requires restart. |
| Server | max_backup_bandwidth_for_server | Total bandwidth cap for all concurrent backups on a single server. Requires restart. |
| Server | max_backups_io_thread_pool_size | Maximum number of threads for backup I/O operations. Requires restart. |
| Server | max_backups_io_thread_pool_free_size | Maximum number of idle threads in the backup I/O thread pool. Requires restart. |
| User | max_backup_bandwidth | Bandwidth cap for a single backup job. |
Restore parameters
| Scope | Parameter | Description |
|---|---|---|
| Server | restore_threads | Maximum number of threads for a restore. Default: 16. Requires restart. |
FAQ
I get "Not found backup engine S3" — what do I do?
The instance is running a version that doesn't support S3 backups. Upgrade to version 23.8 or later. If you're on a self-managed cluster, back up to a local disk first and then upload to OSS.
I get "Using archives with backups on clusters is disabled" — what do I do?
Multi-node cluster backups don't support ZIP archives. Use a folder path instead:
BACKUP TABLE default.data ON CLUSTER default
TO S3('https://<yourBucketName>.<yourEndpoint>/data/', '<yourAccessKeyID>', '<yourAccessKeySecret>')The connection dropped after I ran BACKUP — did my backup fail?
No. The backup runs in the background regardless of the connection state. Check progress with:
SELECT * FROM system.backups;To avoid the issue entirely, add ASYNC to the command:
BACKUP TABLE default.data ON CLUSTER default
TO S3('https://<yourBucketName>.<yourEndpoint>/data/', '<yourAccessKeyID>', '<yourAccessKeySecret>')
ASYNC;How fast are BACKUP and RESTORE?
Speed depends on whichever resource is the bottleneck — CPU, disk throughput, network throughput, or OSS bandwidth.
For ApsaraDB for ClickHouse, upgrade the cluster specifications to increase CPU, disk throughput, and network throughput.
For OSS bandwidth limits, see Limits and performance metrics.
How do I back up or restore nodes one by one?
Query node IP addresses.
SELECT * FROM system.clusters;Connect directly to the target node using clickhouse-client and disable the mandatory cluster DDL setting.
NoteThis setting applies only to ApsaraDB for ClickHouse. Do not use it on self-managed ClickHouse clusters.
SET enforce_on_cluster_default_for_ddl = 0;Run the
BACKUPcommand.BACKUP TABLE default.data TO S3('https://<yourBucketName>.<yourEndpoint>/data/data_1.zip', '<yourAccessKeyID>', '<yourAccessKeySecret>')Run the
RESTOREcommand.RESTORE TABLE default.data FROM S3('https://<yourBucketName>.<yourEndpoint>/data/data_1.zip', '<yourAccessKeyID>', '<yourAccessKeySecret>')