All Products
Search
Document Center

E-MapReduce:Use HDFS for hot and cold data separation

Last Updated:Mar 26, 2026

As data ages, queries shift from recent records to historical archives. Keeping all data on local disks is expensive, while moving everything to HDFS sacrifices query performance. E-MapReduce (EMR) ClickHouse clusters solve this by supporting tiered storage through Hadoop Distributed File System (HDFS): recent data stays on local disks for fast queries, and older data moves automatically to HDFS to reduce storage costs — without affecting read or write performance.

Prerequisites

Before you begin, ensure that you have:

  • A ClickHouse cluster of EMR V5.5.0 or later. See Create a ClickHouse cluster

  • An HDFS service deployed in the same virtual private cloud (VPC) as the ClickHouse cluster — for example, the HDFS service in an EMR Hadoop cluster

  • Read and write permissions on the HDFS service

Limitations

Hot/cold data separation using HDFS is only supported on ClickHouse clusters of EMR V5.5.0 or later.

Key concepts

Understanding the storage hierarchy helps you read the configuration code in the steps that follow.

Concept Description
Disk A storage device registered in ClickHouse — either a local disk or a remote store such as HDFS.
Volume An ordered set of disks. Data moves across volumes according to storage policy rules.
Storage policy A named set of volumes with rules that control how data moves between them, including time-to-live (TTL) expiration.

In this guide, you define an HDFS disk, group local disks into a local volume and the HDFS disk into a remote volume, then apply a TTL rule to move data from local to remote after a specified time.

Step 1: Add an HDFS disk in the EMR console

  1. Go to the Configure tab of the ClickHouse service.

    1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

    2. In the top navigation bar, select the region where your cluster resides and select a resource group.

    3. On the EMR on ECS page, find your cluster and click Services in the Actions column.

    4. On the Services tab, click Configure in the ClickHouse section.

  2. Click the server-metrika tab.

  3. Update the storage_configuration parameter.

    1. In disks, add an HDFS disk entry:

      <disk_hdfs>
          <type>hdfs</type>
          <endpoint>hdfs://${your-hdfs-url}</endpoint>
          <min_bytes_for_seek>1048576</min_bytes_for_seek>
          <thread_pool_size>16</thread_pool_size>
          <objects_chunk_size_to_delete>1000</objects_chunk_size_to_delete>
      </disk_hdfs>
      Parameter Required Description
      disk_hdfs Yes Name of the disk. Customize as needed.
      type Yes Disk type. Must be hdfs.
      endpoint Yes Endpoint of the HDFS service, typically the NameNode address. If NameNode runs in high availability (HA) mode, the port is usually 8020; otherwise it is 9000.
      min_bytes_for_seek No Minimum positive seek offset. If the actual offset is smaller than this value, ClickHouse reads sequentially instead of seeking. Default: 1048576.
      thread_pool_size No Number of threads used to process restore requests on this disk. Default: 16.
      objects_chunk_size_to_delete No Maximum number of HDFS files deleted in a single batch. Default: 1000.
    2. In policies, add a storage policy that maps local disks to a local volume and the HDFS disk to a remote volume:

      <hdfs_ttl>
          <volumes>
            <local>
              <!-- Add all disks that currently use the default storage policy. -->
              <disk>disk1</disk>
              <disk>disk2</disk>
              <disk>disk3</disk>
              <disk>disk4</disk>
            </local>
            <remote>
              <disk>disk_hdfs</disk>
            </remote>
          </volumes>
          <move_factor>0.2</move_factor>
      </hdfs_ttl>
      Note

      You can also add this policy to the existing default storage policy instead of creating a new one.

  4. Save the configuration.

    1. Click Save.

    2. In the dialog box, enter an execution reason, turn on Save and Deliver Configuration, and click Save.

  5. Deploy the client configuration.

    1. On the Configure tab, click Deploy Client Configuration.

    2. In the dialog box, enter an execution reason and click OK.

    3. In the Confirm dialog box, click OK.

Step 2: Verify the configuration

Log on to the ClickHouse cluster over SSH. See Log on to a cluster.

  1. Start the ClickHouse client:

    clickhouse-client -h core-1-1 -m
    Note

    Replace core-1-1 with the name of the core node you logged on to. If the cluster has multiple core nodes, use any one.

  2. Check that the HDFS disk appears in system.disks:

    SELECT * FROM system.disks;

    The configuration is correct if the output includes a row where name is disk_hdfs and type is hdfs:

    ┌─name─────┬─path─────────────────────────────────┬───────────free_space─┬──────────total_space─┬─keep_free_space─┬─type──┐
    │ default  │ /var/lib/clickhouse/                 │          83868921856 │          84014424064 │               0 │ local │
    │ disk1    │ /mnt/disk1/clickhouse/               │          83858436096 │          84003938304 │        10485760 │ local │
    │ disk2    │ /mnt/disk2/clickhouse/               │          83928215552 │          84003938304 │        10485760 │ local │
    │ disk3    │ /mnt/disk3/clickhouse/               │          83928301568 │          84003938304 │        10485760 │ local │
    │ disk4    │ /mnt/disk4/clickhouse/               │          83928301568 │          84003938304 │        10485760 │ local │
    │ disk_hdfs│ /var/lib/clickhouse/disks/disk_hdfs/ │ 18446744073709551615 │ 18446744073709551615 │               0 │ hdfs  │
    └──────────┴──────────────────────────────────────┴──────────────────────┴──────────────────────┴─────────────────┴───────┘
  3. Check that the hdfs_ttl storage policy appears in system.storage_policies:

    SELECT * FROM system.storage_policies;

    The configuration is correct if the output includes two rows for hdfs_ttl — one for the local volume and one for the remote volume:

    ┌─policy_name──┬─volume_name─┬─volume_priority─┬─disks──────────────────────────────┬─volume_type─┬─max_data_part_size─┬─move_factor─┬─prefer_not_to_merge─┐
    │ default      │ single      │               1 │ ['disk1','disk2','disk3','disk4']   │ JBOD        │                  0 │           0 │                   0 │
    │ hdfs_ttl     │ local       │               1 │ ['disk1','disk2','disk3','disk4']   │ JBOD        │                  0 │         0.2 │                   0 │
    │ hdfs_ttl     │ remote      │               2 │ ['disk_hdfs']                       │ JBOD        │                  0 │         0.2 │                   0 │
    └──────────────┴─────────────┴─────────────────┴────────────────────────────────────┴─────────────┴────────────────────┴─────────────┴─────────────────────┘

Step 3: Apply hot/cold separation to tables

Choose one of the following approaches depending on whether you are working with an existing table or creating a new one.

Apply to an existing table

  1. Check the current storage policy of the table:

    SELECT storage_policy
    FROM system.tables
    WHERE database = '<database_name>' AND name = '<table_name>';

    If the output shows the default policy, the table stores all data on local disks. Continue to the next step to add a remote volume.

  2. In the EMR console, add a remote volume to the default storage policy under storage_configuration:

    <default>
      <volumes>
        <single>
          <disk>disk1</disk>
          <disk>disk2</disk>
          <disk>disk3</disk>
          <disk>disk4</disk>
        </single>
        <!-- Add this remote volume. -->
        <remote>
          <disk>disk_hdfs</disk>
        </remote>
      </volumes>
      <!-- Required when the policy has more than one volume. -->
      <move_factor>0.2</move_factor>
    </default>
  3. Modify the TTL on the table to move data older than 5 minutes to the remote volume:

    ALTER TABLE <yourDataName>.<yourTableName>
      MODIFY TTL toStartOfMinute(addMinutes(t, 5)) TO VOLUME 'remote';
  4. Verify that data parts are distributed across volumes:

    SELECT partition, name, path
    FROM system.parts
    WHERE database = '<yourDataName>'
      AND table = '<yourTableName>'
      AND active = 1;

    The separation is working if you see paths on both local disks (/mnt/disk{1..4}/clickhouse/...) and on the HDFS metadata path (/var/lib/clickhouse/disks/disk_hdfs/...). Hot data (within the TTL window) stays on local disks; cold data (past the TTL) moves to HDFS. Example output:

    ┌─partition───────────┬─name─────────────────┬─path──────────────────────────────────────────────────────────────────────────────────────────────────┐
    │ 2022-01-12 11:30:00 │ 1641958200_1_96_3    │ /var/lib/clickhouse/disks/disk_hdfs/store/156/156008ff-41bf-460c-8848-e34fad88c25d/1641958200_1_96_3/ │
    │ 2022-01-12 11:35:00 │ 1641958500_97_124_2  │ /mnt/disk3/clickhouse/store/156/156008ff-41bf-460c-8848-e34fad88c25d/1641958500_97_124_2/             │
    │ 2022-01-12 11:35:00 │ 1641958500_125_152_2 │ /mnt/disk4/clickhouse/store/156/156008ff-41bf-460c-8848-e34fad88c25d/1641958500_125_152_2/            │
    │ 2022-01-12 11:35:00 │ 1641958500_153_180_2 │ /mnt/disk1/clickhouse/store/156/156008ff-41bf-460c-8848-e34fad88c25d/1641958500_153_180_2/            │
    │ 2022-01-12 11:35:00 │ 1641958500_181_186_1 │ /mnt/disk4/clickhouse/store/156/156008ff-41bf-460c-8848-e34fad88c25d/1641958500_181_186_1/            │
    │ 2022-01-12 11:35:00 │ 1641958500_187_192_1 │ /mnt/disk3/clickhouse/store/156/156008ff-41bf-460c-8848-e34fad88c25d/1641958500_187_192_1/            │
    └─────────────────────┴──────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────┘

Create a new table with hot/cold separation

Use the hdfs_ttl storage policy and a TTL ... TO VOLUME 'remote' clause when creating the table.

Syntax

CREATE TABLE <yourDataName>.<yourTableName> [ON CLUSTER cluster_emr]
(
  column1 Type1,
  column2 Type2,
  ...
)
ENGINE = MergeTree()  -- Replicated*MergeTree types are also supported.
PARTITION BY <yourPartitionKey>
ORDER BY <yourPartitionKey>
TTL <yourTtlKey> TO VOLUME 'remote'
SETTINGS storage_policy = 'hdfs_ttl';

Example

The following table keeps data from the last 5 minutes on local disks. After 5 minutes, data moves to the remote volume on HDFS.

CREATE TABLE test.test
(
    `id`  UInt32,
    `t`   DateTime
)
ENGINE = MergeTree()
PARTITION BY toStartOfFiveMinute(t)
ORDER BY id
TTL toStartOfMinute(addMinutes(t, 5)) TO VOLUME 'remote'
SETTINGS storage_policy = 'hdfs_ttl';

Additional parameters

The following optional parameters give you finer control over replication behavior.

server-config

merge_tree.allow_remote_fs_zero_copy_replication: Set to true to enable zero-copy replication for the Replicated\*MergeTree engine. ClickHouse replicates the metadata that points to the HDFS disk to generate multiple metadata replicas for the same shard in the ClickHouse cluster.

server-users

profile.${your-profile-name}.hdfs_replication: Controls the number of data replicas written to HDFS.