All Products
Search
Document Center

E-MapReduce:Back up an HBase cluster

Last Updated:Mar 26, 2026

Use HBase snapshots to back up your E-MapReduce (EMR) HBase cluster data and restore it to another cluster via Object Storage Service (OSS).

How it works

HBase snapshots capture a point-in-time view of a table without copying data, so the operation completes almost instantly and has minimal impact on cluster performance. The snapshot references the underlying HFiles, and as long as the snapshot exists, those files are preserved even if the original data is later deleted.

To move a snapshot between clusters, you export it to an OSS bucket as an intermediate store. The destination cluster then imports the snapshot from OSS and restores the data.

Prerequisites

Before you begin, ensure that you have:

Back up and restore an HBase cluster

Step 1: Prepare test data

  1. Log on to the master node of the source cluster using SSH.

  2. Open HBase Shell.

    hbase shell
  3. Create a table.

    create 'test','cf'
  4. Add data to the table.

    put 'test','a','cf:c1',1
    put 'test','a','cf:c2',2
    put 'test','b','cf:c1',3
    put 'test','b','cf:c2',4
    put 'test','c','cf:c1',5
    put 'test','c','cf:c2',6
  5. Exit HBase Shell.

    exit

Step 2: Create a snapshot

  1. Create a snapshot of the table.

    hbase snapshot create -n test_snapshot -t test
  2. Open HBase Shell to verify the snapshot was created.

    hbase shell
  3. List snapshots.

    list_snapshots

    The output is similar to:

    SNAPSHOT                                           TABLE + CREATION TIME
     test_snapshot                                     test (Tue Aug 18 14:35:28 +0800 2020)
    1 row(s) in 0.2450 seconds
    
    => ["test_snapshot"]
  4. Exit HBase Shell.

    exit

Step 3: Export the snapshot to OSS

Export the snapshot to your OSS bucket using the internal endpoint.

hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot test_snapshot -copy-to oss://$accessKeyId:$accessKeySecret@$bucket.oss-cn-hangzhou-internal.aliyuncs.com/hbase/snapshot/test
Note Always use the internal endpoint to access OSS.

Step 4: Import the snapshot to the destination cluster

  1. Log on to the master node of the destination cluster using SSH.

  2. Import the snapshot from OSS to the local HDFS.

    hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot test_snapshot -copy-from oss://$accessKeyId:$accessKeySecret@$bucket.oss-cn-hangzhou-internal.aliyuncs.com/hbase/snapshot/test -copy-to /hbase/

Step 5: Restore data from the snapshot

  1. Open HBase Shell on the destination cluster.

    hbase shell
  2. Restore the table from the snapshot.

    restore_snapshot 'test_snapshot'
  3. Verify the restored data.

    scan 'test'

    The output is similar to:

    ROW                     COLUMN+CELL
     a                      column=cf:c1, timestamp=1472992081375, value=1
     a                      column=cf:c2, timestamp=1472992090434, value=2
     b                      column=cf:c1, timestamp=1472992104339, value=3
     b                      column=cf:c2, timestamp=1472992099611, value=4
     c                      column=cf:c1, timestamp=1472992112657, value=5
     c                      column=cf:c2, timestamp=1472992118964, value=6
    3 row(s) in 0.0540 seconds

Step 6: Clone a new table from the snapshot

Use clone_snapshot to create an independent copy of the table without a full data copy.

  1. Clone the snapshot into a new table.

    clone_snapshot 'test_snapshot','test_2'
  2. Verify the data in the new table.

    scan 'test_2'

    The output is similar to:

    ROW                     COLUMN+CELL
     a                      column=cf:c1, timestamp=1472992081375, value=1
     a                      column=cf:c2, timestamp=1472992090434, value=2
     b                      column=cf:c1, timestamp=1472992104339, value=3
     b                      column=cf:c2, timestamp=1472992099611, value=4
     c                      column=cf:c1, timestamp=1472992112657, value=5
     c                      column=cf:c2, timestamp=1472992118964, value=6
    3 row(s) in 0.0540 seconds