This topic describes how to back up an E-MapReduce (EMR) HBase cluster.

Prerequisites

Two Hadoop clusters are created, and the HBase and ZooKeeper services are added to the clusters. For more information, see Create a cluster.

Procedure

  1. Log on to the master node of the cluster by using SSH.
  2. Create a table and add data to the table.
    1. Enable HBase Shell.
      hbase shell
    2. Create a table.
      create 'test','cf'
    3. Add data to the table.
      put 'test','a','cf:c1',1
      put 'test','a','cf:c2',2
      put 'test','b','cf:c1',3
      put 'test','b','cf:c2',4
      put 'test','c','cf:c1',5
      put 'test','c','cf:c2',6
    4. Exit HBase Shell.
      exit
  3. Create a snapshot and query snapshot information.
    1. Create a snapshot.
      hbase snapshot create -n test_snapshot -t test
    2. Enable HBase Shell.
      hbase shell
    3. Query snapshot information.
      list_snapshots
      The following information is returned:
      SNAPSHOT                                           TABLE + CREATION TIME
       test_snapshot                                     test (Tue Aug 18 14:35:28 +0800 2020)
      1 row(s) in 0.2450 seconds
      
      => ["test_snapshot"]
    4. Exit HBase Shell.
      exit
  4. Export the created snapshot to OSS.
    hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot test_snapshot -copy-to oss://$accessKeyId:$accessKeySecret@$bucket.oss-cn-hangzhou-internal.aliyuncs.com/hbase/snapshot/test
    Note Use the internal endpoint to access OSS.
  5. Log on to the other cluster by using SSH.
  6. Export the snapshot from OSS.
    hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot test_snapshot -copy-from oss://$accessKeyId:$accessKeySecret@$bucket.oss-cn-hangzhou-internal.aliyuncs.com/hbase/snapshot/test -copy-to /hbase/
  7. Restore data from the snapshot and view data in the restored table.
    1. Enable HBase Shell.
      hbase shell
    2. Restore data from the snapshot.
      restore_snapshot 'test_snapshot'
    3. View data in the restored table.
      scan 'test'
      The following information is returned:
      ROW                     COLUMN+CELL
       a                      column=cf:c1, timestamp=1472992081375, value=1
       a                      column=cf:c2, timestamp=1472992090434, value=2
       b                      column=cf:c1, timestamp=1472992104339, value=3
       b                      column=cf:c2, timestamp=1472992099611, value=4
       c                      column=cf:c1, timestamp=1472992112657, value=5
       c                      column=cf:c2, timestamp=1472992118964, value=6
      3 row(s) in 0.0540 seconds
  8. Create a table based on the snapshot and view data in the table.
    1. Create a table based on the snapshot.
      clone_snapshot 'test_snapshot','test_2'
    2. View data in the table.
      scan 'test_2'
      The following information is returned:
      ROW                     COLUMN+CELL
       a                      column=cf:c1, timestamp=1472992081375, value=1
       a                      column=cf:c2, timestamp=1472992090434, value=2
       b                      column=cf:c1, timestamp=1472992104339, value=3
       b                      column=cf:c2, timestamp=1472992099611, value=4
       c                      column=cf:c1, timestamp=1472992112657, value=5
       c                      column=cf:c2, timestamp=1472992118964, value=6
      3 row(s) in 0.0540 seconds