edit-icon download-icon

Backup HBase

Last Updated: Dec 05, 2017

E-MapReduce HBase clusters can use the built-in snapshot feature of HBase to back up HBase tables and export the backup data to Alibaba Cloud Object Storage Service OSS. A configuration example is provided as follows:

1. Create an HBase cluster.

For details, see Cluster Creation Document.

2. Create a table.

  1. >create 'test','cf'

3. Add data.

  1. > put 'test','a','cf:c1',1
  2. > put 'test','a','cf:c2',2
  3. > put 'test','b','cf:c1',3
  4. > put 'test','b','cf:c2',4
  5. > put 'test','c','cf:c1',5
  6. > put 'test','c','cf:c2',6

4. Create a snapshot.

  1. hbase snapshot create -n test_snapshot -t test

Check the snapshot.

  1. >list_snapshots
  2. SNAPSHOT TABLE + CREATION TIME
  3. test_snapshot test (Sun Sep 04 20:31:00 +0800 2016)
  4. 1 row(s) in 0.2080 seconds

5. Export the snapshot to OSS.

  1. hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot test_snapshot -copy-to oss://$accessKeyId:$accessKeySecret@$bucket.oss-cn-hangzhou-internal.aliyuncs.com/hbase/snapshot/test

Remarks: OSS uses Intranet Endpoint.

6. Create another HBase cluster.

7. Export the snapshot from OSS.

  1. hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot test_snapshot -copy-from oss://$accessKeyId:$accessKeySecret@$bucket.oss-cn-hangzhou-internal.aliyuncs.com/hbase/snapshot/test -copy-to /hbase/

8. Restore data from the snapshot.

  1. >restore _snapshot 'test_snapshot'
  1. >scan 'test'
  2. ROW COLUMN+CELL
  3. a column=cf:c1, timestamp=1472992081375, value=1
  4. a column=cf:c2, timestamp=1472992090434, value=2
  5. b column=cf:c1, timestamp=1472992104339, value=3
  6. b column=cf:c2, timestamp=1472992099611, value=4
  7. c column=cf:c1, timestamp=1472992112657, value=5
  8. c column=cf:c2, timestamp=1472992118964, value=6
  9. 3 row(s) in 0.0540 seconds

9. Create a new table from the snapshot.

  1. >clone_snapshot 'test_snapshot','test_2'
  1. >scan 'test_2'
  2. ROW COLUMN+CELL
  3. a column=cf:c1, timestamp=1472992081375, value=1
  4. a column=cf:c2, timestamp=1472992090434, value=2
  5. b column=cf:c1, timestamp=1472992104339, value=3
  6. b column=cf:c2, timestamp=1472992099611, value=4
  7. c column=cf:c1, timestamp=1472992112657, value=5
  8. c column=cf:c2, timestamp=1472992118964, value=6
  9. 3 row(s) in 0.0540 seconds
Thank you! We've received your feedback.