HBase snapshots let you back up tables with minimal impact on cluster performance. On an E-MapReduce (EMR) cluster with HBase, you can create, clone, restore, import, export, and delete snapshots using HBase Shell or the command line.
Prerequisites
Before you begin, ensure that you have:
An EMR cluster with the HBase service deployed
Access to HBase Shell or the command line on the cluster
To connect to HBase Shell, see the Connect to HBase section of the "Use HBase Shell" topic.
Create a snapshot
The snapshot command captures a lightweight backup of a table with minimal impact on performance.
Run the following command to create a snapshot named table1-snapshot for table1:
snapshot 'table1', 'table1-snapshot'To add constraints, include the SKIP_FLUSH, TTL, or MAX_FILESIZE parameter based on your requirements.
Snapshot names must be unique across the cluster. Run list_snapshots to view all existing snapshots.
Clone a snapshot
The clone_snapshot command creates a new table that contains the same data as the snapshot.
Run the following command to create table2 from table1-snapshot:
clone_snapshot 'table1-snapshot', 'table2'Restore table data
The restore_snapshot command rolls back a table to the state captured in a snapshot. Before restoring, you must disable the table, then re-enable it after the restore completes.
Run the following commands to restore table1 using table1-snapshot:
disable 'table1'
restore_snapshot 'table1-snapshot'
enable 'table1'Import a snapshot
The hbase snapshot export command copies a snapshot from a source path to a destination path. Use it to import snapshots from an Object Storage Service (OSS) bucket or Hadoop Distributed File System (HDFS) path into the current cluster.
If you omit -mappers, HBase calculates the value based on table size, which may be excessively large and affect running tasks. Set -mappers or -bandwidth to limit cluster resource usage.
The following command imports table1-snapshot from the OSS-HDFS service of one DataServing cluster to the OSS-HDFS service of another:
hbase snapshot export \
-snapshot 'table1-snapshot' \
-copy-from oss://<oss-hdfs-endpoint>/oss-dir \
-copy-to oss://<oss-hdfs-endpoint>/hbase \
-mappers 2Replace the placeholder with the actual value:
Placeholder | Description |
| The OSS-HDFS service endpoint. Copy it from the Port section of the Overview page of your bucket in the OSS console. For details, see the Procedure section of the "Use OSS-HDFS as the storage backend of HBase" topic. |
The parameters are described as follows:
Parameter | Description |
| The name of the snapshot to import. Example: |
| The source path of the snapshot. |
| The destination path to save the snapshot. |
| The number of mappers for the import task. |
| An alternative to |
To confirm the import succeeded, run list_snapshots and verify that the snapshot appears in the output.
Export a snapshot
The hbase snapshot export command copies a snapshot from the current cluster to an external HDFS or OSS-HDFS path. This is the same binary used for imports — the direction is controlled by the parameters you provide.
Set -mappers or -bandwidth to limit cluster resource usage during the export.
The following command exports table1-snapshot to the OSS-HDFS service of another cluster:
hbase snapshot export \
-snapshot 'table1-snapshot' \
-copy-to oss://<oss-hdfs-endpoint>/oss-dir \
-mappers 2To export to an HDFS path instead, set -copy-to to the target HDFS path.
The parameters are the same as those described in the Import a snapshot section.
Delete a snapshot
Run the following command to delete table1-snapshot:
delete_snapshot 'table1-snapshot'Run list_snapshots to confirm the snapshot has been removed.
What's next
For the full reference on HBase snapshot operations, see the Apache HBase Reference Guide.