This topic describes how to use the snapshot feature of OSS-HDFS (JindoFS).

Background information

OSS-HDFS (JindoFS) is a new storage service based on Object Storage Service (OSS). OSS-HDFS is compatible with Hadoop Distributed File System (HDFS) API and supports directories of multiple levels. You can use JindoSDK V4.x to access the OSS-HDFS service. You can use the snapshot feature of OSS-HDFS in the same manner as the snapshot feature of HDFS. This topic describes the common operations that you need to perform when you use the snapshot feature of OSS-HDFS.

Limits

Only JindoData V4.0.0 and later support this feature.

Enable the snapshot feature

You can run the following command to create a directory named TestSnapshot in the oss-dfs-test bucket:
hdfs dfs -mkdir oss://oss-dfs-test.<Endpoint>/TestSnapshot
By default, the snapshot feature is disabled for a directory. To enable the snapshot feature for a directory, run the following JindoSDK Shell command:
jindo admin -allowSnapshot -dlsUri <path>
For example, you can run the following command to enable the snapshot feature for the TestSnapshot directory:
jindo admin -allowSnapshot -dlsUri oss://oss-dfs-test.<Endpoint>/TestSnapshot

Create a snapshot

After you enable the snapshot feature for a directory, you can run the following HDFS Shell command to create a snapshot for the directory:
hdfs dfs -createSnapshot <path> [<snapshotName>]
For example, you can create a snapshot named S1 for the TestSnapshot directory to save the status of the directory at the point in time when the snapshot is created.
# Run the following commands to create subdirectories and files in the TestSnapshot directory for testing:
hdfs dfs -mkdir oss://oss-dfs-test.<Endpoint>/TestSnapshot/dir1
hdfs dfs -mkdir oss://oss-dfs-test.<Endpoint>/TestSnapshot/dir2
hdfs dfs -touchz oss://oss-dfs-test.<Endpoint>/TestSnapshot/dir1/file1
hdfs dfs -touchz oss://oss-dfs-test.<Endpoint>/TestSnapshot/dir3/file2

# Run the following command to create a snapshot named S1 for the TestSnapshot directory:
hdfs dfs -createSnapshot oss://oss-dfs-test.<Endpoint>/TestSnapshot S1

Access directories and files in a snapshot

Path format of a directory or file in a snapshot

To distinguish from the original directories and files in a bucket, you must specify the snapshot name to access directories and files in a snapshot. If the snapshot feature is enabled for a directory, and you want to access a directory or file in a snapshot of the directory, specify the path of the directory or file in the following format:
<snapshotRoot>/.snapshot/<snapshotName>/<actual subPath>
In the preceding path, the snapshotRoot parameter specifies the root directory of snapshots. The root directory is the path specified for the dlsUri parameter in the command for enabling the snapshot feature. The snapshotName parameter specifies the name of a snapshot. The subsequent path is the path of the directory or file to be accessed in the root directory of snapshots. In this example, the root directory is TestSnapshot.
If you want to query the files in the oss://oss-dfs-test.<Endpoint>/TestSnapshot/dir1 directory, you can run the following regular Is command:
hdfs dfs -ls oss://oss-dfs-test.<Endpoint>/TestSnapshot/dir1
You have enabled the snapshot feature for the TestSnapshot directory and created the S1 snapshot for the TestSnapshot directory. Therefore, you can also query the directories and files in the S1 snapshot by running the following command:
hdfs dfs -ls oss://oss-dfs-test.<Endpoint>/TestSnapshot/.snapshot/S1/dir1
In the preceding command, .snapshot/S1 specifies the snapshot that you want to access.

Use a snapshot to restore data

The snapshot feature can be used to back up and restore data. You can use a snapshot to restore important data if the data is accidentally deleted. You can access data in a snapshot by using the path format described in the previous section, and then restore the data based on your business requirements.

For example, you accidentally deleted the dir1 directory from the TestSnapshot directory by running the hdfs dfs -rm -r oss://oss-dfs-test.<Endpoint>/TestSnapshot/dir1 command. To restore the deleted directory, perform the following steps:

  1. Use the S1 snapshot that you created for the TestSnapshot directory to restore the deleted directory.
    hdfs dfs -cp oss://oss-dfs-test.<Endpoint>/TestSnapshot/.snapshot/S1/dir1 oss://oss-dfs-test.<Endpoint>/TestSnapshot
  2. Check whether the directory is restored.
    hdfs dfs -ls oss://oss-dfs-test.<Endpoint>/TestSnapshot/dir1
    If the /TestSnapshot/dir1 directory that is accidentally deleted and the files in the directory can be viewed, the data is restored.

Rename a snapshot

You can run the following command to rename a created snapshot:
hdfs dfs -renameSnapshot <path> <oldName> <newName>
For example, you can run the following command to rename the S1 snapshot S100:
hdfs dfs -renameSnapshot oss://oss-dfs-test.<Endpoint>/TestSnapshot S1 S100

Delete a snapshot

If a snapshot is no longer needed, you can run the following command to delete the snapshot:
hdfs dfs -deleteSnapshot <path> <snapshotName>
For example, you can run the following command to delete the S100 snapshot that you obtained by creating and renaming the S1 snapshot:
hdfs dfs -deleteSnapshot oss://oss-dfs-test.<Endpoint>/TestSnapshot S100

Disable the snapshot feature

To disable the snapshot feature for a directory, run the following JindoSDK Shell command:
jindo admin -disallowSnapshot -dlsUri <path>
Important Make sure that all snapshots of the directory are deleted before you disable the snapshot feature. For more information about how to delete snapshots, see the "Delete a snapshot" section of this topic. If the directory still has snapshots, an error occurs when you disable the snapshot feature.
For example, you can run the following command to disable the snapshot feature for the TestSnapshot directory after all snapshots of the directory are deleted:
jindo admin -disallowSnapshot -dlsUri oss://oss-dfs-test.<Endpoint>/TestSnapshot

Compare snapshots

To view the differences between two snapshots, run the following command:
jindo dls -snapshotDiff -dlsUri <uri> -fromSnapshot <fromSnapshot> -toSnapshot <toSnapshot>