This topic describes how to use the trash bin feature of OSS-HDFS (JindoFS).

How the trash bin feature works

  1. When you delete a file from an OSS-HDFS bucket, the file is not immediately deleted. Instead, the file is moved to the /user/<username>/.Trash/Current directory.
  2. Thirty minutes later, the file that you delete is moved from the Current directory to the /user/<username>/.Trash/<timestamp> directory.
    Note Files that are deleted within a specific period of time are moved to a directory with a timestamp. The timestamp indicates the time period in which the files were deleted and functions as a checkpoint.
  3. Seven days later, the directory is permanently deleted.

Therefore, within seven days after you delete a file, you can find the file in the .Trash directory based on the time when the file was deleted. Then, you can restore the file by moving the file out of the .Trash directory.

Note The trash bin feature is implemented by the cooperation between the client and the server. The client moves the files that you delete to the .Trash directory. By default, the server periodically deletes files from the /user/<username>/.Trash directory.

Use the trash bin feature in Hadoop FileSystem Shell

Sample command:
hadoop fs -rm oss://bucket/a/b/c
By default, the trash bin feature is not enabled for Hadoop FileSystem Shell on the client. To enable the trash bin feature, you must add the following configurations to the core-site.xml file:
  <property>
    <name>fs.trash.interval</name>
    <value>1440</value>
  </property>
Note The value must be greater than 0.
Then, the client automatically converts the preceding rm command to hadoop fs -mv oss://bucket/a/b/c /user/<username>/.Trash/Current/a/b/c. Therefore, the trash bin feature works without your awareness. The server periodically clears files in the trash bin.

If you want to immediately delete a file to free up storage space, you can add the -skipTrash parameter to the rm command. This way, the file is immediately deleted.

Use the trash bin feature in Hadoop ecosystem components

Components such as Hive, Spark, and Flink are not aware of the trash bin feature of OSS-HDFS. When you run the delete command of HDFS FileSystem to delete a file, the file is immediately deleted.

OSS-HDFS adopts a similar strategy to open source Hadoop. To use the trash bin feature in Hadoop ecosystem components, you must explicitly run the rename command of HDFS FileSystem to move the file that you want to delete to the /user/<username>/.Trash/Current directory. The OSS-HDFS server periodically clears files from the trash bin.