In E-MapReduce (EMR) V3.30, JindoFS provides a tiered storage feature. This feature allows you to store cold and hot data in different storage media. This helps reduce the data storage costs and accelerate data access.

Use jindo jfs

Run the following command to obtain the help information:
jindo jfs -help archive
-archive -i/a <path> ... :
  Archive commands.

Tiered storage commands in JindoFS are asynchronous and start only related tasks.

Cache

You can use this command to back up data stored in a specific path to local disks. Then, you can read the data from local disks, without the need to read data from Object Storage Service (OSS).
jindo jfs -cache -p <path>

-p can be used to ensure that local data is not cleared based on disk usage.

Uncache

You can use this command to delete backup data from local disks and store data only to OSS Standard storage.
jindo jfs -uncache  <path>

Archive

You can use this command to delete backup data from local disks and store data to OSS Infrequent Access (IA) or Archive storage. For information about the storage classes, see Overview.
jindo jfs -archive -i|-a <path>

-i is used to store data to OSS IA storage. -a is used to store data to OSS Archive storage.

Unarchive

You can use this command to convert the storage class of data from Archive to IA or Standard. You can temporarily restore data stored to Archive storage to allow the data to be readable within one day.
jindo jfs -unarchive -i/-o <path>

By default, this command can be used to store data to OSS Standard storage. -i is used to store data to OSS IA storage. -o is used to temporarily restore data stored to Archive storage to allow the data to be readable.

Status

You can use this command to view task details. By default, the number of files for which you want to use tiered storage in a specific directory and the data to which tiered storage has been applied are measured.
jindo jfs -status -detail/-sync <path>

-detail is used to view the storage progress of file data. -sync indicates that the command exits after a tiered storage task is complete.

ls2

JindoFS provides the ls2 command that allows you to view the file storage status on the basis of Hadoop ls commands.
hadoop fs -ls2 <path>
Example of command output, which includes the file storage class:
drwxrwxrwx  - -         0    2020-06-05 04:27 oss://xxxx/warehouse
-rw-rw-rw-  1 Archive   1484 2020-09-23 16:40 oss://xxxx/wikipedia_data.csv
-rw-rw-rw-  1 Standard  1676 2020-06-07 20:04 oss://xxxx/wikipedia_data.json