In E-MapReduce (EMR) V3.30, JindoFS provides a tiered storage feature. This feature allows you to store cold and hot data in different storage media. This helps reduce the data storage costs and accelerate data access.

Use jindo jfs

Run the following command to obtain help information:
jindo jfs -help archive

Tiered-storage commands in JindoFS are asynchronous and start only related tasks.

Cache

You can use this command to back up data stored in a specific path to local disks. Then, you can read the data from local disks, without the need to read data from Object Storage Service (OSS).
jindo jfs -cache -p <path>

-p can be used to ensure that local data is not cleared based on disk usage.

Uncache

You can use this command to delete backup data from local disks and store data only to OSS Standard storage.
jindo jfs -uncache  <path>

Archive

You can use this command to delete backup data from local disks and store data to OSS Infrequent Access (IA) or Archive storage. For information about the storage classes, see Overview.
jindo jfs -archive -i|-a|-c <path>
  • If you specify the -i option, data is stored to OSS IA storage.
  • If you specify the -a option, data is stored to OSS Archive storage.
  • If you specify the -c option, data is stored to OSS Cold Archive storage.

Unarchive

You can use this command to convert the storage class of data from Archive to IA or Standard. You can temporarily unarchive data that is stored to Archive storage to allow the data to be readable for one day.
jindo jfs -unarchive -i/-o <path>
By default, this command can be used to store data to OSS Standard storage.
  • If you specify the -i option, data is stored to OSS IA storage.
  • If you specify the -o option, data stored to Archive storage can be temporarily unarchived and becomes readable.

Status

You can use this command to view task details. By default, the number of files for which you want to use tiered storage in a specific directory and the data to which tiered storage has been applied are measured.
jindo jfs -status -detail/-sync <path>
  • If you specify the -detail option, the storage progress of file data can be viewed.
  • If you specify the -sync option, the command exits only after a tiered-storage task is completed.

ls2

JindoFS provides the ls2 command that allows you to view the file storage status on the basis of Hadoop ls commands.
hadoop fs -ls2 <path>
Example of command output, which includes the file storage class:
drwxrwxrwx  - -         0    2020-06-05 04:27 oss://xxxx/warehouse
-rw-rw-rw-  1 Archive   1484 2020-09-23 16:40 oss://xxxx/wikipedia_data.csv
-rw-rw-rw-  1 Standard  1676 2020-06-07 20:04 oss://xxxx/wikipedia_data.json