In E-MapReduce (EMR) V3.30, JindoFS provides a tiered storage feature. This feature allows you to store cold and hot data in different storage media. This helps reduce the data storage costs or accelerate data access.
Use jindo jfs
[root@emr-header-1 ~]# jindo jfs -help archive
-archive -i/a <path> ... :
Archive commands.
Tiered storage commands in JindoFS are asynchronous and start only related tasks.
Cache
jindo jfs -cache -p <path>
-p
can be used to ensure that local data is not cleared based on disk usage.
Uncache
jindo jfs -uncache <path>
Archive
jindo jfs -archive -i|-a <path>
-i
is used to store data to OSS in IA mode. -a
is used to store data to OSS in Archive mode.
Unarchive
jindo jfs -unarchive -i/-o <path>
-i
is used to store data to OSS in IA mode. -o
is used to temporarily restore data stored in Archive mode to allow the data to be
readable.
Status
jindo jfs -status -detail/-sync <path>
-detail
is used to view the storage progress of file data. -sync
indicates that the command exits after a tiered storage task is completed.
ls2 command
hadoop fs -ls2 <path>
drwxrwxrwx - - 0 2020-06-05 04:27 oss://xxxx/warehouse
-rw-rw-rw- 1 Archive 1484 2020-09-23 16:40 oss://xxxx/wikipedia_data.csv
-rw-rw-rw- 1 Standard 1676 2020-06-07 20:04 oss://xxxx/wikipedia_data.json