This topic describes how to migrate data from Hadoop Distributed File System (HDFS) to JindoFileSystem (JindoFS) that stores data in Object Storage Service (OSS).
Use Hadoop FS shell commands
You can run File System (FS) shell commands to migrate a small amount of data:
hadoop dfs -cp hdfs://emr-cluster/README.md jfs://emr-jfs/
hadoop dfs -cp oss://oss_bucket/README.md jfs://emr-jfs/
Use Hadoop DistCp
You can use DistCp, a built-in tool of Hadoop, to migrate a large amount of data:
Note For more information about DistCp parameters, see DistCp Version2 Guide.
hadoop distcp hdfs://emr-cluster/files jfs://emr-jfs/output/
hadoop distcp oss://oss_bucket/files jfs://emr-jfs/output/
Use the cache mode
In cache mode, JindoFS stores data files as objects in OSS without changing the metadata and data. When you access these OSS objects, JindoFS can cache data and metadata of these OSS objects in the local cluster so that you can quickly access them next time. For more information, see Use the cache mode.