All Products
Search
Document Center

E-MapReduce:Use Hadoop Shell commands to access OSS or OSS-HDFS

Last Updated:Mar 26, 2026

When you need to manage files in Object Storage Service (OSS) or OSS-HDFS from an EMR cluster, JindoSDK lets you use the same standard Hadoop Shell commands for both — only the endpoint in the path changes.

Prerequisites

Before you begin, ensure that you have:

  • An OSS bucket with the appropriate access permissions

  • JindoSDK available in your environment (see Environment setup)

Environment setup

EnvironmentSetup required
EMR clusterJindoSDK is pre-installed. No additional setup needed.
Non-EMR environmentInstall JindoSDK first. See Deploy JindoSDK in an environment other than EMR.

OSS-HDFS version requirements:

EnvironmentMinimum version
EMR clusterEMR V3.42.0 or later (minor version), or EMR V5.8.0 or later (minor version)
Non-EMR environmentJindoSDK V4.X or later

URI format reference

The commands for OSS and OSS-HDFS are identical. The only difference is the endpoint embedded in the path.

StorageEndpoint patternExample path
OSS-HDFS<region>.oss-dls.aliyuncs.comoss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/

All examples in this topic use the OSS-HDFS endpoint format.

Rename operations in object storage are proportional to the amount of data involved. Commands such as -put, -cp, and -mv can be significantly slower on large directories compared to HDFS.

Commands

Upload a file

hadoop fs -put examplefile.txt oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/

This uploads examplefile.txt from the local root directory to examplebucket.

Useful options:

OptionDescription
-fOverwrite the destination if it already exists
-dSkip the intermediate ._COPYING_ temp file. Use this flag when uploading to object storage to avoid unnecessary rename operations
Tip: Always use -d when uploading to OSS or OSS-HDFS. Without it, Hadoop creates a ._COPYING_ temp file and renames it on completion, which adds latency proportional to file size.

Create a directory

hadoop fs -mkdir oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/dir/

This creates a directory named dir/ in examplebucket.

List files or directories

hadoop fs -ls oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/

This lists all files and directories in examplebucket.

Useful options:

OptionDescription
-RRecursively list all subdirectories
-hDisplay file sizes in human-readable format (KB, MB, GB)

Check disk usage

hadoop fs -du oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/

This reports the sizes of all files and directories in examplebucket.

-du on large buckets can be slow when used against object storage. Avoid running it frequently on buckets with many objects.

View file content

hadoop fs -cat oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/localfile.txt

This displays the content of localfile.txt in plain text.

Important

If the file content is encoded, use the HDFS API for Java to read and decode it. The -cat command does not decode encoded content.

Copy files or directories

hadoop fs -cp oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/subdir1 \
             oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/subdir2/subdir1

This copies subdir1 to subdir2/subdir1, preserving the directory structure, file locations, and content of all subdirectories.

-cp on large directories is slow against object storage because rename operations are proportional to the amount of data moved.

Move files or directories

hadoop fs -mv oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/srcdir \
             oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/destdir

This moves srcdir and all its files and subdirectories to destdir.

-mv can be slow for large directories in object storage.

Download a file

hadoop fs -get oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampleobject.txt /tmp/

This downloads exampleobject.txt from examplebucket to /tmp/ on the local machine.

Delete files or directories

hadoop fs -rm oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/destfolder/

This deletes destfolder/ and all files within it from examplebucket.

Useful options:

OptionDescription
-rRecursively delete a non-empty directory and its contents
-skipTrashPermanently delete without moving to the trash directory

What's next