In an E-MapReduce (EMR) cluster, run hadoop fs commands to perform file operations on Hadoop Distributed File System (HDFS).
Prerequisites
Before you begin, make sure that you have:
-
Cluster access: Logged on to a node in the cluster (typically the master node) via Secure Shell (SSH)
-
User permissions: An account with read and write permissions for the destination HDFS path, such as the default
hadoopuser. For clusters with Kerberos authentication enabled, complete identity authentication first
Command versions
Hadoop provides two equivalent command formats for file system operations:
-
hdfs dfs <args>: Specific to HDFS. -
hadoop fs <args>: A generic file system command that works with HDFS and other Hadoop-compatible file systems, including the local file system (file:///).
All examples in this topic use hadoop fs.
Command cheat sheet
The following table lists the most common HDFS commands.
|
Command |
Description |
Syntax |
|
Creates a directory in HDFS. |
|
|
|
Creates an empty file (0 bytes) in HDFS. |
|
|
|
Lists files and directories at a path with their metadata. |
|
|
|
Uploads files from the local file system to HDFS. |
|
|
|
Downloads files or directories from HDFS to the local file system. |
|
|
|
Copies files or directories within HDFS. |
|
|
|
Moves or renames files or directories within HDFS. |
|
|
|
Deletes files or directories in HDFS. |
|
|
|
Prints the content of a file in HDFS to stdout. |
|
|
|
Shows the size of a file or the total size of files in a directory. |
|
For the full command reference, see the Apache Hadoop FileSystem Shell documentation.
Directory and file management
mkdir: Create a directory
Creates a directory in HDFS.
Syntax
hadoop fs -mkdir [-p] <paths>
Options
| Option | Description |
|---|---|
-p |
Creates parent directories in the path if they do not exist, similar to mkdir -p on Linux. Use this option in production to prevent errors when a parent directory is missing. |
Example
Create the /dir directory:
hadoop fs -mkdir /dir
touchz: Create an empty file
Creates an empty file (0 bytes) in HDFS. Common uses:
-
As a marker file to signal that a task is complete
-
To create an empty output file before data processing
Syntax
hadoop fs -touchz URI [URI ...]
Example
Create emptyfile.txt in /dir/:
hadoop fs -touchz /dir/emptyfile.txt
ls: List files and directories
Lists files and directories at a path, along with permissions, replication factor, owner, group, size, and modification time.
Syntax
hadoop fs -ls [-h] [-R] [-t] <args>
Options
| Option | Description |
|---|---|
-h |
Displays file sizes in human-readable format (for example, 1K, 234M, 2G). |
-R |
Recursively lists the contents of all subdirectories. |
-t |
Sorts output by modification time, newest first. |
Example
List the contents of /dir:
hadoop fs -ls /dir
File transfer
put: Upload files to HDFS
Copies one or more files from the local file system (on the EMR node where the command runs) to HDFS.
Syntax
hadoop fs -put [-f] [-p] <localsrc> <dst>
Options
| Option | Description |
|---|---|
-f |
Overwrites the destination file if it already exists. |
-p |
Preserves file access time, modification time, ownership, and permissions. |
Example
Upload hello.txt to /dir/sub-dir in HDFS:
hadoop fs -put hello.txt /dir/sub-dir
get: Download files from HDFS
Copies files or directories from HDFS to the local file system (on the EMR node where the command runs).
Syntax
hadoop fs -get [-f] [-p] <src> <localdst>
Options
| Option | Description |
|---|---|
-f |
Overwrites the destination file if it already exists. |
-p |
Preserves file access time, modification time, ownership, and permissions. |
Example
Download /dir/emptyfile.txt from HDFS to the local / path:
hadoop fs -get /dir/emptyfile.txt /
File operations
cp: Copy files or directories
Copies files or directories within HDFS.
Syntax
hadoop fs -cp [-f] URI [URI ...] <dest>
Options
| Option | Description |
|---|---|
-f |
Overwrites the destination file if it already exists. |
Example
Copy hello_world.txt from /dir/sub-dir/ to /tmp:
hadoop fs -cp /dir/sub-dir/hello_world.txt /tmp
mv: Move or rename files or directories
Moves or renames files or directories within HDFS. This is an atomic operation — when moving within the same file system, only metadata is updated, not the underlying data blocks, so the operation completes quickly.
Syntax
hadoop fs -mv URI [URI ...] <dest>
Examples
Move hello_world2.txt from /tmp/ to /dir/sub-dir/:
hadoop fs -mv /tmp/hello_world2.txt /dir/sub-dir/
Move the test directory from /tmp/ to /dir/sub-dir/:
hadoop fs -mv /tmp/test /dir/sub-dir/
rm: Delete files or directories
Deletes files or directories in HDFS. By default, deleted items are moved to the current user's Trash at /user/<username>/.Trash/.
Syntax
hadoop fs -rm [-f] [-r] [-skipTrash] URI [URI ...]
Options
| Option | Description |
|---|---|
-r |
Recursively deletes a directory and all its contents. Required when deleting a directory. |
-f |
Suppresses error messages if the specified file or directory does not exist. |
-skipTrash |
Permanently deletes the file or directory, bypassing Trash. Use with caution — this action cannot be undone. |
The hadoop fs -rmr command is deprecated. Use hadoop fs -rm -r to recursively delete directories.
Example
Delete hello_world2.txt from /dir/sub-dir/:
hadoop fs -rm /dir/sub-dir/hello_world2.txt
File viewing
cat: View file content
Prints the content of a file in HDFS to stdout. Pass multiple URIs to concatenate them.
Syntax
hadoop fs -cat URI [URI ...]
Examples
Print the content of /hello.txt:
hadoop fs -cat /hello.txt
Print the content of /dir/sub-dir/hello_world.txt:
hadoop fs -cat /dir/sub-dir/hello_world.txt
du: Display file or directory size
Shows the size of a file, or the size of each file inside a directory.
Syntax
hadoop fs -du [-s] [-h] URI [URI ...]
Options
| Option | Description |
|---|---|
-s |
Displays the aggregate total size instead of a per-file breakdown. |
-h |
Displays sizes in human-readable format (for example, 1K, 234M, 2G). |
Examples
Show the size of /hello.txt:
hadoop fs -du /hello.txt
Show the total size of all files in /dir:
hadoop fs -du /dir
Troubleshooting
Permission denied
The current user lacks read, write, or execute permission on the target path.
Check the permissions of the file or its parent directory:
hdfs dfs -ls <parent_dir>
Then have an administrator grant the necessary permissions using chmod or chown. If Kerberos is enabled, make sure you have a valid Kerberos ticket by running kinit before retrying.
SafeModeException: NameNode is in safe mode
The NameNode enters safe mode at startup and rejects write operations during this period. Wait a few minutes for it to exit automatically. Check the current status with:
hdfs dfsadmin -safemode get
Do not force an exit from safe mode unless it is an emergency.
No such file or directory
The specified path does not exist. Check for typos in the path. If you are writing to a file, make sure the parent directory exists first — use hadoop fs -mkdir -p <parent_dir> to create it.
StandbyException: Operation category READ is not supported in state standby
In a high availability (HA) setup, a request was routed to a NameNode in Standby state. Check core-site.xml and confirm that fs.defaultFS points to the HA NameService name (for example, hdfs://mycluster) rather than a specific NameNode hostname.
What's next
-
For HA cluster administration, see HDFS high availability (HA) commands (HaAdmin).
-
To migrate data between Hadoop clusters or between HDFS and Object Storage Service (OSS), see Hadoop DistCp.