In an E-MapReduce (EMR) cluster, you can run hadoop fs commands to perform operations on files in Hadoop Distributed File System (HDFS). This topic describes the common commands of HDFS.
Background information
Command | Description |
---|---|
mkdir | Creates a directory in HDFS. |
touchz | Creates an empty file in HDFS. |
ls | Views the information about a file or a directory in the specified path after the file or directory is created. When you run this command to view the information about a file or a directory, you must specify an absolute path. |
put | Uploads a local file to the specified directory of HDFS. |
du | Displays the size of a file or the size of each file stored in a directory. |
cat | Views the content of a file in HDFS. |
cp | Copies files or directories from the source to the destination in HDFS. The files or directories remain unchanged in the source. You can run this command to copy files from multiple source paths to a destination path. The destination path must be a directory. |
mv | Moves files or directories from the source to the destination in HDFS. The files or directories are deleted from the source. You can run this command to copy files from multiple source paths to a destination path. The destination path must be a directory. |
get | Downloads a file that is stored in the specified directory of HDFS to a local directory. |
rm | Deletes a file from HDFS. |
rmr | Recursively deletes a directory from HDFS. |
For more information about Apache Hadoop, visit the Apache Hadoop official website.
mkdir
Creates a directory in HDFS.
- Syntax:
hadoop fs -mkdir <path1> [path2] ... [pathn]
- Examples:
- Create a directory named dir in HDFS.
hadoop fs -mkdir dir
You can run the
hadoop fs -ls /
command to view the created directory. - Create a subdirectory named sub-dir in the dir directory.
hadoop fs -mkdir /dir/sub-dir
You can run the
hadoop fs -ls /dir/
command to view the created subdirectory.
- Create a directory named dir in HDFS.
touchz
Creates an empty file in HDFS.
- Syntax:
hadoop fs -touchz URI [URI ...]
- Example: Create a file named emptyfile.txt in the /dir/ directory of HDFS.
hadoop fs -touchz /dir/emptyfile.txt
You can run the
hadoop fs -ls /dir/
command to view the created file.
ls
Views the information about a file or a directory in the specified path after the file or directory is created. When you run this command to view the information about a file or a directory, you must specify an absolute path.
- Syntax:
hadoop fs -ls <path>
- Example: View the information about the /dir/sub-dir directory.
hadoop fs -ls /dir/sub-dir
put
Uploads a local file to the specified directory of HDFS.
- Syntax:
hadoop fs -put <path1> <path2>
- Example: Upload a file named hello.txt to the /dir/sub-dir directory of HDFS.
hadoop fs -put hello.txt /dir/sub-dir
You can run the
hadoop fs -ls /dir/sub-dir
command to check whether the file is uploaded.
du
Displays the size of a file or the size of each file stored in a directory.
- Syntax:
hadoop fs -du <path>
- Examples:
- View the size of the hello.txt file.
hadoop fs -du /hello.txt
- View the size of each file stored in the /dir directory.
hadoop fs -du /dir
- View the size of the hello.txt file.
cat
Views the content of a file in HDFS.
- Syntax:
hadoop fs -cat <path>
- Examples:
- View the content of the hello.txt file.
hadoop fs -cat /hello.txt
- View the content of the hello_world.txt file that is stored in the /dir/sub-dir/ directory.
hadoop fs -cat /dir/sub-dir/hello_world.txt
- View the content of the hello.txt file.
cp
Copies files or directories from the source to the destination in HDFS. The files or directories remain unchanged in the source. You can run this command to copy files from multiple source paths to a destination path. The destination path must be a directory.
- Syntax:
hadoop fs -cp <path1> <path2>
- Example: Copy the hello_world.txt file from the /dir/sub-dir/ directory to the /tmp directory.
hadoop fs -cp /dir/sub-dir/hello_world.txt /tmp
You can run the
hadoop fs -ls /tmp
command to check whether the file is copied.
mv
Moves files or directories from the source to the destination in HDFS. The files or directories are deleted from the source. You can run this command to copy files from multiple source paths to a destination path. The destination path must be a directory.
- Syntax:
hadoop fs -mv <path1> <path2>
- Examples:
- Move the hello_world2.txt file from the /tmp/ directory to the /dir/sub-dir/ directory.
hadoop fs -mv /tmp/hello_world2.txt /dir/sub-dir/
You can run the
hadoop fs -ls /dir/sub-dir/
command to check whether the file is moved. - Move the test directory in the /tmp/ directory to the /dir/sub-dir/ directory.
hadoop fs -mv /tmp/test /dir/sub-dir/
You can run the
hadoop fs -ls /tmp/
command to check whether the directory is moved.
- Move the hello_world2.txt file from the /tmp/ directory to the /dir/sub-dir/ directory.
get
Downloads a file that is stored in the specified directory of HDFS to a local directory.
- Syntax:
hadoop fs -get <path1> <path2>
- Example: Download the hello_world2.txt file that is stored in the /dir/sub-dir/ directory of HDFS to the local /emr directory.
hadoop fs -get /dir/sub-dir/hello_world2.txt /emr
You can run the
ls
command to check whether the file is downloaded.
rm
Deletes a file from HDFS.
- Syntax:
hadoop fs -rm <path>
- Example: Delete the hello_world2.txt file from the /dir/sub-dir/ directory of HDFS.
hadoop fs -rm /dir/sub-dir/hello_world2.txt
You can run the
hadoop fs -ls /dir/sub-dir/
command to check whether the file is deleted.
rmr
Recursively deletes a directory from HDFS.
- Syntax:
hadoop fs -rmr <path>
- Example: Recursively delete the sub-dir directory in the /dir/ directory of HDFS.
hadoop fs -rmr /dir/sub-dir/
You can run the
hadoop fs -ls /dir/
command to check whether the directory is deleted.