In an E-MapReduce (EMR) cluster, you can run hadoop fs commands to perform operations on files in Hadoop Distributed File System (HDFS). This topic describes the common commands of HDFS.

Background information

The following table describes the common commands of HDFS.
CommandDescription
mkdir

Creates a directory in HDFS.

touchz

Creates an empty file in HDFS.

ls

Views the information about a file or a directory in the specified path after the file or directory is created. When you run this command to view the information about a file or a directory, you must specify an absolute path.

put

Uploads a local file to the specified directory of HDFS.

du

Displays the size of a file or the size of each file stored in a directory.

cat

Views the content of a file in HDFS.

cp

Copies files or directories from the source to the destination in HDFS. The files or directories remain unchanged in the source. You can run this command to copy files from multiple source paths to a destination path. The destination path must be a directory.

mv

Moves files or directories from the source to the destination in HDFS. The files or directories are deleted from the source. You can run this command to copy files from multiple source paths to a destination path. The destination path must be a directory.

get

Downloads a file that is stored in the specified directory of HDFS to a local directory.

rm

Deletes a file from HDFS.

rmr

Recursively deletes a directory from HDFS.

For more information about Apache Hadoop, visit the Apache Hadoop official website.

mkdir

Creates a directory in HDFS.

  • Syntax:
    hadoop fs -mkdir <path1> [path2] ... [pathn]
  • Examples:
    • Create a directory named dir in HDFS.
      hadoop fs -mkdir dir

      You can run the hadoop fs -ls /command to view the created directory.

    • Create a subdirectory named sub-dir in the dir directory.
      hadoop fs -mkdir /dir/sub-dir

      You can run the hadoop fs -ls /dir/ command to view the created subdirectory.

touchz

Creates an empty file in HDFS.

  • Syntax:
    hadoop fs -touchz URI [URI ...]
  • Example: Create a file named emptyfile.txt in the /dir/ directory of HDFS.
    hadoop fs -touchz /dir/emptyfile.txt

    You can run the hadoop fs -ls /dir/ command to view the created file.

ls

Views the information about a file or a directory in the specified path after the file or directory is created. When you run this command to view the information about a file or a directory, you must specify an absolute path.

Note You cannot run this command to access the specified directory.
  • Syntax:
    hadoop fs -ls <path>
  • Example: View the information about the /dir/sub-dir directory.
    hadoop fs -ls /dir/sub-dir

put

Uploads a local file to the specified directory of HDFS.

  • Syntax:
    hadoop fs -put <path1> <path2>
  • Example: Upload a file named hello.txt to the /dir/sub-dir directory of HDFS.
    hadoop fs -put hello.txt /dir/sub-dir

    You can run the hadoop fs -ls /dir/sub-dir command to check whether the file is uploaded.

du

Displays the size of a file or the size of each file stored in a directory.

  • Syntax:
    hadoop fs -du <path>
  • Examples:
    • View the size of the hello.txt file.
      hadoop fs -du /hello.txt
    • View the size of each file stored in the /dir directory.
      hadoop fs -du /dir

cat

Views the content of a file in HDFS.

  • Syntax:
    hadoop fs -cat <path>
  • Examples:
    • View the content of the hello.txt file.
      hadoop fs -cat /hello.txt
    • View the content of the hello_world.txt file that is stored in the /dir/sub-dir/ directory.
      hadoop fs -cat /dir/sub-dir/hello_world.txt

cp

Copies files or directories from the source to the destination in HDFS. The files or directories remain unchanged in the source. You can run this command to copy files from multiple source paths to a destination path. The destination path must be a directory.

Note You can also run this command to copy and rename a file or a directory. This way, the original file or directory is retained.
  • Syntax:
    hadoop fs -cp <path1> <path2>
  • Example: Copy the hello_world.txt file from the /dir/sub-dir/ directory to the /tmp directory.
    hadoop fs -cp /dir/sub-dir/hello_world.txt /tmp

    You can run the hadoop fs -ls /tmp command to check whether the file is copied.

mv

Moves files or directories from the source to the destination in HDFS. The files or directories are deleted from the source. You can run this command to copy files from multiple source paths to a destination path. The destination path must be a directory.

  • Syntax:
    hadoop fs -mv <path1> <path2>
  • Examples:
    • Move the hello_world2.txt file from the /tmp/ directory to the /dir/sub-dir/ directory.
      hadoop fs -mv /tmp/hello_world2.txt /dir/sub-dir/

      You can run the hadoop fs -ls /dir/sub-dir/ command to check whether the file is moved.

    • Move the test directory in the /tmp/ directory to the /dir/sub-dir/ directory.
      hadoop fs -mv /tmp/test /dir/sub-dir/

      You can run the hadoop fs -ls /tmp/ command to check whether the directory is moved.

get

Downloads a file that is stored in the specified directory of HDFS to a local directory.

  • Syntax:
    hadoop fs -get <path1> <path2>
  • Example: Download the hello_world2.txt file that is stored in the /dir/sub-dir/ directory of HDFS to the local /emr directory.
    hadoop fs -get /dir/sub-dir/hello_world2.txt /emr

    You can run the ls command to check whether the file is downloaded.

rm

Deletes a file from HDFS.

  • Syntax:
    hadoop fs -rm <path>
  • Example: Delete the hello_world2.txt file from the /dir/sub-dir/ directory of HDFS.
    hadoop fs -rm /dir/sub-dir/hello_world2.txt

    You can run the hadoop fs -ls /dir/sub-dir/ command to check whether the file is deleted.

rmr

Recursively deletes a directory from HDFS.

  • Syntax:
    hadoop fs -rmr <path>
  • Example: Recursively delete the sub-dir directory in the /dir/ directory of HDFS.
     hadoop fs -rmr /dir/sub-dir/

    You can run the hadoop fs -ls /dir/ command to check whether the directory is deleted.