This topic describes how to use Java APIs to perform operations on Hadoop Distributed File System (HDFS).

Background information

Initialize HDFS

Before you use an API provided by HDFS, you must initialize HDFS. When you initialize HDFS, you must load the core-site.xml and hdfs-site.xml configuration files of HDFS.

Sample code to initialize HDFS:
 private void init() throws IOException {
   conf = new Configuration();
   // conf path for core-site.xml and hdfs-site.xml
   conf.addResource(new Path(PATH_TO_HDFS_SITE_XML));
   conf.addResource(new Path(PATH_TO_CORE_SITE_XML));
   fSystem = FileSystem.get(conf);
 }

After HDFS is initialized, you can use various APIs provided by HDFS to develop data.

Create a directory

If you want to create a directory in HDFS, you must use the exists() method in the FileSystem class to check whether the directory already exists.
  • If the directory already exists, true is returned.
  • If the directory does not exist, use the mkdirs() method in the FileSystem class to create a directory.
Sample code to create a directory:
/**
 * create directory path
 *
 * @param dirPath
 * @return
 * @throws java.io.IOException
 */
private boolean createPath(final Path dirPath) throws IOException {
    if (!fSystem.exists(dirPath)) {
        fSystem.mkdirs(dirPath);
    }
    return true;
}

Write data to a file in HDFS

To write data to a file in HDFS, perform the following steps: Use the create() method in the FileSystem class to obtain an output stream. Then, use the output stream to write data to a specified file in HDFS. After you write data to the specified file, use the close() method to close the output stream.

Sample code to write data to a file:
/**
 * Create a file and write data to the file.
 *
 * @throws java.io.IOException
 */
private void createAndWrite() throws IOException {
   final String content = "Hello HDFS!";
   FSDataOutputStream out = null;
   try {
       out = fSystem.create(new Path(DEST_PATH + File.separator + FILE_NAME));
       out.write(content.getBytes());
       out.hsync();
       LOG.info("success to write.");
   } finally {
    // make sure the stream is closed finally.
       out.close();
  }
}

Append content to a file

You can append specified content to the end of an existing file in HDFS. To append content to a file, perform the following steps: Use the append() method in the FileSystem class to obtain an output stream. Then, use the output stream to append the specified content to the end of the file. After you append content to the file, use the close() method to close the output stream.
Notice Make sure that the file to which you want to append content already exists and no other content is being written to the file. Otherwise, the content that you want to append to the file fails to be written and an exception is thrown.
Sample code to append specified content to a file:
/**
 * Append content.
 *
 * @throws java.io.IOException
 */
private void appendContents() throws IOException {
    final String content = "Hello Hello";
    FSDataOutputStream out = null;
    try {
        out = fSystem.append(new Path(DEST_PATH + File.separator + FILE_NAME));
        out.write(content.getBytes());
        out.hsync();
        LOG.info("success to append.");
    } finally {
        // make sure the stream is closed finally.
        out.close();
    }
}

Read data from a file in HDFS

To read data from a specified file in HDFS, use the open() method in the FileSystem class to obtain an input stream. Then, use the input stream to read data from a specified file in HDFS. After you read data from the file, use the close() method to close the input stream.

Sample code to read data from a file:
private void read() throws IOException {
    String strPath = DEST_PATH + File.separator + FILE_NAME;
    Path path = new Path(strPath);
    FSDataInputStream in = null;
    BufferedReader reader = null;
    StringBuffer strBuffer = new StringBuffer();
    try {
        in = fSystem.open(path);
        reader = new BufferedReader(new InputStreamReader(in));
        String sTempOneLine;
        // write file
        while ((sTempOneLine = reader.readLine()) != null) {
            strBuffer.append(sTempOneLine);
        }
        LOG.info("result is : " + strBuffer.toString());
        LOG.info("success to read.");
    } finally {
        // make sure the streams are closed finally.
        IOUtils.closeStream(reader);
        IOUtils.closeStream(in);
    }
}

Delete a directory

You can use the delete() method to delete a specified directory in HDFS. The dirPath parameter specifies whether to recursively delete all subdirectories in the specified directory. If this parameter is set to false and some files or subdirectories still exist in the directory, the directory fails to be deleted.
Notice After you perform this operation, the specified directory is deleted and cannot be recovered. Proceed with caution.
Sample code to delete a directory:
private boolean deletePath(final Path dirPath) throws IOException {
    if (!fSystem.exists(dirPath)) {
        return false;
    }
    // fSystem.delete(dirPath, true);
    return fSystem.delete(dirPath, true);
}

Delete a file

You can use the delete() method to delete a specified file in HDFS.
Notice After you perform this operation, the specified file is deleted and cannot be recovered. Proceed with caution.
Sample code to delete a file:
private void deleteFile() throws IOException {
    Path beDeletedPath = new Path(DEST_PATH + File.separator + FILE_NAME);
    if (fSystem.delete(beDeletedPath, true)) {
        LOG.info("success to delete the file " + DEST_PATH + File.separator + FILE_NAME);
    } else {
        LOG.warn("failed to delete the file " + DEST_PATH + File.separator + FILE_NAME);
    }
}

Move or rename a file

In HDFS, renaming a file and moving a file are essentially the same operation. You can use the rename() method in the FileSystem class to rename a file.
Notice You cannot rename a file to an existing name. Otherwise, the rename operation fails.
Sample code to move or rename a file:
private void renameFile() throws IOException {
    Path srcFilePath = new Path(SRC_PATH + File.separator + SRC_FILE_NAME);
    Path destFilePath = new Path(DEST_PATH + File.separator + DEST_FILE_NAME);
        fs.rename(new Path(srcFilePath), new Path(destFilePath));
}

Move or rename a directory

In HDFS, renaming a directory and moving a directory are essentially the same operation. You can use the rename() method in the FileSystem class to rename a directory.
Notice You cannot rename a directory to an existing name. Otherwise, the rename operation fails.
Sample code to move or rename a directory:
private void renameDir() throws IOException {
    Path srcDirPath = new Path(SRC_PATH + File.separator + SRC_DIR_NAME);
    Path destDirPath = new Path(DEST_PATH + File.separator + DEST_DIR_NAME);
        fs.rename(new Path(srcDirPath), new Path(destDirPath));
}