This topic describes how to use Java APIs to perform operations on Hadoop Distributed File System (HDFS).
Background information
Initialize HDFS
Before you use an API provided by HDFS, you must initialize HDFS. When you initialize HDFS, you must load the core-site.xml and hdfs-site.xml configuration files of HDFS.
private void init() throws IOException {
conf = new Configuration();
// conf path for core-site.xml and hdfs-site.xml
conf.addResource(new Path(PATH_TO_HDFS_SITE_XML));
conf.addResource(new Path(PATH_TO_CORE_SITE_XML));
fSystem = FileSystem.get(conf);
}
After HDFS is initialized, you can use various APIs provided by HDFS to develop data.
Create a directory
- If the directory already exists, true is returned.
- If the directory does not exist, use the mkdirs() method in the FileSystem class to create a directory.
/**
* create directory path
*
* @param dirPath
* @return
* @throws java.io.IOException
*/
private boolean createPath(final Path dirPath) throws IOException {
if (!fSystem.exists(dirPath)) {
fSystem.mkdirs(dirPath);
}
return true;
}
Write data to a file in HDFS
To write data to a file in HDFS, perform the following steps: Use the create() method in the FileSystem class to obtain an output stream. Then, use the output stream to write data to a specified file in HDFS. After you write data to the specified file, use the close() method to close the output stream.
/**
* Create a file and write data to the file.
*
* @throws java.io.IOException
*/
private void createAndWrite() throws IOException {
final String content = "Hello HDFS!";
FSDataOutputStream out = null;
try {
out = fSystem.create(new Path(DEST_PATH + File.separator + FILE_NAME));
out.write(content.getBytes());
out.hsync();
LOG.info("success to write.");
} finally {
// make sure the stream is closed finally.
out.close();
}
}
Append content to a file
/**
* Append content.
*
* @throws java.io.IOException
*/
private void appendContents() throws IOException {
final String content = "Hello Hello";
FSDataOutputStream out = null;
try {
out = fSystem.append(new Path(DEST_PATH + File.separator + FILE_NAME));
out.write(content.getBytes());
out.hsync();
LOG.info("success to append.");
} finally {
// make sure the stream is closed finally.
out.close();
}
}
Read data from a file in HDFS
To read data from a specified file in HDFS, use the open() method in the FileSystem class to obtain an input stream. Then, use the input stream to read data from a specified file in HDFS. After you read data from the file, use the close() method to close the input stream.
private void read() throws IOException {
String strPath = DEST_PATH + File.separator + FILE_NAME;
Path path = new Path(strPath);
FSDataInputStream in = null;
BufferedReader reader = null;
StringBuffer strBuffer = new StringBuffer();
try {
in = fSystem.open(path);
reader = new BufferedReader(new InputStreamReader(in));
String sTempOneLine;
// write file
while ((sTempOneLine = reader.readLine()) != null) {
strBuffer.append(sTempOneLine);
}
LOG.info("result is : " + strBuffer.toString());
LOG.info("success to read.");
} finally {
// make sure the streams are closed finally.
IOUtils.closeStream(reader);
IOUtils.closeStream(in);
}
}
Delete a directory
private boolean deletePath(final Path dirPath) throws IOException {
if (!fSystem.exists(dirPath)) {
return false;
}
// fSystem.delete(dirPath, true);
return fSystem.delete(dirPath, true);
}
Delete a file
private void deleteFile() throws IOException {
Path beDeletedPath = new Path(DEST_PATH + File.separator + FILE_NAME);
if (fSystem.delete(beDeletedPath, true)) {
LOG.info("success to delete the file " + DEST_PATH + File.separator + FILE_NAME);
} else {
LOG.warn("failed to delete the file " + DEST_PATH + File.separator + FILE_NAME);
}
}
Move or rename a file
private void renameFile() throws IOException {
Path srcFilePath = new Path(SRC_PATH + File.separator + SRC_FILE_NAME);
Path destFilePath = new Path(DEST_PATH + File.separator + DEST_FILE_NAME);
fs.rename(new Path(srcFilePath), new Path(destFilePath));
}
Move or rename a directory
private void renameDir() throws IOException {
Path srcDirPath = new Path(SRC_PATH + File.separator + SRC_DIR_NAME);
Path destDirPath = new Path(DEST_PATH + File.separator + DEST_DIR_NAME);
fs.rename(new Path(srcDirPath), new Path(destDirPath));
}