All Products
Search
Document Center

HDFS

Last Updated: May 19, 2022

Before you can migrate data that is stored in a Hadoop Distributed File System (HDFS) file system to a Data Transport Mini device, you must create a compute node that has at least 16 cores and 64 GB of memory. Then, you must mount the HDFS file system and the Data Transport Mini device on the compute node. This topic describes how to mount an HDFS file system and a Data Transport Mini device on a compute node.

Prerequisites

  • A compute node that has at least 16 cores and 64 GB of memory is created.

  • The compute node is connected to the network port or optical port of the Data Transport Mini device by using a network cable or a switch. Make sure that the network cable, optical-fiber cable, and optical module are connected and the port status LEDs are on.

Step 1: Configure a service IP address

  1. Log on to the device console. For more information, see Installation.

  2. Choose Console Panel > Network & Virtual Switch > Interfaces.

  3. Find the network adapter that is in the Connected state and click the edit icon.

    321
  4. On the IPv4 tab, select Use static IP address. Configure the Fixed IP address, Subnet Mask, and Default Gateway parameters based on your actual network configurations. Then, click Apply.

    Note

    You must specify this IP address when you mount the Data Transport Mini device on the compute node. Therefore, we recommend that you record this IP address.

    32

Step 2: Mount the HDFS file system on the compute node

In this example, FUSE is used to mount the HDFS file system on the compute node.

Notice

This method is used for reference only. If the HDFS file system fails to be mounted by using this method, we recommend that you use another method.

  1. Deploy CDH and download the hadoop-hdfs-fuse package.

  2. Log on to the Linux-based compute node.

  3. Run one of the following system-specific commands to install hadoop-hdfs-fuse:

    • Red Hat-compatible systems

      sudo yum install hadoop-hdfs-fuse

    • Ubuntu systems

      sudo apt-get install hadoop-hdfs-fuse

    • SUSE Linux Enterprise (SLES) systems

      sudo zypper install hadoop-hdfs-fuse

  4. Run one of the following commands to configure and test a mount directory:

    • Non-HA installation

      mkdir -p <mount_point>

      hadoop-fuse-dfs dfs://<name_node_hostname>:<namenode_port> <mount_point>

      The namenode_port parameter specifies the RPC port of NameNode.

    • HA installation

      mkdir -p <mount_point>

      hadoop-fuse-dfs dfs://<nameservice_id> <mount_point>

      The nameservice_id parameter specifies the value of fs.defaultFS.

  5. Press Ctrl+C to terminate the fuse-dfs program.

  6. Run the following command to remove the mount directory:

    umount <mount_point>

  7. Open the /etc/fstab file and add code in the following syntax at the end of the file:

    hadoop-fuse-dfs#dfs://<name_node_hostname>:<namenode_port> <mount_point> fuse allow_other,usetrash,rw 2 0

    For example, you can add the following code: hadoop-fuse-dfs#dfs://localhost:8020 /mnt/hdfs fuse allow_other,usetrash,rw 2.

    If you want to mount the HDFS file system in HA mode, you must invoke the HDFS naming service. In this case, you must specify a value for the dfs.nameservices parameter instead of the NameNode URI parameter in the hdfs-site.xml file.

  8. Run the following command to check whether the mount directory is configured and whether the system runs as expected:

    mount <mount_point>

    Grant the logon account the permissions to run the ls command to access the mount directory. Then, you can view the files in the mount directory the same way you view the files in normal system disks.

Step 3: Mount the Data Transport Mini device on the compute node

  1. Log on to the Linux-based compute node.

  2. Run the following command to create a local directory on which you want to mount the Data Transport Mini device:

    mkdir /mnt/cube

  3. Run the following command to view the shared directory of the Data Transport Mini device:

    showmount -e 172.16.0.1

    172.16.0.1 is the service IP address that you specified in Step 1. Replace the service IP address with an actual service IP address.

    If the filedata directory is displayed, the shared directory of the Data Transport Mini device can be mounted.

  4. Run the following command to mount the Data Transport Mini device on the compute node:

    mount 172.16.0.1:/filedata /mnt/cube

  5. Run the following command to view the mount result:

    df -h

    If the output that is similar to the following information appears, the mount is successful.

What to do next

After you mount the HDFS file system and the Data Transport Mini device on the compute node, you must use the migration method for Data Transport II or III to complete a migration task. For more information, see Create and run a migration task.