All Products
Search
Document Center

Lindorm:Use open source HDFS clients to connect to and use LindormDFS

Last Updated:Sep 26, 2023

This topic describes how to use open source HDFS clients to access LindormDFS.

Prerequisites

  • Java Development Kit (JDK) 1.7 or later versions are installed.

  • The IP address of your client is added to the whitelist of the Lindorm instance. For more information, see Configure whitelists.

Usage notes

If your client is deployed on an ECS instance, the ECS instance and the Lindorm instance meet the following requirements to ensure network connectivity:

  • The ECS instance and the Lindorm instance are deployed in the same region. We recommend that you also deploy the two instances in the same zone to reduce network latency.

  • The ECS instance and the Lindorm instance belong to the same virtual private cloud (VPC).

Download the client

You can download the Apache Hadoop SDK V2.7.3 package hadoop-2.7.3.tar.gz from the Apache Hadoop official site.

Configure Apache Hadoop

  1. Run the following command to decompress the downloaded SDK package:

    tar -zxvf hadoop-2.7.3.tar.gz
  2. Run the following command to configure environment variables:

    export HADOOP_HOME=/${Hadoop installation directory}/hadoop-2.7.3
  3. Run the following command to go to the hadoop directory:

    cd $HADOOP_HOME
  4. Run the following commands to add the JAVA_HOME variable to the hadoop-env.sh file in the etc/hadoop/ directory. In this example, Java is installed in the /opt/install/java directory.

    # set to the root of your Java installation
    export JAVA_HOME=/opt/install/java
  5. Modify the etc/hadoop/hdfs-site.xml file. The following sample file shows how to modify the hdfs-site.xml file. You must replace ${Instance ID} in the file with the ID of your Lindorm instance.

    <configuration>
      <property>
            <name>dfs.nameservices</name>
            <value>${Instance ID}</value>
        </property>
        <property>
           <name>dfs.client.failover.proxy.provider.${Instance ID}</name>
           <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        </property>
        <property>
           <name>dfs.ha.automatic-failover.enabled</name>
           <value>true</value>
        </property>
        <property>
           <name>dfs.ha.namenodes.${Instance ID}</name>
           <value>nn1,nn2</value>
        </property>
         <property>
           <name>dfs.namenode.rpc-address.${Instance ID}.nn1</name>
           <value>${Instance ID}-master1-001.lindorm.rds.aliyuncs.com:8020</value>
        </property>
        <property>
           <name>dfs.namenode.rpc-address.${Instance ID}.nn2</name>
           <value>${Instance ID}-master2-001.lindorm.rds.aliyuncs.com:8020</value>
        </property>
    </configuration>
Note
  • You can click File Engine in the Lindorm console. Then, click Generate Configuration to make the system automatically generate a configuration file. For more information, see Activate the LindormDFS service.

  • The preceding example shows how to configure Apache Hadoop on a single instance. The ${Instance ID} field is the ID of a single Lindorm instance. To configure Apache Hadoop on multiple instances, add multiple replicas of all <property> attributes in the example to the <configuration> attribute based on the number of the instances and replace the instance ID in each replica with the ID of an instance on which you want to configure Apache Hadoop.

Examples of common operations

  • Upload a local file.

    • Creates a directory.

    $HADOOP_HOME/bin/hadoop fs -mkdir hdfs://${Instance ID}/test
    • Prepare a file and upload the file to the created directory in LindormDFS.

    echo "test" > test.log
    $HADOOP_HOME/bin/hadoop fs -put test.log hdfs://${Instance ID}/test
  • View the uploaded file.

     $HADOOP_HOME/bin/hadoop fs -ls hdfs://${Instance ID}/test
  • Download the file to your local computer.

    $HADOOP_HOME/bin/hadoop fs -get hdfs://${Instance ID}/test/test.log
    Note

    You must replace ${Instance ID} in the preceding commands with the ID of your Lindorm instance.