All Products
Search
Document Center

Use open source HDFS clients to connect to and use LindormDFS

Last Updated: Jul 09, 2021

This topic describes how to use open source HDFS clients to access LindormDFS provided by ApsaraDB for Lindorm (Lindorm).

Prepare the runtime environment

  1. Run the java -version command to check the Java Development Kit (JDK) version. The JDK version cannot be earlier than JDK 1.7.

  2. Configure the environment variables, as shown in the following sample code. In the following sample code, Java is installed in the /opt/install/java directory.

JAVA_HOME=/opt/install/java
PATH=/opt/install/java/bin:$PATH

Download an HDFS client

You can download the hadoop-2.7.3.tar.gz package for the SDK of Apache Hadoop 2.7.3 from the Apache website.

Configure Hadoop

  1. Download the hadoop-2.7.3 release package.

  2. Run the tar -zxvf hadoop-2.7.3.tar.gz command to extract the downloaded SDK package.

  3. Run the export HADOOP_HOME=/installDir/hadoop-2.7.3 command to configure the environment variables.

  4. Run the cd $HADOOP_HOME command to go to the Hadoop directory.

  5. Modify the etc/hadoop/hadoop-env.sh file and add JAVA_HOME that is specified in the Prepare the runtime environment section. In the following sample code, Java is installed in the /opt/install/java directory.

    # set to the root of your Java installation
    export JAVA_HOME=/opt/install/java
  6. Modify the etc/hadoop/core-site.xml file. The content to modify in the core-site.xml file is displayed in the following sample code. You must replace the Instance ID value in ${Instance ID} with the actual instance ID.

    <configuration>
      <property>
         <name>fs.defaultFS</name>
         <value>hdfs://${Instance ID}</value>
      </property>
    </configuration>
  7. Modify the etc/hadoop/hdfs-site.xml file. The content to modify in the hdfs-site.xml file is displayed in the following sample code. You must replace the Instance ID value in ${Instance ID} with the actual instance ID.

    <configuration>
      <property>
            <name>dfs.nameservices</name>
            <value>${Instance ID}</value>
        </property>
        <property>
           <name>dfs.client.failover.proxy.provider.${Instance ID}</name>
           <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        </property>
        <property>
           <name>dfs.ha.automatic-failover.enabled</name>
           <value>true</value>
        </property>
        <property>
           <name>dfs.ha.namenodes.${Instance ID}</name>
           <value>nn1,nn2</value>
        </property>
         <property>
           <name>dfs.namenode.rpc-address.${Instance ID}.nn1</name>
           <value>${Instance ID}-master1-001.lindorm.rds.aliyuncs.com:8020</value>
        </property>
        <property>
           <name>dfs.namenode.rpc-address.${Instance ID}.nn2</name>
           <value>${Instance ID}-master2-001.lindorm.rds.aliyuncs.com:8020</value>
        </property>
    </configuration>
Note

You can click File Engine in the Lindorm console. Then, click Generate Configuration so that the system can automatically generate a configuration file. For more information, see Activate the LindormDFS service.

Examples of common operations

  1. Upload a local file.

    • Create a directory.

    $HADOOP_HOME/bin/hadoop fs -mkdir /test
    • Prepare a file and upload it to LindormDFS.

    echo "test" > test.log
    $HADOOP_HOME/bin/hadoop fs -put test.log /test
  2. View the uploaded file.

     $HADOOP_HOME/bin/hadoop fs -ls /test
  3. Download the file to your local device.

    $HADOOP_HOME/bin/hadoop fs -get /test/test.log
Notice

The nodes that access LindormDFS must be added to the whitelist of Lindorm. For more information, see Configure a whitelist.