All Products
Search
Document Center

Object Storage Service:Connect EMR clusters to OSS-HDFS

Last Updated:Dec 21, 2023

OSS-HDFS is integrated into specific versions of Alibaba Cloud E-MapReduce (EMR) clusters. This topic describes how to connect EMR clusters to OSS-HDFS and perform common operations.

Prerequisites

Procedure

  1. Log on to the EMR cluster.

    1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

    2. Click the EMR cluster that you created.

    3. Click the Nodes tab, and then click + on the left side of the node group.

    4. Click the ID of the ECS instance. On the Instances page, click Connect next to the instance ID.

    For more information about how to log on to a cluster in Windows or Linux by using an SSH key pair or SSH password, see Log on to a cluster.

  2. Run HDFS Shell commands to perform the following common operations related to OSS-HDFS:
    • Upload objects

      Run the following command to upload a file named examplefile.txt in the local root directory to a bucket named examplebucket:

      hdfs dfs -put examplefile.txt oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/
    • Create directories

      Run the following command to create a directory named dir/ in a bucket named examplebucket:

      hdfs dfs -mkdir oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/dir/
    • Query objects or directories

      Run the following command to query the objects or directories in a bucket named examplebucket:

      hdfs dfs -ls oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/
    • Query the sizes of objects or directories

      Run the following command to query the sizes of all objects or directories in a bucket named examplebucket:

      hdfs dfs -du oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/
    • Query the object content

      Run the following command to query the content of an object named localfile.txt in a bucket named examplebucket:

      hdfs dfs -cat oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/localfile.txt
      Important The content of the queried object is displayed on the screen in plain text. If the content is encoded, use the HDFS API for Java to read and decode the content.
    • Copy objects or directories

      Run the following command to copy the root directory named subdir1 to a directory named subdir2 in a bucket named examplebucket. In addition, the position of the subdir1 root directory, the objects in the subdir1 root directory, and the structure and content of subdirectories in the subdir1 root directory remain unchanged.

      hdfs dfs -cp oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/subdir1  oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/subdir2/subdir1
    • Move objects or directories

      Run the following command to move the root directory named srcdir in a bucket named examplebucket and the objects and subdirectories in the root directory to another root directory named destdir:

      hdfs dfs -mv oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/srcdir  oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/destdir
    • Download objects

      Run the following command to download an object named exampleobject.txt from a bucket named examplebucket to a directory named /tmp in the root directory of your computer:

      hdfs dfs -get oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/exampleobject.txt  /tmp/
    • Delete directories or objects

      Run the following command to delete a directory named destfolder/ and all objects in the directory from a bucket named examplebucket:

      hdfs dfs -rm oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/destfolder/