OSS-HDFS is built into specific versions of Alibaba Cloud E-MapReduce (EMR) clusters, letting you read from and write to OSS-HDFS using standard Hadoop Distributed File System (HDFS) Shell commands.
Note: If you use a self-managed Hadoop cluster, follow the non-EMR connection method instead. For more information, see Connect non-EMR clusters to OSS-HDFS.
Prerequisites
Before you begin, ensure that you have:
An OSS bucket with OSS-HDFS enabled, and a Resource Access Management (RAM) role granted access to OSS-HDFS. For more information, see Enable OSS-HDFS and grant access permissions
The required permissions to connect EMR clusters to OSS-HDFS. By default, an Alibaba Cloud account has these permissions. If you use a RAM user, grant the RAM user the required permissions first. For more information, see Grant a RAM user permissions to connect EMR clusters to OSS-HDFS
Connect an EMR cluster to OSS-HDFS
Step 1: Create an EMR cluster
Log on to the E-MapReduce console. In the left-side navigation pane, click EMR on ECS.
Create an EMR cluster with the following settings: Use the default values for all other parameters. For more information, see Create a cluster.
Setting Required value Product Version EMR-3.46.2 or later, or EMR-5.12.2 or later Root Storage Directory of Cluster The OSS-HDFS-enabled bucket
Step 2: Log on to the EMR cluster
Click the cluster you created.
Click the Nodes tab, then click
to the left of the node group.Click the Elastic Compute Service (ECS) instance ID. On the Instances page, click Connect next to the instance ID to log on via Workbench.
To log on using an SSH key pair or SSH password on Windows or Linux, see Log on to a cluster.
Step 3: Run HDFS Shell commands
Use HDFS Shell commands to read from and write to OSS-HDFS. The OSS-HDFS endpoint format is:
oss://<bucket-name>.<region-id>.oss-dls.aliyuncs.com/Upload a local file
Run the following command to upload examplefile.txt from the local root directory to examplebucket:
hdfs dfs -put examplefile.txt oss://examplebucket.cn-hangzhou.oss-dls.aliyuncs.com/Download an object
Run the following command to download exampleobject.txt from examplebucket to the local /tmp/ directory:
hdfs dfs -get oss://examplebucket.cn-hangzhou.oss-dls.aliyuncs.com/exampleobject.txt /tmp/For a full list of supported HDFS Shell commands, see Use Hadoop Shell commands to access OSS-HDFS.