This topic describes how to get started with Alibaba Cloud Object Storage Service (OSS) or OSS-HDFS.
Prerequisites
- OSS is activated. For more information, see Activate OSS.
- An OSS bucket is created. For more information, see Create buckets.
- Your account is granted the permissions to access OSS.
- By default, your account is granted the required permissions if you use an E-MapReduce (EMR) cluster in the new EMR console. For more information, see Assign roles.
- By default, your account is granted the required permissions if you use an EMR cluster in the old EMR console. For more information, see Assign roles.
- For information about how to grant the required permissions to a user who does not use an EMR cluster, see Grant access to OSS or OSS-HDFS.
- (Optional) OSS-HDFS is activated and the permissions to access OSS-HDFS are granted. We recommend that you activate OSS-HDFS.
- JindoSDK is deployed.
- In EMR clusters, JindoSDK is automatically deployed. Note To access OSS-HDFS, you must create a cluster of EMR V3.42.0 or a later minor version, or EMR V5.8.0 or a later minor version.
- If you do not use an EMR cluster, you must manually deploy JindoSDK. For more information, see Deploy JindoSDK in an environment other than EMR. Note To access OSS-HDFS, you must deploy JindoSDK 4.X or later.
- In EMR clusters, JindoSDK is automatically deployed.
Path description
The methods that you can use to access OSS and OSS-HDFS are the same. However, the endpoints in the access paths are different. The following table describes sample access paths.
| Storage system | Sample root path | Description |
|---|---|---|
| OSS | oss://examplebucket.oss-cn-shanghai-internal.aliyuncs.com/ | An OSS bucket named examplebucket is created in the China (Shanghai) region. You can access the OSS bucket by using an internal endpoint. Note If you do not assign public IP addresses to nodes in an EMR cluster, you cannot access OSS by using a public endpoint. This indicates that you cannot access OSS across regions. |
| OSS-HDFS | oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/ | An OSS-HDFS bucket named examplebucket is created in the China (Shanghai) region. Note You can access OSS-HDFS only by using a private IP address. This indicates that you cannot access OSS-HDFS across regions. |
Access methods
You can access OSS or OSS-HDFS in the OSS console or by running Hadoop Shell commands, Jindo CLI commands, Portable Operating System Interface (POSIX) commands. The following table describes the access methods.
| Access method | Example | Description |
|---|---|---|
| Hadoop Shell command | hadoop fs -ls oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/ | JindoOssFileSystem in JindoSDK is an implementation of Hadoop FileSystem. When you run a Hadoop Shell command, the endpoint in the path is used to access OSS or OSS-HDFS. For more information, see Use Hadoop Shell commands to access OSS or OSS-HDFS. |
| Jindo CLI command | jindo fs -ls oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/ | You can run Jindo CLI commands to access OSS or OSS-HDFS in a similar manner as you run Hadoop Shell commands. You can also run Jindo CLI commands to perform other operations, such as archiving, caching, and analyzing errors. For more information, see Use Jindo CLI commands to access OSS or OSS-HDFS. |
| POSIX command | mkdir -p /mnt/oss jindo-fuse /mnt/oss -ouri=oss://examplebucket.cn-shanghai.oss-dls.aliyuncs.com/ ls /mnt/oss | JindoFuse can attach the path of OSS or OSS-HDFS to an on-premises path by calling the FUSE API. This way, you can access OSS or OSS-HDFS in the same manner as you access an on-premises file. For more information, see Use POSIX commands to access OSS or OSS-HDFS. |
| OSS console | ![]() | To access OSS or OSS_HDFS, you can perform the following operations:
|
