This topic describes how to access OSS data.
Background information
In EMR, Spark and Hadoop are seamlessly compatible with OSS. You can manage OSS files
in the same way as you manage files in HDFS. You can use one of the following methods
to access OSS data:
- (Recommended) Access OSS data without an AccessKey pair
- Explicitly enter an AccessKey pair
Access OSS data without an AccessKey pair
[Scala]
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{ Path, FileSystem}
val dir = "oss://bucket/dir"
val path = new Path(dir)
val conf = new Configuration()
conf.set("fs.oss.impl", "com.aliyun.emr.fs.oss.JindoOssFileSystem")
val fs = FileSystem.get(path.toUri, conf)
val fileList = fs.listStatus(path)
...
[Java]
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
String dir = "oss://bucket/dir";
Path path = new Path(dir);
Configuration conf = new Configuration();
conf.set("fs.oss.impl", "com.aliyun.emr.fs.oss.JindoOssFileSystem");
FileSystem fs = FileSystem.get(path.toUri(), conf);
FileStatus[] fileList = fs.listStatus(path);
...