edit-icon download-icon

Simple operations on OSS files

Last Updated: Aug 06, 2018

Dependency conflict when using OSS SDK

If you cannot directly perform operations on the files in OSS through OSS SDK in Spark or Hadoop jobs, it is because there are conflicts between the http-client-4.4.x version that OSS SDK is dependent on and the http-client version in your Spark or Hadoop running environment. If such operations are required, you must solve the dependency conflicts first. By selecting E-MapReduce, Spark and Hadoop are already seamlessly compatibile with OSS, and you can operate on OSS files as you use HDFS.

The Internet endpoints of OSS are only needed when you perform local testing, so that you can access OSS data locally.

All endpoints of OSS can be found here.

Follow these steps to query files under OSS directories:

  1. [Scala]
  2. import org.apache.hadoop.conf.Configuration
  3. import org.apache.hadoop.fs.{Path, FileSystem}
  4. val dir = "oss://bucket/dir"
  5. val path = new Path(dir)
  6. val conf = new Configuration()
  7. conf.set("fs.oss.impl", "com.aliyun.fs.oss.nat.NativeOssFileSystem")
  8. val fs = FileSystem.get(path.toUri, conf)
  9. val fileList = fs.listStatus(path)
  10. ...
  11. [Java]
  12. import org.apache.hadoop.conf.Configuration;
  13. import org.apache.hadoop.fs.Path;
  14. import org.apache.hadoop.fs.FileStatus;
  15. import org.apache.hadoop.fs.FileSystem;
  16. String dir = "oss://bucket/dir";
  17. Path path = new Path(dir);
  18. Configuration conf = new Configuration();
  19. conf.set("fs.oss.impl", "com.aliyun.fs.oss.nat.NativeOssFileSystem");
  20. FileSystem fs = FileSystem.get(path.toUri(), conf);
  21. FileStatus[] fileList = fs.listStatus(path);
  22. ...
Thank you! We've received your feedback.