edit-icon download-icon

Spark + OSS

Last Updated: Aug 06, 2018

Spark access to OSS

Currently,E-MapReduce supports MetaService, which allows you to access OSS without AccessKey. Entering AccessKey and Endpoint is also supported, please use internal domain names for OSS endpoints. For more details, refer to OSS Endpoint.

This example demonstrates how to read data from OSS in Spark, and write the processed data back to OSS.

  1. val conf = new SparkConf().setAppName("Test OSS")
  2. val sc = new SparkContext(conf)
  3. val pathIn = "oss://bucket/path/to/read"
  4. val inputData = sc.textFile(pathIn)
  5. val cnt = inputData.count
  6. println(s"count: $cnt")
  7. val outputPath = "oss://bucket/path/to/write"
  8. val outpuData = inputData.map(e => s"$e has been processed.")
  9. outpuData.saveAsTextFile(outputPath)

Appendix

For a complete sample code, refer to:

Thank you! We've received your feedback.