This topic describes how Spark writes data to Object Storage Service (OSS).

Background

Currently, E-MapReduce provides the following features:

  • Supports MetaService.
  • Allows you to access OSS without an AccessKey.
  • Allows you to use an AccessKey and an endpoint to access OSS.
    Note If you use an endpoint to access OSS, you need to use an internal network endpoint. For more information, see Regions and endpoints.

Allow Spark to access OSS

The following sample code shows how Spark reads data from OSS and writes the processed data back to OSS without an AccessKey:
val conf = new SparkConf().setAppName("Test OSS")
    val sc = new SparkContext(conf)
    val pathIn = "oss://bucket/path/to/read"
    val inputData = sc.textFile(pathIn)
    val cnt = inputData.count
    println(s"count: $cnt")
    val outputPath = "oss://bucket/path/to/write"
    val outpuData = inputData.map(e => s"$e has been processed.")
    outpuData.saveAsTextFile(outputPath)

Appendix

For more information, see the complete sample code on GitHub.