This topic describes how to develop a demo project on Spark on MaxCompute by using Java or Scala.
Download a demo project
Spark on MaxCompute provides a demo project template. We recommend that you download and copy the template to develop your application.
Run the following commands to download the demo project template:
Download and compile the Spark-1.x template
git clone https://github.com/aliyun/MaxCompute-Spark.git
cd spark-1.x
mvn clean package
Download and compile the Spark-2.x template
git clone https://github.com/aliyun/MaxCompute-Spark.git
cd spark-2.x
mvn clean package
Notice In the demo project, the scope parameter for the Spark dependency is set to provided. Do not modify this parameter. Otherwise, the submitted job will not run normally.
Spark-1.x demo project
Examples of a Spark-1.x demo project:
Spark-2.x demo project
Examples of a Spark-2.x demo project:
- WordCount example
- GraphX PageRank example
- Mllib Kmeans-ON-OSS examples
- OSS UnstructuredData example
- MaxCompute table I/O example
- Example of PySpark I/O by MaxCompute table
- PySpark writing to OSS example
- Example of supporting Spark Streaming Loghub
- Example of supporting Spark Streaming Datahub
- Example of supporting Spark Streaming Kafka