All Products
Search
Document Center

MaxCompute:Spark 2.3.0 Usage

Last Updated:Mar 12, 2026

This topic describes the configuration required to use Spark 2.3.0.

Important

Use Spark 3 or later.

Submit Tasks

  • When you submit tasks using the Spark client, add the following parameters to specify the version. You can download the client.

    # Enable kube mode and event log
    spark.hadoop.odps.kube.mode=true
    spark.hadoop.odps.cupid.data.proxy.enable=true
    spark.hadoop.odps.cupid.fuxi.shuffle.enable=true
    spark.hadoop.odps.spark.version=spark-2.3.0-odps0.47.0
    spark.hadoop.odps.spark.libs.public.enable=true
    spark.eventLog.enabled=true
    spark.eventLog.dir=/workdir/eventlog/
    
    # Read and write MaxCompute
    spark.sql.catalogImplementation=odps
  • When you submit tasks using a DataWorks node, select Spark 2.x and add the following parameters to specify the version.

    spark.hadoop.odps.spark.version=spark-2.3.0-odps0.47.0

Parameter Settings

Parameter Name

Value

Description

spark.sql.catalogImplementation

odps

spark.hadoop.odps.cupid.vectorization.enable

Set to true.

When set to true, batch read/write optimization is used.

spark.hadoop.odps.input.split.size

Default value is 256.

This parameter adjusts the concurrency for reading MaxCompute tables. Each partition defaults to 256 MB.