All Products
Search
Document Center

MaxCompute:Spark 3.4.2 configuration

Last Updated:Mar 12, 2026

This topic describes the configurations required for using Spark 3.4.2 and 3.5.2.

Submit tasks

Use cluster mode

  • Submit tasks using the Spark client.

    Add the following parameters to specify the version. For the client, download Download Spark 3.4.2 or Download Spark 3.5.2.

    # Enable kube mode and event log
    spark.hadoop.odps.kube.mode=true
    spark.hadoop.odps.cupid.data.proxy.enable=true
    spark.hadoop.odps.cupid.fuxi.shuffle.enable=true
    
    ## for spark 3.4.2
    spark.hadoop.odps.spark.version=spark-3.4.2-odps0.48.0
    
    ## for spark 3.5.2
    spark.hadoop.odps.spark.version=spark-3.5.2-odps0.49.0
    spark.hadoop.odps.spark.libs.public.enable=true
    spark.eventLog.enabled=true
    spark.eventLog.dir=/workdir/eventlog/
    
    # For reading and writing to MaxCompute
    spark.sql.defaultCatalog=odps
    spark.sql.catalog.odps=org.apache.spark.sql.execution.datasources.v2.odps.OdpsTableCatalog
    spark.sql.sources.partitionOverwriteMode=dynamic
    spark.sql.extensions=org.apache.spark.sql.execution.datasources.v2.odps.extension.OdpsExtensions
  • Submit tasks using a DataWorks node. Add the following parameters to specify the version.

    ## for spark 3.4.2
    spark.hadoop.odps.spark.version=spark-3.4.2-odps0.48.0
    
    ## for spark 3.5.2
    spark.hadoop.odps.spark.version=spark-3.5.2-odps0.49.0

Parameter settings

Parameter Name

Value

Description

spark.sql.defaultCatalog

Set the value to odps.

spark.sql.catalog.odps

Set the value to org.apache.spark.sql.execution.datasources.v2.odps.OdpsTableCatalog.

spark.sql.sources.partitionOverwriteMode

Set the value to dynamic.

spark.sql.extensions

Set the value to org.apache.spark.sql.execution.datasources.v2.odps.extension.OdpsExtensions.

spark.sql.catalog.odps.enableNamespaceSchema

The default value is false.

If the MaxCompute project enables the schema-level syntax switch, set this to true.

spark.sql.catalog.odps.enableVectorizedReader

The default value is true.

Enable vectorized reading.

spark.sql.catalog.odps.enableVectorizedWriter

The default value is true.

Enable vectorized writing.

spark.sql.catalog.odps.splitSizeInMB

The default value is 256.

This parameter adjusts the concurrency for reading MaxCompute tables. The default value for each partition is 256 MB.

spark.sql.catalog.odps.tableReadProvider

The default value is v1.

When using local mode, set this to tunnel.

spark.sql.catalog.odps.tableWriteProvider

The default value is v1.

When using local mode, set this to tunnel.

spark.hadoop.odps.spark.alinux3.enabled

The default value is false.

Cluster mode uses the Alibaba Cloud Linux 3 (Alinux 3) base runtime image and Python 3.11.

spark.hadoop.odps.native.engine.enable

The default value is false.

In cluster mode, use Native Engine to accelerate computation. Native Engine uses the alinuX3 base image by default.