SparkConf Built-in Parameters for EMR Serverless Spark - E-MapReduce

Serverless Spark provides various built-in parameters. This topic describes these parameters and their use cases to help you configure the runtime environment and optimize task execution.

Parameter name	Description	Scenario
spark.emr.serverless.user.defined.jars	Adds uploaded JAR packages to the ClassPath of the Serverless Spark driver and executors. Supported engine versions: esr-4.x: esr-4.1.0 and later esr-3.x: esr-3.1.0 and later esr-2.x: esr-2.5.0 and later Upload methods: For esr-4.2.0, esr-3.2.0, esr-2.6.0, and later versions: You can upload files to File Management in Serverless Spark. For more information, see Manage files. On the Managed File Directory tab, click Copy Address in the Actions column for an object file to copy its address. Upload to Alibaba Cloud Object Storage Service (OSS). For more information, see Simple upload. The path format is `oss://path/to/file1.jar,oss://path/to/file2.jar`. For versions earlier than esr-4.2.0, esr-3.2.0, or esr-2.6.0, you can only upload packages to OSS. For more information, see Simple upload. The path format is `oss://path/to/file1.jar,oss://path/to/file2.jar`.	Use this parameter to add custom JAR packages from OSS to the Spark driver and executors when you submit Spark tasks using the Spark-Submit tool, batch jobs, or Airflow Serverless Spark Operator, or when you create session resources.
spark.emr.serverless.fusion	Specifies whether to enable Fusion for sessions or batch processing tasks started by Kyuubi and Livy. Valid values: false (default): shutdown true: Indicates an enabled state.	You can use the Spark Configuration parameter in a task or session to enable Fusion.
spark.emr.serverless.environmentId	Specifies the ID of the runtime environment to use for computing resources.	Use this parameter to specify a runtime environment when you submit Serverless Spark tasks using Airflow or the Spark-Submit tool. By default, third-party dependency libraries are installed in the runtime environment.
spark.emr.serverless.network.service.name	Specifies the name of the network connection to enable network connectivity between computing resources and data sources in other VPCs.	Use this parameter to add a network connection when you submit a Serverless Spark task, allowing access to data sources in other Virtual Private Clouds (VPCs).
spark.emr.serverless.excludedModules	Removes built-in libraries from Serverless Spark. For esr-2.7.0, esr-3.3.0, esr-4.3.0, and later versions, you can remove the following libraries: `paimon,hudi,iceberg,delta,celeborn,dlf,fusion,jindo,odps,mysql,doctor`. For esr-2.8.0, esr-3.4.0, esr-4.4.0, and later versions, you can also remove `kafka`.	This parameter is typically used when you need to use custom JAR packages. It lets you remove built-in Serverless Spark libraries when you submit Spark tasks from the Serverless Spark console, the Spark-Submit tool, batch jobs, Airflow Serverless Spark Operator, Kyuubi, or Livy, or when you create session resources.
spark.emr.serverless.kyuubi.engine.queue	Specifies the name of the workspace queue where the Spark application started by Kyuubi will run.	This parameter can be set in the Kyuubi configuration section or specified in the JDBC URL when you establish a connection.
spark.emr.serverless.jr.timeout	Sets the maximum runtime of a task in seconds. The task is automatically stopped if it times out. The default value is empty, which means no timeout limit is set. The value must be an integer from -1 to 2147483647. A value of -1 or 0 indicates that no timeout is set.	Use this parameter to set the task timeout when you submit a task from the Serverless Spark console, using the Spark-Submit tool, as a batch job, or with the Airflow Serverless Spark Operator.
spark.emr.serverless.fusion.enabled	Specifies whether to enable Fusion when you launch a Serverless Spark engine. Valid values: false (default): Disabled. true: Enabled.	Use this parameter to specify whether to enable Fusion acceleration when you submit a task from the Serverless Spark console, using the Spark-Submit tool, as a batch job, or with the Airflow Serverless Spark Operator.
spark.emr.serverless.mount.nas.enabled	Specifies whether to mount a NAS directory to the Spark driver. If you enable this feature, you must also use `spark.emr.serverless.mount.nas.volume` to specify the directory to mount. Supported engine versions: esr-4.x: esr-4.7.0 and later. esr-3.x: esr-3.6.0 and later. Valid values: false (default): Disabled. true: Enabled.	Use this parameter to mount a managed NAS directory to the Spark driver when you submit a task from the Serverless Spark console, using the Spark-Submit tool, as a batch job, or with the Airflow Serverless Spark Operator. After this feature is enabled, the driver can read and write files in the mounted NAS directory.
spark.emr.serverless.mount.nas.volume	Specifies the ID of the managed NAS directory to mount. Supported engine versions: esr-4.x: esr-4.7.0 and later. esr-3.x: esr-3.6.0 and later.	Use this parameter to mount a specific managed NAS directory when you submit a task from the Serverless Spark console, using the Spark-Submit tool, as a batch job, or with the Airflow Serverless Spark Operator.
spark.emr.serverless.mount.nas.executor	Specifies whether to mount a NAS directory to all Spark executors. Supported engine versions: esr-4.x: esr-4.7.0 and later. esr-3.x: esr-3.6.0 and later. Valid values: false (default): Disabled. true: Enabled.	Use this parameter to mount a managed NAS directory to Spark executors when you submit a task from the Serverless Spark console, using the Spark-Submit tool, as a batch job, or with the Airflow Serverless Spark Operator. After this feature is enabled, the executors can read and write files in the mounted NAS directory.
spark.emr.serverless.mount.oss.enabled	Specifies whether to mount an OSS directory to the Spark driver. After mounting, you must also use `spark.emr.serverless.mount.nas.volume` to specify the mount folder. The value is specified as follows: false (default): Disabled. true: Enabled.	Use this parameter to mount a managed OSS directory to the Spark driver when you submit a task from the Serverless Spark console, using the Spark-Submit tool, as a batch job, or with the Airflow Serverless Spark Operator. After this feature is enabled, the driver can read and write files in the mounted OSS directory.
spark.emr.serverless.mount.oss.volume	Specifies the ID of the managed OSS directory to mount.	Use this parameter to mount a specific managed OSS directory when you submit a task from the Serverless Spark console, using the Spark-Submit tool, as a batch job, or with the Airflow Serverless Spark Operator.
spark.emr.serverless.mount.oss.executor	Specifies whether to mount an OSS directory to all Spark executors. Valid values: false (default): Disabled. true: Enabled.	Use this parameter to mount a managed OSS directory to Spark executors when you submit a task from the Serverless Spark console, using the Spark-Submit tool, as a batch job, or with the Airflow Serverless Spark Operator. After this feature is enabled, the executors can read and write files in the mounted OSS directory.
spark.emr.serverless.templateId	Specifies the ID of the default configuration template for the Spark application. By referencing a predefined workspace template, you can simplify parameter configuration when you submit a task. You can obtain the template ID on the Operation Center > Configuration Management > Spark Configuration Templates page. For example, `TPL-2b3859f8c0c8439faddc22f223c8****`.	You can use only the Spark-Submit tool.
spark.emr.serverless.livy.config.mode	Controls whether to use the settings from the `spark-defaults.conf` file of the Livy Gateway when you submit a Spark task. If this parameter is not set (default behavior), the settings from the `spark-defaults.conf` file are automatically loaded and applied to the task. Set to `ignore` If you set this parameter to `ignore`, the configuration information in the `spark-defaults.conf` file is ignored when you submit a Spark job. In this case, the job uses only the configuration parameters that you explicitly specify or the default Spark configuration.	Set this parameter to `ignore` to fully customize the configuration of a Spark task. If you want to retain the default configuration behavior of the Livy Gateway, do not set this parameter.
spark.emr.serverless.tag.xxxx	You can add tags to a batch job submitted through Livy in the `spark.emr.serverless.tag.<key> <value>` format.	Use this parameter to add tags to Spark jobs submitted through the Livy Gateway. You can then filter jobs by these tags in the job history.