What are the Spark configuration parameters and how to configure them - AnalyticDB

AnalyticDB for MySQL Spark configuration parameters are similar to those of Apache Spark. This topic describes the configuration parameters of AnalyticDB for MySQL that differ from those of Apache Spark.

Usage notes

Spark application configuration parameters are used to configure and adjust the behavior and performance of Spark applications. The format of these parameters varies based on the Spark development tools you use.

Development tool	Configuration parameter format	Configuration Example
SQL editor	set key=value;	`set spark.sql.hive.metastore.version=adb;`
Spark Jar editor	"key": "value"	`"spark.sql.hive.metastore.version":"adb"`
Notebook editor	"key": "value"	`"spark.sql.hive.metastore.version":"adb"`
spark-submit command line interface	key=value	`spark.sql.hive.metastore.version=adb`

Specify Driver and Executor resources

Parameter	Required	Default value	Description	Corresponding parameter in Apache Spark
spark.adb.acuPerApp	No	None	The number of ACUs used by a single Spark job. Valid values: [2, Maximum computing resources of a job resource group]. After you configure this parameter, the system automatically calculates and configures the Spark Driver specifications, Spark Executor specifications, and the number of Spark Executor nodes. Click to view the configuration strategy for spark.adb.acuPerApp When you configure both spark.adb.acuPerApp and all other resource parameters (including spark.driver.resourceSpec, spark.executor.resourceSpec, and spark.executor.instances), spark.adb.acuPerApp is invalid and the values of other resource parameters remain unchanged. When you configure only spark.adb.acuPerApp, spark.adb.acuPerApp is valid and all other resource parameters are automatically calculated and configured based on spark.adb.acuPerApp. In other configuration combinations, spark.adb.acuPerApp is valid, and the system automatically calculates the resource parameters that are not explicitly configured (spark.driver.resourceSpec, spark.executor.resourceSpec, and spark.executor.instances) based on this parameter. The values of explicitly configured parameters remain unchanged.	N/A
spark.driver.resourceSpec	Yes	medium	The resource specifications of the Spark driver. Each type corresponds to distinct specifications. For more information, see the Type column in the Spark resource specifications table of this topic. Important If you submit Spark applications, you can use Apache Spark parameters and configure the parameters based on the values of cores and memory that are described in the Spark resource specifications table of this topic. Example: `CONF spark.driver.resourceSpec = c.small;`. In this example, the Spark driver provides 1 core and 2 GB memory.	spark.driver.cores and spark.driver.memory
spark.executor.resourceSpec	Yes	medium	The resource specifications of each Spark executor. Each type corresponds to distinct specifications. For more information, see the Type column in the Spark resource specifications table of this topic. Important If you submit Spark applications, you can use Apache Spark parameters and configure the parameters based on the values of cores and memory that are described in the Spark resource specifications table of this topic. Example: `CONF spark.executor.resourceSpec = c.small;`. In this example, each Spark executor provides 1 core and 2 GB memory.	spark.executor.cores and spark.executor.memory
spark.executor.instances	No	Maximum computing resources of a job resource group/5	The number of started Spark executors.	spark.executor.instances
spark.adb.driverDiskSize	No	None	The size of additional disk storage that is mounted on the Spark driver to meet large disk storage requirements. By default, the additional disk storage is mounted on the /user_data_dir directory. Unit: GiB. Valid values: (0,100]. Example: spark.adb.driverDiskSize=50Gi. In this example, the additional disk storage that is mounted on the Spark driver is set to 50 GiB.	N/A
spark.adb.executorDiskSize	No	None	The size of additional disk storage that is mounted on a Spark executor to meet the requirements of shuffle operations. By default, the additional disk storage is mounted on the /shuffle_volume directory. Unit: GiB. Valid values: (0,100]. Example: spark.adb.executorDiskSize=50Gi. In this example, the additional disk storage that is mounted on a Spark executor is set to 50 GiB.	N/A

Spark resource specifications

Important

You can use reserved resources or elastic resources to execute Spark jobs. If you use the on-demand elastic resources of a job resource group to execute Spark jobs, the system calculates the number of used AnalyticDB compute units (ACUs) based on the Spark resource specifications and the CPU-to-memory ratio using the following formulas:

1:2 CPU-to-memory ratio: Number of used ACUs = Number of CPU cores × 0.8.
1:4 CPU-to-memory ratio: Number of used ACUs = Number of CPU cores × 1.
1:8 CPU-to-memory ratio: Number of used ACUs = Number of CPU cores × 1.5.

For information about the prices of on-demand elastic resources, see Pricing for Data Lakehouse Edition.

Table 1. Spark resource specifications

Type	Specifications			Used ACUs
Type	CPU cores	Memory (GB)	Disk storage¹ (GB)	Used ACUs
c.small	1	2	20	0.8
small	1	4	20	1
m.small	1	8	20	1.5
c.medium	2	4	20	1.6
medium	2	8	20	2
m.medium	2	16	20	3
c.large	4	8	20	3.2
large	4	16	20	4
m.large	4	32	20	6
c.xlarge	8	16	20	6.4
xlarge	8	32	20	8
m.xlarge	8	64	20	12
c.2xlarge	16	32	20	12.8
2xlarge	16	64	20	16
m.2xlarge	16	128	20	24
m.4xlarge	32	256	20	48
m.8xlarge	64	512	20	96

Note

¹Disk storage: The system is expected to occupy approximately 1% of the disk storage. The actual available disk storage may be less than 20 GB.

Example

Allocate 32 Executors for a Spark job, with each Executor having a specification of medium (2 cores 8 GB), and each Driver having a specification of small (1 core 4 GB). In this case, the entire job can allocate a total of 65 ACUs of computing resources.

{
   "spark.driver.resourceSpec":"small",
   "spark.executor.resourceSpec":"medium",
   "spark.executor.instances":"32",
   "spark.adb.executorDiskSize":"100Gi"
}

Specify priorities for Spark jobs

Parameter

Required

Default value

Description

spark.adb.priority

NORMAL

The priority of a Spark job. If resources are insufficient to execute all Spark jobs that are submitted, the queued jobs that have higher priorities are first executed. Valid values:

HIGH: High priority.
NORMAL: Normal priority.
LOW: Low priority.
LOWEST: Lowest priority.

Important

We recommend that you set this parameter to HIGH for streaming Spark jobs (long-running stream jobs).

Access the metadata

Parameter

Required

Default value

Description

spark.sql.catalogImplementation

Spark SQL jobs: hive.
Non-Spark SQL jobs: in-memory.

The type of the metadata to be accessed. Valid values:

hive: the metadata in the built-in Hive Metastore of Apache Spark.
in-memory: the metadata in the temporary directory.

spark.sql.hive.metastore.version

Spark SQL jobs: adb.
Non-Spark SQL jobs: <hive_version>.

The version of the metastore service. Valid values:

adb: Connects to the metadata of AnalyticDB for MySQL.
<hive_version>: the version of the Hive Metastore.

Note

For information about the Hive versions that are supported by Apache Spark, see Spark Configuration.
To access a self-managed Hive Metastore, you can replace the default configuration with the standard Apache Spark configuration. For more information, see Spark Configuration.

Examples

Configure the following setting to access the metadata in AnalyticDB for MySQL:
```
spark.sql.hive.metastore.version=adb;
```
Configure the following settings to access the metadata in the built-in Hive Metastore of Apache Spark:
```
spark.sql.catalogImplementation=hive;
spark.sql.hive.metastore.version=2.1.3;
```
Configure the following setting to access the metadata in the temporary directory:
```
spark.sql.catalogImplementation=in-memory;
```

Configure the Spark UI

Parameter	Required	Default value	Description
spark.app.log.rootPath	No	`oss://<aliyun-oa-adb-spark-Alibaba Cloud account ID-oss-Zone ID>/<Cluster ID>/<Spark application ID>`	The directory in which the AnalyticDB for MySQL Spark job logs and output data of the Linux operating system are stored. By default, the folder named Spark application ID contains the following content: The file named `Spark application ID-000X` that stores the Spark event logs used for Spark UI rendering. The folders named `driver` and numbers that store the logs of the corresponding nodes. The folders named `stdout` and `stderr` that store the output data of the Linux operating system.
spark.adb.event.logUploadDuration	No	false	Specifies whether to record the duration of an event log upload.
spark.adb.buffer.maxNumEvents	No	1000	The maximum number of events that are cached by the driver.
spark.adb.payload.maxNumEvents	No	10000	The maximum number of events that can be uploaded to Object Storage Service (OSS) at a time.
spark.adb.event.pollingIntervalSecs	No	0.5	The interval between two uploads of events to OSS. Unit: seconds. For example, a value of 0.5 indicates that events are uploaded every 0.5 seconds.
spark.adb.event.maxPollingIntervalSecs	No	60	The maximum retry interval when an event upload to OSS fails. Unit: seconds. The interval between a failed upload and an upload retry must be within the range of the `spark.adb.event.pollingIntervalSecs` to `spark.adb.event.maxPollingIntervalSecs` values.
spark.adb.event.maxWaitOnEndSecs	No	10	The maximum wait time for uploading events to OSS. Unit: seconds. The maximum wait time is the interval between the start and completion of an upload. If an upload is not complete within the maximum wait time, the upload is retried.
spark.adb.event.waitForPendingPayloadsSleepIntervalSecs	No	1	The required wait time for retrying an upload that fails to be complete within the `spark.adb.event.maxWaitOnEndSecs` value. Unit: seconds.
spark.adb.eventLog.rolling.maxFileSize	No	209715200	The maximum file size of event logs in OSS. Unit: bytes. Event logs are stored in OSS in the form of multiple files, such as Eventlog.0 and Eventlog.1. You can specify the file size.

Grant permissions to RAM users

Parameter	Required	Default value	Description
spark.adb.roleArn	No	N/A	The Alibaba Cloud Resource Name (ARN) of the Resource Access Management (RAM) role that you want to attach to the RAM user in the RAM console to grant the RAM user the permissions to submit Spark applications. For more information, see RAM role overview. If you submit Spark applications as a RAM user, you must specify this parameter. If you submit Spark applications with an Alibaba Cloud account, you do not need to specify this parameter. Note If you have granted permissions to a RAM user in the RAM console, you do not need to specify this parameter. For more information, see Account authorization.

Enable the built-in data source connectors

Parameter	Required	Default value	Description
spark.adb.connectors	No	N/A	The names of the built-in connectors of AnalyticDB for MySQL Spark that you want to enable. Separate multiple names with commas (,). Valid values: oss, hudi, delta, adb, odps, external_hive, jindo, and default.
spark.hadoop.io.compression.codec.snappy.native	No	false	Specifies whether a Snappy file is in the standard Snappy format. By default, Hadoop recognizes the Snappy files that are edited in Hadoop. If you set this parameter to true, the standard Snappy library is used for decompression. If you set this parameter to false, the default Snappy library of Hadoop is used for decompression.

Enable VPC access and data source access

Parameter	Required	Default value	Description
spark.adb.eni.enabled	No	false	Specifies whether to enable Elastic Network Interface (ENI). If you use external tables to access other external data sources, you must enable ENI. Valid values: true: Enable false: Disable
spark.adb.eni.vswitchId	No	N/A	The ID of the vSwitch that is associated with an ENI. If you connect to AnalyticDB for MySQL from an Elastic Compute Service (ECS) instance over a virtual private cloud (VPC), you must specify a vSwitch ID for the VPC. Note If you have enabled VPC access, you must set the spark.adb.eni.enabled parameter to true.
spark.adb.eni.securityGroupId	No	N/A	The ID of the security group that is associated with an ENI. If you connect to AnalyticDB for MySQL from an ECS instance over a VPC, you must specify a security group ID. Note If you have enabled VPC access, you must set the spark.adb.eni.enabled parameter to true.
spark.adb.eni.extraHosts	No	N/A	The mappings between IP addresses and hostnames. This parameter allows Spark to resolve the hostnames of data sources. If you want to access a self-managed Hive data source, you must specify this parameter. Note Separate IP addresses and hostnames with spaces. Separate multiple groups of IP addresses and hostnames with commas (,). Example: `ip0 master0,ip1 master1`. If you have enabled data source access, you must set the spark.adb.eni.enabled parameter to true.
spark.adb.eni.adbHostAlias.enabled	No	false	Specifies whether to automatically write the domain name resolution information that AnalyticDB for MySQL requires to a mapping table of domain names and IP addresses. Valid values: true: Enable. false: Disable. If you use an ENI to read data from or write data to EMR Hive, you must set this parameter to true.

Configure application retries

Parameter

Required

Default value

Description

spark.adb.maxAttempts

The maximum number of attempts that are allowed to run an application. The default value is 1, which specifies that no retry attempts are allowed.

If you set this parameter to 3 for a Spark application, the system attempts to run the application up to three times within a sliding window.

spark.adb.attemptFailuresValidityInterval

Integer.MAX

The duration of the sliding window within which the system attempts to rerun an application. Unit: seconds.

For example, if you set this parameter to 6000 for a Spark application, the system counts the number of attempts within the last 6,000 seconds after a failed run. If the number of attempts is less than the value of the maxAttempts parameter, the system retries to run the application.

Automatic job termination on scheduling timeout (Scheduling Watchdog)

When the Executor pods of a job remain in the Pending state for an extended period due to insufficient underlying resources (such as ECI inventory exhaustion, resource quota overruns, or unsatisfiable node affinity), the job keeps waiting, which wastes Driver resources and blocks subsequent scheduling. After you enable this feature, when both the ratio of unschedulable Executors and the wait time reach the specified thresholds, the system automatically marks the job as failed and releases resources to achieve fail-fast behavior.

Scenarios

This feature is suitable for scenarios in which you run a large number of Spark jobs in a shared cluster and want jobs to fail quickly rather than wait indefinitely when resources are insufficient. Typical scenarios include:

Insufficient ECI resources: When you use Elastic Container Instance (ECI) to run Executors, insufficient inventory for the underlying instance types causes pods to remain in the Pending state for a long time.
Resource quota overruns: The CPU or memory quota of the namespace or resource group is exhausted, and new Executors cannot be scheduled.
Fast retries for batch jobs: You want jobs to fail quickly when resources are unavailable, and let an upper-layer scheduling system (such as Airflow) decide the retry strategy.

Configuration method

When you submit a Spark job, configure the following parameters to enable and adjust the behavior of the Watchdog.

Parameter	Required	Default value	Description
spark.adb.terminateOnPending.enabled	Required when the feature is enabled	false	Specifies whether to enable automatic job termination on scheduling timeout for the job. Valid values: true: enables the feature. false (default): disables the feature. This feature is disabled by default. This feature and the other parameters take effect only after you set this parameter to true.
spark.adb.terminateOnPending.timeoutSec	No	610	The tolerance time during which an Executor pod remains in the Pending state. Unit: seconds. The timer starts when an Executor pod is first detected as unschedulable. When the wait time exceeds this value and the spark.adb.terminateOnPending.executorRatio condition is also met, the job is terminated. Recommended values: 610 to 900.
spark.adb.terminateOnPending.executorRatio	No	0.5	The threshold for the ratio of unschedulable Executors that triggers termination. Valid range: (0, 1.0]. The ratio is calculated as the number of unschedulable Executors divided by the expected total number of Executors. When the ratio reaches this value and the Pending time exceeds spark.adb.terminateOnPending.timeoutSec, the job is terminated. A smaller value makes triggering more sensitive, and a larger value makes triggering more lenient: 0.3: triggers termination when 30% of the Executors are unschedulable. Suitable for latency-sensitive jobs. 0.5 (default): triggers termination when half of the Executors are unschedulable. 0.8 to 1.0: triggers termination only when most or all of the Executors are unschedulable. Suitable for environments with dynamically fluctuating resources. The expected total number of Executors is determined based on the following priority: spark.executor.instances (static allocation). spark.dynamicAllocation.initialExecutors (the initial value for dynamic allocation). spark.dynamicAllocation.maxExecutors (the maximum value for dynamic allocation). If none of the preceding parameters are configured, the current number of unschedulable pods is used.

Important

Termination is triggered only when both spark.adb.terminateOnPending.executorRatio and spark.adb.terminateOnPending.timeoutSec conditions are met. Termination is not triggered if only the wait time is exceeded but the ratio is not reached, or if only the ratio is reached but the wait time is insufficient.

Configuration example: enable automatic job termination on scheduling timeout for a job, and terminate the job when half of the Executors remain unschedulable for more than 300 seconds.

{
   "spark.adb.terminateOnPending.enabled":"true",
   "spark.adb.terminateOnPending.timeoutSec":"300",
   "spark.adb.terminateOnPending.executorRatio":"0.5"
}

Job behavior after termination

After termination is triggered, the job behaves as follows:

Job state: The SparkApplication first enters FailingState and then changes to FailedState.
Error message: status.appState.errorMessage contains structured fault diagnostic information.
Driver pod: The Driver pod is proactively deleted to release resources.

The following example shows an error message:

[ResourceQuotaExceeded] executors unschedulable: 5/10 (50%) for 5m12s; threshold ratio≥50%, age≥5m0s
denom=10 (spec.executor.instances), distinct reasons=1, oldest pod=my-job-exec-3 (pending since 2026-06-01T10:16:34Z)
scheduler reason: Unschedulable
scheduler message: 0/1279 nodes are available: quota not enough, quotaName: amv-x

The content within the square brackets at the beginning of the error message indicates the fault cause that is automatically classified by the system. The following table describes the meaning of each category and the recommended handling.

Fault category	Meaning	Recommended handling
`ECIPendingTimeout`	The ECI pod exceeds the maximum Pending duration.	Check the ECI instance type inventory, or switch to another ECI instance type.
`ResourceQuotaExceeded`	The resource quota is insufficient.	Contact the administrator to increase the quota, or reduce the job concurrency.
`InsufficientClusterResources`	The cluster CPU or memory is insufficient.	Wait for resources to be released, or scale out the cluster.

Note

The Watchdog only marks the job as failed. Whether the job is retried afterward is controlled by the restartPolicy of the job, which is consistent with the behavior when a job fails normally. For more information about application retries, see the Configure application retries section in this topic.

Specify a runtime environment for Spark jobs

The following table describes the configuration parameters that are required when you use the virtual environments technology to package a Python environment and submit Spark jobs.

Parameter	Required	Default value	Description
spark.pyspark.python	No	N/A	The path of the Python interpreter on your on-premises device.

Specify the Spark version

Parameter

Required

Default value

Description

spark.adb.version

3.2

The Spark version. Valid values:

High-performance vectorized execution engine

Parameter	Required	Default value	Description
spark.adb.native.enabled	No	false	Specifies whether to enable the high-performance vectorized execution engine to run jobs. The engine is built into AnalyticDB for MySQL Spark and is fully compatible with open source Spark. You can enable it without modifying your existing code.

Lake storage acceleration

Parameter	Required	Default value	Description
spark.adb.lakecache.enabled	No	false	Specifies whether to enable LakeCache (lake storage acceleration).

Configuration parameters not supported by AnalyticDB for MySQL

AnalyticDB for MySQL Spark does not support the following configuration parameters of Apache Spark. These parameters do not take effect on AnalyticDB for MySQL Spark.

Useless options(these options will be ignored):
  --deploy-mode
  --master
  --packages, please use `--jars` instead
  --exclude-packages
  --proxy-user
  --repositories
  --keytab
  --principal
  --queue
  --total-executor-cores
  --driver-library-path
  --driver-class-path
  --supervise
  -S,--silent
  -i <filename>