All Products
Search
Document Center

Lindorm:Configure parameters for jobs

Last Updated:Mar 28, 2026

Lindorm Distributed Processing System (LDPS) runs Spark jobs on Kubernetes-managed elastic resource pools. This page lists all configurable parameters for LDPS Spark jobs and explains how to pass them for each submission method.

Spark parameters

Restricted parameters

The following parameters are set by the system and cannot be customized.

ParameterDescription
spark.masterEndpoint of the cluster management system.
spark.submit.deployModeDeployment mode of the Spark driver.

Resource parameters

LDPS runs on elastic resource pools billed on a pay-as-you-go basis. By default, there is no upper limit on the resources a job can request. To set a maximum, see Modify the configurations of LDPS.

Resource parameters apply to all JDBC, JAR, and Python jobs submitted to LDPS. They are divided into specification parameters and capacity parameters.

Specification parameters

Basic specification parameters

ParameterDescriptionDefault
spark.driver.memoryHeap memory of the driver. Unit: mebibytes.8192m
spark.driver.memoryOverheadOff-heap memory of the driver. Unit: mebibytes.8192m
spark.kubernetes.driver.disk.sizeLocal disk size of the driver. Unit: GB.50
spark.executor.coresCPU cores per executor.4
spark.executor.memoryHeap memory per executor. Unit: mebibytes.8192m
spark.executor.memoryOverheadOff-heap memory per executor. Unit: mebibytes.8192m
spark.kubernetes.executor.disk.sizeLocal disk size per executor. Unit: GB.50

Advanced specification parameters

ParameterDescriptionDefault
spark.{driver/executor}.resourceTagPredefined resource specification set. When set, LDPS automatically applies the corresponding CPU, memory, and disk values. Valid values: xlarge, 2xlarge, 4xlarge, 8xlarge, 16xlarge.None
spark.kubernetes.{driver/executor}.ecsModelPreferencePreferred compute node models, listed in priority order. LDPS tries each model in sequence; if all are unavailable, it selects an available model that matches the resource specification. Specify up to four models, separated by commas. Example: hfg6,g6.None
spark.kubernetes.{driver/executor}.annotation.k8s.aliyun.com/eci-use-specsGPU specification of the Elastic Container Instance (ECI). For supported GPU instance types, see Specify ECS instance types to create pods.ecs.gn7i-c8g1.2xlarge
spark.{driver/executor}.resource.gpu.vendorGPU vendor. Must match the GPU specification set in eci-use-specs.nvidia.com
spark.{driver/executor}.resource.gpu.amountNumber of GPUs. Set to 1.1
spark.{driver/executor}.resource.gpu.discoveryScriptPath to the GPU discovery script used to identify and bind GPU resources at driver or executor startup. Set to /opt/spark/examples/src/main/scripts/getGpusResources.sh./opt/spark/examples/src/main/scripts/getGpusResources.sh
spark.kubernetes.executor.annotation.k8s.aliyun.com/eci-use-specsExecutor instance specification with expanded local disk capacity. See the supported values below.None

The spark.kubernetes.executor.annotation.k8s.aliyun.com/eci-use-specs parameter supports the following executor specifications:

ValueCPU coresMemory
ecs.d1ne.2xlarge832 GB
ecs.d1ne.4xlarge1664 GB
ecs.d1ne.6xlarge2496 GB
ecs.d1ne.8xlarge32128 GB
ecs.d1ne.14xlarge56224 GB
Important

When using spark.kubernetes.executor.annotation.k8s.aliyun.com/eci-use-specs, also set the following two parameters:

  • spark.kubernetes.executor.volumes.emptyDir.spark-local-dir-1.mount.path=/var

  • spark.kubernetes.executor.volumes.emptyDir.spark-local-dir-1.options.medium=LocalRaid0 If the specified executor instance type is unavailable, contact Lindorm technical support (DingTalk ID: s0s3eg3).

resourceTag specification mapping

resourceTag valuespark.{driver/executor}.coresspark.{driver/executor}.memoryspark.{driver/executor}.memoryOverheadspark.kubernetes.{driver/executor}.disk.size
xlarge48192m8192m50 GB
2xlarge816384m16384m100 GB
4xlarge1632768m32768m200 GB
8xlarge3265536m65536m400 GB
16xlarge64131072m131072m400 GB

Capacity parameters

ParameterDescriptionDefault
spark.executor.instancesNumber of executors allocated for the job.2
spark.dynamicAllocation.enabledEnables dynamic resource allocation. When enabled, LDPS automatically requests and releases executors based on real-time job workload. Valid values: true, false.true
spark.dynamicAllocation.minExecutorsMinimum number of executors when dynamic resource allocation is enabled.0
spark.dynamicAllocation.maxExecutorsMaximum number of executors when dynamic resource allocation is enabled. This value equals the number of concurrent tasks.Infinity
spark.dynamicAllocation.executorIdleTimeoutHow long an idle executor is kept before being released. Unit: seconds.600s

Execution parameters

ParameterDescriptionDefault
spark.speculationEnables speculative execution. When enabled, the driver re-submits tasks that are running significantly slower than other tasks in the same stage (long tails), to avoid job delays. Valid values: true, false.true
spark.task.maxFailuresMaximum number of task failures allowed before the job fails.4
spark.dfsLog.executor.enabledStores executor logs to LindormDFS. Set to false for large-scale jobs to reduce DFS load from log streams. Valid values: true, false.true
spark.jarsPath to the JAR package required for the job. Accepts OSS or Hadoop Distributed File System (HDFS) paths. If you use JDBC, set this to an HDFS path only. If you use an OSS path, also configure the OSS parameters below.None
spark.hadoop.fs.oss.endpointOSS endpoint. For endpoint values by region, see Regions and endpoints.None
spark.hadoop.fs.oss.accessKeyIdAccessKey ID of your Alibaba Cloud account or RAM user. See Obtain an AccessKey pair.None
spark.hadoop.fs.oss.accessKeySecretAccessKey secret of your Alibaba Cloud account or RAM user. See Obtain an AccessKey pair.None
spark.hadoop.fs.oss.implFile system implementation class for OSS. Set to org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem.None
spark.default.parallelismDefault parallelism for non-SQL tasks, including data source reads and shuffle stages.None
spark.sql.shuffle.partitionsNumber of shuffle partitions for SQL tasks.200

Monitoring parameters

ParameterDescriptionDefault
spark.monitor.cmdMonitoring commands to run at regular intervals. Separate multiple commands with semicolons (;). Results are written to job logs.
Note

This parameter cannot be configured when submitting jobs via Beeline or JDBC.

None
spark.monitor.intervalInterval between monitoring command executions. Unit: seconds.60
spark.monitor.timeoutTimeout for each monitoring command. If a command exceeds this limit, it is skipped and the next command runs. Unit: seconds.2

Example monitoring commands:

# Single command
"spark.monitor.cmd": "top -b -n 1"

# Multiple commands
"spark.monitor.cmd": "top -b -n 1; vmstat; free -m; iostat -d -x -c -k; df -h; sar -n DEV 1 1; netstat"

Common monitoring commands by category:

CategoryCommands
System statustop -b -n 1, vmstat
Memoryfree -m
Disk I/Oiostat -d -x -c -k
Disk usagedf -h
Networksar -n DEV 1 1, netstat

Log parameters

ParameterDescriptionDefault
spark.log.levelLog output level for the job.INFO

Valid values for spark.log.level, from most to least verbose:

LevelDescription
ALLAll log output, including the most granular debug information.
TRACEMore detailed than DEBUG; records fine-grained execution steps.
DEBUGDebug-level logs including detailed runtime status.
INFOGeneral informational logs for normal execution events.
WARNWarnings about potential issues that do not stop execution.
ERRORErrors that occur during execution.
FATALCritical errors that prevent the program from continuing.
OFFDisables all log output.

Open-source Spark parameters

For parameters inherited from open-source Spark, see Spark configuration.

Configure parameters by submission method

When you submit a job to LDPS, the method you use determines how parameters are passed.

Beeline

Edit conf/beeline.conf in the Spark package directory where the Beeline command line tool is located.

# LDPS endpoint
# Format: jdbc:hive2://<host>:<port>/;?token=<token>
endpoint=jdbc:hive2://ld-bp13ez23egd123****-proxy-ldps-pub.lindorm.aliyuncs.com:10009/;?token=jfjwi2453-fe39-cmkfe-afc9-01eek2j5****

# Connection credentials (default: root/root)
user=root
password=root

# Set to false to use a dedicated Spark session for this connection
shareResource=false

# Spark parameters
spark.dynamicAllocation.enabled=true
spark.dynamicAllocation.minExecutors=3

For the full Beeline setup, see Getting started.

JDBC

Append Spark parameters to the JDBC connection string as key-value pairs after the token.

jdbc:hive2://<host>:<port>/;?token=<token>;spark.executor.memory=8g;spark.sql.shuffle.partitions=2

For the full JDBC URL format, see Use JDBC in application development.

spark.jars cannot be set to an OSS path when submitting jobs via JDBC. Use an HDFS path instead.

JAR jobs

Configure parameters in the job template when submitting a Java job:

Python jobs

Configure parameters in the job template when submitting a Python job: