Instructions of spark-submit parameter settings

Last Updated: Jun 23, 2017

This section introduces how to set spark-submit parameters in E-MapReduce.

Cluster configuration

Software configuration

E-MapReduce version 1.1.0

  • Hadoop 2.6.0

  • Spark 1.6.0

Hardware configuration

Master node

  • 8-core 16G and 500G ultra cloud disk

  • 1 unit

Worker node x 10

  • 8-core 16G and 500G ultra cloud disk

  • 10 units

Total: 8-core 16G (Worker) x 10 + 8-core 16G (Master)

Note: As only CPU and memory resources are calculated when a job is submitted, the disk size is not counted in the total resources.

Total available resources in yarn: 12-core 12.8G (worker) x 10

Note: By default, available cores in yarn = machine core x 1.5; available memory in yarn = machine memory x 0.8.

Submit job

After a cluster is created, you can submit a job. However, you first need to create a job in E-MapReduce. The configuration is as follows:

  1. --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-memory 4g num-executors 2 --executor-memory 2g --executor-cores 2 /opt/apps/spark-1.6.0-bin-hadoop2.6/lib/spark-examples*.jar 10

The job in the preceding figure directly uses the Spark example package. You do not need to upload your own jar package.

Parameters are described as follows:

Parameter Reference Value Description
class org.apache.spark.examples.SparkPi The primary class of the job.
master yarn E-MapReduce only adopts the Yarn mode.
yarn-client Equivalent to —master yarn —deploy-mode client. You do not need to specify deploy-mode.
yarn-cluster Equivalent to —master yarn —deploy-mode cluster. You do not need to specify deploy-mode.
deploy-mode client The client mode indicates the job AM will be run on the master node.
cluster The cluster mode indicates the AM will be started and run on a random worker node.
driver-memory 4g The memory used by the driver cannot exceed the total core of a single machine.
num-executors 2 How many executors are created.
executor-memory 2g The maximum memory of every executor cannot exceed the maximum memory available for a single machine.
executor-cores 2 The concurrent thread count of every executor, that is, the concurrent task count that every executor can run at the same time.

Resource calculation

The resources used by the jobs in different modes and settings are shown in the following table:

Calculation of resources in yarn-client mode

Node Resource Type Resources (The result can be calculated in the example above)
master core 1
mem driver-memroy = 4G
worker core num-executors * executor-cores = 4
mem num-executors * executor-memory = 4G
  • The primary program of the job (driver program) will be run on the master node. According to the job configuration, 4G (specified by —driver-memory) of memory will be allocated to it (of course the actual memory usage may be less than 4G).

  • Two executors (specified by —num-executors) will be initiated on the worker node, with each executor allocated with 2G (specified by —executor-memory) memory, and each executor supports a maximum of 2 (specified by - -executor-cores) concurrent tasks.

Calculation of resources in yarn-cluster mode

Node Resource Type Resources (The result can be calculated in the preceding example)
master A client program used for synchronizing the job information. It occupies very little memory.
worker core num-executors * executor-cores+spark.driver.cores = 5
mem num-executors * executor-memory + driver-memroy = 8g

Note: The spark.driver.cores in this example is 1 by default. In actual scenarios, you can set it to a value greater than 1.

Optimization of resource usage

Yarn-client mode

If you have a large job and want to use more resources of the cluster in the yarn-client mode, you can refer to the following configurations:

Note:

  • When Spark allocates the memory, it will allow an overflow of 375M or 7% (whichever metric is greater) over the user-set memory value.
  • When Yarn allocates container memory, it rounds the value up to an integer, that is, the memory will be an integer specified in multiples of 1G.
  1. --master yarn-client --driver-memory 5g –-num-executors 20 --executor-memory 4g --executor-cores 4

According to the preceding resource calculation formula:

  • The resources for the master node are:

    • core: 1
    • mem: 6G (5G + 375M, and rounded up to 6G)
  • The resources for worker nodes are:

    • core: 20*4 = 80
    • mem: 20*5G (4G + 375M, and rounded up to 5G) = 100G

As shown, the total resources did not exceed the total resources for the cluster. Following this principle, you can set multiple methods of configuration, such as:

  1. --master yarn-client --driver-memory 5g num-executors 40 --executor-memory 1g --executor-cores 2
  1. --master yarn-client --driver-memory 5g num-executors 15 --executor-memory 4g --executor-cores 4
  1. --master yarn-client --driver-memory 5g num-executors 10 --executor-memory 9g --executor-cores 6

As long as the total resources calculated using the preceding formulas do not exceed the maximum resources for the cluster, the principle works.However, in real scenarios, the system, HDFS and E-MapReduce services all occupy some core and memory resources. This means if you use up the core and memory resources, it will compromise the performance, and may lead to a failure.

Typically, the number of executor cores is set to the core count of the cluster. If the set value is too large, the CPU will frequently switch without creating any significant benefit to performance.

Yarn-cluster mode

In the yarn-cluster mode, the driver program will be set on the worker nodes. Resources in the resource pool of the worker nodes will be used. If you want to use more resources of this cluster, refer to the following configurations:

  1. --master yarn-cluster --driver-memory 5g num-executors 15 --executor-memory 4g --executor-cores 4
  • If you set the memory to a very large value, pay attention to the GC consumption. We recommend that you keep the memory of an executor no more than 64G.

  • If you are executing an HDFS read/write job, we recommend that you set the concurrency in each executor to be no more than 5 for reading/writing.

  • If you are executing an OSS read/write job, we recommend that you distribute executors to different ECSs so that the bandwidth for every ECS can be utilized. For example, if you have 10 ECSs, then you can configure num-executors=10, and set appropriate memory and concurrency.

  • If you used non-thread-safe code in the job, consider whether the high concurrency will lead to job exceptions when setting the number of executor cores. If yes, we recommend you to set executor-cores=1.

Thank you! We've received your feedback.