how to deploy a Spark cluster on eRDMA-enhanced instances - Elastic Compute Service

Elastic Remote Direct Memory Access (eRDMA) allows you to process requests at ultra-low latency. This topic describes how to create a Spark cluster that contains eRDMA-enhanced Elastic Compute Service (ECS) instances as nodes and use Benchmark to test the load processing performance of the Spark cluster.

Background information

Benchmark is a performance benchmarking tool that is used to test load processing performance, including load execution time, transmission rate, throughput, and resource utilization.

Step 1: Make preparations

Before you test the load processing performance of a Spark cluster, make preparations to set up the environment that is required for the test. The preparations include preparing Hadoop and Spark machines, installing Hadoop, and installing and configuring eRDMA.

Prepare a Hadoop environment. If Hadoop clusters already exist, skip this step.
- Requirements on hardware and software environments
  Prepare the following Hadoop version, Spark version, and ECS instances:
  - Hadoop version: Hadoop 3.2.1.
  - Spark version: Spark 3.2.1.
  - ECS instances:
    ECS instance type: See Overview.
    Number of vCPUs per ECS instance: 16.
    Number of ECS instances: Four. One ECS instance serves as the master node and the other three ECS instances serve as worker nodes in a Hadoop cluster.
- Installation procedure
Log on to the ECS instance that serves as the master node.
For more information, see Connect to a Linux instance by using a password or key.
Configure eRDMA.
- Install the required drivers.
  For more information, see Configure eRDMA on an enterprise-level instance.
- Configure network settings.
  1. Run the following command to open the hosts file:
```
vim /etc/hosts
```
  2. Press the I key to enter Insert mode and then modify the following content in the file:
```
192.168.201.83 poc-t5m0        master1
192.168.201.84 poc-t5w0
192.168.201.86 poc-t5w1
192.168.201.85 poc-t5w2
```
    Note
    Replace the IP addresses with the IP addresses of actual eRDMA interfaces (ERIs).
  3. Press the Esc key to exit Insert mode. Enter :wq and press the Enter key to save and exit the file.
- Configure Yet Another Resource Negotiator (YARN) settings.
  Note
  If the default network interface controller (NIC) of the ECS instance supports eRDMA, you do not need to configure YARN settings.
  1. Run the following commands in sequence to open the yarn-env.sh file:
```
cd /opt/hadoop-3.2.1/etc/hadoop
vim yarn-env.sh
```
  2. Press the I key to enter Insert mode and add the following content to the file:
```
RDMA_IP=`ip addr show eth1 | grep "inet\b" | awk '{print $2}' | cut -d/ -f1`
export YARN_NODEMANAGER_OPTS="-Dyarn.nodemanager.hostname=$RDMA_IP"
```
    Note
    Replace eth1 with an actual ERI name.
  3. Press the Esc key to exit Insert mode. Enter :wq and press the Enter key to save and exit the file.
- Configure Spark.
  Note
  If the default NIC of the ECS instance supports eRDMA, you do not need to configure Spark.
  1. Run the following commands in sequence to open the spark-env.sh file:
```
cd /opt/spark-3.2.1-bin-hadoop3.2/conf
vim spark-env.sh
```
  2. Press the I key to enter Insert mode and add the following content to the file:
```
export SPARK_LOCAL_IP=`/sbin/ip addr show eth1 | grep "inet\b" | awk '{print $2}' | cut -d/ -f1`
```
    Note
    Replace eth1 with an actual ERI name.
  3. Press the Esc key to exit Insert mode. Enter :wq and press the Enter key to save and exit the file.
Run the following command to start Hadoop Distributed File System (HDFS) and YARN:
```
$HADOOP_HOME/sbin/start-all.sh
```

Step 2: Download the Benchmark installation package

This section describes how to download the Benchmark installation package.

Run the following command to download the Benchmark installation package:

wget https://mracc-release.oss-cn-beijing.aliyuncs.com/erdma-spark/spark-erdma-jverbs.tar.gz

Run the following command to decompress the spark-erdma-jverbs.tar.gz installation package:
```
tar -zxvf spark-erdma-jverbs.tar.gz
```
The following components are included in the installation package:
- erdmalib: the native library that is required to run the spark-erdma plug-in. This library corresponds to the libdisni.so file.
- plugin-sparkrdma: the plug-in and dependency library that support Spark RDMA, which correspond to the spark-eRDMA-1.0-for-spark-3.2.1.jar and disni-2.1-jar-with-dependencies.jar files.

Step 3: Run a Benchmark test

This section describes how to use Benchmark to test the load processing performance of the Spark cluster.

Run the following commands to modify IP routes.
Note
If the default NIC of your ECS instance supports eRDMA, skip this step.
```
route del -net 192.168.201.0 netmask 255.255.255.0 metric 0 dev eth0 && \
route add -net 192.168.201.0 netmask 255.255.255.0 metric 1000 dev eth0
```
Note
Replace the IP addresses with the gateway IP address of the actual ERI.

Configure Spark.

Run the following command to open the spark-jverbs-erdma.conf configuration file:
```
vim /opt/spark-3.2.1-bin-hadoop3.2/conf/spark-jverbs-erdma.conf
```

Press the I key to enter Insert mode and modify the following content in the file:

spark.master yarn
spark.deploy-mode client
#driver
spark.driver.cores 4
spark.driver.memory 19g
#executor
spark.executor.instances 12
spark.executor.memory 10g
spark.executor.cores 4
spark.executor.heartbeatInterval   60s
#shuffle
spark.task.maxFailures 4
spark.default.parallelism 36
spark.sql.shuffle.partitions 192
spark.shuffle.compress            true
spark.shuffle.spill.compress      true

#other
spark.network.timeout 3600
spark.sql.broadcastTimeout 3600
spark.eventLog.enabled             false
spark.eventLog.dir                 hdfs://master1:9000/sparklogs
spark.eventLog.compress            true
spark.yarn.historyServer.address   master1:18080
spark.serializer                  org.apache.spark.serializer.KryoSerializer

#eRDMA
spark.driver.extraLibraryPath   /path/erdmalib
spark.executor.extraLibraryPath   /path/erdmalib
spark.driver.extraClassPath       /path/spark-eRDMA-1.0-for-spark-3.2.1.jar:/path/disni-2.1-jar-with-dependencies.jar
spark.executor.extraClassPath     /path/spark-eRDMA-1.0-for-spark-3.2.1.jar:/path/disni-2.1-jar-with-dependencies.jar
spark.shuffle.manager org.apache.spark.shuffle.sort.RdmaShuffleManager
spark.shuffle.sort.io.plugin.class org.apache.spark.shuffle.rdma.RdmaLocalDiskShuffleDataIO
spark.shuffle.rdma.recvQueueDepth  128

Note

Set the spark.shuffle.compress parameter to false to achieve a better acceleration ratio.
In the preceding sample code, the Spark resource settings such as the spark.executor.instances, spark.executor.memory, spark.executor.cores, and spark.sql.shuffle.partitions parameters of an ECS instance that has 32 vCPUs and 128 GB of memory are used. Modify the Spark resource settings based on the actual cluster scale or instance specifications.

Press the Esc key to exit Insert mode. Enter :wq and press the Enter key to save and exit the file.

Run the following commands in sequence to generate data:

cd /opt/spark-3.2.1-bin-hadoop3.2/conf
spark-submit --properties-file /opt/spark-3.2.1-bin-hadoop3.2/conf/spark-normal.conf --class com.databricks.spark.sql.perf.tpcds.TPCDS_Bench_DataGen spark-sql-perf_2.12-0.5.1-SNAPSHOT.jar hdfs://master1:9000/tmp/tpcds_400 tpcds_400 400 parquet

Note

400 indicates the amount of data generated. Unit: GB. Change the value based on the cluster scale.

Run the following command to run a Benchmark test:
```
spark-submit --properties-file /opt/spark-3.2.1-bin-hadoop3.2/conf/spark-jverbs-erdma.conf --class com.databricks.spark.sql.perf.tpcds.TPCDS_Bench_RunAllQuery spark-sql-perf_2.12-0.5.1-SNAPSHOT.jar all hdfs://master1:9000/tmp/tpcds_400 tpcds_400 /tmp/tpcds_400_result
```
The following command output indicates that the test is complete. You can view the load execution time of the Spark cluster in the test result.
Note
You can delete the spark-erdma plug-in configurations from the files in the Spark conf directory or log on to another Spark cluster that does not support eRDMA, and use the preceding method to perform another Benchmark test. Then, you can compare the two test results to know performance differences between a Spark cluster that supports eRDMA and a Spark cluster that does not support eRDMA.