This topic describes how to use IMB and an MPI library to test the communication performance of an Elastic High Performance Computing (E-HPC) cluster.
Intel MPI Benchmarks (IMB) is a software application that is used to measure the performance of point-to-point and global communication operations in an HPC cluster for various message sizes.
Message Passing Interface (MPI) is a standardized and portable message-passing standard for parallel computing. MPI supports multiple programming languages and provides benefits such as high performance, concurrency, portability, and scalability.
Before you begin
Before the test, prepare an example file named IMB.dat on your computer. The file includes the runtime parameters of IMB. The following section shows the parameters in the example file.
/opt/intel-mpi-benchmarks/2019/IMB-MPI1 -h # View the parameter descriptions of IMB and the communication modes that IMB supports. cd /home/<user>/<work_dir> # Replace <user> with the actual username. The user must be a non-root user. /opt/intel/impi/2018.3.222/bin64/mpirun -genv I_MPI_DEBUG 5 -np 2 -ppn 1 -host <node0>,<node1> /opt/intel-mpi-benchmarks/2019/IMB-MPI1 pingpong # Test the performance of ping-pong communication between two nodes and obtain the communication latency and bandwidth. -genv I_MPI_DEBUG indicates that the debug information of MPI is printed. -np indicates the total number of MPI processes. # -ppn specifies the number of processes per node. # -host specifies the task node list. # -npmin specifies the minimum number of processes that can be run. # -msglog specifies the range of segment granularities. /opt/intel/impi/2018.3.222/bin64/mpirun -genv I_MPI_DEBUG 5 -np <N*2> -ppn 2 -host <node0>,...,<nodeN> /opt/intel-mpi-benchmarks/2019/IMB-MPI1 -npmin 2 -msglog 19:21 allreduce # Test the performance of all-reduce communication between N nodes. On each node, two processes are started to obtain the time consumed for various message sizes. You must replace <node0> and similar parameters with the node names of your cluster. /opt/intel/impi/2018.3.222/bin64/mpirun -genv I_MPI_DEBUG 5 -np <N> -ppn 1 -host <node0>,...,<nodeN> /opt/intel-mpi-benchmarks/2019/IMB-MPI1 -npmin 1 -msglog 15:17 alltoall # Test the performance of all-to-all communication among N nodes. On each node, a process is started to obtain the time consumed to communicate messages of various sizes. You must replace <node0> and similar parameters with the node names of your cluster.
Log on to the E-HPC console.
Create a cluster named MPI.
For more information, see Create a cluster. Set the following parameters:
Compute Node: Select an instance type whose vCPUs are greater than or equal to 8, for example, ecs.c7.2xlarge.
Other Software: Install intel-mpi 2018 and intel-mpi-benchmarks 2019.Note
You can also install the preceding software in an existing cluster. For more information, see Install software.
Create a sudo user named mpitest.
For more information, see Create a user.
Upload a job file.
In the left-side navigation pane, click Job.
Select a cluster from the Cluster drop-down list. Then, click Create Job.
On the Create Job page, choose Create File > Open Local File.
In the local directory of your computer, find the IMB.dat file, and click Open.
For more information, see Example file.
Create a job script and submit the job.
On the Create Job page, choose Create File > Template > pbs demo.
Configure the job, as shown in the following figure. Then, click OK to submit the job.
The following sample script shows how to configure the job file:
#!/bin/sh #PBS -j oe # PBS -l select=2:ncpus=8:mpiprocs=1 # The script runs on two compute nodes. Each node uses 8 vCPUs and one MPI process. In an actual test, you can configure the number of CPUs based on your node configurations. export MODULEPATH=/opt/ehpcmodulefiles/ module load intel-mpi/2018 module load intel-mpi-benchmarks/2019 echo "run at the beginning" /opt/intel/impi/2018.3.222/bin64/mpirun -genv I_MPI_DEBUG 5 -np 2 -ppn 1 -host compute000,compute001 /opt/intel-mpi-benchmarks/2019/IMB-MPI1 pingpong > IMB-pingpong # In this example, the compute nodes compute000 and compute001 are selected for the test. Replace the names with the node names of your cluster.
View the job result.
On the Cluster page, find HPL.test, and click Connect.
In the Connect panel, specify a username, password, and port number. Then, click Connect via SSH.
Run the following command to view the result of the HPL job:
The following figure shows the test result.