All Products
Search
Document Center

Elastic High Performance Computing:Use WRF to perform meteorological simulation

Last Updated:Apr 29, 2025

This topic describes how to run Weather Research and Forecasting Model (WRF) in an Elastic High Performance Computing (E-HPC) cluster for meteorological simulation.

Background information

WRF is an open source mesoscale numerical system that is widely used for weather forecast. It supports a wide range of atmospheric process studies and simulations, including historical data reconstruction and future weather forecasting, on a variety of computing platforms. For more information, visit the WRF official website.

This topic simulates meteorological activities at a 12-km resolution to describe how WRF works in combination with E-HPC. In this topic, a winter storm in the USA in 2019 is used as an example. In the simulation, key meteorological factors such as precipitation, temperature, and wind speed are analyzed and studied, and their changes and distribution in a 12-hour period are illustrated.

Preparations

  1. Use one of the following methods to create an E-HPC cluster:

    • Create a cluster by using a template. For more information, see Create a cluster by using a template.

      Important

      If you want to create a cluster by using a cluster template, you must modify the node configuration on the Compute Node and Queue page before you create the cluster.

      image

    • Manually create a cluster. For more information, see Create a standard cluster.

    In this example, the following configurations are used for the cluster:

    Item

    Configuration

    Series

    Standard Edition

    Deployment Mode

    Public cloud cluster

    Cluster Type

    SLURM

    Node configurations

    • One management node, one logon node, and four compute nodes with the following specifications:

      • Management node: Elastic Compute Service (ECS) instance of the ecs.c8ae.xlarge type and with 4 vCPUs and 8 GiB memory

      • Logon node: ECS instance of the ecs.c8ae.xlarge type and with 4 vCPUs and 8 GiB memory

      • Compute node: ECS instance of the ecs.c8ae.16xlarge type and with 64 vCPUs and 128 GiB memory

    • Compute nodes are interconnected over the eRDMA network.

    Image

    Alibaba Cloud Linux 2.1903 LTS 64-bit

  2. Create a cluster user. For more information, see Manage users.

    The user is used to log on to the cluster, compile software, and submit jobs. The following settings are used in this example:

    • Username: testuser

    • User group: sudo permissions group

  3. (Conditionally required) If you manually create the cluster, you must install the wrf-aocc software. If you create the cluster by using a template, skip this step. For more information, see Install and uninstall software in a cluster.

    Note

    After the wrf-aocc software is installed in the cluster, all other depended software is automatically installed.

Step 1: Verify and configure the environment

  1. Log on to E-HPC Portal. For more information, see Log on to E-HPC Portal.

  2. Click image in the upper-right corner to use Workbench to connect to the cluster.

  3. Run the following command to check whether the software is successfully installed. This ensures that the system successfully loads the required version of environment.

    module avail

    Sample output:

    image

  4. Configure the conus12km workload.

    1. Run the following command to download and decompress the workload file:

      cd ~
      wget https://ehpc-perf.oss-cn-hangzhou.aliyuncs.com/yt710/WRFV4/input-data/v4.4_bench_conus12km.tar.gz
      
      tar -zxvf v4.4_bench_conus12km.tar.gz
      cd v4.4_bench_conus12km
      ln -s /opt/ehpc_common_softwares/nwp/wrf-aocc/4.4.2/aliyun/2/x86_64/run/* .
    2. Run the following command to download and decompress the NCL programming language:

      cd ~
      wget https://ehpc-perf.oss-cn-hangzhou.aliyuncs.com/AMD-Genoa/WRFV4/ncl_draw.tar.gz
      tar -zxvf ncl_draw.tar.gz
  5. Run the following command to modify the ~/.bashrc file:

    vim ~/.bashrc

    Add the following code to the file:

    module load aocc/4.0.0   aocl/4.0.1   gcc/12.3.0   hdf5/1.10.5   libfabric/1.16.0   mpich-aocc/4.0.3   netcdf/4.8.0   szip/2.0.0   wrf-aocc/4.4.2   zlib/1.2.13
    
    export NCARG_ROOT=~/ncl
    
    export PATH=$NCARG_ROOT/bin:$PATH
  6. Run the following command on the four compute nodes to install NCL and ImageMagick:

    Note

    You can send a remote command to all four nodes in the console for quick execution. For more information, see Send commands to nodes.

    sudo yum install -y ncl ImageMagick
  7. Run the following command to create a job execution file named png2gif.sh:

    cd ~
    vim png2gif.sh

    Click to view a sample script:

    #!/bin/sh
    
    echo date
    echo "------------get wrfout_d01 file-------------"
    ls -al wrfout_d01* | awk '{print $9}'
    echo "----------start change png tp gif-----------"
    
    
    out_gif=''
    for i in "$@"; do
      case $i in
        -out)
          shift
          out_gif=$1 
          break 
          ;;
        *)
          echo "Other parameter: $i"
          ;;
      esac
    done
    WRFPath=`pwd`
    for file in `ls -al wrfout_d01* | awk '{print $9}'`; do
            export FILE_PATH="/$WRFPath/$file"
            echo "-----------$FILE_PATH-----------"
            ncl surface.ncl
    done
    echo "----------surface.ncl file-----------"
    ls -al wrfout_d01*.png | awk '{print $9}'
    
    echo "----------convert delay start-----------"
    if [ -n "$out_gif" ]; then
      convert -delay 50 wrfout_d01_2019-11-26*.000001.png "$out_gif"
    else
      convert -delay 50 wrfout_d01_2019-11-26*.000001.png wrfout_d01_2019-11-26.000001.gif
    fi

Step 2: Submit a job

After you verify and configure the environment, you must close the Workbench dialog box to continue with the following steps:

  1. In the top navigation bar, click Task Management.

  2. In the upper part of the page, click submitter.

  3. On the Create Job page, specify the parameters. The following tables describe the parameters:

    Note

    Retain the default settings for other parameters.

    • Basic parameters

      Parameter

      Example

      Description

      Job Name

      wrf-conus12

      The name of the job.

      If you need to automatically download and decompress job files, the decompression directory uses the name of the job.

      Output File

      • Variable name: out

      • Variable value: test.gif

      The output file of the job.

      Queue

      comp

      The queue in which you want to run the job.

      If you added compute nodes to a queue when you created the cluster, submit the job to the queue. Otherwise, the job fails to be run. If you did not add compute nodes to a queue, the job is submitted to the default queue of the scheduler.

      Command

      Edit Online

      The job execution command that you want to submit to the scheduler. You can enter a command or the relative path of the script file. E-HPC Portal supports the following methods:

      • Edit Online

      • Use Local File

      • Use Uploaded File

      Number of Nodes

      4

      The number of compute nodes that are used to run the job.

      Number of Tasks

      4

      The number of tasks used by each compute node to run the job, that is, the number of processes.

      Number of Threads

      2

      The number of threads that are used by a task. If you do not specify this parameter, the number of threads is 1.

      Click to view a sample command:

      #!/bin/bash
      
      ln -s ~/v4.4_bench_conus12km/* .
      ln -s ~/ncl/* .
      ln -s ~/png2gif.sh .
      
      rm -rf rsl.error.????
      rm -rf rsl.out.????
      
      rm -rf wrfout*
      
      ulimit -s unlimited
      ulimit -l unlimited
      
      export WRF_NUM_TILES=16
      export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
      export OMP_STACKSIZE="16M"
      
      
      # mpi + omp
      mpirun   -genv FI_PROVIDER="verbs;ofi_rxm" -bind-to core:2 ./wrf.exe
      
      source png2gif.sh -out 'test.gif' 
    • Advanced parameters

      Parameter

      Example

      Description

      MPI Profile

      Enable

      Specifies whether to enable MPI performance profiling.

  4. Click Submit.

    After you complete the operation, you can see that the job is in the RUNNING status.

    image

Step 3: View the job details

  1. On the Task Management page, find the submitted job and click View in the Actions column.

  2. On the job details page, you can view the details of the job, as shown in the following figure:

    image

  3. When the job is in the COMPLETED status, click the Input and Output Files tab and select Output File from the drop-down list in the upper-right corner to view the running result.

    image

    The following GIF figure shows a sample result:

    test