All Products
Search
Document Center

Elastic High Performance Computing:Use the E-HPC console to submit a job

Last Updated:Mar 26, 2024

A job is the basic computing unit of Elastic High Performance Computing (E-HPC). A job consists of a shell script and executable files. Jobs are run in a sequence that is determined by the specified queues and scheduler. In the E-HPC console, you can submit a job, stop a job, or view the status of a job. This topic describes how to use the E-HPC console to submit a job.

Prerequisites

  • The cluster and cluster nodes are in the Running state.

  • A user is created. For more information, see Create a user.

  • Job files are ready to be imported. E-HPC allows you to import job files by using one of the following methods:

    • Before you submit a job, log on to the cluster and import job files by using remote transmission solutions, such as rsync and the secure copy protocol (SCP).

    • When you submit a job, import the job files stored in an Object Storage Service (OSS) bucket.

    • When you submit a job, import the job files stored in your local directory or select newly created job files.

Procedure

  1. Log on to the E-HPC console.

  2. In the upper-left corner of the top navigation bar, select a region.

  3. In the left-side navigation pane, choose Job and Performance Management > Job.

  4. On the Job page, select a cluster from the Cluster drop-down list.

  5. Click the Submit Job tab.

  6. On the Submit Job tab, configure the required parameters. The following table describes key parameters.

    Parameter

    Description

    Job Template

    The configured template based on which a job is submitted. For more information, see Manage a job template.

    Job Name

    The name of the job. If you need to automatically download and decompress job files, name the job files after the job.

    Command Line

    The job execution command that you want to submit to the scheduler. You can enter a command or the relative path of the script file, for example, /home/test/job.pbs. This parameter is differently set in the following scenarios:

    • If the script file is executable, enter its relative path, for example, ./job.pbs.

    • If the script file is not executable, enter the execution command, for example, /opt/mpi/bin/mpirun /home/test/job.pbs. If your scheduler is PBS, add twohyphen (-) before the command, for example, --/opt/mpi/bin/mpirun /home/test/job.pbs.

    Queue

    If you added compute nodes to a queue when you created the cluster, submit the job to the queue. Otherwise, the job fails to be run. If you did not add compute nodes to a queue, the job is submitted to the default queue of the scheduler. You must select a queue in which the compute nodes reside. Otherwise, the job fails.

    Task Quantity

    The number of compute nodes that are used to run the job.

    Number of Tasks

    The number of tasks used by each compute node to run the job, that is, the number of processes.

    Maximum Memory

    The maximum memory that can be used when a compute node runs the job. If you do not specify this parameter, the memory is unlimited.

    Maximum Run Time

    The maximum running time of the job. If the actual running time exceeds the maximum running time, the job fails. If you do not specify this parameter, the running time is unlimited.

    Thread Quantity

    The number of threads that are used by a task. If you do not specify this parameter, the number of threads is 1.

    GPU Quantity

    The number of GPUs that are used when a compute node runs the job. If you specify this parameter, make sure that the compute node is a GPU-accelerated instance.

    Priority

    The priority of the job. Valid values: 0 to 9. A greater value indicates a higher priority. If you specify that jobs are scheduled by job priority when you set the cluster scheduling policy, jobs with a higher priority are scheduled and run first.

    You can set a high priority for the jobs that you want to run first.

    Enable Job Array

    Specifies whether to enable the job array feature of the scheduler. A job array is a collection of similar independent jobs. You can set a job array to customize a job execution rule.

    Format: X-Y:Z. X is the minimum index value. Y is the maximum index value. Z is the step size. For example, 2-7:2 indicates that three jobs need to be run and their index values are 2, 4, and 6. Default value of Z: 1.

    Post-Processing Command

    The command that is used to perform subsequent operations on the running results of the job, for example, packaging or uploading of the generated job data to an OSS bucket.

    Stdout Redirect Path

    The output file path of stderr and stdout redirected by using a Linux shell. The path contains the output file name.

    • stdout: standard output

    • stderr: standard error

    Cluster users must have the write permissions on the path. By default, output files are generated based on the scheduler settings.

    Stderr Redirect Path

    Variables

    The runtime variables passed to the job. They can be accessed by using environment variables in the executable file.

  7. Upload the job files to the cluster.

    • Use the job files that are stored in an OSS bucket

      E-HPC allows you to import job files from an OSS bucket before you submit a job. You can also specify job files that are stored in an OSS bucket when you submit a job in the E-HPC console. For more information, see Import job files from an OSS bucket to a cluster. To specify job files that are stored in an OSS bucket when you submit a job in the E-HPC console, perform the following steps:

      1. On the Use OSS Job file tab, click Select File. In the Select File dialog box, select job files and click OK.

      2. If you want to specify ZIP files, TAR files, or GZIP job files, you must turn on Decompression and specify a command to decompress them.

        Note

        After you select job files from an OSS bucket, a folder that has the same name as the job files (for example, JobName) is automatically created in the /home/user directory. Then, the job files are downloaded and decompressed (if necessary) to the /home/user/JobName directory.

    • Edit job files

      1. Click the Edit Job Files tab.

      2. On the Edit Job Files tab, click Cluster File Browser. In the dialog box that appears, enter the cluster username and password to log on to the cluster by using Workbench. You can create, edit, or delete job files based on your needs.

  8. Click Submit Job in the upper-right corner of the Submit Job tab. In the dialog box that appears, enter the cluster username and password. The job is submitted to the cluster. Then, E-HPC runs the job.

Results

After you submit a job, you can view it on the Job page.

Find the job and click Details in the Actions column. In the Job Details panel, you can view the job details, including the job name, job ID, start time, the time at which the job is last updated, and job running information.