All Products
Search
Document Center

Configure scheduler settings

Last Updated: Nov 05, 2021

Elastic High Performance Computing (E-HPC) allows you to configure scheduler settings, including queue resource limits and job scheduling cycles, to improve resource usage. This topic describes how to configure scheduler settings for a cluster.

Background information

The scheduler of the cluster must be PBS, PBS 19, Slurm, Slurm 19, or Slurm 20.

The scheduler of a cluster is used to schedule jobs based on job priorities and distribute compute node resources as needed, including vCPUs, memory, and nodes. You can estimate the node resources and time that are required to run a job based on the job volume, and configure scheduler settings for the cluster.

Procedure

  1. Log on to the E-HPC console.

  2. In the top navigation bar, select a region.

  3. In the left-side navigation pane, choose Resource Management > Scheduler.

  4. Select a cluster from the Cluster drop-down list. Select a scheduler from the Scheduler drop-down list.

  5. Configure scheduler settings.

    • If the scheduler is Slurm, Slurm 19, or Slurm 20, set the following parameters. We recommend that you use the default scheduling cycle.

      • Main Scheduling Cycle: Jobs are scheduled based on job priorities at intervals.

        For example, your cluster has 1 vCPU and a main scheduling cycle of 20 seconds. If you submit Job 1 and Job 2 that both require 1 vCPU and 30 seconds, they have different statuses at the following points in time:

        • 0s: The scheduling starts. Job 1 starts running. Job 2 is pending.

        • 20s: The scheduling is triggered. Job 1 is still running. Job 2 is pending. This is because no resources are available for Job 2.

        • 30s: Job 1 is completed. Job 2 is pending. This is because the scheduling is not triggered even though resources are available for Job 2.

        • 40s: The scheduling is triggered again. Job 2 starts running.

      • Backfill Scheduling Period: Jobs that have a small load are scheduled first regardless of job priorities to improve CPU utilization.

        For example, your cluster has 8 vCPUs and a backfill scheduling cycle of 10 seconds. If you submit Job 1 and Job 2 that both require 6 vCPUs and 60 minutes, and then submit Job 3 that requires 3 vCPUs and 40 minutes, they have different statuses at the following points in time:

        • 0s: The scheduling starts. Job 1 starts running. Job 2 and Job 3 are pending. This is because the scheduling is not triggered even though resources are available for Job 3.

        • 10s: Job 1 and Job 3 are running. Job 2 is pending. This is because the backfill scheduling is triggered for Job 3 to improve CPU utilization.

        • 40 min: Job 1 is running. Job 2 is pending. Job 3 is completed.

        • 60 min: Job 1 and Job 3 are completed. Job 2 starts running.

    • If the scheduler is PBS, or PBS 19, perform the following steps:

      1. In the Scheduler Global Configuration section, set the following parameters:

        • Time Reserved for Historical Assignments: the period during which job data is retained. After the retention period is exceeded, job data is deleted.

        • Scheduling Cycle: the interval between two consecutive scheduling tasks. Generally, scheduling is triggered once a scheduling cycle unless more jobs are submitted or scheduling is restarted.

      2. In the Scheduler Queue Configuration section, select a queue from the Queue drop-down list.

      3. In the Queue Resource Limits section, click New Restrictions. In the New Restrictions dialog box, set the following parameters:

        • User: the name of the user that runs the job.

        • CPU: the maximum number of vCPUs that can be used for nodes in a queue.

        • Memory: the maximum memory of the compute nodes in a queue, for example, 1 GB or 200 MB. The unit is case-insensitive.

        • Node: the maximum number of nodes.

      4. In the Queue User Mapping section, click Add New User. In the Add New User dialog box, select a user and click OK.

        Notice

        After a user is selected, the queue can be used only by the user. If no user is selected, the queue can be used by all users in the cluster.

  6. In the upper-right corner of the Scheduling Settings page, click Submit.

References

SetSchedulerInfo