Choosing a Job Queue

Summary

Most MSI systems use job queues to efficiently and fairly manage when computations are executed.  A job queue is an automated waiting list for use of a particular set of computational hardware.  When computational jobs are submitted to a job queue they wait in the queue in line until the appropriate resources become available.  Different job queues have different resources and limitations.  When submitting a job, it is very important to choose a job queue which has resources and limitations suitable to the particular calculation.

This document outlines factors to consider when choosing a job queue.   These factors are important when choosing where to place a job. This document is best used on all MSI systems and in conjunction with the Queues page that outlines the resource limitations for each queue.

Please note that Mesabi's "widest" queue requires special permission to use. Please submit your code for review at: help@msi.umn.edu.

Guidelines

There are several important factors to consider when choosing a job queue for a specific program or custom script.  In most cases, jobs are submitted via PBS scripts as described in Job Submission and Scheduling. 

Overall System

Each MSI system contains job queues managing sets of hardware with different resource and policy limitations. MSI currently has three primary systems: the newest supercomputer Mesabi, the supercomputer Itasca, and the Lab compute cluster. Mesabi is MSI's newest supercomputer, with the highest performance hardware, and a wide variety of queues suitable for many different job types. Mesabi should be your first choice when doing any computation at MSI. Itasca is a supercomputer with queues most suitable for multi-node jobs which will complete within 1-2 days.  The Lab cluster is primarily for interactive software that is graphical in nature, and testing. Which system to choose depends highly on which system has queues appropriate for your software/script. The variety of queues on Mesabi will be suitable for most users, but the Queue page should be examined.

Job Walltime (walltime=)

The job walltime is the time from the start to the finish of a job (as you would measure it using a clock on a wall), not including time spent waiting to run. This is in contrast to cputime, which measures the cumulative time all cores spent working on a job. Different job queues have different walltime limits, and it is important to choose a queue with a sufficiently high walltime that enables your job to complete.  Jobs which exceed the requested walltime are killed by the system to make room for other jobs.  Walltime limits are maximums only, and you can always request a shorter walltime, which will reduce the amount of time you wait in the queue for your job to start. If you are unsure how much walltime your job will need start with the queues with shorter walltime limits and only move to others if needed. 

Job Nodes and Cores (nodes=X:ppn=Y)

Many calculations have the ability to use multiple cores (ppn), or (less often) multiple nodes, to improve calculation speed.  Certain job queues have maximum or minimum values for the number  nodes and cores a job may use.  If Node Sharing is enabled for a queue you can request fewer cores (ppn) than exist on an entire node.  If Node Sharing is not enabled then you must request resources equivalent to a multiple of an entire node.  All Itasca queues, and Mesabi’s widest and large queues, do not allow Node Sharing.

Job Memory (mem=)

The memory which a job requires is an important factor when choosing a queue. The largest amount of memory (RAM) that can be requested for a job is limited by the memory on the hardware associated with that queue.  Mesabi has two queues (ram256g and ram1t) with high memory hardware, the largest memory hardware is available through the ram1t queue.  Itasca also has two queues with high memory hardware (sb128 and sb256).

User and Group Limitations

To efficiently share resources, many queues have limits on the number of jobs or cores a particular user or group may simultaneously use.  If a workflow requires many jobs to complete, it can be helpful to choose queues which will allow many jobs to run simultaneously.  Mesabi allows more simultaneous jobs to run than Itasca.

Special Hardware

Some queues contain nodes with special hardware, GPU accelerators and solid-state scratch drives being the most common. If a calculation needs to use special hardware, then it is important to choose a queue with the correct hardware available. Furthermore, those queues may require additional resources to be specified (e.g., GPU nodes require ":gpus=X").

Queue Congestion

At certain times particular queues may become overloaded with submitted jobs.  In such a case, it can be helpful to send jobs to queues with lower utilization (node status). Sending jobs to lower utilization queues can decrease wait time and improve throughput. Care must be taken to make sure calculations will fit within queue limitations.