Queues

MSI uses job scheduling queues to efficiently and fairly share MSI resources. The job queues on our systems manage different sets of hardware, and have different limits for quantities such as walltime, available processors, and available memory. When submitting a calculation it is important to choose a queue where the job is suited to the hardware and resource limitations.

Selecting a Queue

Each MSI system contains job queues managing sets of hardware with different resource and policy limitations. MSI currently has three primary systems: the supercomputer Mesabi, the Mesabi expansion Mangi, and the Lab compute cluster. Mesabi is has high-performance hardware and a wide variety of queues suitable for many different job types. Mangi expands Mesabi and should be your first choice for submitting jobs. The Lab cluster is primarily for small single node jobs, testing, and interactive software that is graphical in nature. Which system to choose depends highly on which system has queues appropriate for your software/script. More information about selecting a queue and the different queue parameters can be found in the Selecting A Queue quick start guide.

Below is a summary of the available queues organized by system, and the associated queue limitations. The quantities listed are totals or upper limits.

Mangi: Mesabi Expansion 

Mangi is a Linux cluster with most nodes using AMD EPYC 7702 processors.

Queue name Node Sharing Max Nodes Per Job Processor cores per node Wallclock Limit Total Node Memory Limit Advised Per-core Memory Allocation Local Scratch
(GB/node)
Per User Limits Service Unit (SU) Rate
amdsmall Yes 1 128 96 hours 250GB 1950MB 429 50 Jobs 1.50 CPU hours/SU
amdlarge No 32 128 24 hours 250GB 1950MB 429 2 Jobs 1.50 CPU hours/SU
amd2tb Yes 1 128 96 hours 2000GB 15500MB 429 1 Job 0.79 CPU hours/SU
v100 GPU nodes No 6 24 24 hours 375GB 3000MB 875 6 Jobs 0.19 CPU hours/SU
v100-4 GPU nodes No 2 24 24 hours 375GB 3000MB 875 2 Jobs 0.10 CPU hours/SU
v100-8 GPU nodes No 1 24 24 hours 375GB 3000MB 875 1 Job 0.06 CPU hours/SU
amd_or_intel Yes 1 *(1) 24 hours *(1) *(1) *(1) 50 Jobs *(1)
mangi The mangi queue is a meta-queue, which will automatically route jobs to the amdsmall or amdlarge, according to where each job will best fit based on the resource request.

Service Unit (SU) rate: see above.

(1)Note: The amd_or_intel queue can schedule on nodes that belong to the Mangi amdsmall and Mesabi small queues. Your job will be limited for placement on nodes that meet your resource request.

Mesabi

Mesabi is an HP Linux cluster with most nodes using Haswell E5-2680v3 processors.

Queue name Node Sharing Max Nodes Per Job Min Nodes Per Job Processor cores per node Wallclock Limit Total Node Memory Limit Advised Per-core Memory Allocation Local Scratch
(GB/node)
Per User Limits Per Group Limits Service Unit (SU) Rate
small(1) Yes 9 None 24 96 hours 62GB 2580MB 390 GB 500 Jobs 1800 total cores(4) 1.50 CPU hours/SU
large No 48 10 24 24 hours 62GB 2580MB 390 GB 4 Jobs 16 Jobs 1.50 CPU hours/SU
widest(5) No 360 49 24 24 hours 62GB 2580MB 390 GB 4 Jobs 16 Jobs 1.50 CPU hours/SU
max Yes 1 None 24 696 hours 62GB 62GB 390 GB 4 Jobs 16 Jobs 1.50 CPU hours/SU
ram256g Yes 2 None 24 96 hours 251GB 10580MB 390 GB 2 nodes 1800 total cores(4) 1.50 CPU hours/SU
ram1t Yes 2 None 32(3) 96 hours 998GB 31180MB 228 GB 2 nodes 1800 total cores(4) 1.50 CPU hours/SU
k40 
GPU nodes(2)
No 40 None 24 24 hours 125GB 5290MB 390 GB None 1800 total cores(4) 1.50 CPU hours/SU
interactive Yes 1/2/4(6) None 4 12 hours 998GB 249GB Shared, 228GB/390GB(7) 1 Job NA 1.50 CPU hours/SU
mesabi
(default)
The mesabi queue is a meta-queue, which will automatically route jobs to the small, large, widest, or max queues, according to where each job will best fit based on the resource request.

Service Unit (SU) rate: see above. 

(1)Note: Within the small queue there are 32 nodes with ~440 GB of local SSD space available, accessible at "/scratch.ssd". Please remember that data stored in /scratch.ssd is deleted at the end of each job. To submit a job to an SSD node, modify your PBS script or qsub command with the "ssd" keyword:

#PBS -l nodes=1:ssd:ppn=1,walltime=1:00:00

A list of these SSD-capable nodes can be generated using the command "pbsnodes :ssd".

 

(2)Note: The k40 queue is for calculations performing GPU computations. Each k40 node is unshared.   Jobs without a gpu resource request will be rejected.   Each of the k40 nodes contains two NVIDIA K40m GPUs. Reserve GPUs in this queue with the "gpus" keyword: 

#PBS -l nodes=1:ppn=24:gpus=2,walltime=1:00:00

(3)Note: The ram1t nodes contain Intel Ivy Bridge processors, which do not support all of the optimized instructions of the Haswell processors. Programs compiled using the Haswell instructions will only run on the Haswell processors.

(4)Note: The 1800 core limit is inclusive for all group jobs in the small, ram256g, ram1t, and k40, queues together. For example, a group simultaneously using 1798 cores in the small queue and 2 cores in the ram1t queue could run no further simultaneous jobs in the small, ram256g, ram1t, or k40, queues.

(5)Note: The widest queue requires special permission to use. Please submit your code for review at: help@msi.umn.edu

(6)Note: The interactive queue has a limit of 4 processor cores. This can be 1 node with 4 cores, 2 nodes with 2 cores, or 4 nodes with 1 core. 

(7)Note: Scratch on these nodes is shared, and total scratch available (228GB or 390GB) depends on which node the job lands on.

Mesabi Small Queue Characteristics

The Mesabi small queue nodes do not always correspond to separate physical nodes. Instead, the "nodes" are taken to correspond to groups of cores.  

 

For example, if a job requests nodes=2:ppn=4, the scheduler will attempt to find 2 groups of 4 idle cores (8 cores total). The scheduler may assign this job to use 8 cores on a single physical node. For most calculations, it doesn't matter whether the requested cores are on different physical nodes, or on the same physical node, but for some calculations it does matter.

 

The Mesabi small queue has node sharing enabled, so jobs in the Mesabi small queue that use a fraction of a node will very likely be sharing that node with other jobs. This can affect performance in some cases.

Lab Servers

Being decommissioned Wednesday, February 5, 2020.

 

Queue name Number of Nodes Processor cores per node Wallclock Limit Total Node Memory Limit Per-core Memory Limit Local Scratch
(GB/node)
Per User Running Jobs Per User Idle Jobs
(gaining priority in queue)
lab
(default)
1 node (up to 8 cores) up to 8 24 hours 251GB 15650MB 90 GB 6§ 8
lab-long 1 node (up to 8 cores) up to 8 150 hours 251GB 15650MB 90 GB 6§ 8
lab-600 1 node (up to 8 cores) up to 8 600 hours 251GB 15650MB 90 GB 1§ 8

The Lab Servers are for smaller jobs where an interactive experience is needed.   The lab queue is a mix of old hardware with most systems using nehalem or sandybridge intel processors.    Calculations on the lab servers do not consume Service Units (SUs). In some cases more nodes are physically present than listed here, but jobs may each request only a single node, so the table represents the queue submission limits for individual jobs.

§The total number of processors per group from running jobs is subject to a limit of 64 processors.