Slurm

Slurm

Slurm Workload Manager is MSI's new Job Scheduler

What is Slurm?

Slurm is a best-in-class, highly-scalable scheduler for HPC clusters. It allocates resources, provides a framework for executing tasks, and arbitrates contention for resources by managing queues of pending work.

Why is MSI transitioning to the Slurm scheduler?

Slurm has become an industry standard for scheduling among HPC centers. It’s an open-source scheduler with a plugin framework that allows us to leverage tools developed at other centers. It is capable of stable management of a larger number of jobs than our current scheduler. Finally, it’s architecture opens opportunities to leverage technologies that will be useful for many areas of scientific computation.

How does the transition to Slurm impact my work on MSI systems?
The most obvious adjustment everyone will need to make is to learn a new set of commands for submitting jobs and checking on job status. If you have written scripts that depend on the job scheduler, they will need to be modified to match the syntax used in Slurm. This is also true of some software that MSI maintains. 
 
When you run jobs using Slurm, there will be no SUs deducted from your SU allocation. Group job limits will change over the next couple months as we migrate nodes from the other cluster. ESO customers have received an email on October 15th containing important information regarding the transition of paid SUs and SU accounting for ESO customers.
Resources
MSI has put together resources for users to help groups get started using Slurm. A recorded tutorial session on using Slurm is also now available. Please see the list of links below for more information on various topics related to Slurm, and how to get started using Slurm:

Getting Started Using Slurm

Tutorial Materials

Other Slurm Documentation