How do I use a job array?

A job array is a collection of similar independent jobs which are submitted together to one of the Linux cluster job schedulers using a job template script. The advantage of using a job array is that many similar jobs can be submitted using a single job template script, and the jobs will run independently as they are able to obtain resources on the compute cluster. Using a job array can be advantageous for calculation throughput, especially for small independent calculations that may be able to run "in-between" larger calculations.  

Job arrays are mostly easily used if input and output files for independent calculations can be numbered in a sequential fashion.

 

To use a job array, first create a job template script.  An example is shown below:

        #!/bin/bash -l
        #SBATCH --nodes=1
        #SBATCH --ntasks-per-node=1
        #SBATCH --mem=2g
        #SBATCH -t 1:00:00
        #SBATCH --mail-type=END 
        #
        cd ~/program_directory
	./program.exe < input$SLURM_ARRAY_TASK_ID > output$SLURM_ARRAY_TASK_ID 
    

In this example, the job script is requesting one hour of walltime, two compute cores on a single node, and 2gb of memory. The job script attempts to run an executable named program.exe, and directs the input and output of the program. Saving this job script to a file named jobtemplate.sh, it could then be used to submit an array of ten jobs using the command:

 

sbatch --array=1-10 jobtemplate.sh

Within a job array, the value of SLURM_ARRAY_TASK_ID will be replaced by a number equal to the array ID number. In this example, ten independent jobs will be submitted, with SLURM_ARRAY_TASK_ID values running from 1 to 10. This means that in each of the jobs, program.exe will be passed a numbered input file with a name like inputfile1, inputfile2, etc. Output will similarly be sent to an output file with a name like outputfile1, outputfile2, etc. Each of the ten jobs in this example will have 1 hour of walltime, 2 cores, and 2gb of memory.

The jobs will run independently as they find available resources on the compute cluster. 

Category: 
jobs