How do I use a job array?

A job array is a collection of similar independent jobs which are submitted together to one of the Linux cluster job schedulers using a job template script. The advantage of using a job array is that many similar jobs can be submitted using a single job template script, and the jobs will run independently as they are able to obtain resources on the compute cluster. Using a job array can be advantageous for calculation throughput, especially for small independent calculations that may be able to run "in-between" larger calculations.  

Job arrays are mostly easily used if input and output files for independent calculations can be numbered in a sequential fashion.

 

To use a job array, first create a job template script.  An example is shown below:

        #!/bin/bash -l
        #PBS -l walltime=1:00:00,nodes=1:ppn=2,mem=5gb 
        #PBS -m abe 
        #PBS -M sample_email@umn.edu 
        cd ~/program_directory
	./program.exe < input$PBS_ARRAYID > output$PBS_ARRAYID 
    

In this example, the job script is requesting one hour of walltime, two compute cores on a single node, and 5gb of memory. The job script attempts to run an executable named program.exe, and directs the input and output of the program. Saving this job script to a file named jobtemplate.pbs, it could then be used to submit an array of ten jobs using the command:

 

qsub -t 1-10 jobtemplate.pbs

Within a job array, the value of PBS_ARRAYID will be replaced by a number equal to the array ID number. In this example, ten independent jobs will be submitted, with PBS_ARRAYID values running from 1 to 10. This means that in each of the jobs, program.exe will be passed a numbered input file with a name like inputfile1, inputfile2, etc. Output will similarly be sent to an output file with a name like outputfile1, outputfile2, etc. Each of the ten jobs in this example will have 1 hour of walltime, 2 cores, and 5gb of memory.

The jobs will run independently as they find available resources on the compute cluster. The job ID numbers will consist of a primary ID number followed by the array ID number in brackets, for example 12345[2].