A job array is a collection of similar independent jobs which are submitted together to one of the Linux cluster job schedulers using a job template script. The advantage of using a job array is that many similar jobs can be submitted using a single job template script, and the jobs will run independently as they are able to obtain resources on the compute cluster. Using a job array can be advantageous for calculation throughput, especially for small independent calculations that may be able to run "in-between" larger calculations.
Job arrays are mostly easily used if input and output files for independent calculations can be numbered in a sequential fashion.
To use a job array, first create a job template script. An example is shown below:
#!/bin/bash -l #PBS -l walltime=1:00:00,nodes=1:ppn=2,mem=5gb #PBS -m abe #PBS -M firstname.lastname@example.org cd ~/program_directory ./program.exe < input$PBS_ARRAYID > output$PBS_ARRAYID
In this example, the job script is requesting one hour of walltime, two compute cores on a single node, and 5gb of memory. The job script attempts to run an executable named program.exe, and directs the input and output of the program. Saving this job script to a file named jobtemplate.pbs, it could then be used to submit an array of ten jobs using the command:
qsub -t 1-10 jobtemplate.pbs
Within a job array, the value of PBS_ARRAYID will be replaced by a number equal to the array ID number. In this example, ten independent jobs will be submitted, with PBS_ARRAYID values running from 1 to 10. This means that in each of the jobs, program.exe will be passed a numbered input file with a name like inputfile1, inputfile2, etc. Output will similarly be sent to an output file with a name like outputfile1, outputfile2, etc. Each of the ten jobs in this example will have 1 hour of walltime, 2 cores, and 5gb of memory.
The jobs will run independently as they find available resources on the compute cluster. The job ID numbers will consist of a primary ID number followed by the array ID number in brackets, for example 12345.