K40 queue on Mesabi for GPU-enabled software applications
Since each K40m GPU has a peak performance of 1.43 double precision TFLOPS (4.29 single precision TFLOPS), the GPUs in the GPU subsystem provides a total of 114 double precision TFLOPS of peak performance.
GPU enabled software applications can utilize the k40 nodes by submitting PBS jobs (which request the k40 queue) from Mesabi.
Examples of software packages (modules) that can utilize GPUs include: amber/14, namd/2.9-libverbs-CUDA, nwchem/6.5_cuda_6.0, caffe/0.999_cuda_6.5, fsl/5.0.6_cuda5.5
K40 GPU nodes are accessible to users by submitting jobs to the k40 queue located on the Mesabi computing cluster. All MSI users with active accounts and service units (SUs) can submit jobs to the k40 queue using standard commands outlined in the Queue Quick Start Guide.
At MSI, CUDA is installed on Mesabi, our main cluster. There are 40 nodes with 2 K40 GPUs each. To request the GPU nodes, you need to use the k40 queue. In the PBS options, you should include the number of GPUs that are needed for the job by adding gpus=2 to the resource list. Below is an example of an interactive session. 1 node with 2 GPUs was requested for 20 minutes. As of May 2017, cgroups is now enforcing access control and resource management. If you do not request the GPU resource then the cgroups will not provide access to the GPU. MSI job filters will reject jobs submitted to the k40 queue which do not have a gpu resource request. Note that the k40 nodes are not shared.
% qsub -I -l nodes=1:ppn=24:gpus=2,walltime=20:00 -q k40
qsub: waiting for job 469592.mesabim3.msi.umn.edu to start
qsub: job 469592.mesabim3.msi.umn.edu ready
Load the cuda modules
[~] % module load cuda cuda-sdk
Here, the deviceQuery program shows that there are 2 GPUs available
[~] % deviceQuery | grep NumDevs
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.0, CUDA Runtime Version = 7.0, NumDevs = 2, Device0 = Tesla K40m, Device1 = Tesla K40m