CUDA

Software Summary

Mesabi

Default Module: 

11.8.0-gcc-7.2.0-xqzqlf2

Other Modules Available: 

11.3.1-gcc-8.2.0-lxzwmyc, 11.8.0-gcc-7.2.0-xqzqlf2, 10.0, 10.1, 11.2, 12.0, 6.5, 7.0, 7.5, 8.0, 9.0, 9.1

Last Updated On: 

Tuesday, November 14, 2023

Mesabi K40

Default Module: 
Other Modules Available: 
Last Updated On: 

Mangi

Default Module: 
11.8.0-gcc-7.2.0-xqzqlf2
Other Modules Available: 

11.3.1-gcc-8.2.0-lxzwmyc, 11.8.0-gcc-7.2.0-xqzqlf2, 10.0, 10.1, 11.2, 12.0, 6.5, 7.0, 7.5, 8.0, 9.0, 9.1

Last Updated On: 

Tuesday, November 14, 2023

Mangi v100

Default Module: 
Other Modules Available: 
Last Updated On: 

NICE

Default Module: 
Other Versions Available: 
Last Updated On: 
Last Updated On: 

Tuesday, November 14, 2023

Support Level: 
Primary Support
Software Access Level: 
Open Access
Software Categories: 
Numerical Libraries
Optimization
Debugging and Performance
Software Description

CUDA, (the Compute Unified Device Architecture), is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. CUDA gives program developers direct access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.

Software Documentation

Software Documentation Tabs

Mesabi

At MSI, CUDA is installed on Mesabi, our main cluster. There are 40 nodes with 2 K40 GPUs each. In order to request the GPU nodes, you need to use the k40 queue.  In the PBS options, you should include the number of GPUs that are needed for the job. Below is an example of an interactive session. 1 node with 2 GPUs was requested for 20 minutes. 

NOTE: GPU nodes are not shared, which means any job running in the k40 queue will be charged for 24 cores of utilization. 

(This assumes you are already on a mesabi login node.)

[ln0003:~] % qsub -I -l nodes=1:gpus=2,walltime=20:00 -q k40
qsub: waiting for job 469592.mesabim3.msi.umn.edu to start
qsub: job 469592.mesabim3.msi.umn.edu ready

[cn3006:~] %

Load the cuda modules

[cn3006:~] % module load cuda cuda-sdk

Here, the deviceQuery program shows that there are 2 GPUs available

[cn3006:~] % deviceQuery | grep NumDevs
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.0, CUDA Runtime Version = 7.0, NumDevs = 2, Device0 = Tesla K40m, Device1 = Tesla K40m
[cn3006:~] %