Itasca - Quickstart Guide

 

This guide will provide you with the basic information needed to get up and running jobs on Itasca ( HP Cluster Platform 3000 BL280c G6).

Login Procedure

Please connect through login.msi.umn.edu or nx.msi.umn.edu. See MSI's interactive connections FAQ, and use itasca.msi.umn.edu as the hostname.

Available Software:

A module system is used on Itasca to control the run-time environment for individual applications. Please type module avail to see the software available on Itasca.

Compilers and MPI libraries

The following tables summarize the compilers, MPI implementations. In the subsequent commands and scripts, we have used Intel compilers for Fortran, C, and C++.

Compiler
Commands
Module
GNU 4.4.1 gcc, g++, gfortran gcc
Intel icc, icpc, ifort intel

MPI implementations and their corresponding modules

Compiler
Platform MPI
OpenMPI
Intel MPI
Intel intel pmpi/intel intel ompi/intel intel impi/intel
GNU gcc pmpi/gnu gcc ompi/gnu gcc impi/gnu

Please note mpif90, mpif77, mpicc and mpicxx are the generic scripts set for compiling F90, F77, C and C++ codes respectively whichever MPI implementation is used. The script sets the necessary PATHs to the include files and MPI libraries so that the MPI code can be compiled. Please don't set any hard-coded path inside the MPI code unless it has been well tested and will generate better performance with the specified MPI implementation. Different versions of each of the MPI implementations are available for different versions of the compilers. Again type module avail to see the details. User manual and documentation can be found in /opt/platform_mpi/doc for Platform MPI and /soft/intel/ict/3.2/impi/3.2.2.006/doc for Intel MPI.

Compiling Code:

Serial codes

Single core serial jobs are not allowed to run on Itasca. They should be run on other systems. Please feel free to contact user support for assistance in running these kinds of jobs (email: help@msi.umn.edu or call 612-626-0802) .

OpenMP codes

C

module load intel

icc -o test -O3 -openmp openmp1.c

Fortran

module load intel

ifort -o test -O3 -openmp openmp1.f

Users can select different compiling options for optimizing the performance. Please see the man page (e.g., man ifort or man icc) for the available options. For example, for jobs to run on Sandy Bridge nodes, users should add -xAVX flag to the above compiling commands.

 

MPI codes

For compiling MPI code with one of the MPI implementations, one needs to load the corresponding MPI modules.

To use Platform MPI and  Intel Compilers:

C

module load intel pmpi/intel

mpicc -o test mpi_code.c

mpiCC -o test mpi_code.cpp

Fortran

module load intel pmpi/intel

mpif90 -o test mpi_code.f

To use Open MPI and  Intel Compilers:

C

module load intel ompi/intel

mpicxx -o test mpi_code.cpp

mpicc -o test mpi1.c

Fortran

module load intel ompi/intel

mpif90 -o test mpi1.f

To use Intel MPI and  Intel Compilers:

C

module load intel impi/intel

mpiicpc -o test mpi_code.cpp

mpiicc -o test mpi1.c

Fortran

module load intel impi/intel

mpiifort -o test mpi1.f

To use  Intel MPI and  GNU Compilers:

C

module load intel impi/intel

mpicxx -o test mpi_code.cpp

mpicc -o test mpi1.c

Fortran

module load intel impi/intel

mpif90 -o test mpi1.f

Run Jobs Interactively:

OpenMP jobs

export OMP_NUM_THREADS=4

./test < input.dat > output.dat

MPI jobs

mpirun -np 4 ./test > run.out

Submit Jobs to the Queue

We use PBS to ensure the machine is being used to its full potential and is fair to every user. You need to create a script file and use the command qsub to submit jobs, e.g. qsub myscript.pbs . Use the "-q" option to select which queue you will submit to (examples below).

For detailed information about the queues, please see the queue information on the HPC policy page.

Serial jobs

On Itasca, no compute nodes are shared by two or more jobs. Single core serial jobs are not allowed to run on Itasca. They should run on other systems. Please feel free to contact user support for assistance in running these kinds of jobs (email: help@msi.umn.edu or call 612-626-0802). However, for a simulation that run multiple copies of the serial job for different inputs, they can be packed to run on one or more nodes.

The following is a PBS script for a job that will run on 1 node using all 8 cores for 8 different tasks. Each of the core needs 2500mb memory. Save the script as 'myscript.pbs'

#!/bin/bash -l
#PBS -l walltime=01:00:00,pmem=2500mb,nodes=1:ppn=8
#PBS -m abe
module load intel
pbsdsh -n 0 /lustre/username/task1 &
pbsdsh -n 1 /lustre/username/task2 &
pbsdsh -n 2 /lustre/username/task3 &
pbsdsh -n 3 /lustre/username/task4 &
pbsdsh -n 4 /lustre/username/task5 &
pbsdsh -n 5 /lustre/username/task6 &
pbsdsh -n 6 /lustre/username/task7 &
pbsdsh -n 7 /lustre/username/task8 &
wait

OpenMP jobs

The following is a PBS script for a 1-hour OpenMP job that will run on 1 node using all 8 cores. This job will need 10GB memory. Save the script as 'myscript.pbs'

#!/bin/bash -l
#PBS -l walltime=01:00:00,mem=10gb,nodes=1:ppn=8
#PBS -m abe
cd /lustre/szhang
module load intel
export OMP_NUM_THREADS=8
./test < input.dat > output.dat

MPI jobs

The following is a PBS script for a 1024-process MPI job that will run for 1 hour on 128 nodes using intel pmpi. The script is using pmem = 1500mb to request 1500MB of memory per core. Please note the difference between mem and pmem. mem requests memory for a job. For the following example, the equivalent mem is 128x8x1500 MB .

#!/bin/bash -l
#PBS -l walltime=01:00:00,pmem=1500mb,nodes=128:ppn=8
#PBS -m abe

a1=$(cat $PBS_NODEFILE | sort | uniq)
time pdsh -w `echo $a1 | sed 's/ /,/g'` date >& check_node

cd /lustre/Your_username
module load intel
module load pmpi/intel
mpirun -np 1024 -hostfile $PBS_NODEFILE ./test > run.out

The following is a PBS script for a 32-process MPI job that will run for 1 hour on 4 nodes using intel ompi. The script is using pmem = 500mb to request 500MB of memory for each core. Save the script as 'myscript.pbs'

#!/bin/bash -l
#PBS -l walltime=01:00:00,pmem=500mb,nodes=4:ppn=8
#PBS -m abe

a1=$(cat $PBS_NODEFILE | sort | uniq)
time pdsh -w `echo $a1 | sed 's/ /,/g'` date >& check_node
cd /lustre/Your_username
module load intel
module load ompi/intel
mpirun -np 32 ./test > run.out

The following is a PBS script for a 256 process MPI job that will run for 2 hour on 32 nodes using intel impi. The script is using pmem = 500mb to request 500MB of memory for each core. Save the script as 'myscript.pbs'

#!/bin/bash -l
#PBS -l walltime=02:00:00,pmem=500mb,nodes=32:ppn=8
#PBS -m abe

a1=$(cat $PBS_NODEFILE | sort | uniq)
time pdsh -w `echo $a1 | sed 's/ /,/g'` date >& check_node

cd /lustre/Your_username
module load intel
module load impi/intel

mpirun -r ssh -f $PBS_NODEFILE -np 256 ./test > run.out

#Please note that the default impi module is for interl MPI version 4. For code compiled with earlier version of intel MPI, you should use the following

module load impi/intel3.2.1
mpdboot -r ssh -n 32 -f $PBS_NODEFILE
mpiexec -perhost 8 -n 256 -env I_MPI_DEVICE rdssm ./test > run.out

Useful Commands

Submitting a standard batch job:

qsub myscript.pbs

Submitting a job to the "long" queue (for extended walltimes):

qsub -q long myscript.pbs

Monitoring queue status:

showq

qstat -f

Monitoring a job's status:

checkjob

Canceling a job:

qdel

Recommendations

Where to run your jobs? If your application needs to read or write large volumes of data, you should run your jobs in a directory under the high performance and high capacity scratch partition /lustre, which is mounted on all of Itasca's login and compute nodes via the LUSTRE file system. To use /lustre, you should create a subdirectory under /lustre with your user name. For example you could do the following.

cd /lustre

mkdir your_username

cd your_username

cp ~/your_job .

mpirun -np 4 ./your_job

Where to load the modules that are needed for running jobs? Put the needed modules in your .bashrc file rather than in the jobs script. That is required for jobs that link with shared objects and these shared objects are not local on the compute nodes. The job will fail if they cannot find the shared objects in the working environment.

I/O performance issue: jobs that usually write relatively small amounts of data at a time should use the buffered I/O transfer approach. For such kind of FORTRAN jobs, please set the following

export FORT_BUFFERED=1

in your job script.

Debugging - submitting an interactive batch job for debugging:

qsub -I myscript.pbs

Where myscript.pbs only contains the #PBS commands to list requested resources. E.g.:

#PBS -l pmem=2150mb,nodes=20:ppn=8

Use of PMPI for special need

Platform MPI has a useful feature that allows users to conveniently reorder the hosts contained in $PBS_NODEFILE for the objective to meet special need. The recommended procedure goes as follows:

Set the following env parameters in your .bashrc file:

module load intel pmpi/intel
export MPI_MAX_REMSH=16
export MPI_MAX_MPID_WAITING=64
export MPI_BUNDLE_MPIDS=N
export OMP_NUM_THREADS=1

Generate your own host list, e.g., named myhostlist

Run the job with your own hostfile:

mpirun -np 128 -hostfile myhostlist ./a.out