Supercomputing Institute Scientific Development & Visualization Lab



PBS Information for SDVL

The Portable Batch System (PBS) is installed on the SUN and LINUX workstations

Queue Structure

There are 2 different queues for each platform type.The following table gives a summary of queue on different nodes and the enforced limits.

Queue Nodes Available Max Wall Clock Time per Job Max Number of CPUs per Job Max Memory per Job
sun s1, s3, s6, s7, s8, s9, s10 48 hrs 4 cpus 16 gb
linux64 l3, l4, l5, l6, l7 48 hrs 2 cpus 4 gb

NOTE:

Users have to specify a queue, sun or linux64 in their submission scripts.

Please specify :


Submitting a job: qsub

Please load the PBS module first, i.e.,

        module add pbs
and use the command qsub to submit a job to the queuing system. qsub takes a job submission script that contains special commands telling PBS what resources are needed. It also contains the commands necessary to run the submitted job.


Creating a submission script

To submit jobs to the queue using PBS, one needs to create a job script and use qsub to submit. For example, if you named the submission script "myscript", you could submit it like this:

      qsub myscript
The following are two examples of job scripts

  • Example 1: Serial Job Script
    	#PBS -q linux64
            #PBS -l ncpus=1,mem=1gb,walltime=3:00:00
            #PBS -m abe
            cd /home/smpb/username/TESTS
            ./a.out
            # end of example script
    

  • Example 2: OpenMP Job Script:
          	#PBS -q linux64
    	#PBS -l ncpus=2,mem=1gb,walltime=4:00:00
            #PBS -m abe
            cd /home/smpb/username/TESTS
            setenv OMP_NUM_THREADS 2
            ./a.out
            # end of example script
    


    How to check job status

    You can check job status using the qstat command. Since there are many options available to qstat, we suggest consulting the man page, i.e. man pbs. Here is an example session showing a queued and a running job. Note that, in our environment, the "TSK" field shows the number of processors requested by the user. Also, the "S" field gives the status of the job. This can have a number of possible values, two of which are "R", for running jobs, and "Q", for queued jobs. For more information please refer to the man page.

    username@11 [~/test] % qstat -u username
    
    goose.msi.umn.edu:
                                                                       Req'd  Req'd   Elap
    Job ID               Username Queue    Jobname    SessID NDS   TSK Memory Time  S Time
    -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - -----
    4361.goose.msi.umn.e username linux64  job350    32010   --    1    4gb   24:00 R 01:07
    4471.goose.msi.umn.e username linux64  job312      --    --    1    4gb   24:00 Q   --
    4474.goose.msi.umn.e username linux64  job315    19713   --    1    4gb   24:00 R 01:07
    4475.goose.msi.umn.e username linux64  job316    19718   --    1    4gb   24:00 R 01:07
    4477.goose.msi.umn.e username linux64  job318    19735   --    1    4gb   24:00 R 01:07
    4478.goose.msi.umn.e username linux64  job319    31737   --    1    4gb   24:00 R 01:07
    4479.goose.msi.umn.e username linux64  job320    31742   --    1    4gb   24:00 R 01:07
    
    

    To find more information about the running job, for example, 4361.goose.msi.umn.edu, one can use "qstat -f 24", which shows the following

    username@11 [~/test] % qstat -f 24
    
    Job Id: 4361.goose.msi.umn.edu
        Job_Name = job350
        Job_Owner = username@quantum1.msi.umn.edu
        resources_used.cput = 01:09:31
        resources_used.mem = 89980kb
        resources_used.vmem = 269632kb
        resources_used.walltime = 01:09:44
        job_state = R
        queue = linux64
        server = goose.msi.umn.edu
        Checkpoint = u
        ctime = Sat Apr 12 18:53:04 2008
        Error_Path = quantum1.msi.umn.edu:/home/msi/username/SUB/job350.e4361
        exec_host = l2/2
        Hold_Types = n
        Join_Path = n
        Keep_Files = n
        Mail_Points = abe
        mtime = Fri May  9 15:08:16 2008
        Output_Path = quantum1.msi.umn.edu:/home/msi/username/SUB/job350.o4361
        Priority = 0
        qtime = Sat Apr 12 18:53:04 2008
        Rerunable = True
        Resource_List.arch = x86_64
        Resource_List.mem = 4gb
        Resource_List.ncpus = 1
        Resource_List.walltime = 24:00:00
        session_id = 32010
        comment = 1 nodes unavailable to start reserved job after 1 seconds (job 4
            459 has exceeded wallclock limit on node l2 - check job)
        etime = Sat Apr 12 18:53:04 2008
        exit_status = -4
    
    


    How to remove and kill jobs

    Jobs are killed or removed from the queuing system by using the qdel command. There is a man page for qdel that lists the options you can use with it. If you wish to signal a running job, Here is a quick example of how to remove job 24.sfs.msi.umn.edu.

    qdel 24


    This information is available in alternative formats upon request by individuals with disabilities. Please send email to alt-format@msi.umn.edu or call 612-624-0528.

    HOME | QUESTIONS | FEEDBACK
    Events | Links | People | Publications | Support | Welcome
     


    URL: http://www.msi.umn.edu /sdvl/queue/pbs-intro.html
    This page last modified on Monday, 12-May-2008 10:00:13 CDT  
    Please direct questions or problems to help@msi.umn.edu  
    Website related questions or problems should be directed to webmaster@msi.umn.edu
    The Supercomputing Institute does not collect personal information on visitors to our website. For the University of Minnesota policy, see www.privacy.umn.edu.