Page not found

5/8/13: Virtual School Summer Computational Science Courses

The Virtual School of Computational Science and Engineering (VSCSE) is holding two courses this summer. These courses are open to graduate students, post-docs, and young professionals who want to expand their skills with advanced computational resources. The courses are offered at institutions...

How do I use Second Tier Storage from the command line?

To access the MSI second-tier storage, each user must have a set of s3 credentials. The s3 credentials act like a username and password to control access to the storage behind the S3 gateway server called s3.msi.umn.edu. These credientals are automatically generated and stored in a configuration...

Functional Genomics of Fusarium graminearum, the Wheat and Barley Scab Fungus

Abstract: 

Functional Genomics of Fusarium graminearum, the Wheat and Barley Scab Fungus

Fusarium head blight or scab caused by Fusarium graminearum is a destructive disease of wheat and barley. Infested cereals are reduced in yield and contaminated with harmful mycotoxins. In the past decade, the disease has resulted in billions of dollars of economic loss to United States agriculture. Better understanding of F. graminearum pathogenesis and differentiation is critical because effective fungicides and highly resistant plant varieties are not available for controlling the disease. This group’s goals are to identify and characterize genes important for plant infection and colonization, secondary metabolism, sporulation, and sexual development of F. graminearum by using transcriptome analysis and targeted mutation of selected genes.

One objective of this research is to analyze gene expression profiles of F. graminearum in different infection and colonization stages, in mutants defective in plant infection or toxin production, and in different developmental stages. Genes differentially expressed during specific infection or development processes or in response to mutants will be identified by high throughput cDNA sequencing (RNAseq). The second objective is to experimentally determine the biological functions of selected candidate genes identified in gene expression experiments. Targeted deletion mutants will be generated for genes chosen on the basis of expression profiles and bioinformatics analyses. A third objective will be to assess the presence of Fusarium species and total fungal content of environmental samples using a metagenomic approach. MSI resources are used for storage and analysis of RNAseq data as well as DNA sequence storage and metagenomic analysis of fungi from environmental samples.

Return to this PI's main page.

 

Group name: 
kistlerh

Conceptual Climate Models

Abstract: 

Conceptual Climate Models

This researcher is continuing a project develop undergraduate research modules on the mathematics of climate change that can be inserted into a standard course of calculus with differential equations. One such module, Global Warming: A Zonal Energy Balance Model, is availble to the public through a web repository of educational resources sponsored by the Science Education Resource Center (SERC) at Carleton College.

Mathematics has an important role to play in understanding the earth's climate. While controlled physical experiments on climate change are rarely available, mathematical models, computational experiments, and data analysis are the fundamental tools to study the earth’s climate system. The project will focus on zonal energy balance conceptual models of the state of the climate system that are formulated in terms of a mean temperature that varies with latitude and a thermal energy exchange among latitudes by diffusion. Although conceptual models only retain some fundamental features of the climate system, they are capable of reproducing relevant complex phenomena.

This project is supported by the NSF funded Engaging Mathematics Initiative, a partnership organized by faculty colleagues to develop curriculum content aiming at improving mathematics learning by connecting the topics to issues of critical local, national, and global importance. This work will also receive the benefit of the collaboration with colleagues of the Mathematics of Climate Seminar, Department of Mathematics, University of Minnesota, and the “Mathematics and Climate Research Network” (MCRN), a network funded by the National Science Foundation linking researchers across the United States to develop the mathematics needed to better understand the Earth’s climate. One of the objectives of the MCRN is “to prepare and disseminate educational material for the undergraduate- and graduate- level curriculum.

MSI computational resources, such as MATLAB and Python, are used to develop the numerical codes for the modules. The PI's students also use these resources in the implementation of their research projects.

Return to this PI's main page.

Group name: 
padronv

Bioenergy and Food Safety

Abstract: 

Bioenergy and Food Safety

Scum is a rich source of recoverable energy, as it contains greases, vegetable and mineral oil, animal fats, waxes, soaps, food wastes, and plastic materials discharged from households, restaurants, and other animal product industries. The lipid content of scum can be as high as 60%.  The energy store in scum, around 22.3 MJ/kg of dry scum, cannot be utilized if it is disposed of in landfills. Moreover, the disposal of scum increases operation costs in treatment plants. For instance, the Metro plant spends $200,000 a year just in transporting and landfilling scum. Therefore, there is an urgent need to develop a technology for energy recovery and beneficial reuse of scum.

These researchers are developing a novel technology that recovers energy and converts lipid, fatty acid, and soap in scum directly to biodiesel.  The final product has a quality similar to ASTM-grade diesel.  The researchers believe that by utilizing biodiesel derived from scum the wastewater treatment plant can: reduce cost of scum disposal to landfills; reduce petroleum fuel use and cost for fuel purchasing; and reduce GHG emissions by using biofuels. In addition, by diverting scum from landfills the technology could reduce methane emissions that have 25 times more global-warming potential as compared to CO2 in a 100-year's time horizon.  All these benefits are likely to improve the economic and environmental sustainability of the wastewater treatment plant. 

There are still uncertainties to the scum-to-diesel technology. In particular, the environmental performances of this technology have never been evaluated within current literature. When examining the technology, heat, electricity, and chemicals have to be provided for all conversion processes, which will raise environmental impacts, and thus, compromise the environmental benefits obtained in the production of biodiesel.

Return to this PI's main page.

Group name: 
ruanr

Hadoop

Software Description: 

The Hadoop Map/Reduce framework harnesses a cluster of machines and executes user defined Map/Reduce jobs across the nodes in the cluster.  On itasca, a script exists to create an ephemeral Hadoop cluster on the set of nodes assigned by the scheduler.  The script setup_cluster will format a HDFS filesystem on the local scratch disks. 

This resource is best-suited for application benchmarking, and algorithm testing.  All data must be moved to HDFS after the cluster is brought up when the jobs starts.  Any data that you wish to save must be moved to your home directory before the job completes.  Many job scripts will follow the pattern:

  1. Set up cluster
  2. move data to hdfs with "hadoop fs -put"
  3. execute test program
  4. move data to home directory with "hadoop fs -get"

If you need a persistent cluster for your work, please see the information at: https://www.msi.umn.edu/content/hadoop-cluster

Software Support Level: 
Secondary Support
Software Access Level: 
Open Access
Itasca Documentation: 

To run this software in batch mode on Itasca, run the commands:

#!/bin/bash -l
#
#PBS -m n
#PBS -l nodes=4:ppn=8
#PBS -l walltime=24:00:00
#PBS -q batch
#

cd $PBS_O_WORKDIR
module load hadoop
setup_cluster
start-all.sh
sleep 90
time hadoop jar \
$HADOOP_HOME/hadoop-examples-1.0.3.jar \
randomwriter random_example \
$HADOOP_HOME/scripts/random.xml

To create an interactive Hadoop cluster on Itasca, first request interactive nodes through pbs using the command:

qsub -l nodes=4:ppn=8,walltime=1:00:00 -I 

This command will wait until 4 nodes are available.  After the nodes are available, you will get a shell on the first node, and you can type:

module load hadoop
setup_cluster
start-all.sh
sleep 90
time hadoop jar \
$HADOOP_HOME/hadoop-examples-1.0.3.jar \
randomwriter random_example \
$HADOOP_HOME/scripts/random.xml
Software Categories: 
Software Interactive/GUI: 
No

MOPAC

Software Support Level: 
Secondary Support
Software Description: 

MOPAC is a general-purpose semi-empirical molecular orbital package for the study of chemical structures and reactions. MOPAC calculates vibrational spectra, thermodynamic quantities, isotopic substitution effects, and force constants for molecules, radicals, ions and polymers. It can carry these out using methods such as AM1, PM3, and PM6.

Software Access Level: 
Open Access
Software Categories: 
Software Interactive/GUI: 
No
General Linux Documentation: 

To run this software interactively in a Linux environment run the commands:

module load mopac
mopac input_file.mop

    where input_file.mop is the MOPAC input file.

    Unfortunately, the version of MOPAC that is available at MSI as a software module prints out a message stating "Your MOPAC executable has expired" and directing the user to upgrade. However, newer versions of MOPAC depend on a version of glibc that is not yet supported on CentOS 6, which the bulk of our lab machines use, and so we are unable to upgrade MOPAC at this time.

    However, while the executable is expired, it still does work normally other than printing out the above message. As a workaround, it is possible to run MOPAC via a wrapper script that skips the message. Create a file with the following contents and a name something like "mopac_wrapper.sh":

    #!/usr/bin/env sh
    
    yes "" | mopac "${@}"

    Then set that file to be executable by executing "chmod u+x mopac_wrapper.sh". This script will then call mopac using the same arguments that you call the script with, and automatically skip past any prompts given by the real mopac executable.

    For additional information

    mpiBLAST

    Software Support Level: 
    Primary Support
    Software Description: 

    mpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST. mpiBLAST takes advantage of distributed computational resources, i.e., a cluster, through explicit MPI communication and thereby utilizes all available resources unlike standard NCBI BLAST which can only take advantage of shared-memory multi-processor computers. The primary advantage to using mpiBLAST versus traditional NCBI BLAST is performance. mpiBLAST can increase performance by several orders of magnitude while still retaining identical results as output from NCBI BLAST.

     

    Additional Information
    Software Access Level: 
    Open Access
    PBS Example: 
    An example PBS script for submitting mpiBLAST jobs to the queue is shown below.
    #!/bin/bash -l
    #PBS -l nodes=30:ppn=8,walltime=24:00:00
    #PBS -m abe
    #PBS -M USERNAME@umn.edu
    
    module load mpiblast
    cd /lustre/USERNAME
    
    mpiexec -np 240 mpiblast -i test.fasta -p blastx -d uniref90.fasta \
            -m 9 -o blastx.out --use-virtual-frags &> blastx.error
    
    The --use-virtual-frags option should provide a small speedup and solves the problem of full local scratch spaces. The -m 9 option generates easily-parsed tab-delimited output.
     
    Software Categories: 
    Software Interactive/GUI: 
    No
    General Linux Documentation: 
    To initialize this software in a Linux environment run the command:
    module load mpiblast
    Before running mpiBLAST, a configuration file must be created.  Create the file ~/.ncbirc (in your home directory) with the contents
    [mpiBLAST]
    Shared=/lustre/USERNAME/blastdb
    Local=/scratch
    
    [NCBI]
    Data=/soft/mpiblast/VER/ncbi/data
    
    [BLAST]
    BLASTDB=/lustre/USERNAME/blastdb
    BLASTMAT=/soft/mpiblast/VER/ncbi/data
    where VER is the version of mpiBLAST you are using.  Also, create the folder /lustre/USERNAME/blastdb and copy an exisiting mpiBLAST database:
    ​mkdir -p /lustre/USERNAME/blastdb
    cp /project/db/mpiblast/uniref90.fasta.* /lustre/USERNAME/blastdb
    mpiBLAST cannot use serial ncbi BLAST databases (such as those in /project/db/blast/current). You must create an mpiBLAST database using mpiformatdb (uniref90.fasta db takes ~10 minutes to build, nr takes ~1hr).  Below is an example of using mpiformatdb:
    cd /lustre/USERNAME/blastdb
    wget url.to.dataset/dataset.fasta
    module load mpiblast
    mpiformatdb --nfrags=16 -i uniref90.fasta
    
    It is highly recommended that the number of fragments is the exact same number of cores performing the search, or, when that is not possible, that the number of fragments is an exact multiple of the number of cores, thus avoiding load unbalance, which degrades performance.  On Itasca, 16 is the recommended value for number of fragments.
     

    Use of Vtune for performance optimization

    Outline

    Introduction

    Use of Vtune on different MSI systems

    Interactive profiling

    Comand-line options

    Profiling MPI applications

    Introduction

    Intel® VTune™ Amplifier XE 2013 is the premier performance profiler for C, C++, C#, Fortran, Assembly and Java*. It is available on all MSI Linux systems for users to eveluate the performance of your applications (identify and remove the hotspots). The objective is to enable all applications to run efficiently on any MSI systems. Certainly, experienced users can deeply explore each of the performance metrics embedded in Vtune. The performance evaluation process itself can be very benedicial for users to learn and understand  the cutting-edge technologioes available in the HPC world.

    Use of Vtune on different systems

    The  module vtune  has been set on all systems. One can profile their applications not only through the graphic interface amplxe-gui, but alos by the use  of command-line interface amplxe-cl. The former fits the need of short-time interactive profiling  while the latter is usefulf for collecting infromation during the run-time.  Users who need to do  the Interactive profiling, please go to the section of Find Hotspot for the details.

    Table 1: Profiling metrics associated with micro-architecture on differen systems 

    System Name Sub-sytem specific features

    Itasca - Nehalem processor

    General Exploration,  Read Bandwidth; Write Bandwith; Memory Access; Cycles and Ops; Frond End Investigation.

    Itasca- Sandy Bridge processor

    General Exploration,  Memory Bandwidth; Access contention; Branch Analysis; Client Analysis; Core port Saturation; Cycles and Ops.

    Cascade- Knights Corner, phi processor

    Lightweight Hotspots; Memory Bandwidth; General exploration

    Cascade- Core i7 980x processor

    Lightweight Hotspots; Hotspots; Concurrency; Locks and Waits.
    Lab Limux workstations Lightweight Hotspots; Hotspots; Concurrency; Locks and Waits.

    Comand-line options

    The command-line interface amplxe-cl provides users with the convenience to profile a real application. Users need to load the vtune module and specify the analysis type of interests. Here are the basic format:

           module load vtune

           amplxe-cl -collect $analysis_type -result-dir $yourprof_dir -- myApplication

            where $analysis_type is the options that users can chose for analyzing the performance on different sub-sysmtem processor (see the Table 2 for the supported analysis type on different platforms); $yourprof_dir is  the directory in which the profiling information is to save; myApplication is the program that you want to prfile.  After the job finishes, you can view the profiling results by either graphic interface:

           amplxe-gui $yourprof_dir

    or the command-line interface:

          amplxe-cl -report  $report_type  -result-dir  $yourprof_dir

            where the $report_type should match the selected $analysis_type 

     

    Table 2 Available Analysis Types for different micro-architectures

     System Name  Options available on different sub-systems
     General     concurrency                    
        frequency                     
        hotspots                  
        locksandwaits                
        sleep                         

     Sandy Bridge processor

        snb-access-contention        
        snb-bandwidth               
        snb-branch-analysis          
        snb-client                 
        snb-core-port-saturation    
        snb-cycles-uops                
        snb-general-exploration      
        snb-memory-access             
        snb-port-saturation         
     

     phi processor

        knc-bandwidth                   
        knc-general-exploration         
        knc-lightweight-hotspots
     Nehalem/Westmere processor     nehalem-cycles-uops         
        nehalem-frontend-investigation 
        nehalem-general-exploration   
        nehalem-memory-access         
     

    Please note that the genral analysis-type in Table 2 applies to every platform on which you want to use vtune. One can find the details about one analysis type of particular interest  by

             amplxe-cl --help $analysis_type

    For example,

             amplxe-cl --help concurrency

     

    Find_hotspots

    Fortran Applications

    C and C++ applications
     

    Profile MPI applications

    MPI jobs can be analyzed by using Vtune over the the implementation of Intel MPI. Here are the simplified commands for profling MPI jobs:

    module load intel impi vtune

    mpirun -r ssh -f $PBS_NODEFILE -np 256 amplxe-cl -collect $analysis_type -result-dir $yourprof_dir ./test > run.out 
    

    After the job runs successfully, one can view the profiling results either graphic or commd-line interface.

    Comprehensive information can be found from the software document - Analyzing MPI applications.

     

    Chenomx NMR Suite

    Software Support Level: 
    Secondary Support
    Software Description: 

    Chenomx NMR suite is an integrated suite of tools that allow you to easily identify and quantify metabolites in NMR spectra. The following products make up Chenomx NMR Suite:

    Chenomx Profiler
    Chenomx Profiler offers you a broad range of tools to assist in identifying and quantifying compound concentrations based on data in an NMR spectrum.
    Using a compound library of over 260 unique spectral signatures together with sophisticated computer-assisted fitting routines, you can accomplish in minutes what would have taken hours or days using manual analysis methods.
     
    Chenomx Compound Builder
    Chenomx Compound Builder was developed to address a recurring customer request: the capability to add custom compounds to a Compound Library. Compound Builder allows a user to create a signature that models the compound of interest.
    This capability lets users create signatures for compounds that are not present in the standard Chenomx Compound Library, and subsequently use those proprietary compounds in an analysis within Profiler.
     
    Chenomx Spin Simulator
    Chenomx Spin Simulator is a simple yet powerful tool for creating simulations of NMR spectra, based on user-defined spin systems, coupling relationships and reference spectra. You can use these simulations as starting points for creating your own compound signatures based on fundamentals of NMR theory.
     
    Chenomx Library Manager
    Chenomx Library Manager allows you to create and manage Compound Sets for use in Profiler. Compound Sets can contain any compound signatures in your library, including those from the Chenomx library as well as those that you create with Compound Builder.
    You can tailor a Compound Set to a specific project. For example, in a research project, you may only be interested in examining the concentrations of five particular compounds for a drug study; you can use Library Manager to create a Compound Set including only those compounds.
     
    Chenomx Processor
    Chenomx Processor allows a variety of native spectrum formats to be converted into the Chenomx file format. These include: Varian,Bruker and JCAMP (Version 5.1 and above)
     
    In addition, Processor allows users to identify or manually override the automatically determined parameters for the Chemical Shift Indicator. Processor can also be used to determine and set a variety of properties for the spectrum, such as pH.
     
    The Chenomx license is for two instances of Chenomx. Chenomx can be run in BSCL and through NICE (http://msi.umn.edu/content/nice) .Chenomx currently runs in the BSCL on bl3-bl12
    Software Access Level: 
    Open Access
    Software Interactive/GUI: 
    No
    General Linux Documentation: 

    To run Chenomx  interactively in the BSCL  Linux environment run the commands:

    module load chenomx 
    
    ChenomxNmrSuite.sh

    When the program asks for a license file click the license button and browse for

    /soft/chenomx/8.01.licensefiles

    And choose the license file there. There will only be one. You then need to load the compound libraries.To load the libraries follow these instructions.

    • 1. Open the Library Manager module of Chenomx NMR Suite.
    • 2. On the Compounds menu, click Import...
    • 3. Click the Browse button (...), and navigate to the folder in which you saved your Chenomx compound packs (.pack).
    • On the machines they are located at /soft/chenomx
    • 4. Select the compound packs that you would like to install, and click Choose Files.
    • 5. Click Next.
    • 6. If you would like Library Manager to automatically create compound sets for the imported compounds, select Create associated compound sets. Creating compound sets is strongly recommended.
    • If you prefer to simply import the compounds, clear the Create associated compound sets check box. You will need to create compound sets later to use the imported compounds in Profiler.
    • 7. Click Finish.

    To run Chenomx remotely:

    Connect to MSI through NICE (http://msi.umn.edu/content/nice). Then run the commands

    module load chenomx/8.01
    ChenomxNmrSuite.sh

    If prompted to load a license, navigate to /soft/chenomx/8.01.licensefiles and select the file located there. Select the whose name corresponds to your NICE session. To view the online manual click the help icon on the Chenomix start up menu.

    Pages