Page not found

Gamma Ray Astrophysics; Zooniverse Crowdsourcing Science

Abstract: 

Gamma Ray Astrophysics; Zooniverse Crowdsourcing Science

The Fortson research group is focused on two main research areas, each of which can require MSI resources.

  • Gamma Ray Astrophysics: VERITAS is an array of four imaging atmospheric Cherenkov telescopes (IACTs), located at the F. L. Whipple Observatory in southern Arizona. The array has been detecting extraterrestrial gamma rays since 2007. In order to properly calibrate the results, large amounts of simulation and data processing are required. In addition to VERITAS, the next-generation gamma-ray experiment CTA, with a factor of 10 improvement in sensitivity over existing arrays, is finalizing development of its low-level systems. One key system is the triggering and event building stage, which collects and associates information from telescopes spread over several square kilometers.  

    The Fortson group at UMN has responsibilities for both VERITAS and CTA development. For VERITAS, they produce a large fraction of the simulations necessary for calibrating the instrument and performing analysis on the data. More processing capability allows them to explore a larger parameter space of observational conditions. Different atmospheric humidity and aerosol content between summer and winter require them to repeat these simulations. Another important example of the importance of simulations is to track the performance as the array hardware is upgraded.

    For CTA, the group is developing a novel use of self-assembly algorithms to generate a self-annealing event building architecture. These algorithms are meant to better cope with the high data rate and correspondingly high failure rates. These failures include network errors, timing errors, and other hardware errors. The ability of the CTA event builder to correctly identify the information associated with a particular gamma-ray atmospheric shower is vital to the success of this large-scale project.

    Supercomputing resources are also required for running NASA Fermi LAT gamma-ray analysis. Typically this is run in several stages depending on the data products required such as counts maps, test statistic maps, spectra and light curves. For example, to perform a standard binned analysis on a single gamma-ray source (using all the photons collected by the Fermi satellite to date) this typically requires about 2GB of disk storage space with memory usage between 2 to 4GB using approximately 15 CPU hours. This example is for a Log Likelihood analysis of an object situated away from the Galactic plane where the relative number of nearby Fermi sources is smaller and the diffuse background emission low. For an object on or close to the Galactic plane the same analysis could easily take 30 CPU hours depending on the number of sources to be included in the Log Likelihood fit. For data products such as a test statistic map which can only be generated once the standard analysis is complete, this requires significantly longer CPU hour usage e.g. ~168 CPU hours. This is because a maximum likelihood computation is performed on each and every pixel in the requested map. Typically, computing jobs using the Fermi LAT analysis tools are submitted serially to a batch management system.  The group expects to analyze several dozen Fermi LAT sources this year.

  • The Zooniverse is the world’s largest online citizen science platform and several members of the Fortson group are involved in the development and analysis of Zooniverse project data. It is likely that the Fortson group will need to use MSI resources about two-three times during 2016 to batch process hundreds of thousands of images in preparation for their upload to the Zooniverse site.

This PI's work in translational informatics and the Zooniverse project was featured in an MSI Research Spotlight in November 2014.

Return to this PI’s main page.

Group name: 
fortson

MSI Users Bulletin – March 2016

The Users Bulletin provides a summary of new policies, procedures, and events of interest to MSI users. It is published quarterly. To request technical assistance with your MSI account, please contact help@msi.umn.edu . 1. User Accounts: MSI is making changes that will consolidate user accounts and...

Materials Studio

Software Description: 

MS Modeling is Materials Studio's modeling and simulation product suite, and is designed for structural and computational researchers in chemicals and materials R&D who need to perform expert-level modeling and simulations tasks in an easy to learn yet powerful environment. It provides flexible and validated tools for the study of materials at various length and time scales. MS Modeling is Materials Studio's modeling and simulation product suite, and is designed for structural and computational researchers in chemicals and materials R&D who need to perform expert-level modeling and simulations tasks in an easy to learn yet powerful environment. It provides flexible and validated tools for the study of materials at various length and time scales.

To use Materials Studio you must be on the Materials Studio user list. Contact MSI Help, help@msi.umn.edu, to be added to the Materials Studio user list. We have only one Materials Studio license. We ask that all users use the calendaring program to sign up for time on Materials Studio. Please specify your name on the calendar. You can use the following link to see the Materials Studio calendar and reserve the license:
Materials Studio Calendar

Follow the instructions at https://www.msi.umn.edu/content/registering-use-msi-software-through-umn-calendar to add a reservation to the calendar.

If you are found to be using Materials Studio during a time when you are not scheduled, your job is subject to being killed.

The following modules are available to MSI researchers:

Package Concurrent Users Description (from the Accelrys, Inc. web site)
MS Visualizer 1 MS Visualizer - the core MS Modeling product, provides all of the tools that are required to construct graphical models of molecules, crystalline materials, and polymers. Additionally, the Visualizer lets you view and analyze these models and provides the software infrastructure and analysis tools to support the full range of Materials Studio products.
Amorphous Cell 1 Model construction and property prediction for amorphous materials particularly polymers.
CASTEP 1 Uses density functional theory to provide a good atomic-level description of all manner of materials and molecules.
Compass 1 A powerful molecular mechanics force field supporting simulations of solid materials.
Discover 1 molecular mechanics and dynamics methods for structure and property prediction.
Dmol 1 DMol3 - a unique density functional theory quantum mechanical code for gas phase, solvent, and solid state simulations.
Forcite 1 An advanced classical molecular mechanics tool, which allows fast energy calculations and reliable geometry optimization of molecules and periodic systems.
Polymorph Predictor 1 Polymorph Predictor - for the prediction of potential polymorphs of a given compound directly from the molecular structure.
Reflex 1 Reflex - powder diffraction simulation enhanced with indexing and refinement capabilities.
X-Cell 1 X-Cell - a novel and robust indexing program for medium- to high-quality powder diffraction data obtained from X-ray, neutron, and electron radiation sources.

For more information, see http://accelrys.com/products/materials-studio/.

Software Support Level: 
Secondary Support
Software Access Level: 
RESTRICTED - Contact Helpdesk
Citrix Documentation: 
To run this software under Windows first connect to a Window system following the instructions on the Windows System Webpage.  In Windows the program can be located at:
  Start > All Programs > Accelrys > Materials Studio 6.1

Additionally, Materials Studio provides some informational files with names ending in .Readme located in the executable directories (eg. RunCASTEP.Readme).

Software Categories: 
Software Interactive/GUI: 
No
General Linux Documentation: 

To view the available Materials Studio versions in a Linux environment run the command:

  module avail materialsstudio

To view the locations of the Materials Studio programs, view the paths using the command:

  module show materialsstudio

Most Materials Studio executables are launched using a Run script.  For example: RunCASTEP.sh
The script syntax generally requires specifying the number of cores like:

  RunCASTEP.sh -n 4 inputfile

This example would start a 4 core CASTEP calculation.

PacBio SMRT Analysis Portal

Software Support Level: 
Primary Support
Software Description: 

The PacBio Single Molecule Real Time (SMRT) analysis portal is an easy-to-use web-based platform for analyzing 3rd generation sequencing data generated from the PacBio SMRT platform.  Currently, workflows for microbial whole genome assembly, resequencing analysis, transcriptome analysis and various data processing steps are available through the portal.  For more information on the analysis portal itself, see http://www.pacb.com/devnet/and the tutorial materials. The software must be run from a browser in the MSI network.  This can be achieved via connection through the NICE interface, or by working directly in one of the MSI laboratories. Due to limits in RAM the portal does not run reliably on the lab queue, so execution is supported for Mesabi only. Genomes up to 100 Mbp in size can be successfully run on Mesabi.

 
 

 

Software Access Level: 
Open Access
Software Categories: 
Software Interactive/GUI: 
No
General Linux Documentation: 

Instructions for SMRT Link version 3

Initial setup (only needed once unless re-installing)

  1. Get a MSI account (https://www.msi.umn.edu/content/eligibility-getting-access)
  2. Setup your NICE client. (https://www.msi.umn.edu/support/faq/how-do-i-obtain-graphical-connection-using-nice-system). You will need to download the DCV client first.  You must use NICE to access the PacBio analysis portal at MSI remotely or you can come use the computer lab in Walter 575 or Cargill 138.
  3. SSH to MSI, then in the Terminal type: "/home/support/public/smrtlink311/install.sh" then hit return.  This will set up the pacbio portal files in your home directory under the folder name "smrtlink".

Running the PacBio portal (Mesabi queue for genomes < 100 Mbp)

Note: you must request a service unit (SU) allocation on Mesabi before proceeding with these instructions.

  1. Open an NICE session (non-GPU session, with more RAM and time for larger genomes). 
  2. Within the NICE session open a terminal
  3. In the Terminal:
    Type: "ssh -Y login" then enter your MSI account password and hit return to enter the gateway login node.
    Type: "ssh -Y mesabi" then enter your MSI account password and hit return to enter the HPC system.
    Type: "qsub -I -l nodes=1:ppn=8,walltime=24:00:00 -X" then hit return. [NOTEthe first -I is a capitol i and the second -l is a lowercase L.] 
    [NOTE2: if you have a larger genome, e.g.,  > ~30 Mbp, see advanced user tips below]
    When prompted enter your MSI account password then hit return
    Wait for job to start
    Type: "/home/support/public/smrtlink311/start.sh" then hit return.  This will start the portal server going.  Copy down the URL for use in the next step.
  4. In the same Terminal window and isub session:
    Type: "firefox &" then hit return
    When Firefox opens, enter the URL you copied down in the previous step into the browser address bar.  It will look something like this "http://cn0575:9090/".
  5. Note: If you receive an error message about needing to use Chrome, please follow these steps:
    • In the address bar type "about:config" and hit return.
    • In the list that appears, right-click, and select New -> String from the pop-up menu.
    • For "Enter the preference name", enter "general.useragent.override" (without quotes) and click "OK".
    • For the string value, paste "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2895.0 Safari/537.36" (without quotes) and click "OK".
  6. Do not exit the browser until the job is complete.  You can always close out the NICE server and Save your session, reconnecting later as you please.  The job will run in the background.
  7. When you are all complete be sure to clean up the session by running the following script: Type: "/home/support/public/smrtlink311/stop.sh" then hit return. Then type "exit" to exit your isub session.

 

Changing from local jobs to PBS

Edit `userdata/config/preset.xml`, change "False" to "True" for pbsmrtpipe.options.distributed_mode.
<!-- Enable Distributed Mode -->
<option id="pbsmrtpipe.options.distributed_mode">
    <value>True</value>
</option>
 
Edit `userdata/config/smrtlink.config`, change "NONE" to "PBS" for jmsselect__jmstype.
jmsselect__jmstype='PBS';

Instructions for SMRT Link version 2

Initial setup (only needed once unless re-installing)

  1. Get a MSI account (https://www.msi.umn.edu/content/eligibility-getting-access)
  2. Setup your NICE client. (https://www.msi.umn.edu/support/faq/how-do-i-obtain-graphical-connection-using-nice-system). You will need to download the DCV client first.  You must use NICE to access the PacBio analysis portal at MSI remotely or you can come use the computer lab in Walter 575 or Cargill 138.
  3. Open a NICE session. Choose one of the non-GPU options that meets your needs for time and RAM usage.
  4. Within the NICE session open a terminal
  5. In the Terminal:
    Type: "isub -m 8gb -w 24:00:00" then hit return
    When prompted enter your MSI account password then hit return
    Wait for job to start
    Type: "/home/support/public/smrtanalysis230v2/pacbio_user_setup_230.sh" then hit return.  This will set up the pacbio portal files in your home directory under the folder name "smrtanalysis".

Running the PacBio portal (Mesabi queue for genomes < 100 Mbp)

Note: you must request a service unit (SU) allocation on Mesabi before proceeding with these instructions.

  1. Open an NICE session (non-GPU session, with more RAM and time for larger genomes). 
  2. Within the NICE session open a terminal
  3. In the Terminal:
    Type: "ssh -Y login" then enter your MSI account password and hit return to enter the gateway login node.
    Type: "ssh -Y mesabi" then enter your MSI account password and hit return to enter the HPC system.
    Type: "qsub -I -l nodes=1:ppn=8,walltime=24:00:00 -X" then hit return. [NOTEthe first -I is a capitol i and the second -l is a lowercase L.] 
    [NOTE2: if you have a larger genome, e.g.,  > ~30 Mbp, see advanced user tips below]
    When prompted enter your MSI account password then hit return
    Wait for job to start
    Type: "/panfs/roc/pacbio/start_user_portal.sh" then hit return.  This will start the portal server going.  Copy down the admistrator username/password and URL for use in the next step.
  4. In the same Terminal window and isub session:
    Type: "firefox -no-remote &" then hit return
    When Firefox opens, enter the URL you copied down in the previous step into the browser address bar.  It will look something like this "http://cn0575:8080/smrtportal/".
    When prompted for your username and password, enter the administrator username/password you copied in the previous step.
  5. Do not exit the browser until the job is complete.  You can always close out the NICE server and Save your session, reconnecting later as you please.  The job will run in the background.
  6. When you are all complete be sure to clean up the session by running the following script: Type: "/panfs/roc/pacbio/stop_user_portal.sh" then hit return. Then type "exit" to exit your isub session.

Advanced users hints

Queue speedups

If you installed your PacBio portal prior to September 25, 2015, your portal is probably set up to use the PBS system, which tends to experience serious delays when running on Mesabi.  The portal works much better in multi-threaded mode, rather than in cluster mode.  So, you'll need to change a few things in a couple of config files.  Edit the following 2 files:

$HOME/smrtanalysis/install/smrtanalysis_2.3.0.140936/analysis/etc/user.smrtpipe.rc

$HOME/smrtanalysis/install/smrtanalysis_2.3.0.140936/analysis/etc/smrtpipe.rc

change CLUSTER_MANAGER = PBS to CLUSTER_MANAGER = BASH in both of those files.

You can continue to follow the instructions above (mesabi section).  When you run 'top', you should now see many processes happening in your local node, and qstat -u USERNAME should only show your single interactive batch job.

Adding more processor cores and memory for genomes > 30 Mbp

By default, we've set up the configuration files to use only 8 processor cores and 24 hours of walltime.  But if you have a large genome, you will greatly benefit from increasing these limits, and you may need more memory.  On Mesabi, you may request up to 96h of walltime and 32 processor cores on the ram1t nodes (See queue table specs here).  To take advantage of these increases, edit the 2 files:

$HOME/smrtanalysis/install/smrtanalysis_2.3.0.140936/analysis/etc/user.smrtpipe.rc

$HOME/smrtanalysis/install/smrtanalysis_2.3.0.140936/analysis/etc/smrtpipe.rc

change MAX_THREADS = 8 to MAX_THREADS = 32 in both of those files (assuming 32 is the number of cores you want to use).

change TMP = /tmp to TMP = /scratch.global/<your-user-name> to avoid overflowing the memory in /tmp for large genomes.

Then login to mesabi or itasca by ssh-ing to one of those machines and submit a request for an interactive queue submission: "qsub -I -l nodes=1:ppn=32,walltime=96:00:00 -q ram1t -X", for example.  Then follow the normal instructions.

Troubleshooting errors

If you get an error in the setup or running of the PacBio server, try the steps once more.  If it still fails, try the following:

  1. Open an NICE session. 
  2. Within the NICE session open a terminal
  3. In the Terminal:
    Type: "isub -m 8gb -w 24:00:00" then hit return
    When prompted enter your MSI account password then hit return
    Wait for job to start
    If you wish to save data from previous runs, move or make a copy of your current ~/smrtanalysis directory before proceding to the next step.
    Type: "/panfs/roc/pacbio/delete_user_portal.sh" then hit return.  This will delete your existing portal data and pending jobs.  Exit the current isub session by typing: "exit" and retry the steps for running the portal above.
  4. If you continue to have problems, send a email to "help@msi.umn.edu", being careful to include "trouble running PacBio portal" in the subject line.
 

Translational Informatics

Application of Informatics to Transcription of Ancient Papyri While computers can do many things, there are still a few areas in which humans excel such as the discriminatory power of the eye and the natural human ability to quickly classify objects. The visual ability of recognizing patterns is at...

Pages