Page not found

High-Performance De Novo RNA-Transcript Reconstruction Leveraging Distributed Memory and Massive Parallelization

Abstract: 
<h4>High-Performance De Novo RNA-Transcript Reconstruction Leveraging Distributed Memory and Massive Parallelization</h4><p><span style="font-size: 14px; line-height: 1.5;">These researchers are working to optimize the&nbsp;</span><span style="font-size: 14px; background-color: rgb(255, 255, 255); color: rgb(51, 51, 51); line-height: 1.5;">performance of the Trinity RNA-Seq de novo assembly software. This project e</span><span style="font-size: 14px; background-color: rgb(255, 255, 255); color: rgb(51, 51, 51); line-height: 1.5;">xemplifies collaborative software development between industry and academia to tackle computational challenges in manipulating large volumes of next-gen sequence data and to leverage advances in algorithm development and compute hardware</span><span style="font-size: 14px; line-height: 1.5;">.&nbsp;</span><span style="font-size: 14px; background-color: rgb(255, 255, 255); color: rgb(51, 51, 51); line-height: 1.5;">Three versions of Trinity&#39;s Inchworm computationally intensive part (one that is based on the original OpenMP version, and two new versions that are based on MPI and on Fortran2008) are integrated into the Galaxy web interface.</span></p><p><span style="font-size: 14px; background-color: rgb(255, 255, 255); color: rgb(51, 51, 51); line-height: 1.5;">A bibliography of this group&#39;s publications acknowledging MSI is attached.</span></p>
Group name: 
sosac
Attachment: 

Three-dimensional computer reconstructions of histological brain preparations and neural network analysis

Abstract: 
<p><strong>Three-Dimensional Computer Reconstructions of Histological Brain Preparations and Neural Network Analysis</strong></p> <p>The growing success and widespread acceptance of deep brain stimulation (DBS) for Parkinson&rsquo;s and essential tremor has opened up the possibilities for applying brain stimulation for other neurological disorders and conditions. The Lim lab pushes to develop new neural prostheses for hearing restoration and tinnitus suppression. This research requires parallel experiments in animals and in humans to understand&nbsp;the various ways in which we can electrically stimulate different brain regions to restore normal auditory function in patients suffering from hearing loss and debilitating tinnitus.&nbsp;For the animal experiments, special electrode arrays are implanted into different brain regions and stimulated with various parameters to characterize the corresponding activation effects on neural coding and perception.&nbsp;After each experiment,&nbsp;histological slices&nbsp;are&nbsp;used to reconstruct three-dimensional computational brain models using Rhinoceros software to identify the locations of our electrode arrays. The researchers also&nbsp;perform extensive spiking pattern analysis and correlations across locations to identify how specific brain regions are related to different electrical stimulation brain activation patterns.&nbsp;These results not only help characterize different brain regions but will also help to identify optimal locations for implanting neural implants in future patients.&nbsp;The large data files used to create these brain reconstructions and neural analyses require the high-performance computers available at MSI to&nbsp;ensure effective and efficient creation and manipulations of the various brain and network models.&nbsp;</p> <body id="cke_pastebin" style="position: absolute; top: 38px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden; left: -1000px; "> </body>
Group name: 
limhh

PerM

Software Description: 

PerM is a software package which was designed to perform highly efficient genome scale alignments for hundreds of millions of short reads produced by the ABI SOLiD and Illumina sequencing platforms. Today PerM is capable of providing full sensitivity for alignments within 4 mismatches for 50bp SOLID reads and 9 mismatches for 100bp Illumina reads.

 

Algorithm and Performance:

 

With its special periodic spaced seeds, PerM can be fully sensitive to four mismatches, and highly sensitive to higher numbers of mismatches. This seed matching method has speed advantages in longer read (although limited to 64bp currently), non-mappable reads (for fixed number of shift and checking) and in the genome scale mapping due to the high seed weight. PerM is about 43 million reads per CPU hour, full sensitive to 3 mismatches and highly sensitive to more than 3 mismatches for 50bp SOLiD reads. PerM can build the reference index in parallel; it takes half hour to build the human genome index with 16 CPUs and 15 GB memory.

 

Additional Information

Software Support Level: 
Secondary Support
Software Access Level: 
Open Access
Software Categories: 
Software Interactive/GUI: 
No
General Linux Documentation: 

To run this software interactively in a Linux environment run the commands:

module load perm
perm

Effector-Triggered Immunity and Effector Targets in the Bacterial Wilt Disease of Tomato

Abstract: 
<p>&nbsp;</p> <p><strong>Effector-Triggered Immunity and Effector Targets in the Bacterial Wilt Disease of Tomato</strong></p> <p>In the field, plants are exposed to numerous pathogenic microorganisms and employ a multi-layered defense arsenal to limit pathogen invasion and growth. The bacterial wilt pathogen, <em>Ralstonia solanacearum</em>, causes one of the most devastating bacterial diseases of plants worldwide, affecting hundreds of plant species including many major crops such as tomato and potato. This pathogen typically infects plants via the root systems and ultimately colonizes the plant vasculature resulting in plant wilting and death. <em>R. solanacearum</em> employs a type III secretion system to deliver effector proteins into the plant cell. The role of most of these effectors in virulence or ETI has not been explored. This research examines the role of <em>R. solanacearum</em> effectors in immunity and disease of tomato. MSI resources are used to analyze DNA sequences and large-scale RNA-seq datasets. The findings will be broadly important, as they will further the understanding of the mechanism of bacterial wilt pathogenesis, which should help in crop protection.</p>
Group name: 
mitrar1

Large-Scale Interactive Scientific Visualization

Abstract: 
<h3 class="red">Large-Scale Interactive Scientific Visualization</h3><p>This project develops and studies visualization, computer graphics, and human computer interaction techniques for analyzing scientific datasets. Datasets for the work come from a variety of collaborators, including researchers studying medical device design, fluid simulation, and bioinformatics.&nbsp; Using the scientific research questions of these collaborators as a driving force, this project investigates new approaches to data analysis based on interactive, exploratory visualization. The work utilizes emerging technologies, such as 3D trackers, custom multi-touch computer interfaces, and a 30-foot wide stereoscopic display, available in the LCSE-MSI Visualization Laboratory. Ultimately, this work can help us to better understand how to explore and analyze the complex large-scale datasets that result from high-performance simulations, helping scientists to both evaluate and generate new hypotheses while leveraging the power of the human visual system and interactive tools.</p><p>Return to this PI&#39;s <a href="https://www.msi.umn.edu/pi/8b9f50c4b0e69f30803a44c0bd447580/10127">main page</a>.</p>
Group name: 
keefedf

Auxin Regulation of Plant Growth and Development

Abstract: 
<h3 class="red">Auxin Regulation of Plant Growth and Development</h3><p>Reversible Ser/Thr phosphorylation of proteins is a major regulatory mechanism of numerous cellular functions. Type 2C protein phosphatases (PP2Cs) represent a major class of Ser/Thr phosphatases, and defects in several human PP2Cs have been implicated in cancer, diabetes, heart disease, neural disorders, and stress signaling. However, little is known about the mechanisms by which PP2C activity is regulated.</p><p>The plant hormone auxin regulates virtually every aspect of plant growth and development. Small Auxin Up-RNA (SAUR) genes represent the largest class of auxin-induced genes. The SAUR19-24 subset of highly related SAUR proteins specifically interact with and inhibit the enzymatic activity of PP2C.D family phosphatases to promote cell expansion. In part, this involves SAUR proteins preventing PP2C.D-mediated dephosphorylation of a key regulatory site of plasma membrane H+-ATPases. The long-term goal of this project is to thoroughly understand the molecular mechanisms underlying auxin-mediated control of plant growth and development. More specifically, this work will characterize and illuminate the mechanism by which SAUR proteins regulate PP2C.D phosphatases to control auxin-mediated cell expansion and other aspects of growth and development.</p><p>These studies include genetic, molecular, biochemical, and structural approaches to elucidate the regulatory mechanisms by which SAURs control PP2C activity in the model plant <em>Arabidopsis thaliana</em>. This plant provides a powerful genetic system for investigating conserved regulatory processes within multicellular eukaryotes. PP2C.D functions will be revealed through genetic analyses and phosphoproteomic profiling experiments that will define PP2C.D regulated pathways and identify potential phosphoprotein substrates important for auxin-mediated growth. Secondly, the regulation of SAUR protein stability will be investigated to gain insight into the mechanism and developmental regulation plants employ to control the abundance and activity of this important family of PP2C inhibitors. Lastly, the structure of a SAUR-PP2C.D complex will be determined and tested in biochemical and genetic assays to illuminate the molecular mechanism of SAUR inhibition of PP2C.D activity at atomic resolution. The findings from this research will likely have direct parallels to the mechanisms human cells employ to regulate PP2C activity, as PP2C structure and function are highly conserved. Such detailed understanding of PP2C regulatory mechanisms will facilitate the development of novel therapeutic strategies to alter PP2C activity and combat disease. Further, as humans depend on plants for sources of food, fiber, medicine, and fuel, this work will elucidate plant growth control by the SAUR-PP2C.D regulatory module and potentially lead to novel strategies for manipulating plant growth to benefit human health.</p><p>Return to this PI&#39;s <a href="https://www.msi.umn.edu/pi/4151823d1c44e9084a87b3c45832ebdd/10485">main page</a>.</p>
Group name: 
grayw0

Experimental Complex Trait Genetics

Abstract: 
<h3 class="red">Experimental Complex Trait Genetics</h3><p>This group group works on understanding how genetic differences between individuals influence gene expression, cell biology, fitness, and other traits. Their work is about evenly split between experimental work (cell and molecular biology) and computational analyses of genomic datasets. They use MSI for several purposes. In analyses of high-throughput sequencing data (genomic DNA, RNA sequencing, and counting of molecular barcodes), they handle sequence files, align them to reference genomes, and perform various manipulations (e.g. trimming, counting). They use a set of relatively standard academic software tools (sometimes multi-threaded) for sequence analyses, along with UNIX, Python, and Perl scripts. The group also performs statistical analyses, primarily in R. Some workflows involve parallelization (simulations, permutations) in a cluster environment.</p><p>Return to this PI&#39;s <a href="https://www.msi.umn.edu/pi/5da4afc4536087f86aca624a9f33e380/10953">main page</a>.</p>
Group name: 
albertf

FFTW

Software Description: 

FFTW is a free collection of fast C routines for computing the Discrete Fourier Transform in one or more dimensions.  It includes complex, real, and parallel transforms, and can handle arbitrary array sizes efficiently.  FFTW is typically faster than other publically-available FFT implementations, and is even competitive with vendor-tuned libraries.  To achieve this performance, FFTW uses novel code-generation and runtime self-optimization techniques.  The FFTW package was developed at MIT by Matteo Frigo and Steven G. Johnson.  More information can be found at the web site: www.fftw.org.

Software Support Level: 
Primary Support
Software Access Level: 
Open Access
Software Categories: 
Software Interactive/GUI: 
No
General Linux Documentation: 

To initialize this software in a Linux environment run the command:

module load fftw

Linking with fftw during compilation is often done with the -lfftw3 flag like:

g++ source_code.cpp -lfftw3

A variety of versions and builds of FFTW, for both single and double precision, are available on MSI systems. Builds are organized in directories by compiler, MPI version, FFTW version, and single or double precision.

Working examples of codes which use threaded and MPI parallel FFTW routines, along with build scripts, are available on high performance systems at:

/soft/fftw/examples

To retrieve, compile, and run all the examples the following commands may be executed:

cp /soft/fftw/examples/* .
./runme

Proteomics Studies of DNA-Protein Cross-Links and Proteins Binding to Epigenetics Marks in DNA

Abstract: 
<h3 class="red">Proteomics Studies of DNA-Protein Cross-Links and Proteins Binding to Epigenetics Marks in DNA</h3><p>This research consists of two projects:</p><ul><li>Cytosine methylation (5-methylcytosine (MeC)) regulates gene expression in a tissue-specific manner. These methylation marks are introduced by DNA methyltransferases (DNMTs), which catalyze de novo methylation of CpG sites and maintain DNA methylation patterns to allow for activation and inactivation of specific genes. Cytosine methylation patterns are stable in normal somatic tissues, but are significantly altered in many diseases including cancer, asthma, and autism. Ten-eleven translocation proteins 1-3 (Tet) catalyze α-ketoglutarate dependent oxidation of MeC to 5-hydroxymethyl-cytosine (hmC), 5-formylcytosine (fC), and 5-carboxylcytosine (caC). These oxidized forms of MeC are not recognized by DNMTs and are removed via base excision mechanism, leading to gene reactivation. Furthermore, studies in the brain have shown that hmC, fC, and caC can be recognized by specific protein readers and may have their own function in human cells. Many key questions remain in regard to the biological roles of Tet proteins and the oxidized forms of MeC in normal cells and in human disease. This project employes quantitative proteomics to establish the biological functions of Tet proteins and oxidized forms of MeC in the lung. These studies will identify protein readers of MeC, hmC, fC, and caC in human bronchial epithelial cells using quantitative proteomics. This will contribute to basic understanding of epigenetic regulation in human cells and will help identify novel targets for therapeutic interventions.</li><li>Exposure to common antitumor drugs, environmental toxins, transition metals, UV light, ionizing radiation, and free radical-generating systems can result in cellular proteins becoming covalently trapped on DNA. The resulting DNA-protein cross-links (DPCs) are hypothesized to be toxic and mutagenic. DPCs are known to progressively accumulate with the heart and brain tissues with age. This project employs mass spectrometry based proteomics to detect and characterize DPCs in cells and tissues. These tools are now being employed to characterize DPC formation in the heart following a myocardial infarction. The researchers hypothesize that DPCs play an important role in cardiotoxicity which follows myocardial infarctions. This information should aid in the development of novel protective agents which could be utilized to prevent cardiac tissue damage following a heart attack.</li></ul><p>Return to this PI&rsquo;s <a href="https://www.msi.umn.edu/pi/75d3ef571cb8a5c2966cd737d0c9fce5/47000">main page</a>.</p>
Group name: 
tretyako

mpiBLAST

Software Description: 

mpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST. mpiBLAST takes advantage of distributed computational resources, i.e., a cluster, through explicit MPI communication and thereby utilizes all available resources unlike standard NCBI BLAST which can only take advantage of shared-memory multi-processor computers. The primary advantage to using mpiBLAST versus traditional NCBI BLAST is performance. mpiBLAST can increase performance by several orders of magnitude while still retaining identical results as output from NCBI BLAST.

 

Additional Information
Software Support Level: 
Primary Support
Software Access Level: 
Open Access
PBS Example: 
An example PBS script for submitting mpiBLAST jobs to the queue is shown below.
#!/bin/bash -l
#PBS -l nodes=30:ppn=8,walltime=24:00:00
#PBS -m abe
#PBS -M USERNAME@umn.edu

module load mpiblast
cd /lustre/USERNAME

mpiexec -np 240 mpiblast -i test.fasta -p blastx -d uniref90.fasta \
        -m 9 -o blastx.out --use-virtual-frags &> blastx.error
The --use-virtual-frags option should provide a small speedup and solves the problem of full local scratch spaces. The -m 9 option generates easily-parsed tab-delimited output.
 
Software Categories: 
Software Interactive/GUI: 
No
General Linux Documentation: 
To initialize this software in a Linux environment run the command:
module load mpiblast
Before running mpiBLAST, a configuration file must be created.  Create the file ~/.ncbirc (in your home directory) with the contents
[mpiBLAST]
Shared=/lustre/USERNAME/blastdb
Local=/scratch

[NCBI]
Data=/soft/mpiblast/VER/ncbi/data

[BLAST]
BLASTDB=/lustre/USERNAME/blastdb
BLASTMAT=/soft/mpiblast/VER/ncbi/data
where VER is the version of mpiBLAST you are using.  Also, create the folder /lustre/USERNAME/blastdb and copy an exisiting mpiBLAST database:
​mkdir -p /lustre/USERNAME/blastdb
cp /project/db/mpiblast/uniref90.fasta.* /lustre/USERNAME/blastdb
mpiBLAST cannot use serial ncbi BLAST databases (such as those in /project/db/blast/current). You must create an mpiBLAST database using mpiformatdb (uniref90.fasta db takes ~10 minutes to build, nr takes ~1hr).  Below is an example of using mpiformatdb:
cd /lustre/USERNAME/blastdb
wget url.to.dataset/dataset.fasta
module load mpiblast
mpiformatdb --nfrags=16 -i uniref90.fasta
It is highly recommended that the number of fragments is the exact same number of cores performing the search, or, when that is not possible, that the number of fragments is an exact multiple of the number of cores, thus avoiding load unbalance, which degrades performance.  On Itasca, 16 is the recommended value for number of fragments.
 

Pages