Life Sciences


QIIME 2 is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results. QIIME 2 is a complete redesign and rewrite of the QIIME 1 microbiome analysis pipeline.


RAxML (Randomized Axelerated Maximum Likelihood) is a program for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees. It can also be used for post- analyses of sets of phylogenetic trees, analyses of alignments and, evolutionary placement of short reads.




From the OrthoFinder website:

Accurate inference of orthologous gene groups made easy. "OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthologous gene group inference accuracy"


FALCON and FALCON-Unzip provide algorithms to assemble Single Molecule Real-Time (SMRT(R)) Sequencing data into highly accurate, contiguous, and correctly phased diploid genomes. 


From the HISAT webpage:


TransDecoder is software for identifying candidate coding regions within transcript sequences, including those from de novo RNA-Seq transcript assembly via Trinity, or from RNA-Seq alignments to the genome using Tophat and Cufflinks.


GeneMark is the name of a set of gene prediction programs developed at the Georgia Institure of Technology. Genemark-ET is a member of this set of software for semi-supervised analysis of novel genomes.

To use this program at MSI, please send a a request to and ask to be added to the GeneMark group. The use of GeneMark at MSI is limited to academic researchers.



PBSuite aka PBJelly

PBJelly is a highly automated pipeline that aligns long sequencing reads (such as PacBio RS reads or long 454 reads in fasta format) to high-confidence draft assembles. PBJelly fills or reduces as many captured gaps as possible to produce upgraded draft genomes. Each step in PBJelly’s workflow can be run on a cluster, thus parallelizing the gap filling process for rapid turn around, even for very large eukaryotic genomes.

PacBio SMRT Analysis Portal

The PacBio Single Molecule Real Time (SMRT) analysis portal is an easy-to-use web-based platform for analyzing 3rd generation sequencing data generated from the PacBio SMRT platform.  Currently, workflows for microbial whole genome assembly, resequencing analysis, transcriptome analysis and various data processing steps are available through the portal.  For more information on the analysis portal itself, see and the tutorial