PerM is a software package which was designed to perform highly efficient genome scale alignments for hundreds of millions of short reads produced by the ABI SOLiD and Illumina sequencing platforms. Today PerM is capable of providing full sensitivity for alignments within 4 mismatches for 50bp SOLID reads and 9 mismatches for 100bp Illumina reads.
Algorithm and Performance:
With its special periodic spaced seeds, PerM can be fully sensitive to four mismatches, and highly sensitive to higher numbers of mismatches. This seed matching method has speed advantages in longer read (although limited to 64bp currently), non-mappable reads (for fixed number of shift and checking) and in the genome scale mapping due to the high seed weight. PerM is about 43 million reads per CPU hour, full sensitive to 3 mismatches and highly sensitive to more than 3 mismatches for 50bp SOLiD reads. PerM can build the reference index in parallel; it takes half hour to build the human genome index with 16 CPUs and 15 GB memory.
FFTW is a free collection of fast C routines for computing the Discrete Fourier Transform in one or more dimensions. It includes complex, real, and parallel transforms, and can handle arbitrary array sizes efficiently. FFTW is typically faster than other publically-available FFT implementations, and is even competitive with vendor-tuned libraries. To achieve this performance, FFTW uses novel code-generation and runtime self-optimization techniques. The FFTW package was developed at MIT by Matteo Frigo and Steven G. Johnson. More information can be found at the web site: www.fftw.org.
To initialize this software in a Linux environment run the command:
module load fftw
Linking with fftw during compilation is often done with the -lfftw3 flag like:
g++ source_code.cpp -lfftw3
A variety of versions and builds of FFTW, for both single and double precision, are available on MSI systems. Builds are organized in directories by compiler, MPI version, FFTW version, and single or double precision.
Working examples of codes which use threaded and MPI parallel FFTW routines, along with build scripts, are available on high performance systems at:
To retrieve, compile, and run all the examples the following commands may be executed:
cp /soft/fftw/examples/* . ./runme
mpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST. mpiBLAST takes advantage of distributed computational resources, i.e., a cluster, through explicit MPI communication and thereby utilizes all available resources unlike standard NCBI BLAST which can only take advantage of shared-memory multi-processor computers. The primary advantage to using mpiBLAST versus traditional NCBI BLAST is performance. mpiBLAST can increase performance by several orders of magnitude while still retaining identical results as output from NCBI BLAST.
#!/bin/bash -l #PBS -l nodes=30:ppn=8,walltime=24:00:00 #PBS -m abe #PBS -M USERNAME@umn.edu module load mpiblast cd /lustre/USERNAME mpiexec -np 240 mpiblast -i test.fasta -p blastx -d uniref90.fasta \ -m 9 -o blastx.out --use-virtual-frags &> blastx.error
module load mpiblast
[mpiBLAST] Shared=/lustre/USERNAME/blastdb Local=/scratch [NCBI] Data=/soft/mpiblast/VER/ncbi/data [BLAST] BLASTDB=/lustre/USERNAME/blastdb BLASTMAT=/soft/mpiblast/VER/ncbi/data
mkdir -p /lustre/USERNAME/blastdb cp /project/db/mpiblast/uniref90.fasta.* /lustre/USERNAME/blastdb
cd /lustre/USERNAME/blastdb wget url.to.dataset/dataset.fasta module load mpiblast mpiformatdb --nfrags=16 -i uniref90.fasta