mpiBLAST

Search Software

mpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST.  mpiBLAST takes advantage of distributed computational resources, i.e., a cluster, through explicit MPI communication and thereby utilizes all available resources unlike standard NCBI BLAST which can only take advantage of shared-memory multi-processor computers. The primary advantage to using mpiBLAST versus traditional NCBI BLAST is performance. mpiBLAST can increase performance by several orders of magnitude while still retaining identical results as output from NCBI BLAST.

SW Documentation: 
To initialize this software in a Linux environment run the command:
module load mpiblast
Before running mpiBLAST, a configuration file must be created.  Create the file ~/.ncbirc (in your home directory) with the contents
[mpiBLAST]
Shared=/lustre/USERNAME/blastdb
Local=/scratch

[NCBI]
Data=/soft/mpiblast/VER/ncbi/data

[BLAST]
BLASTDB=/lustre/USERNAME/blastdb
BLASTMAT=/soft/mpiblast/VER/ncbi/data
where VER is the version of mpiBLAST you are using.  Also, create the folder /lustre/USERNAME/blastdb and copy an exisiting mpiBLAST database:
​mkdir -p /lustre/USERNAME/blastdb
cp /project/db/mpiblast/uniref90.fasta.* /lustre/USERNAME/blastdb
mpiBLAST cannot use serial ncbi BLAST databases (such as those in /project/db/blast/current). You must create an mpiBLAST database using mpiformatdb (uniref90.fasta db takes ~10 minutes to build, nr takes ~1hr).  Below is an example of using mpiformatdb:
cd /lustre/USERNAME/blastdb
wget url.to.dataset/dataset.fasta
module load mpiblast
mpiformatdb --nfrags=16 -i uniref90.fasta
It is highly recommended that the number of fragments is the exact same number of cores performing the search, or, when that is not possible, that the number of fragments is an exact multiple of the number of cores, thus avoiding load unbalance, which degrades performance.  On Itasca, 16 is the recommended value for number of fragments.
 
An example PBS script for submitting mpiBLAST jobs to the queue is shown below.
#!/bin/bash -l
#PBS -l nodes=30:ppn=8,walltime=24:00:00
#PBS -m abe
#PBS -M USERNAME@umn.edu

module load mpiblast
cd /lustre/USERNAME

mpiexec -np 240 mpiblast -i test.fasta -p blastx -d uniref90.fasta \
        -m 9 -o blastx.out --use-virtual-frags &> blastx.error
The --use-virtual-frags option should provide a small speedup and solves the problem of full local scratch spaces. The -m 9 option generates easily-parsed tab-delimited output.
 
Additional Information
Short Name: 
mpiBLAST
SW Module: 
mpiblast
Service Level: 
Primary
SW Category: