AbokiaBLAST is a parallel implementation of NCBI BLAST created by the inventors of the open-source mpiBLAST project. AbokiaBLAST inherits the super-scalable architecture from mpiBLAST but is re-factored and re-engineered to offer production quality. With intelligent task parallelization and I/O optimization, AbokiaBLAST enables users to massively accelerate large-scale BLAST search on clusters or supercomputers with a single command.
Due to the license, AbokiaBLAST is only installed in Itasca so far.
In order to use AbokiaBLAST, users must have SUs in Itasca. In order to apply for SUs, please refer to this page SU Accounting. After that, users need log into Itasca, then a "module load" method should be used to call the program.
module load abokia-blast
Before running AbokiaBLAST, a configuration file must be created. Create the file ~/.abokia.ini in your home directory. However, if you do have multiple copies of the .abokia.ini files, the program will follow this search path: working directory -> home directory -> root.
The content of the .abokia.ini file must contain the following items:
; Shared directory that stores formatted databases, must be accessible by the user
; Directory to cache database fragments and store temporary data.
; Memory size per compute core (in Megabytes).
We recommend users to use /lustre and /scratch space for temporary data. The fragmented database doesn't need to be put under /lustre. Please also remember the memory setting and node configuration are only working for Itasca.
AbokiaBLAST can use the pre-fragmented NCBI database, which means you can use the various .nin, .nnd, .nni. etc files from the ftp tarball (e.g ftp://ftp.ncbi.nlm.nih.gov/blast/db/nr.00.tar.gz). If this is the case, you can simply prepare the AbokiaBLAST database using the following command:
Other formatted databases could also be converted using this method, which means once you run the standard ncbi formatdb to prepare the database, AbokiaBLAST can use it after the conversion.
However if you would like to create the database from the scratch, AbokiaBLAST has a function (abokia_formatdb) to do so. For this purpose, you need to first download the fasta file, do whatever cleaning you prefer, then run abokia_formatdb
mkdir -p /lustre/USERNAME/blastdb
module load abokia-blast
abokia_formatdb -i dataset.fasta
The command "abokia_formatdb" accepts the regular NCBI formatdb options. For more details, you can run "abokia_formatdb" directly.
When the database is ready, users can create a pbs script to submit the AbokiaBLAST job. Here is an example PBS script:
--- job submission script for 52 nodes ----
#PBS -N abokiaBLAST
#PBS -j oe
#PBS -l nodes=52:ppn=1,walltime=8:00:00
module load abokia-blast/2.0.2-130524/ompi_intel
mpiexec -n 52 abokia_blast -m 8 -i input.fa -p blastx -d nr -o output.fa.out \
--profile perf.n52.log --use-virtual-frags --num-threads 8
Please be aware that at least 3 nodes will be needed in order to launch AbokiaBLAST.
Note: You might get the "File Locking Failed" error while running abokiaBLAST. To fix it, please change the settings in .abokia.ini file. You need to specify another directory for "Local".