Minnesota Supercomputing Institute
The PacBio Single Molecule Real Time (SMRT) analysis portal is an easy-to-use web-based platform for analyzing 3rd generation sequencing data generated from the PacBio SMRT platform. Currently, workflows for microbial whole genome assembly, resequencing analysis, transcriptome analysis and various data processing steps are available through the portal. For more information on the analysis portal itself, see http://www.pacb.com/devnet/and the tutorial materials. The software must be run from a browser in the MSI network. This can be achieved via connection through the NICE interface, or by working directly in one of the MSI laboratories. Due to limits in RAM the portal does not run reliably on the lab queue, so execution is supported for Mesabi only. Genomes up to 100 Mbp in size can be successfully run on Mesabi.
Note: you must request a service unit (SU) allocation on Mesabi before proceeding with these instructions.
<!-- Enable Distributed Mode -->
If you installed your PacBio portal prior to September 25, 2015, your portal is probably set up to use the PBS system, which tends to experience serious delays when running on Mesabi. The portal works much better in multi-threaded mode, rather than in cluster mode. So, you'll need to change a few things in a couple of config files. Edit the following 2 files:
change CLUSTER_MANAGER = PBS to CLUSTER_MANAGER = BASH in both of those files.
You can continue to follow the instructions above (mesabi section). When you run 'top', you should now see many processes happening in your local node, and qstat -u USERNAME should only show your single interactive batch job.
By default, we've set up the configuration files to use only 8 processor cores and 24 hours of walltime. But if you have a large genome, you will greatly benefit from increasing these limits, and you may need more memory. On Mesabi, you may request up to 96h of walltime and 32 processor cores on the ram1t nodes (See queue table specs here). To take advantage of these increases, edit the 2 files:
change MAX_THREADS = 8 to MAX_THREADS = 32 in both of those files (assuming 32 is the number of cores you want to use).
change TMP = /tmp to TMP = /scratch.global/<your-user-name> to avoid overflowing the memory in /tmp for large genomes.
Then login to mesabi or itasca by ssh-ing to one of those machines and submit a request for an interactive queue submission: "qsub -I -l nodes=1:ppn=32,walltime=96:00:00 -q ram1t -X", for example. Then follow the normal instructions.
If you get an error in the setup or running of the PacBio server, try the steps once more. If it still fails, try the following: