College of Biological Sciences
These researchers work on cancer research, trying to identify biomarkers of aggressiveness and/or progression. They are using a relatively new MS proteomics strategy to generate data-independent acquisition (DIA) files on an AB SCIEX 5600+ instrument. Each of these experiments involves developing a spectral library using data-dependent acquisition (DDA) proteomics and traditional database search algorithms, followed by collecting the DIA files on many samples in multiple replicates. Each DDA file is approximately 500 MB, and typically at least 5-10 of those files will go into creating a given spectral library. The DIA files are much larger, on average about 2 GB each, and are usually collected from many, many samples in duplicate or triplicate - such that a full biological experiment can easily produce 100-200 GB of data overall. The researchers also collect RNA-seq data to match up with their proteomics data using proteogenomics strategies in collaboration with the Center for Mass Spectrometry and Proteomics and the group of Professor Tim Griffin. Those files are also hundreds of megabytes each, and need to be run in replicates of at least triplicate. The group is also using Galaxy and Galaxy-P to process these data.