Editor’s note: As we were preparing this article for publication, we learned of the death of Dr. John Ohlfest, who was a collaborator on the research discussed here. See the paragraph at the end of the article for more information about Dr. Ohlfest and his research.
Dr. David Largaespada received his Ph.D. in Cellular and Molecular Biology at the University of Wisconsin-Madison in 1992. He joined the University of Minnesota's Department of Laboratory Medicine and Pathology in 1996 and the Department of Genetics, Cell Biology, and Development in 1999, where he is currently a professor. He holds the Margaret Harvey Schering Chair in Cancer Genetics. In 2012, he was awarded the American Cancer Society’s Research Professor Award.
Dr. Largaespada and his research assistant Sue Rathe (a graduate student in the Microbiology, Immunology, and Cancer Biology program) sat down with an MSI staffer to talk about their research and close work with MSI.
MSI: How long have you been using MSI resources?
DL: I have been using MSI resources since I first started at the U of M as a researcher. So, about 16 years.
MSI: Could you describe the research you are doing on personal tumor vaccines?
DL: The work we are doing is a collaboration with John Olfest’s lab in the Department of Pediatrics. This project is funded by a grant from the Children’s Cancer Research Fund (CCRF).
The goal of this research is to examine childhood tumors for mutations that result in altered proteins, which we can use as a basis for making a vaccine. Each vaccine would be specifically designed for a particular patient’s tumor.
Research and literature that has already been published shows that cancer cells accumulate genetic abnormalities. Some of these gene mutations actually drive cancer development (drivers) and some are the results of random changes to the DNA that occur when cells divide many times (called passengers). In fact many tumors are genetically unstable. What this means, though, is that the altered proteins, whether they are driver mutations or passenger mutations, can be recognized by the immune system as foreign proteins. If we could present those foreign proteins to the immune system in the right way, at high levels with drugs called adjuvants that boost the immune response to an antigen or foreign protein, then we might be able to get someone’s immune system to attack the tumors they have.
For this to work, because everyone’s tumor has a different suite of mutations that are present, in each and every patient we would have to sequence their DNA and find all the alterations that are present. We would then make a vaccine that is specifically for that person’s tumor. So this is the challenge: Can we develop an automated pipeline that starts with sequence information and results in a list of candidate alterations that can be used for making a multivalent peptide vaccine cocktail for that patient and produce it in enough time for it to be useful for that patient?
We are currently working with patient material that we have obtained through the tissue repository here at the University and also mouse model tumors, because we want to test this whole concept in mouse models first before we try it on people. We realized if we want to make this possible we are going to need to use very powerful deep sequencing technology.
We also realized that there is no reason to sequence the whole DNA genome of the tumor cell because most of the genome is DNA sequence in between genes and not expressed and made into proteins. So we decided to sequence the RNA, or so-called transcritptome, of the tumor cell. So what we have been doing is isolating RNA from mouse or human tumors, then that RNA is converted to DNA using reverse transcriptase enzyme so you can make a so-called cDNA library, a DNA version of all the transcripts that are expressed in the tumor cell. After this, we sequence that cDNA library from these tumors using the Illumina HiSeq machine and MSI resources to analyze that enormous amount of sequence data.
We have to take all of that sequence data and basically look for variations from normal behavior that would result in the production of altered protein sequences that could be used as a basis for making a vaccine. This part turned out to be way more complicated then I thought it would be initially, because the processing of that RNA-seq data is very complex and distilling it down to the meaningful, real alterations is not simple. As part of this project we went back and used other technologies to verify the mutations we thought were there based on our RNA-seq.
Sue, do you want to talk about how much Seq data you ended up getting and some of the challenges you faced?
MSI: Sue, was this the project you worked on with Kevin Silverstein from our RISS group and Jim Johnson from our Applications Development group, using the pipeline called Missense Mutation and Frameshift Finder (MMuFF)?
DL: Yes, Kevin and Jim are part of the team.
SR: Yes, that’s one part of it. We are looking for any type of mutation that will modify the proteins. The types of mutations that we are looking for are missense mutations, frame shift mutations which result from small insertions or deletions, fusions, between two genes that aren’t normally together, and any kind of abnormal splicing in the gene that has never been seen before. We have been tackling this in three separate areas: the missense frameshift mutation is one, fusion is the second, and then the alternative splicing is the third part.
We have made a lot of progress on the MMuFF software. Jim has done an amazing job on this project. He has distilled all of the potential candidates down by looking for data anomalies, so he has added a lot of coding to eliminate those from the data. He is also checking SNP databases. SNPs are single nucleotide polymorphisms that naturally occur either in mouse or human genomes. They’re known mutations that are considered normal so we don’t need to look at those. We are only looking for the abnormal ones, and Jim has code that eliminates normal mutations from consideration. So, in addition to distilling the information into the real candidates, he also formats the information in a meaningful way to us. He is giving us information that we can then immediately use to go and create a protein that matches the novel protein. It’s a wonderful tool set.
MSI: What other resources do you utilize at MSI, and how have these resources enhanced your research capabilities?
DL: The Galaxy tool set is great.
SR: Yes, the Galaxy tool set is wonderful. MSI is providing tools that biologists can use to analyze their own data. This is helpful in the sense that you don’t have to wait on IT resources. For example the MMuFF tool Jim is working on will be hugely valuable to everyone once we have perfected it and get it added to the Galaxy suite of tools.
Right now there is an incredible amount of data being generated and to try to process this amount of data without a supercomputer would be impossible. So just the physical resources that are provided in terms of disk space and processor power is beyond measure.
MSI: One of our missions at MSI is to contribute to the education of the U’s graduate and undergraduate students. Have your students been able to attend any of our tutorials or software seminars and then apply what they have learned and has MSI facilitated your students’ development in any other way?
DL: Sue is a graduate student and I know she has gone to a number of tutorials. Sue, what do you think?
SR: Yes, I have gone to many of them and I know a number of other graduate students have as well. In regards to the tutorials I know that, initially, MSI was playing catch up because the BMGC brought in a HiSeq machine and started to produce huge amounts of data that MSI wasn’t ready for. There were no tutorials set up for using Galaxy at that point. When we first started using Galaxy it was a painful process, but I have been very impressed with how well MSI has developed training material. I have gone to all of MSI’s new training sessions so I could experience them and give feedback on how well they are doing and it’s quite impressive. They have caught up now and are providing some very useful training for the biologists, in that the biologists can do their own analysis. I will say it is still very daunting to the non-tech-savvy researcher and I believe there is more that can be done to make using these tools more user friendly to the biologists, but MSI is definitely doing a great deal to progressing to that end.
Description of image above: The normal (reference) transcript appears at the bottom of the first diagram and near the top of the second diagram. The thickest bars on the reference transcripts show the translated portions of the gene, the thinnest bars show the introns, and the medium size bars show the untranslated regions (UTRs). The arrows overlaying the introns show the direction of transcription. The top graph depicts the number of reads mapping to each area of the genome. It shows abnormal transcription occurring within the sixth intron of Dck and beyond the end of the 3’-UTR in the B117H sample, indicating an abnormally spliced gene. This same pattern appeared when the transcripts were assembled de novo (without the benefit of a transcript reference file), and then mapped back to the reference genome (bottom diagram). Sanger sequencing of the RNA verified the unusual Dck transcript found in B117H, and Sanger sequencing of the DNA identified an 878 bp deletion starting within the last intron and extending into the last exon.
MSI is sorry to note that Professor Largaespada’s collaborator, Dr. John Ohlfest, passed away on January 21, 2013. Dr. Ohlfest studied molecular biology Iowa State University and received his B.S. in 2001. He received his Ph.D. at the University of Minnesota in 2004 working on gene therapy approaches to treating malignant gliomas. He joined the faculty of the Department of Neurosurgery in 2005 and was the Director of the Neurosurgery Gene Therapy Program and a researcher in the Masonic Cancer Center. MSI staff extend their deepest condolences to Dr. Ohlfest’s family, friends, and colleagues.