You are here
Wen Wang is a graduate student who is a member of the MSI research group of Professor Vipin Kumar (MSI Fellow and Head, Department of Computer Science and Engineering). Assistant Professor Chad Myers (Computer Science and Engineering) also advises her work. Professor Kumar specializes in the field of data mining, and Professor Myers is a computational biologist. Ms. Wang entered the graduate program in computer science at the University of Minnesota in 2009 and began using MSI for her research at about the same time. She was a finalist in the poster competition at the 2013 MSI Research Exhibition with her poster, “Leveraging Network Structure to Discover Genetic Interactions in Genome-Wide Association Studies.” Ms. Wang sat down with MSI recently to discuss her research and this poster.
MSI: What resources do you use at MSI?
Wen Wang: Some of my works is done on Professor Kumar’s proprietary machine, but most computational works for this project were done on Elmo. The software I use is MATLAB.
MSI: Let’s get into what your poster describes. This is related to genome-wide association studies, where you can look a couple of different genes and find differences in them?
WW: The purpose of our research project is to study the genetic causes of complex human diseases. The traditional methods used to analyze genome-wide associations (GWAS) data only test single genetic variation between patients and healthy subjects. GWAS data contains hundreds of thousands or even millions of genetic variables - single nucleotide polymorphism (SNP), and so this univariate analysis approach involves testing hundreds of thousands hypotheses. As a result, the statistical score (p-value) obtained needs to be corrected based on the number of hypotheses tested. Thus, to discover a single genetic variation with significant statistic power is a challenging task. In the past 10 years there have been about 1350 published GWAS studies and altogether these GWAS studies have successfully discovered more than 2000 loci which are significantly associated with one or more complex traits. However, these discovered genetic factors only can explain a very small amount of the heritability.
So maybe it’s not the single genetic variant that causes most of a disease. Instead, it could be the interaction of two genetic variants that brings more risk for a disease. However, to study pairwise genetic interactions is difficult since the test space is tremendous and at least half million samples are needed to achieve the statistical significance. This is not practicable and so it seems that this is a hopeless cause.
However, in the yeast research community, genetic interactions have been well studied. It has been proven that genetic interactions are more likely to happen between two pathways with redundant or complementary functions. So we were motivated to test the genetic interaction in the context of pathway-pathway interaction since many well-defined human pathways exist.
We developed a method that explicitly searches for such larger structures, guided by established sets of genes belonging to characterized pathways or gene modules. We applied this approach to a Parkinson's disease GWAS data and discovered tens of pathway-pathway interactions which are statistically significant. We also found biological evidence for many of these interactions. A significant fraction of them also can be validated in two independent cohorts.
MSI: How many subjects will you run the calculations for?
WW: The more the better. The data we tested has a number of subjects ranging from around 500 to 4,000.
MSI: So, something will get your attention if you see a lot more interactions than you expect?
WW: Yes. And we also did permutation tests to make sure it is significant.
MSI: You wrote your code in MATLAB and ran it on Elmo?
WW: Yes. We have different scenarios and different parameters to test. We like to run our experiments in parallel. Also, as you can tell, we’re dealing with big data and our approach needs lots of memory support. Elmo provided all that we need and allowed conducting the experiments in a much more efficient way compared to regular computers.
MSI: Yes, we sometimes have users who say they have programs that would take days on a desktop computer.
WW: Sometime even worse than that. It could be weeks or months. I try to make good use of MSI resources to get results as soon as possible.
MSI: Is this research basic science, or is there an immediate application?
WW: It’s kind of both. We study genetic interaction to help us understand how our biological system works - more specifically understand the underlying cause of disease. However, the ultimate goal of this research is to develop disease model which can be used for disease risk screening, and also to support the development of individualized medicine.
MSI: This research seems to be very collaborative among different disciplines, with data mining and computational biology.
WW: Absolutely! We have [Assistant Professor] Nathan Pankratz in Lab Medicine and Pathology, and [Professor] Brian Van Ness, in Genetics, Cell Biology, and Development involved in this project. We’re from computer science and we like to have experts from biology side to help us understand and interpret our discoveries.
Posted on December 11, 2013.
Understanding the genetic makeup of food crops is critical if we want to develop sustainable ways of protecting those crops against disease. To this end, researchers are using genomics to study plants and their characteristics. Distinguished McKnight Professor Nevin Young (Plant Pathology), together with Associate Professor Peter Tiffin (Plant Biology), Professor Mike Sadowsky (Soil, Water and Climate), and Assistant Professor Bob Stupar (Agronomy and Plant Genetics) have been using MSI for many years as part of their investigations into legumes, the family of plants that includes soybeans, peas, and alfalfa. Legumes are especially interesting in that they form symbiotic relationships with rhizobial bacteria and arbuscular mycorrhizal fungi. These symbiotic relationships allow legumes to extract compounds such as nitrogen and phosphorus out of the soil and provide natural forms of fertilizer for agriculture.
Medicago truncatula (barrel medic) is used by many researchers as a model for legume genomics. The Young group collaborates with research groups worldwide in studying M. truncatula, using massive genomic sequencing methods to study the gene systems and genomic variations that impact these valuable plant-microbe interactions.
MSI’s RISS group is working with the Young lab on this NSF-funded research. Dr. Kevin Silverstein, Operations Manager and Scientific Lead of the RISS group, is the co-PI on the most recent NSF grant to fund this work. The researchers are using MSI’s Itasca system to map the immense collection of raw sequencing data generated by the project and they are also using the Institute’s high-performance storage capabilities. They have also developed graph analytic applications that offer the opportunity to extend MSI’s compute resources in the area of bioinformatics.
The Young group has published numerous papers about their research. A sampling of recent papers includes:
- “Phylogenetic signal variation in the genomes of the genus Medicago (Fabaceae),” JB Yoder, R Briskine, J Mudge, A Farmer, T Paape, K Steel, GD Weiblen, A Bharti, P Zhou, GD May, ND Young, P Tiffin, Systematic Biology, 62(3): 424-438, DOI:10.1093/sysbio/syt009 (2013)
- “Selection, Genome Wide Fitness Effects, and Evolutionary Rates in the Model Legume Medicago truncatula,” T Paape, T Bataillon, P Zhou, T Kono, R Briskine, ND Young, P Tiffin, Molecular Ecology, 22(13):3525-3538, DOI:10.1111/mec.12329 (2013)
- “Estimating Heritability With Whole-Genome Data,” J Stanton-Geddes, J Yoder, R Briskine, ND Young, P Tiffin, Methods in Ecology and Evolution, DOI:10.1111/2041-210X.12129 (2013)
- “Candidate Genes and Genetic Architecture of Symbiotic and Agronomic Traits Revealed by Whole-Genome, Sequence-Based Association Genetics in Medicago truncatula,” J Stanton-Geddes, T Paape, B Epstein, R Briskine, J Yoder, J Mudge, AK Bharti, AD Farmer, P Zhou, R Denny, GD May, S Erlandson, M Sugawara, MJ Sadowsky, ND Young, P Tiffin, PLoS ONE, 8(5): e65688, doi:10.1371/journal.pone.0065688 (2013)
- “Whole-Genome Nucleotide Diversity, Recombination, and Linkage-Disequilibrium in the Model Legume Medicago truncatula,” A Branca, T Paape, P Zhou, R Briskine, AD Farmer, J Mudge, AK Bharti, JE Woodward, GD May, L Gentzbittel, C Ben, R Denny, MJ Sadowsky, J Ronfort, T Bataillon, ND Young, P Tiffin, Proceedings of the National Academy of Sciences of the USA, 108: E864-870, DOI:10.1073/pnas.1104032108 (2011)
- “The Medicago Genome Provides Insight Into The Evolution of Rhizobial Symbioses,” ND Young, F Debellé, G Oldroyd, R Geurts, SB Cannon, MK Udvardi, VA Benedito, KFX Mayer, J Gouzy, H Schoof, et al., Nature, 480: 520-524, DOI:10.1038/nature10625 (2011)
Note: Authors in bold are MSI Principal Investigators.
Image Description: Left: A standard circular plot of Medicago truncatula’s eight chromosomes. Lines are drawn between regions of the genome that have evidence of ancient duplication events, which are plentiful in this genome. On the outer ring, dots represent individual members of two large gene families known to have association with plant-microbe interactions. Right: Medicago truncatula (barrel medic).
Posted on November 27, 2013.
Jin Woo Jung is a computer science graduate student who is doing research at the Minnesota Dental Research Center for Biomaterials and Biomechanics (MDRCBB), led by Professor Alex Fok (Restorative Sciences), in the School of Dentistry. He has been using MSI for about a year. Mr. Jung was a finalist in the poster competition at the 2013 MSI Research Exhibition with his poster, “Photo-realistic Rendering of Teeth and Restorative Bio-materials Using Monte-Carlo Photon Tracing.” He is in the research group of Associate Professor Gary Meyer (Computer Science and Engineering) and also works with Professor Ralph DeLong (MDRCBB), and Associate Professor Maria Pintado (MDRCBB). Mr. Jung sat down with MSI recently to discuss his research; Professors DeLong and Pintado also participated in the interview.
MSI: What kinds of research do you use MSI for?
Jin Woo Jung: I’m doing light transport simulations to reproduce the photo-realistic appearance of dental tissues and biomaterials on the computer screen. The light path within the volume can be modeled as random walks of photons, and Monte Carlo simulation is able to compute the outgoing radiance from the random walks. And then we synthesize the images that show the realistic translucent objects, using the information we calculate.
MSI: In your poster, you’re modeling materials that will be used in tooth restorations. Tell me what you did to get to this poster.
JJ: Obviously, restorative materials have to have very similar appearance to the tooth. As any dentist will tell you, that is a very difficult problem because the color of the restoration changes when it becomes part of the tooth. Furthermore, it is not easy for dentists to choose the correct shades of restorations for their patients. Well, I’m sure Professor DeLong and Professor Pintado will correct me if I’m wrong about the properties of teeth. A tooth is a heterogeneous object; it consists of three major components that are very different. From the outside working inwards, they are: enamel, dentin, and pulp. Overall a tooth is translucent. We all agree that it looks kind of yellow. That’s not because of the color of the enamel; it’s because of the color of the underlying dentin, which is yellow. The yellow passes through the enamel, which may add some white to the final appearance.
But if we can predict the appearance of the restorations in various situations through a simulation, we can assist the materials engineer to develop restorations with the correct appearances, and dentists to choose the right shades for the restorations. So, that’s why we started this research. First of all, we scanned real teeth using the micro-CT scanner at the MDRCBB, and extracted their geometric information; not only the surface geometry but also the density information inside the volume.
Maria Pintado: This geometric information is also used in my study and teaching on the anatomy of teeth. We have produced a self-learning software package called Tooth Explorer using the images created by Jin Woo.
JJ: Using Avizo and Hypermesh that MSI provides, we segmented the micro-CT data and reconstructed the surface information. We also obtained the optical characteristics of teeth and restorative materials from published papers. Applying all of this information to my simulations, I can render the appearance of the teeth on the computer screen. The picture at the end of the poster shows a rendition of the real teeth, including their surfaces and volumes. The simulated light has all the characteristics that come from real enamel and dentin.
I also simulated restorative materials. The restorative material I used was from 3M ESPE, a division of 3M that manufactures dental products and has a long-term collaboration with MDRCBB. They have the spectrophotometers to measure the optical characteristics of the material, such as scattering, absorption, and anisotropy. I modeled the restorative material to show how shade, color and translucency combined to give off overall appearance.
MSI: Now, you’re a computer science student, and you’re working with the Dental School. Can you talk a little bit about the interdisciplinary aspects of this work?
JJ: We are trying to solve a problem in the field of dentistry using computer graphics. The computer science aspects of this require the appropriate mathematics and optimal computing algorithms using the computational power of modern computers, which leads to a good approximation of the real world. We need to have data for the light scattering and the geometry of the teeth. The field of dentistry and related industries provide the data we can make use of. The simulations have some very practical ends in view that is, assisting the dentists in the real world in shade and color matching. The medical/dental industry is a good practical field that the computer science experts can apply their approaches. That’s how the two fields are working together.
Ralph DeLong: The way this research came about was that we have a problem. The problem is making realistic teeth out of, essentially, numbers. Jin Woo’s advisor in Computer Science [Associate Professor Gary Meyer] is interested in light transfer through materials, so the two requirements fit very well together. The third component of this is 3M. 3M manufactures restorative materials, and there’s always been a problem in dentistry of matching colors to the natural teeth, because if they don’t match, people aren’t happy. For years, dentistry has been trying to find a better way to do this. Right now, it’s still done by eye. You look at a tooth, compare it to a color guide and decide this color is close, so I’ll use it. Generally, you can get a good result. Now, you have things like CAD/CAM [computer-aided design/computer-aided manufacturing] coming into dentistry, where you’re milling restorations, and most of them don’t look very aesthetic. So, if you want to make an aesthetic restoration, you have to mill different materials and put them together. Also, for composite fillings, you take one color and you insert it into the tooth it may or may not give you a good match. If you really want aesthetics, you have to start using multiple colors and put them together that’s an art form. What we’d like to do is capture an image of the tooth, then use a computer using known optical properties of the restorative materials and the teeth, and our software tells us what restorative materials to use. This is a three-way junction between computer science, dentistry, and industry. Much of this development was foreseen in our work on an NIDCR [National Institute of Dental and Craniofacial Research] project called the Virtual Dental Patient.
MSI: That’s very exciting. When you say composites, you mean fillings?
RD: Yes, filling materials.
MSI: So, you’re using computer modeling to simulate the way light hits these materials. I’m guessing this is something that could not be done on a standard desktop computer?
JJ: The difficulty with this simulation is that it requires a lot of computational power, as well as a large amount of memory space. Unlike an analytical approach, Monte-Carlo simulation needs to sample large amount of data to compute an outgoing radiance on the surfaces. Parallel computing with multiple computing nodes is very useful in this kind of problem. In addition, we re-use a lot of intermediate results that we compute for the translucent objects. That’s why large amount of main memory is also helpful in this simulation. If we were running this kind of simulation in my laptop, it would take weeks, if not months. Without huge main memory space, the simulation would waste time on swapping data between main memory and secondary storage. Thanks to MSI, we can save a lot of time.
MSI: Did you use the supercomputers on this, or did you use the labs?
JJ: This is a large simulation that requires a lot of computing power. I used the Windows machines [through the labs] on Iron for the Monte Carlo simulation.
MSI: When you run your calculations, do you get a visualization, or do you get numbers that the manufacturer can use to create these false teeth?
JJ: The final results of this simulation are photo-realistic renderings of teeth and biomaterials. Through changing the optical properties of the restorative materials, we can arrive at the appropriate values for desired appearance as required by the materials scientists and dentists.
MSI: How close are we to the point where 3M will be able to use this?
JJ: It will take some time. One thing I have to overcome is the slow speed of the simulation. If we simulated the physically correct appearance of the teeth and the biomaterials fast enough, the material engineers and dentists would be happy to use this. Right now, we are modeling a real-time algorithm, with recent hardware technologies, based on the current mathematical model. We are expecting this research to be more helpful with better responsiveness. Once we have that, it’s going to be very useful, especially for the material designers, as well as dentists who can take advantage of these approaches.
MSI: So then, 3M will be able to make these materials so that a dentist will be able to build teeth based on the individual patient? Very individualized treatment?
RD: Once he’s got an equation, he needs the optical properties of the materials. 3M would measure those optical properties, and they would be plugged into some sort of a device, possibly a portable spectrometer. The device would capture an image of the tooth, and tell you what material to use. It makes it very easy to identify the right material and how to use it. The device would consider the multiple structures within the tooth, possible filling materials, and guide the dentist how to use them.
This optical information could lead manufacturers to changing colors of the restorative materials. Right now, the colors of the restorative materials are a shot in the dark (no pun intended). They provide a range, but they may not be the best colors to match natural teeth.
One other thing I should mention, Jin Woo gets most of his data from the MDRCBB microCT scans, which come out as shades of gray; it’s literally just a bunch of numbers. When he gets through, and you look at the image, it looks just like a natural tooth. There have been NIH SBIR [Small Business Innovation Research] grants given to do something like this, to render a tooth in a dental atlas. The way they had to do it was to take pictures of teeth as they slowly rotated them through different viewing angles, then they combined the pictures to get an apparent 3D image that could be rotated in space; a very tedious process. When you put those pictures up against Jin Woo’s, there’s no difference. And yet his technique is all mathematics with some physics.
Posted on November 13, 2013.
In the past few decades, researchers specializing in condensed matter physics have been intensely interested in studying the properties of strongly correlated quantum many-body systems, which are systems on the microscopic scale that include more than two particles that interact with each other. Researchers especially want to be able to explain how the macroscopic behavior of materials can be explained by the fundamental interactions of the material’s microscopic constituents.
Particular interest has focused on quantum spin-lattice systems, where the interactions can be simply described, but where different types of interaction can compete with one another. (The “spin” of an elementary particle is an inherent property it possesses in quantum mechanics that has no classical counterpart, although it can loosely be thought of as a form of internal angular momentum. In a standard computational model in condensed matter physics, these spins are arranged on a lattice and coupled through magnetic interactions.) The system can then find itself in a frustrated state in which different forms of ordering are trying to emerge in competition with one another. The often subtle interplay between this frustration and quantum fluctuations can lead to quantum spin-lattice models exhibiting ground-state (i.e., at zero temperature) phase diagrams that are very different from their classical counterparts. Of greatest theoretical interest are the so-called quantum critical points where phase transitions occur.
Computer models for quantum-mechanical wave functions of strongly interacting many-spin systems are extremely complex. This is especially so near the quantum critical points, where the ground-state phase has special properties and involves a very large set of fluctuating configurations. Methods for modeling these systems have included techniques from quantum field theory or large-scale numerical simulations such as Monte Carlo methods. Since the effects of quantum fluctuations, and hence the complexity of the wave functions, increase the closer one approaches the quantum critical points, very accurate quantum many-body techniques are necessary.
Professor Charles Campbell (Physics and Astronomy) and his colleagues Professor Raymond Bishop and Dr. Peggy Li at the University of Manchester (UK), and their collaborators elsewhere, have developed and adapted one such many-body method, the so-called coupled cluster method (CCM), to study a large and diverse array of two-dimensional (2D) quantum spin systems of theoretical and experimental interest. The CCM is now widely accepted as being one of the most successful and most widely applicable of all modern methods of microscopic quantum many-body theory. The CCM techniques pioneered by Professor Bishop and his collaborators are probably now the best available for these strongly frustrated 2D quantum spin-lattice systems, and their results are now setting benchmarks in the field. The group runs their CCM codes on MSI’s supercomputers. The interesting magnetic phenomena displayed by such systems make them suitable candidates for a large number of technological applications, many of which are already in widespread use. This research is also providing insights into exciting new systems, such as exotic superconducting systems and non-superconducting systems that have unusual magnetic properties.
Professor Bishop’s contributions to the development and applications of the CCM resulted in his sharing the Eugene Feenberg Memorial Medal in 2005. (The Feenberg Medal is awarded for work that significantly advances the field of many-body physics.) His co-recipient, the late Hermann Kuemmel, is generally acknowledged as the inventor of the CCM. Among other contributions, Professor Bishop has adapted the CCM to several important quantum many-body systems, including the quantum magnetism problems described above. Professor Campbell, who is a long-time researcher at MSI in the field of quantum fluid research, has been working with Professor Bishop and his former student and now post-doctoral associate Dr. Li, adding his own area of expertise to the CCM. Bishop, Li, and Campbell have used MSI resources for several years in their work to advance this technique.
Approximately 10 papers describing this research using MSI have been published since the start of 2012. These have appeared in the journals Physical Review B, Journal of Physics: Condensed Matter, and the European Physical Journal B. Two of the articles were chosen by the editors for special highlighting. The group has especially concentrated on the spin-1/2 J1-J2-J3 model on the honeycomb lattice and the spin-1/2 J1-J2 model on the checkerboard lattice (otherwise known as the anisotropic planar pyrochlore), which have recently become very hot topics in the field.
Image Description: Phase diagram of the spin-1/2 J1−J2 model on the honeycomb lattice (with J1 > 0 and x ≡ J2/J1 > 0), as obtained by a CCM analysis. The four phases shown are Néel, plaquette valence-bond crystalline (PVBC), staggered dimer valence-bond crystalline (SDVBC), and Néel-II. The quantum critical points (phase transitions) are at xc1 ≈ 0.207(3), xc2 ≈ 0.385(10), and xc3 ≈ 0.65(5), as shown in the diagram. From “Valence-bond crystalline order in the s = 1/2 J1−J2 model on the honeycomb lattice,” R.F. Bishop, P.H.Y. Li and C.E. Campbell, Journal of Physics: Condensed Matter 25:306002, DOI=10.1088/0953-8984/25/30/306002 (2013) ©2013 IOP Publishing Ltd.
Posted on October 30, 2013.
As the earth’s climate changes, scientists are concerned about the effects these changes will have on the earth’s ecosystems. Regents Professor Peter B. Reich (Forest Resources, Institute on the Environment) and post-doctoral researcher Emily Peters (Institute on the Environment) specialize in discovering the impacts of these changes. The Reich group’s main area of focus is the part of central North America that includes Minnesota. In this part of the continent, several types of ecosystems converge, including boreal forests (forest consisting mostly of coniferous trees), temperate hardwood forests, oak woodlands/savannas, and grasslands.
MSI has been working with Professor Reich and Dr. Peters to develop a distributed computing framework for the parallel photosynthesis and evapotranspiration model (PPnET). PPnET allows researchers to efficiently use PnET-CN (a widely used and well-tested ecosystem model) to simulate the effects of many simultaneously changing environmental factors on forests over large geographic areas. MSI is providing hardware, software, and consulting support to this project. Dr. Shuxia Zhang, in the HPC Operations group, developed an MPI-based program that allowed parallel jobs to start and restart flexibly. This made allowances for the availability of software licenses at any given time, as well as the availability of compute nodes on the supercomputers. Dr. Zhang also developed script tools that verified data integrity and the success of hundreds of thousands of inputs.
The Reich group used PPnET to simulate ecosystem responses to changes in climate and atmospheric CO2 concentrations in the Great Lakes region of North America. This simulation had 1 km spatial resolution, consisting of 200,000 forest grid cells. The computing time, which would take 25 days for serial runs – and would therefore be impractical – was reduced to six hours using 96 cores on a Linux cluster. This research has been published in the Canadian Journal of Forest Research (“Potential Climate Change Impacts on Temperature Forest Ecosystem Processes,” EB Peters, K Wythers, S Zhang, JB Bradford, PB Reich, Canadian Journal of Forest Research, DOI:10.1139/cjfr-2013-0013, published online July 17, 2013.)
(Left) Forest types with a 1-km grid resolution over the northern Great Lakes region of the United States, also known as the Laurentian Mixed Forest Province. This region includes six major forest types.
(Right) Map of changes in above-ground net primary production predicted from 1970 to 2100, under a high-emissions climate change scenario.
©Canadian Journal of Forest Research, NRC Research Press (2013)
posted on October 9, 2013.
Update, October 28, 2013: The Minneapolis Star-Tribune published an article about climate change's effects on the north woods of Minnesota. Professor Reich and his project, B4WARMED, are discussed at the end of the article.