University of Minnesota
University Relations
http://www.umn.edu/urelate
612-624-6868

Minnesota Supercomputing Institute


Log out of MyMSI

Research Abstracts Online
January 2010 - March 2011

Main TOC ...... Next Abstract

University of Minnesota Twin Cities
College of Science and Engineering
Department of Computer Science and Engineering

PI: Vipin Kumar, Fellow

Data Mining for Earth-Science, Clinical, and Biological Data

The primary objective of this research is to develop novel, high-performance data-mining algorithms and tools for mining large-scale datasets that arise in a variety of applications. Some examples are gigabyte datasets collected by earth-observing satellites that must be processed to better understand global scale changes in biosphere processes and patterns, data generated by scientific simulations that can be used to gain insight into the underlying physical processes, data obtained through monitoring network traffic to detect illegal network activities, and large collections of text and hypertext analyzed to extract relevant information. The key technical challenges in mining these datasets include: high volume, dimensionality, and heterogeneity; the spatio-temporal aspect of the data; possible skewed class distribution; the distributed nature of the data; and complexity in converting raw collected data into high level features. High-performance data mining is essential to analyze the growing data and provide analysts with automated tools that facilitate some of the steps needed for hypothesis generation and evaluation.

Data mining has also become a key tool for analyzing biomedical data. In collaboration with the Mayo Clinic of Rochester, Minnesota, these researchers are developing advanced data-mining techniques for several medical problems. They are also working to identify data mining’s impact on the automatic prediction of protein function from proteomics data, genetic and genomic marker discovery from SNP and gene-expression data. Computational challenges imposed by the large size of the datasets are addressed by building upon the group’s past research in highly parallel formulations of key data-mining kernels for anomaly/outlier detection, finding association patterns, clustering, and building rare-class predictive models that can take advantage of high performance computers.

Group Members

Kshitij Agrawal, Visiting Researcher
Divya Alla, Graduate Student
Shyam Boriah, Graduate Student
Luchiana Brodeala, Visiting Researcher
Ivan C. Brugere, Graduate Student
Yashu Chamber, Graduate Student
Vijay Chaudhari, Graduate Student
Deepthi Cheboli, Graduate Student
Xi Chen, Graduate Student
Sanjoy Dey, Graduate Student
Marc Dunham, Graduate Student
James H. Faghmous, Graduate Student
Gang Fang, Graduate Student
Ashish Gard, Graduate Student
Dhruv Goel, Undergraduate Student
Atluri Gowtham, Graduate Student
Rohit Gupta, Graduate Student
Vibhor Gupta, Visiting Researcher
Tushar Gupta, Visiting Researcher
Ravi Janardan, Faculty Collaborator
Matt E. Kappel, Graduate Student
Anuj Karpatne, Visiting Researcher
Raghav Pavan Srivatsav Karumur, Graduate Student
Jaya Kawale, Research Associate
Vikrant Krishna, Graduate Student
Sairam Krishnamurthy, Graduate Student
Aditya Kulkarni, Graduate Student
Shashank Kumar, Visiting Researcher
Sean Landman, Graduate Student
Aleksander Lazarevic, Research Associate
Peter Li, Research Associate
Kelvin O. Lim, Faculty Collaborator
Xiaoye Liu, Undergraduate Student
Varun Mithal, Undergraduate Student
Benjamin W. Oatley, Undergraduate Student
Robert Olabode, Graduate Student
Gaurav Pandey, Graduate Student
Vanja Paunic, Graduate Student
Saket Saurabh, Visiting Researcher
Garima Sharma, Visiting Researcher
Michael S. Steinbach, Research Associate
Pang Tan, Research Associate
Sruthi Vangala, Graduate Student
Mark Wagy, Undergraduate Student
Libing Wang, Collaborator
Wen Wang, Graduate Student
Baylor Wetzel, Graduate Student