January 2008 - March 2009

University of Minnesota Twin Cities
Institute of Technology
Department of Computer Science and Engineering

PI: Vipin Kumar, Fellow

High-performance Data Mining

The primary objective of this research is to develop novel, high-performance data-mining algorithms and tools for mining large-scale datasets that arise in a variety of applications. Some examples are: analysis of data from high-throughput biological experiments, such as those involving genome sequencing, gene expression, or protein interaction; gigabyte datasets collected by earth-observing satellites that need to be processed to better understand global-scale changes in biosphere processes and patterns; data generated by scientific simulations that can be used to gain insight into the underlying physical processes; data obtained through monitoring traffic networks to detect illegal network activities; and collections of text and hypertext analyzed to extract relevant information. The key technical challenges in mining these datasets include: high volume, dimensionality, and heterogeneity; spatio-temporal aspects of the data; possible skewed class distribution; distributed nature of the data; and the complexity in converting raw collected data into high-level features. High-performance data mining is essential to analyze a growing amount of data and to provide analysts with automated tools that facilitate some of the steps needed for hypothesis generation and evaluation.

Group Members

Gowtham Atluri, Graduate Student
Shyam Boriah, Graduate Student
Varun Chandola, Graduate Student
Deepthi Cheboli, Graduate Student
Sanjoy Dey, Graduate Student
Eric Eilertson, Graduate Student
Levent Ertoz, Graduate Student
Gang Fang, Graduate Student
Tushar Garg, Graduate Student
Ananth Y. Grama, Department of Computer Science, Purdue University, West Lafayette, Indiana
Anshul Gupta, IBM, T.J. Watson Research Center, Yorktown Heights, New York
Rohit Gupta, Graduate Student
Ravi Janardan, Faculty Collaborator
Aleksander Lazarevic, Research Associate
Robert Olabode, Graduate Student
Aysel Ozgur, Graduate Student
Uygar Oztekin, Collaborator
Gaurav Pandey, Graduate Student
Vanja Paunic, Graduate Student
Sanjay Ranka, Department of Computer and Information Science and Engineering, University of Florida, Gainesville, Florida
Kirk A. Schloegel, Honeywell, Minneapolis, Minnesota
Gyorgy Simon, Graduate Student
Michael S. Steinbach, Graduate Student
Pang Tan, Department of Computer Science, Michigan State University, East Lansing, Michigan
Fernando Torre, Graduate Student
Baylor Wetzel, Graduate Student