UMSI 2000 Annual Report: George Karypis, Principal Investigator Previous Page  |  Table of Contents  |  Next Page

George Karypis, Principal Investigator


Scalable Algorithms for Graph/Mesh Partitioning and for Scientific Data Mining

Algorithms that find a good partitioning of highly unstructured and irregular graphs are critical for developing efficient solutions of a wide range of problems including parallel execution of scientific simulations. The traditional graph partitioning problem focuses on computing a k-way partition of a graph such that the edge-cut is minimized and each partition has an equal number of vertices (or in the case of weighted graphs, the sum of the vertex-weights in each partition are the same). The task of minimizing the edge-cut can be considered as the objective and the requirement that the partitions will be of the same size can be considered as the constraint. Unfortunately, this single-constraint single-objective graph-partitioning problem is not sufficient to model the underlying requirements of many current and emerging applications, especially in the area of high-performance scientific simulations. In particular, the effective parallel solution of multi-physics and multi-phase computations in areas such as automobile engine design, crash-worthiness testing, fluid dynamics, and structural mechanics, to name a few, requires that the partitioning algorithm simultaneously balance the computations performed during each one of the phases and minimize the various communication overheads-none of which can be accomplished by the current graph models and partitioning algorithms. The key characteristic of these applications is that they require the partitioning algorithm to handle an arbitrary number of balancing constraints as well as an arbitrary number of optimization objectives.

Due to advances in informational technology and high-performance computing, very large data sets are becoming available in many scientific disciplines. The rate of production of such data sets far outstrips the ability to analyze them manually. There is an increasing interest in scientific communities in exploring the use of emerging data mining techniques to analyze these large data sets. To date, the majority of the research in data mining has been focused on developing algorithms for data sets arising in various business, information retrieval, and financial applications. However, data mining techniques hold great promise for analyzing massive data sets resulting from scientific simulations. Data mining techniques can be used to develop a new set of analysis tools that will automatically analyze the data and allow engineers and scientists to gain insight into the underlying mechanisms of the dynamic physical processes.

In this approach, feature extraction tools are used to analyze the raw simulation data to extract higher order objects that correspond to interesting structures (e.g., vortices in flow simulation data). These objects are then analyzed using clustering and pattern discovery algorithms to provide high-level information and, thus, help unravel the casual relationships in the underlying physical mechanism.

Research Group

Michihiro Kuramochi, Graduate Student Researcher
Irene Moulitsas, Graduate Student Researcher
Shrikanth Shankar, Graduate Student Researcher

The research in this project is focusing on developing scalable algorithms for multi-constraint and multi-objective graph partitioning and for mining large scientific data sets. This work is based upon earlier work on developing highly effective and scalable graph partitioning algorithms available in the METIS and ParMETIS libraries and work on developing clustering algorithms for high-dimensional data sets and scalable association rule discovery algorithms.


Previous Page  |  Table of Contents  |  Next Page