## Research Abstracts Online

January - December 2011

Main TOC ...... Next Abstract

### University of Minnesota Twin Cities

College of Science and Engineering

Department
of Computer Science and Engineering

# PI: David H.C. Du, Fellow

### Exascale I/O and Parallel File System Workload Characterization; Storage Subsystem Modeling and Analysis

These researchers are using MSI for two projects. The goal of the first is to accurately predict parallel file system I/O workload characteristics at the exascale computing scale and generate synthetic exascale I/O workloads based on appropriate mathematical models. Accordingly, the target applications will be typical HPC applications and/or parallel scientific applications. The first phase of this project is learning about the file system I/O workload characteristics at the current computing scale, becoming familiar with all kinds of workload capturing, and replaying, visualizing, and analyzing techniques to assist in I/O workload characterization. There are quite a few dimensions that are suitable to describe file system I/O workload and identification of these is the most important and critical dimension, especially for a parallel file system I/O workload. The second phase of this project will focus on parallel file system I/O workload characterization. It can be further divided into three steps. First of all, the researchers will predict the exascale computing environment, including computing infrastructures, software stack, I/O library stack, and HPC application behaviors, among others. By doing this, they can theoretically sketch out the potential characteristics changes over petascale computing. Secondly, they will identify the important dimensions to describe the exascale I/O workload and design parameters for each of them. Finally, they complete the mathematical modeling and generate the synthetic workload based on the final mathematical model. This project tries to define IO workload such that given a storage system, one can predict its performance under a specified workload.

The second project involves storage subsystem modeling. The goal of this project is to find out an optimized data placement algorithm for different kinds of storage media. In this project, the researchers will model a storage subsystem using an in-house simulator. This simulator can capture storage device characteristics and model a real storage subsystem with different data placement algorithms. Therefore, using this simulation engine, researchers can easily develop and test various kinds of data placement algorithms.

### Group Members

Weiping He, Graduate Student

Guanlin Lu, Graduate Student

Muthukumar Murugan, Research Associate