Research Abstracts Online
January - December 2011
University of Minnesota Twin Cities
College of Science and Engineering
of Computer Science and Engineering
PI: Shashi Shekhar, Fellow
Discovering Intervals of Abrupt Change in Spatiotemporal Datasets
In earth science data (e.g., climate data), it is often observed that a persistently abrupt change in value occurs in a certain time-period or spatial interval. For example, abrupt climate change is defined as an unusually large shift of precipitation, temperature, etc, that occurs during a relatively short time period. A similar pattern can also be found in geographical space, representing a sharp transition of the environment (e.g., vegetation cover between different ecological zones). Identifying such intervals of change from earth science datasets is a crucial step for understanding and attributing the underlying phenomenon.
In this work, the researchers analyze earth science data using a novel, automated data mining approach to identify spatial/temporal intervals of persistent, abrupt change. They developed a statistical model to quantitatively evaluate the change abruptness and persistence in an interval. They then designed an algorithm to exhaustively examine all the intervals. They evaluated the method with the Sahel rainfall index data and vegetation cover (in NDVI value) data of Africa. Results show that this method can find intervals of abrupt changes both in space (e.g., ecotones like the Sahel region) and time (e.g., precipitation shifts) of different scales. They have further optimized the algorithm using a top-down searching strategy. By doing this, they significantly reduced the computational cost of the original algorithm for the above test cases. More significantly, the optimized algorithm is also proven to scale up well with data volume and number of change intervals. The researchers are using MSI resources for experimental analysis and to identify intervals of abrupt change from precipitation datasets. They are starting with pattern discovery on a real dataset that has billions of data entries to validate this approach. They will next validate the algorithm on its scalability by using large synthetic datasets ranging from gigabytes to terabytes. This will be a time-consuming process that takes intensive computation.
Michael Robert Evans, Graduate Student
Visanath Gunturi, Graduate Student
Zhe Jiang, Graduate Student
Pradeep Mohan, Graduate Student
Dev Oliver, Graduate Student
KwangSoo Yang, Graduate Student
Zhou Xun, Graduate Student