ReaLSAT, A New Dataset of Reservoir and Lake Changes

map showing lake areas

The condition of lakes and reservoirs is of great interest to researchers, because these bodies of water show the effects of climate change and human activities. The research group of MSI PI Vipin Kumar (Regents Professor, Computer Science and Engineering) has published a new dataset, ReaLSAT, containing information on over 680,000 lakes and reservoirs larger than 0.1 km2 and south of 50 degrees N, using data from 1984 through 2015. The dataset makes it possible to study the effects of climate change and human activity on surface water bodies on a global scale. It is an improvement over older datasets in that the existing datasets are static, whereas ReaLSAT includes data over time, including a timeseries of surface area and monthly shapes for each lake. ReaLSAT is also far more comprehensive than exiting datasets of surface water bodies; e.g., it contains nearly three times the number of water bodies than present in HydroLakes for the region of study being covered in ReaLSAT. The researchers used machine-learning techniques to create the dataset, with the computational work performed on MSI’s supercomputers.

The paper can be found on the Scientific Data website: ReaLSAT, a Global Dataset of Reservoir and Lake Surface Area Variations. The National Science Foundation, which provided funding for this project, published an article about the paper on their website: Scientists Use New Technique to Identify Changes in Lakes and Reservoirs Around the World, and a Research Brief article can also be found on the University of Minnesota News site: Data Scientists Use New Techniques to Identify Lakes and Reservoirs Around the World.

Regents Professor Kumar is a long-time PI at MSI, and his group members are active users of MSI’s systems. The Kumar group participated in early user testing of MSI’s newest supercomputer, Agate, which went into full production on April 27, 2022. A story about another modeling project by this group appeared on the MSI website in May 2022: New Machine Learning Model Integrates Static and Time-Series Data (link:

Image description: An image of Minnesota lakes identified using the ReaLSAT dataset (red) is combined with a similar image of the area where lakes were identified in the previous HydroLAKES dataset (blue). The ReaLSAT dataset identifies almost three times as many lakes and reservoirs worldwide compared to HydroLAKES. Credit: ReaLSAT, University of Minnesota.

posted on August 16, 2022

See all Research Spotlights.