IN11C-1787
Light-weight Parallel Python Tools for Earth System Modeling Workflows

Monday, 14 December 2015
Poster Hall (Moscone South)
Kevin Paul, Sheri A Mickelson, Haiying Xu, John Dennis and David I Brown, National Center for Atmospheric Research, Boulder, CO, United States
Abstract:
With the growth in computing power over the last 30 years, earth system modeling codes have become increasingly data-intensive. As an example, it is expected that the data required for the next Intergovernmental Panel on Climate Change (IPCC) Assessment Report (AR6) will increase by more than 10x to an expected 25PB per climate model. Faced with this daunting challenge, developers of the Community Earth System Model (CESM) have chosen to change the format of their data for long-term storage from time-slice to time-series, in order to reduce the required download bandwidth needed for later analysis and post-processing by climate scientists. Hence, efficient tools are required to (1) perform the transformation of the data from time-slice to time-series format and to (2) compute climatology statistics, needed for many diagnostic computations, on the resulting time-series data. To address the first of these two challenges, we have developed a parallel Python tool for converting time-slice model output to time-series format. To address the second of these challenges, we have developed a parallel Python tool to perform fast time-averaging of time-series data. These tools are designed to be light-weight, be easy to install, have very few dependencies, and can be easily inserted into the Earth system modeling workflow with negligible disruption. In this work, we present the motivation, approach, and testing results of these two light-weight parallel Python tools, as well as our plans for future research and development.