Application of data cubes for improving detection of water cycle extreme events
Thursday, 17 December 2015
Poster Hall (Moscone South)
As part of an ongoing NASA-funded project to remove a longstanding barrier to accessing NASA data (i.e., accessing archived time-step array data as point-time series), for the hydrology and other point-time series-oriented communities, "data cubes" are created from which time series files (aka "data rods") are generated on-the-fly and made available as Web services from the Goddard Earth Sciences Data and Information Services Center (GES DISC). Data cubes are data as archived rearranged into spatio-temporal matrices, which allow for easy access to the data, both spatially and temporally. A data cube is a specific case of the general optimal strategy of reorganizing data to match the desired means of access. The gain from such reorganization is greater the larger the data set. As a use case for our project, we are leveraging existing software to explore the application of the data cubes concept to machine learning, for the purpose of detecting water cycle extreme (WCE) events, a specific case of anomaly detection, requiring time series data. We investigate the use of the sequential probability ratio test (SPRT) for anomaly detection and support vector machines (SVM) for anomaly classification. We show an example of detection of WCE events, using the Global Land Data Assimilation Systems (GLDAS) data set.