IN31A-1750
EXTENDING CLIMATE ANALYTICS-AS-A-SERVICE TO THE EARTH SYSTEM GRID FEDERATION

Wednesday, 16 December 2015
Poster Hall (Moscone South)
Glenn Tamkin, NASA Goddard Space Flight Center, Greenbelt, MD, United States and NASA GSFC Office of Computational and Information Sciences and Technology (CISTO)
Abstract:
We are building three extensions to prior-funded work on climate analytics-as-a-service that will benefit the Earth System Grid Federation (ESGF) as it addresses the Big Data challenges of future climate research: (1) We are creating a cloud-based, high-performance Virtual Real-Time Analytics Testbed supporting a select set of climate variables from six major reanalysis data sets. This near real-time capability will enable advanced technologies like the Cloudera Impala-based Structured Query Language (SQL) query capabilities and Hadoop-based MapReduce analytics over native NetCDF files while providing a platform for community experimentation with emerging analytic technologies. (2) We are building a full-featured Reanalysis Ensemble Service comprising monthly means data from six reanalysis data sets. The service will provide a basic set of commonly used operations over the reanalysis collections. The operations will be made accessible through NASA's climate data analytics Web services and our client-side Climate Data Services (CDS) API. (3) We are establishing an Open Geospatial Consortium (OGC) WPS-compliant Web service interface to our climate data analytics service that will enable greater interoperability with next-generation ESGF capabilities. The CDS API will be extended to accommodate the new WPS Web service endpoints as well as ESGF's Web service endpoints. These activities address some of the most important technical challenges for server-side analytics and support the research community's requirements for improved interoperability and improved access to reanalysis data.