Provenance for Earth Science Data Systems

Wednesday, 17 December 2014
Hook Hua, NASA Jet Propulsion Laboratory, Pasadena, CA, United States, Curt Tilmes, NASA Headquarters, Washington, DC, United States, Hampapuram K Ramapriyan, NASA Goddard Space Flight Center, Greenbelt, MD, United States, Brian Duggan, US Global Change Research Program, Washington D.C., DC, United States, Brian D Wilson, Jet Propulsion Laboratory, Pasadena, CA, United States and Gerald John Maramba Manipon, Raytheon Company Pasadena, Pasadena, CA, United States
Earth Science Data Systems across NASA play a critical role in data processing, management, and analysis of NASA observations. However, there is a growing need to provide the provenance of these datasets as scientists increasingly need more transparency of the data products to improve their understanding and trust of the science results. Lessons learned from Climategate show that there is public demand for more transparency and understanding in the science process. Science data systems are key to enabling the capture, management, and use of production provenance information. Science analysis now also may involve merging multi-sensor datasets where lineage can facilitate the understanding of the data. But there does not exist a formal recommendation for an interoperable standard for provenance representation for use in NASA's Earth Science Data Systems.

The W3C Provenance Working Group has a specification for the representation of provenance information. The standard is very general and intended to support the breadth of any domain. To better serve the needs of specific domain communities, the standard has several built in points of extensibility. We will present efforts by NASA’s Earth Science Data Systems Working Group (ESDSWG) on Provenance to develop an Earth Science extension to the PROV specification (PROV-ES) and how it can be used in science data system to capture, consume, and interpret provenance information.