Provenance of Earth Science Datasets – How Deep Should One Go?

Tuesday, 15 December 2015
Poster Hall (Moscone South)
Gerald John Maramba Manipon1, Hampapuram Ramapriyan2, Steve Aulenbach3, Brian Duggan3, Justin Goldstein3, Hook Hua4, Dexter Tan5, Curt Tilmes6, Brian D Wilson1, Robert Wolfe7 and Stephan Zednik8, (1)Jet Propulsion Laboratory, Pasadena, CA, United States, (2)Science Systems and Applications, Inc., Lanham, MD, United States, (3)US Global Change Research Program, Washington D.C., DC, United States, (4)NASA Jet Propulsion Laboratory, Pasadena, CA, United States, (5)Raytheon Company Pasadena, Pasadena, CA, United States, (6)NASA Goddard Space Flight Center, Greenbelt, MD, United States, (7)University Corporation for Atmospheric Research, Boulder, CO, United States, (8)Rensselaer Polytechnic Institute, Troy, NY, United States
For credibility of scientific research, transparency and reproducibility are essential. This fundamental tenet has been emphasized for centuries, and has been receiving increased attention in recent years. The Office of Management and Budget (2002) addressed reproducibility and other aspects of quality and utility of information from federal agencies. Specific guidelines from NASA (2002) are derived from the above. According to these guidelines, “NASA requires a higher standard of quality for information that is considered influential. Influential scientific, financial, or statistical information is defined as NASA information that, when disseminated, will have or does have clear and substantial impact on important public policies or important private sector decisions.” For information to be compliant, “the information must be transparent and reproducible to the greatest possible extent.” 

We present how the principles of transparency and reproducibility have been applied to NASA data supporting the Third National Climate Assessment (NCA3). The depth of trace needed of provenance of data used to derive conclusions in NCA3 depends on how the data were used (e.g., qualitatively or quantitatively). Given that the information is diligently maintained in the agency archives, it is possible to trace from a figure in the publication through the datasets, specific files, algorithm versions, instruments used for data collection, and satellites, as well as the individuals and organizations involved in each step. Such trace back permits transparency and reproducibility.