Data Preservation, Information Preservation, and life-cyle of information management at NASA GES DISC

Friday, 19 December 2014
Mo G Khayat1,2, Barbara Deshong1,2, Asghar E Esfandiari1,2, Irina V Gerasimov1,2, James E Johnson1,2, Steven J Kempler1 and Jennifer C Wei1,2, (1)NASA Goddard Space Flight Center, Greenbelt, MD, United States, (2)ADNET Systems Inc. Greenbelt, Greenbelt, MD, United States
Data lifecycle management awareness is common today; planners are more likely to consider lifecycle issues at mission start. NASA remote sensing missions are typically subject to life cycle management plans of the Distributed Active Archive Center (DAAC), and NASA invests in these national centers for the long-term safeguarding and benefit of future generations. As stewards of older missions, it is incumbent upon us to ensure that a comprehensive enough set of information is being preserved to prevent the risk for “information loss”. This risk is greater when the original data experts have moved on or are no longer available. Preservation of items like documentation related to processing algorithms, pre-flight calibration data, or input/output configuration parameters used in product generation, are examples of digital artifacts that are sometimes not fully preserved. This is the grey area of “information preservation”; the importance of these items is not always clear and requires careful consideration. Missing important “metadata” about intermediate steps used to derive a product could lead to serious challenges in the reproducibility of results or conclusions.

Organizations are rapidly recognizing that the focus of life-cycle preservation needs to be enlarged from the strict raw data to the more encompassing arena of “information lifecycle management”. By understanding what constitutes information, and the complexities involved, we are better equipped to deliver longer lasting value about the original data and derived knowledge (information) from them.

The “NASA Earth Science Data Preservation Content Specification” is an attempt to define the content necessary for long-term preservation. It requires new lifecycle infrastructure approach along with content repositories to accommodate artifacts other than just raw data. The NASA Goddard Earth Sciences Data and Information Services Center (GES DISC) setup an open-source Preservation System capable of long-term archive of digital content to augment its raw data holding. This repository is being used for such missions as HIRDLS, UARS, TOMS, OMI, among others. We will provide a status of this implementation; report on challenges, lessons learned, and detail our plans for future evolution to include other missions and services.