ISO TC211 standards on Provenance for Earth science

Wednesday, 17 December 2014: 5:00 PM
Liping Di and Meixia Deng, George Mason University, Fairfax, VA, United States
Data provenance, also called lineage, records the derivation history of a data product. The history could include the algorithms used, the process steps taken, the computing environment run, data sources input to the processes, the organization/person responsible for the product, etc. Provenance provides important information to data users for them to determine the usability and reliability of the product. In the science domain, the data provenance is especially important since scientists need to use the information to determine the scientific validity of a data product and to decide if such a product can be used as the basis for further scientific analysis.

Provenance is a kind of metadata. In Earth science domain, the International Organization for Standardization (ISO) Technical Committee 211 (ISO TC 211) have set geospatial metadata standards for geospatial data, including ISO 19115:2003-Metadata, ISO 19115-2:2009-Metadata-Part 2: Extensions for imagery and gridded data, and ISO 19115-1:2014 - Metadata -- Part 1: Fundamentals. ISO 19115 and ISO 19115-1 define the fundamental metadata for documenting geospatial data products, and ISO 19115-2 provides additional metadata classes for imagery and gridded data. ISO 19115-1:2014 is the revised version of ISO 19115:2003.

ISO 19115 and ISO 19115-1 define fundamental lineage information classes and subclasses. They miss some key information classes needed for documenting the provenance in the Web service environment, such as the running environment, the algorithms, and software executables. However, ISO 19115-2 extends the lineage model in ISO 19115 and provides additional metadata classes needed for documenting provenance information. The combination of lineage models in ISO 19115 and ISO 19115-2 provides a comprehensive provenance information model needed for the web service environment. Currently the ISO Provence standard is not compatible with W3C Prov standard. The revision of ISO 19115-2 will be started in November 2014. The revision process will provide the opportunity for harmonizing the ISO provenance model with the W3C Prov standard and for the Earth science community to provide inputs for improving the ISO provenance model.