IN33A-1784
Provenance of things - describing geochemistry observation workflows using PROV-O

Wednesday, 16 December 2015
Poster Hall (Moscone South)
Simon J D Cox, CSIRO, Land and Water, Clayton, Vic., Australia and Nicholas J Car, CSIRO, Land and Water, Brisbane, Australia
Abstract:
Geochemistry observations typically follow a complex preparation process after sample retrieval from the field. Description of these are required to allow readers and other data users to assess the reliability of the data produced, and to ensure reproducibility. While laboratory notebooks are used for private record-keeping, and laboratory information systems (LIMS) on a facility basis, this data is not generally published, and there are no standard formats for transfer. And while there is some standardization of workflows, this is often scoped to a lab, or an instrument. New procedures and workflows are being developed continually - in fact this is a key expectation in the development of the science. Thus formalization of the description of sample preparation and observations must be both rigorous and flexible.

We have been exploring the use of the W3C Provenance model (PROV) to capture complete traces, including both the real world things and the data generated. PROV has a core data model that distinguishes between entities, agents and activities involved in producing a piece of data or thing in the world. While the design of PROV was primarily conditioned by stories concerning information resources, application is not restricted to the production of digital or information assets. PROV allowing a comprehensive trace of predecessor entities and transformations at any level of detail. In this paper we demonstrate the use of PROV for describing specimens managed for scientific observations. Two examples are considered: a geological sample which undergoes a typical preparation process for measurements of the concentration of a particular chemical substance, and the collection, taxonomic classification and eventual publication of an insect specimen.

PROV enables the material that goes into the instrument to be linked back to the sample retrieved in the field. This complements the IGSN system, which focuses on registration of field sample identity to support the correlation of results from different investigations on the same samples. Together with the sampling model from OGC O&M these provide a rigorous and flexible system for formalizing geochemistry observation metadata.