Linking Geobiology Fieldwork and Data Curation Through Workflow Documentation

Thursday, 18 December 2014
Andrea Thomer1, Karen S. Baker1, Jacob G. Jett1, Sean Gordon2 and Carole L. Palmer1, (1)University of Illinois at Urbana Champaign, Champaign, IL, United States, (2)University of Illinois at Urbana Champaign, Urbana, IL, United States
Describing the specific processes and artifacts that lead to the creation of data products provides a detailed picture of data provenance in the form of a high-level workflow. The resulting diagram identifies:
1. “points of intervention” at which curation processes can be moved upstream, and
2. data products that may be important for sharing and preservation.

The Site-Based Data Curation project, an Institute of Museum and Library Services-funded project hosted by the Center for Informatics Research in Science and Scholarship at the University of Illinois, previously inferred a geobiologist’s planning, field and laboratory workflows through close study of the data products produced during a single field trip to Yellowstone National Park (Wickett et al, 2013). We have since built on this work by documenting post hoc curation processes, and integrating them with the existing workflow. By holistically considering both data collection and curation, we are able to identify concrete steps that scientists can take to begin curating data in the field. This field-to-repository workflow represents a first step toward a more comprehensive and nuanced model of the research data lifecycle.

Using our initial three-phase workflow, we identify key data products to prioritize for curation, and the points at which data curation best practices integrate with research processes with minimal interruption. We then document the processes that make key data products sharable and ready for preservation. We append the resulting curatorial phases to the field data collection workflow: Data Staging, Data Standardizing and Data Packaging. These refinements demonstrate:
1) the interdependence of research and curatorial phases;
2) the links between specific research products, research phases and curatorial processes;
3) the interdependence of laboratory-specific standards and community-wide best practices.

We propose a poster that shows the six-phase workflow described above. We plan to discuss with attendees how well this applies to other types of field-based research, and seek additional opportunities for future case studies.

References: Wickett, K., Thomer, A. K., Baker, K. S., DiLauro, T., & Asangba, A. E. (2013). How Workflow Documentation Facilitates Curation Planning. In AGU Fall Meeting Abstracts (Vol. 1, p. 1556).