Realizing the Living Paper using the ProvONE Model for Reproducible Research

Friday, 18 December 2015
Poster Hall (Moscone South)
Matthew B. Jones1, Christopher S. Jones1, Bertram Ludäscher2, Paolo Missier3, Lauren Walker1, Peter Slaughter1, Mark Schildhauer1 and Victor Cuevas-Vicenttín4, (1)National Center for Ecological Analysis and Synthesis, Santa Barbara, CA, United States, (2)University of Illinois at Urbana Champaign, Urbana, IL, United States, (3)Newcastle University, Newcastle Upon Tyne, United Kingdom, (4)Universidad Popular Autónoma del Estado de Puebla, Puebla, Mexico
Science has advanced through traditional publications that codify research results as a permenant part of the scientific record. But because publications are static and atomic, researchers can only cite and reference a whole work when building on prior work of colleagues. The open source software model has demonstrated a new approach in which strong version control in an open environment can nurture an open ecosystem of software. Developers now commonly fork and extend software giving proper credit, with less repetition, and with confidence in the relationship to original software. Through initiatives like 'Beyond the PDF', an analogous model has been imagined for open science, in which software, data, analyses, and derived products become first class objects within a publishing ecosystem that has evolved to be finer-grained and is realized through a web of linked open data.

We have prototyped a Living Paper concept by developing the ProvONE provenance model for scientific workflows, with prototype deployments in DataONE. ProvONE promotes transparency and openness by describing the authenticity, origin, structure, and processing history of research artifacts and by detailing the steps in computational workflows that produce derived products. To realize the Living Paper, we decompose scientific papers into their constituent products and publish these as compound objects in the DataONE federation of archival repositories. Each individual finding and sub-product of a reseach project (such as a derived data table, a workflow or script, a figure, an image, or a finding) can be independently stored, versioned, and cited. ProvONE provenance traces link these fine-grained products within and across versions of a paper, and across related papers that extend an original analysis. This allows for open scientific publishing in which researchers extend and modify findings, creating a dynamic, evolving web of results that collectively represent the scientific enterprise. The Living Paper provides detailed metadata for properly interpreting and verifying individual research findings, for tracing the origin of ideas, for launching new lines of inquiry, and for implementing transitive credit for research and engineering.