IN13C-1848
What Does it Mean to Publish Data in Earth System Science Data Journal?

Monday, 14 December 2015
Poster Hall (Moscone South)
David Carlson, World Meteorological Organization, Geneva, Switzerland
Abstract:
The availability of more than 120 data sets in ESSD represents an unprecedented effort by providers, data centers and ESSD. ESSD data sets and their accompanying data descriptions undergo rigorous review. The data sets reside at any of more than 20 cooperating data centers. The ESSD publication process depends on but challenges the concepts of digital object identification and exacerbates the varied interpretations of the phrase ‘data publication’. 

ESSD adopts the digital object identifier (doi). Key questions apply to doi’s and other identifiers. How will persistent identifiers point accurately to distributed or replicated data? How should data centers and data publishers use identifier technologies to ensure authenticity and integrity? Should metadata associated with identifiers distinguish among raw, quality controlled and derived data processing levels, or indicate license or copyright status?

Data centers publish data sets according to internal metadata standards but without indicators of quality control. Publication in this sense indicates availability. National data portals compile, serve and publish data products as a service to national researchers and, often, to meet national requirements. Publication in this second case indicates availability in a national context; the data themselves may still reside at separate data centers. Data journals such as ESSD or Scientific Data publish peer-reviewed, quality controlled data sets. These data sets almost always reside at a separate data center - the journal and the center maintain explicit identifier linkages. Data journals add quality to the feature of availability. A single data set processed through these layers will generate three independent doi's but the doi’s will provide little information about availability or quality. Could the data world learn from the URL world to consider additions? Suffixes? Could we use our experience with processing levels or data maturity to propose and agree such extensions?