Community-Supported Data Repositories in Paleobiology: A ‘Middle Tail’ Between the Geoscientific and Informatics Communities
Abstract:Community-supported data repositories (CSDRs) in paleoecology and paleoclimatology have a decades-long tradition and serve multiple critical scientific needs. CSDRs facilitate synthetic large-scale scientific research by providing open-access and curated data that employ community-supported metadata and data standards. CSDRs serve as a ‘middle tail’ or boundary organization between information scientists and the long-tail community of individual geoscientists collecting and analyzing paleoecological data. Over the past decades, a distributed network of CSDRs has emerged, each serving a particular suite of data and research communities, e.g. Neotoma Paleoecology Database, Paleobiology Database, International Tree Ring Database, NOAA NCEI for Paleoclimatology, Morphobank, iDigPaleo, and Integrated Earth Data Alliance. Recently, these groups have organized into a common Paleobiology Data Consortium dedicated to improving interoperability and sharing best practices and protocols.
The Neotoma Paleoecology Database offers one example of an active and growing CSDR, designed to facilitate research into ecological and evolutionary dynamics during recent past global change. Neotoma combines a centralized database structure with distributed scientific governance via multiple virtual constituent data working groups. The Neotoma data model is flexible and can accommodate a variety of paleoecological proxies from many depositional contests. Data input into Neotoma is done by trained Data Stewards, drawn from their communities. Neotoma data can be searched, viewed, and returned to users through multiple interfaces, including the interactive Neotoma Explorer map interface, REST-ful Application Programming Interfaces (APIs), the neotoma R package, and the Tilia stratigraphic software. Neotoma is governed by geoscientists and provides community engagement through training workshops for data contributors, stewards, and users. Neotoma is engaged in the Paleobiological Data Consortium and other efforts to improve interoperability among cyberinfrastructure in the paleogeosciences.