GeoCSV: tabular text formatting for geoscience data

Monday, 14 December 2015
Poster Hall (Moscone South)
Mike Stults1, Robert A Arko2, Ethan Davis3, Douglas J Ertz4, Mark Turner5, Chad M Trabant1, David W Valentine Jr6, Timothy Keith Ahern1, Suzanne M Carbotte7, Michael Gurnis5, Charles Meertens4, Mohan K Ramamurthy8 and Ilya Zaslavsky9, (1)Incorporated Research Institutions for Seismology, Seattle, WA, United States, (2)Columbia University of New York, Palisades, NY, United States, (3)University Corporation for Atmospheric Research, Boulder, CO, United States, (4)UNAVCO, Inc. Boulder, Boulder, CO, United States, (5)California Institute of Technology, Pasadena, CA, United States, (6)University of California San Diego, San Diego Supercomputer Center, La Jolla, CA, United States, (7)Lamont-Doherty Earth Obs, Palisades, NY, United States, (8)Organization Not Listed, Washington, DC, United States, (9)San Diego Supercomputer Center, Spatial Information Systems Lab, La Jolla, CA, United States
The GeoCSV design was developed within the GeoWS project as a way to provide a baseline of compatibility between tabular text data sets from various sub-domains in geoscience. Funded through NSF's EarthCube initiative, the GeoWS project aims to develop common web service interfaces for data access across hydrology, geodesy, seismology, marine geophysics, atmospheric science and other areas. The GeoCSV format is an essential part of delivering data via simple web services for discovery and utilization by both humans and machines. As most geoscience disciplines have developed and use data formats specific for their needs, tabular text data can play a key role as a lowest common denominator useful for exchanging and integrating data across sub-domains. The design starts with a core definition compatible with best practices described by the W3C - CSV on the Web Working Group (CSVW). Compatibility with CSVW is intended to ensure the broadest usability of data expressed as GeoCSV. An optional, simple, but limited metadata description mechanism was added to allow inclusion of important metadata with comma separated data, while staying with the definition of a "dialect" by CSVW. The format is designed both for creating new datasets and to annotate data sets already in a tabular text format such that they are compliant with GeoCSV.