IN11F-1804
Elements of a next generation time-series ASCII data file format for Earth Sciences

Monday, 14 December 2015
Poster Hall (Moscone South)
Christopher James Webster, National Center for Atmospheric Research, Boulder, CO, United States
Abstract:
Data in ASCII comma separated value (CSV) format are recognized as the most simple, straightforward and readable type of data present in the geosciences. Many scientific workflows developed over the years rely on data using this simple format. However, there is a need for a lightweight ASCII header format standard that is easy to create and easy to work with. Current OGC grade XML standards are complex and difficult to implement for researchers with few resources.

Ideally, such a format should provide the data in CSV for easy consumption by generic applications such as spreadsheets. The format should use an existing time standard. The header should be easily human readable as well as machine parsable. The metadata format should be extendable to allow vocabularies to be adopted as they are created by external standards bodies. The creation of such a format will increase the productivity of software engineers and scientists because fewer translators and checkers would be required.


Data in ASCII comma separated value (CSV) format are recognized as the most simple, straightforward and readable type of data present in the geosciences. Many scientific workflows developed over the years rely on data using this simple format. However, there is a need for a lightweight ASCII header format standard that is easy to create and easy to work with. Current OGC grade XML standards are complex and difficult to implement for researchers with few resources.

Ideally, such a format would provide the data in CSV for easy consumption by generic applications such as spreadsheets. The format would use existing time standard. The header would be easily human readable as well as machine parsable. The metadata format would be extendable to allow vocabularies to be adopted as they are created by external standards bodies. The creation of such a format would increase the productivity of software engineers and scientists because fewer translators would be required.