Information Requirements for Integrating Spatially Discrete, Feature-Based Earth Observations
Thursday, 18 December 2014: 5:18 PM
Several cyberinfrastructures have emerged for sharing observational data collected at densely sampled and/or highly instrumented field sites. These include the CUAHSI Hydrologic Information System (HIS), the Critical Zone Observatory Integrated Data Management System (CZOData), the Integrated Earth Data Applications (IEDA) and EarthChem system, and the Integrated Ocean Observing System (IOOS). These systems rely on standard data encodings and, in some cases, standard semantics for classes of geoscience data. Their focus is on sharing data on the Internet via web services in domain specific encodings or markup languages. While they have made progress in making data available, it still takes investigators significant effort to discover and access datasets from multiple repositories because of inconsistencies in the way domain systems describe, encode, and share data. Yet, there are many scenarios that require efficient integration of these data types across different domains. For example, understanding a soil profile’s geochemical response to extreme weather events requires integration of hydrologic and atmospheric time series with geochemical data from soil samples collected over various depth intervals from soil cores or pits at different positions on a landscape. Integrated access to and analysis of data for such studies are hindered because common characteristics of data, including time, location, provenance, methods, and units are described differently within different systems. Integration requires syntactic and semantic translations that can be manual, error-prone, and lossy. We report information requirements identified as part of our work to define an information model for a broad class of earth science data – i.e., spatially-discrete, feature-based earth observations resulting from in-situ sensors and environmental samples. We sought to answer the question: “What information must accompany observational data for them to be archivable and discoverable within a publication system as well as interpretable once retrieved from such a system for analysis and (re)use?” We also describe development of multiple functional schemas (i.e., physical implementations for data storage, transfer, and archival) for the information model that capture the requirements reported here.