IN22A-05:
Semantically aided interpretation and querying of Jefferson Project data using the SemantEco framework

Tuesday, 16 December 2014: 11:20 AM
Evan W Patton, Paulo Pinheiro and Deborah L McGuinness, Rensselaer Polytechnic Institute, Troy, NY, United States
Abstract:
We will describe the benefits we realized using semantic technologies to address the often challenging and resource intensive task of ontology alignment in service of data integration. Ontology alignment became relatively simple as we reused our existing semantic data integration framework, SemantEco. We work in the context of the Jefferson Project (JP), an effort to monitor and predict the health of Lake George in NY by deploying a large-scale sensor network in the lake, and analyzing the high-resolution sensor data. SemantEco is an open-source framework for building semantically-aware applications to assist users, particularly non-experts, in exploration and interpretation of integrated scientific data. SemantEco applications are composed of a set of modules that incorporate new datasets, extend the semantic capabilities of the system to integrate and reason about data, and provide facets for extending or controlling semantic queries. Whereas earlier SemantEco work focused on integration of water, air, and species data from government sources, we focus on redeploying it to provide a provenance-aware, semantic query and interpretation interface for JP’s sensor data. By employing a minor alignment between SemantEco's ontology and the Human-Aware Sensor Network Ontology used to model the JP’s sensor deployments, we were able to bring SemantEco's capabilities to bear on the JP sensor data and metadata. This alignment enabled SemantEco to perform the following tasks: (1) select JP datasets related to water quality; (2) understand how the JP's notion of water quality relates to water quality concepts in previous work; and (3) reuse existing SemantEco interactive data facets, e.g. maps and time series visualizations, and modules, e.g. the regulation module that interprets water quality data through the lens of various federal and state regulations. Semantic technologies, both as the engine driving SemantEco and the means of modeling the JP data, enabled us to rapidly align the two ontologies without needing the projects to change models and allowed us to adopt the existing software development effort invested in SemantEco as a portal for exploring Lake George's water quality data. We plan to extend the registration of modules and facets to handle climate data, hydrology data, and food web data.