An ODIP Effort to Map R2R Ocean Data Terms to International Vocabularies

Wednesday, 17 December 2014
Renata Ferreira, University of California San Diego, La Jolla, CA, United States, Karen I Stocks, Univ. of California San Diego, La Jolla, CA, United States and Robert A Arko, Lamont-Doherty Earth Observatory, Palisades, NY, United States
The diverseness of terminology used in describing ocean data creates a barrier to efficient discovery and re-use of data, particularly across institutional, programmatic, and disciplinary boundaries. Here we explore the outcomes of a student project to crosswalk terms between the Rolling Deck to Repository (R2R) program and other international systems, as part of the Ocean Data Interoperability Platform (ODIP).

R2R is a U.S. program developing and implementing an information management system to preserve and provide access to routine underway data collected by U.S. academic research vessels. R2R participates in ODIP, an international forum for improving interoperability and effective sharing of marine data resources through technical workshops and joint prototypes. The vocabulary mapping effort lays a foundation for future ocean data portals through which users search and access ocean data using familiar terms.

R2R describes its data with a suite of controlled vocabularies ( some of which were developed within R2R or are specific to the U.S. The goal of this student project is to crosswalk local/national vocabularies to authoritative international ones, where they exist, or to vocabularies widely used by ODIP partners. Specifically, R2R developed the following crosswalks: UNOLS ports to SeaDataNet Ports Gazetteer, R2R Device Models to NVS SeaVoX Device Catalog, R2R Organizations to the European Directory of Marine Organizations (EDMO), and R2R chief scientist names to well known professional identifiers such as ORCID, Research Gate, Linkedin, etc. Mappings were done in simple spreadsheets using synonymy relationships, and will be published as part of the R2R Linked Data resources.

The level of success in crosswalking was variable. All ports are successfully mapped. Both organizations and device models have initial mappings and R2R has added new terms to EDMO and SeaVoX Device Catalog vocabularies allowing for nearly complete coverage of terms. An initial search for R2R scientists identifiers on ORCID returned few potential matches, and most potential matches lacked sufficient metadata to confirm the match. R2R is now adopting an alternate approach of requesting chief scientists to self-report on the professional identifiers used to expose their work.