A Comprehensive Data Architecture for Multi-Disciplinary Marine Mammal Research

ABSTRACT WITHDRAWN

Abstract:
The Oregon State University Marine Mammal Institute (MMI) comprises five research laboratories, each with specific research objectives, technological approaches, and data requirements. Among the types of data under management are individual photo-ID and field observations, telemetry (e.g., locations, dive characteristics, temperature, acoustics), genetics (and relatedness), stable isotope and toxicology assays, and remotely sensed environmental data. Coordinating data management that facilitates collaboration and comparative exploration among different researchers has been a longstanding challenge for our groups as well as for the greater wildlife research community. Research data are commonly stored locally in flat files or spreadsheets, with copies made and analyses performed with various packages without any common standards for interoperability, becoming a potential source of error. Database design, where it exists, is frequently arrived at ad-hoc. New types of data are generally tacked on when technological advances present them. A data management solution that can address these issues should meet the following requirements: be scalable, modular (i.e., able to incorporate new types of data as they arise), incorporate spatiotemporal dimensions, and be compliant with existing data standards such as DarwinCore. The MMI has developed a data architecture that allows the incorporation of any type of animal-associated data into a modular and portable format that can be integrated with any other dataset sharing the core format. It allows browsing, querying and visualization across any of the attributes that can be associated with individual animals, groups, sensors, or environmental datasets. We have implemented this architecture in an open-source geo-enabled relational database system (PostgreSQL, PostGIS), and have designed a suite of software tools (Python, R) to load, preprocess, visualize, analyze, and export data. This architecture could benefit organizations with similar data challenges.