The Ocean Observatories Initiative: Data Acquisition Functions and Its Built-In Automated Python Modules

Michael J Smith1, Michael Vardaro1, Michael F Crowley2, Scott M Glenn3, Oscar Schofield1, Leila Belabbassi4, Lori M Garzio1, Friedrich Knuth4, Jonathan P Fram5 and John Kerfoot6, (1)Rutgers University, Department of Marine and Coastal Sciences, New Brunswick, NJ, United States, (2)Rutgers University, New Brunswick, NJ, United States, (3)Rutgers University New Brunswick, New Brunswick, NJ, United States, (4)Rutgers Unversity, Department of Marine and Coastal Sciences, New Brunswick, NJ, United States, (5)Oregon State University, College of Earth, Ocean, and Atmospheric Sciences, Corvallis, OR, United States, (6)Rutgers University, Marine and Coastal Sciences, New Brunswick, NJ, United States
Abstract:
The Ocean Observatories Initiative (OOI), funded by the National Science Foundation, provides users with access to long-term datasets from a variety of oceanographic sensors. The Endurance Array in the Pacific Ocean consists of two separate lines off the coasts of Oregon and Washington. The Oregon line consists of 7 moorings, two cabled benthic experiment packages and 6 underwater gliders. The Washington line comprises 6 moorings and 6 gliders. Each mooring is outfitted with a variety of instrument packages. The raw data from these instruments are sent to shore via satellite communication and in some cases, via fiber optic cable. Raw data is then sent to the cyberinfrastructure (CI) group at Rutgers where it is aggregated, parsed into thousands of different data streams, and integrated into a software package called uFrame. The OOI CI delivers the data to the general public via a web interface that outputs data into commonly used scientific data file formats such as JSON, netCDF, and CSV. The Rutgers data management team has developed a series of command-line Python tools that streamline data acquisition in order to facilitate the QA/QC review process. The first step in the process is querying the uFrame database for a list of all available platforms. From this list, a user can choose a specific platform and automatically download all available datasets from the specified platform. The downloaded dataset is plotted using a generalized Python netcdf plotting routine that utilizes a data visualization toolbox called matplotlib. This routine loads each netCDF file separately and outputs plots by each available parameter. These Python tools have been uploaded to a Github repository that is openly available to help facilitate OOI data access and visualization.