Development of a Call Catalog to Support Automated Acoustic Data Processing Techniques for Coral Reef Soundscapes

Shannon Whitney Ricci, North Carolina State University Raleigh, Center for Geospatial Analytics, Raleigh, NC, United States, Delwayne R Bohnenstiehl, North Carolina State University, Marine, Earth and Atmospheric Sciences, Raleigh, United States, David Eggleston, North Carolina State University Raleigh, MEAS, Raleigh, NC, United States, Jenni Stanley, University of Waikato, New Zealand and Timothy Rowell, Southeast Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Woods Hole, United States
Underwater soundscapes provide information on both biotic and abiotic processes that can be used to understand ecological patterns within a variety of marine habitats. Acoustic monitoring programs can result in large volumes of data and extracting ecologically relevant information from these datasets can be challenging and time-consuming. Automated and effective data processing and reduction techniques are needed to improve our ability to interpret soundscapes. This study developed a database of fish calls recorded at three sites within the Florida Keys National Marine Sanctuary. For each site, calls were labeled within spectrograms generated at weekly intervals using 30-second duration audio segments extracted from the continuous field recordings from dawn, midday, dusk, and midnight. Waveform similarity was calculated by cross-correlating each call with every other labeled call. Major call types were identified through hierarchical clustering, with groups formed when waveform similarity was >60%. For calls less than 50 milliseconds and within the 20-1800 Hz frequency range, this unsupervised classification approach identified four major call types (1000s of examples each) and several less common call types (10s of examples each) that were present during the period of December 2018-April 2019. The resulting call database includes both archived waveforms and acoustic feature information (peak frequency, bandwidth, duration, etc.) to facilitate the development of automated call identification approaches using template matching and machine learning techniques. The integration of call time-series with other biological and environmental datasets can provide a sustainable method to monitor and manage changing ocean ecosystems.