Machine Learning Techniques Applied to Flow Cytometry and Flow Imaging Data to Assess Phytoplankton Community Dynamics in a Florida Coastal Lagoon and Estuary
Machine Learning Techniques Applied to Flow Cytometry and Flow Imaging Data to Assess Phytoplankton Community Dynamics in a Florida Coastal Lagoon and Estuary
Abstract:
Phytoplankton community dynamics in coastal ecosystems are often highly variable and challenging to characterize over ecologically important spatial and temporal scales. Analytical techniques such as flow cytometry and automated imaging in flow allow for efficient detection and enumeration of cells and colonies but produce large data sets that require substantial effort to analyze. We have applied supervised and unsupervised machine learning algorithms to automate analyses and help extract ecologically important information from these types of data sets collected in the Indian River Lagoon (IRL) and St. Lucie Estuary (SLE), Florida. Water samples have been collected monthly since 2016 from 10 locations in the southern half of the IRL and SLE. Phytoplankton have been characterized and enumerated using a BD Accuri C6 flow cytometer and Fluid Imaging Technologies FlowCam. Environmental conditions are continuously monitored at these sites by the Indian River Lagoon Observatory’s (IRLO) network of in situ sensors. Unsupervised machine learning techniques have been primarily used to cluster flow cytometry data while supervised convolutional neural networks are being trained to classify FlowCam images of phytoplankton and other suspended particles. Results show variation in phytoplankton community composition over seasonal time scales and in response to local environmental conditions influenced by proximity to ocean inlets or river outflows. Application of machine learning techniques can improve monitoring and management of the IRL and SLE, ecologically and economically important coastal ecosystems that have been heavily impacted by freshwater discharges from Lake Okeechobee, high nutrient inputs, and frequent harmful algal blooms. In combination with IRLO environmental data, machine learning based analyses of phytoplankton can help improve understanding of community ecology in these productive coastal ecosystems.