Using Machine Learning to Estimate Relative Abundances of Marine Heterotrophic Protists

Keomony Diep, Greta Kcomt Del Rio, Sara Bailey and Darcy Taniguchi, California State University San Marcos, Biology, San Marcos, CA, United States
Heterotrophic protists serve a vital ecological role in marine food webs as major consumers of primary production and as a food source for higher trophic levels. Despite their importance, measuring abundances of heterotrophic protists can be difficult due to their small size and lack of pigments. To estimate relative abundances of heterotrophic protists and correlations with their prey, we are using the machine learning tool of convolutional neural networks (CNNs). CNNs are trainable algorithms that can be trained to classify images accurately and efficiently into pre-determined categories. A novel microscope system off the Scripps Memorial Pier, the Scripps Plankton Camera System, takes continuous, real-time, in situ images of passively drifting particles and organisms in the waters off La Jolla, California. Using images from this system, we have created a training set of labeled heterotrophic protists. Using that dataset, we have trained 3-layer CNNs and compared them with fine-tuned preexisting CNNs architectures. The accuracy varied by ~7% between the small networks and fine-tuned networks. Using the network with the highest accuracy, we can estimate relative abundances of heterotrophic protists as they change through time. Such information can then be used to address such important ecological questions as the role of heterotrophic protists in the formation and decline of algal blooms. This work will thus address ecological concerns such as planktonic interactions and simultaneously advance the field of machine learning.