Comparing Data Input Requirements of Statistical vs. Process-based Watershed Models Applied for Prediction of Fecal Indicator and Pathogen Levels in Recreational Beaches

Monday, 15 December 2014
Marirosa Molina1, Mike Cyterski1, Gene Whelan1 and Richard G Zepp2, (1)Environmental Protection Agency Athens, Athens, GA, United States, (2)US Environmental Protection Ag, Athens, GA, United States
Same day prediction of fecal indicator bacteria (FIB) concentrations and bather protection from the risk of exposure to pathogens are two important goals of implementing a modeling program at recreational beaches. Sampling efforts for modelling applications can be expensive and time consuming and can lead to the collection of large data sets that go unused. In this study, we assessed the accuracy, sensitivity and specificity of model prediction of FIB concentrations (culturable and qPCR) using environmental data collected onsite vs. publicly available data (such as EnDDaT) with the goal of offering states and beach managers a cost efficient alternative for model development. Multilinear regression (MLR) models were developed to predict the concentration of enterococci in fresh and marine beaches using model input data from on-site monitoring equipment as well as publicly available, near-site data. False negative and false positive predictions of each model were calculated via a threshold analysis. Comparison of model performance at a Great Lake beach revealed that adding on-site data inputs yielded about a 38% higher adjusted R-square (indicating a better fit to the data) and better predictive performance compared to using only publicly available data inputs. Although the models using both datasets were 14% better at predicting regulatory exceedances, the model using only publicly available data was slightly better at predicting non-exceedances. We also compared MLR model input data requirements with the input data requirements needed to develop watershed process models. In a simulation where six different manure-based contaminant sources were evaluated to determine the health risk impacts to a receptor location downstream from the sources of contamination, the watershed model predicted the distributions of three waterborne pathogens (Salmonella, Cryptosporidium, and E. coli 0157) based on rainfall impacting the watershed. Although this type of analysis identifies which pathogens are important, when they are important, and which sources contribute to their presence, the effort is data intensive and might only be warranted when assessment of FIB or pathogen concentrations requires a continuous linkage between source and receptor, as is the case for tributary-impacted beaches.