A machine learning approach to simulate dissolved oxygen and summer hypoxic volume in Chesapeake Bay
A machine learning approach to simulate dissolved oxygen and summer hypoxic volume in Chesapeake Bay
Abstract:
Dissolved oxygen (DO) is essential for the survival of almost all aquatic organisms. A seasonally hypoxic condition (DO < 2 mg/L) in subpycnocline waters is often observed in estuaries, lakes, and coastal waters. Accurately modeling the interannual variations of DO and hypoxic volume is essential for environmental management, and yet it is very challenging despite the great advances in numerical techniques. We present a data-driven model to simulate the DO condition and apply it for the Chesapeake Bay, where hypoxic condition is a major concern for the bay’s ecosystem. The data-driven model combines empirical orthogonal functions (EOF) and neural network. EOF is used to reduce data dimension and convert the 3D problem into multiple 1D problems, while neural network is used simulate the nonlinear relationships between input forcings (e.g., river discharge, nutrient load, air temperature, wind) and DO. We collect the 32-year (1985-2016) monthly continuous monitoring DO data at 41 stations in Chesapeake Bay as the target variable and corresponding forcings from various sources as model input. The model uses 75% of the dataset to train the model and the other 25% dataset for model validation. The model shows great performance in predicting the seasonal and interannual variations of DO, and hypoxic volume. Combined EOF and neural network approach enables the model to capture well the variations of the spatial DO distribution. As the approach relies purely on observational data, it reduces the error cascading from the base hydrodynamic simulation to water quality modeling. With rapid accumulation of observational data, the data-driven approach will be a promising way for environmental assessment considering its high computational efficiency.