Reconstruction of Ocean Color Data Using Machine Learning Techniques in Polar Regions: Focusing on Off Cape Hallett, Ross Sea

Jinku Park, United States, Jeong-Hoon Kim, Korea Polar Research Institute, Incheon, South Korea, Hyun Cheol Kim, Korea Polar Research Institute, Incheon, Korea, Republic of (South), Bong-Guk Kim, Seoul National University, Seoul, Korea, Republic of (South), Dukwon Bae, Pusan National University, Oceanography, Busan, South Korea, Young-Heon Jo, Pusan National University, Department of Oceanography, Busan, South Korea, Naeun Jo, Pusan National University, South Korea and Sang Lee, Pusan National University, Busan, Korea, Republic of (South)
Abstract:
The most problematic issue in the ocean color application is the presence of heavy clouds, especially in polar regions. For that reason, the demand for the ocean color application in polar regions is increased. As a way to overcome such issues, we conducted the reconstruction of the chlorophyll-a concentration (CHL) data using the machine learning-based models to raise the usability of CHL data. This analysis was first conducted on a regional scale and focused on the biologically-valued Cape Hallett, Ross Sea, Antarctica. Environmental factors and geographical information associated with phytoplankton dynamics were considered as predictors for the CHL reconstruction, which were obtained from cloud-free microwave and reanalysis data. As the machine learning models used in the present study, the ensemble-based models such as Random forest (RF) and Extremely randomized tree (ET) were selected with 10-fold cross-validation. As a result, both CHL reconstructions from the two models showed significant agreement with the standard satellite-derived CHL data. In addition, the reconstructed CHLs were close to the actual CHL value even where it was not observed by the satellites. However, there is a slight difference between the CHL reconstruction results from the RF and the ET, which is likely caused by the difference in the contribution of each predictor. In addition, we examined the variable importance for the CHL reconstruction quantitatively. As such, the sea surface and atmospheric temperature, and the photosynthetically available radiation have high contributions to the model developments. Mostly, geographic information appears to have a lower contribution relative to environmental predictors. Lastly,
we estimated the partial dependences for the predictors for further study on the variable contribution and investigated the contributions to the CHL reconstruction with changes in the predictors.