Abstract: Bayesian estimation of observation error covariance matrix in the equatorial Pacific (2016 Ocean Sciences Meeting)

Bayesian estimation of observation error covariance matrix in the equatorial Pacific

Genta Ueno, The Inst of Statistical Math, Tokyo, Japan

Abstract:

We develop a Bayesian technique for estimating the parameters in the observation noise covariance matrix R_t for ensemble data assimilation. We design a posterior distribution by using the ensemble-approximated likelihood and a Wishart prior distribution and present an iterative algorithm for parameter estimation. The present algorithm is identified as the expectation-maximization (EM) algorithm for a Gaussian mixture model and can estimate a number of parameters in R_t. The algorithm is an extension of that by Ueno and Nakamura (2014) for maximum-likelihood estimation. An advantage of the proposed method is that R_t can be estimated online, and more importantly, the temporal smoothness of R_t can be controlled by adequately choosing two parameters of the prior distribution, the covariance matrix S and the number of degrees of freedom ν. The parameters S and ν may vary with the time at which R_t is estimated. The ν parameter can be objectively estimated by maximizing the marginal likelihood. The present formalism can handle cases in which the number of data points or the data positions varies with time, the former case of which is exemplified in the experiments. We present an application to a coupled atmosphere-ocean model under each of the following assumptions: R_t is a scalar multiple of a fixed matrix (R_t=α_tΣ, where α_t is the scalar parameter and Σ is the fixed matrix), R_t is a diagonal matrix, R_t has fixed eigenvectors, or R_t has no specific structure. We verify that the proposed algorithm works well and that only a limited number of iterations are necessary. When R_t has one of the structures mentioned above, by assuming the prior covariance matrix to be the previous estimate, namely S=\hat{R}_t-1, we obtain the Bayesian estimate of R_t that varies smoothly in time compared to the maximum-likelihood estimate at each time. When R_t has no specific structure, we need to regularize S=\hat{R}_t-1 to maintain the positive-definiteness of S. Through twin experiments, we find that the best estimate of R_t is, in general, obtained by a combination of structure-free R_t and tapered S by the decorrelation lengths of half the size of the model ocean basin. From experiments using real observations, we find that the estimates of the structured R_t (having fixed structure Σ and being diagonal), lead to overfitting of the data compared to the structure-free R_t.

Back to: Developments and Ocean Applications of Data Assimilation, Uncertainty, and Sensitivity Analyses III