H31M-03
The Scientific Method, Diagnostic Bayes, and How to Detect Epistemic Errors

Wednesday, 16 December 2015: 08:30
3016 (Moscone West)
Jasper A Vrugt, University of California Irvine, Irvine, CA, United States
Abstract:
In the past decades, Bayesian methods have found widespread application and use in environmental systems modeling. Bayes theorem states that the posterior probability, P(H|D) of a hypothesis, H is proportional to the product of the prior probability, P(H) of this hypothesis and the likelihood, L(H|\hat{D})$ of the same hypothesis given the new/incoming observations, \hat{D}. In science and engineering, H often constitutes some numerical simulation model, D = F(x,.) which summarizes using algebraic, empirical, and differential equations, state variables and fluxes, all our theoretical and/or practical knowledge of the system of interest, and x are the d unknown parameters which are subject to inference using some data, \hat{D} of the observed system response. The Bayesian approach is intimately related to the scientific method and uses an iterative cycle of hypothesis formulation (model), experimentation and data collection, and theory/hypothesis refinement to elucidate the rules that govern the natural world. Unfortunately, model refinement has proven to be very difficult in large part because of the poor diagnostic power of residual based likelihood functions \citep{gupta2008}. This has inspired \cite{vrugt2013} to advocate the use of 'likelihood-free' inference using approximate Bayesian computation (ABC). This approach uses one or more summary statistics, S(\hat{D}) of the original data, \hat{D} designed ideally to be sensitive only to one particular process in the model. Any mismatch between the observed and simulated summary metrics is then easily linked to a specific model component. A recurrent issue with the application of ABC is self-sufficiency of the summary statistics. In theory, S(.) should contain as much information as the original data itself, yet complex systems rarely admit sufficient statistics. In this article, we propose to combine the ideas of ABC and regular Bayesian inference to guarantee that no information is lost in diagnostic model evaluation. This hybrid approach, coined diagnostic Bayes, uses the summary metrics as prior distribution and original data in the likelihood function, or P(x|\hat{D}) \propto P(x|S(\hat{D})) L(x|\hat{D})$. A case study illustrates the ability of the proposed methodology to diagnose epistemic errors and provide guidance on model refinement.