IN43B-3693:
Comparing Absolute Error with Squared Error for Evaluating Empirical Models of Continuous Variables: Compositions, Implications, and Consequences
Thursday, 18 December 2014
Jing Gao, University of Illinois at Urbana Champaign, Urbana, IL, United States
Abstract:
Reducing modeling error is often a major concern of empirical geophysical models. However, modeling errors can be defined in different ways: When the response variable is continuous, the most commonly used metrics are squared (SQ) and absolute (ABS) errors. For most applications, ABS error is the more natural, but SQ error is mathematically more tractable, so is often used as a substitute with little scientific justification. Existing literature has not thoroughly investigated the implications of using SQ error in place of ABS error, especially not geospatially. This study compares the two metrics through the lens of bias-variance decomposition (BVD). BVD breaks down the expected modeling error of each model evaluation point into bias (systematic error), variance (model sensitivity), and noise (observation instability). It offers a way to probe the composition of various error metrics. I analytically derived the BVD of ABS error and compared it with the well-known SQ error BVD, and found that not only the two metrics measure the characteristics of the probability distributions of modeling errors differently, but also the effects of these characteristics on the overall expected error are different. Most notably, under SQ error all bias, variance, and noise increase expected error, while under ABS error certain parts of the error components reduce expected error. Since manipulating these subtractive terms is a legitimate way to reduce expected modeling error, SQ error can never capture the complete story embedded in ABS error. I then empirically compared the two metrics with a supervised remote sensing model for mapping surface imperviousness. Pair-wise spatially-explicit comparison for each error component showed that SQ error overstates all error components in comparison to ABS error, especially variance-related terms. Hence, substituting ABS error with SQ error makes model performance appear worse than it actually is, and the analyst would more likely accept a model with higher bias but more stability. Further, my experiments showed that the two metrics can and will lead to different conclusions about the impacts of certain operations and may suggest different strategies for model improvement. Therefore, SQ error is a poor substitute for ABS error, and the use of the two metrics should be clearly differentiated.