Incorporating Locally Averaged Distributions Based on Categorical Land-Use Information into Point Estimation
Abstract:This paper presents a novel two-tiered approach to incorporate information from categorical land-use for point estimation: (1) The influence of the composition and the size of land-use within a neighbourhood of a point location was detected in data, and is describable via locally mixed distributions. (2) This averaged information of the locally mixed distributions can be included to improve the estimation of values at point locations.
This approach is applicable for a variety of problems and variables, for example for ground-truthing of meteorological models. This paper demonstrates the method by using categorical land-use information to estimate full distributions of contaminant concentrations at the shallow water table. This information is subsequently used as secondary information for point estimates of concentrations. Both secondary information and censored measurements are taken into account in a full stochastic copula-based model.
Within the proposed method, the secondary information represents the dominantly small-scale and vertical infiltration process, where the spatial dependence model represents the dominantly horizontal larger-scale groundwater flow and solute transport process.
At each interpolation location, a full local distribution is obtained by mixing the pure distributions of the land-use categories according to the composition of the land-use in the vicinity of the interpolation point. The pure distributions for a given neighbourhood size are jointly optimized for all groups of secondary information. A contaminant-specific, spatially distributed measure of the information content of the secondary information is presented.
The impacts of the improvements of the geostatistical model are evaluated using regional groundwater quality data from a large monitoring network (∼2500 measurement locations) within the state of Baden-Württemberg (~36.000 km2), Germany. Land-use as secondary information is available as categorical variable in a 30m x 30m resolution.
Different approaches for the incorporation of the secondary information and their effectiveness are demonstrated. The improved quality of the interpolation, particularly the improved quantification of the uncertainty at each location, is demonstrated by cross-validation results.