H13D-1151:
Evaluation of Methods for Estimating Long-Range Dependence in Water Quality Time Series with Missing Data and Irregular Sampling

Monday, 15 December 2014
Qian Zhang, Ciaran J Harman and William P Ball, Johns Hopkins University, Geography and Environmental Engineering, Baltimore, MD, United States
Abstract:
Water-quality time series have been observed to exhibit long-range dependence (LRD) phenomena (e.g., Kirchner & Neal, PNAS 110(30), 2013). LRD means that the autocorrelation between values decays more slowly than exponential and presents challenges to the identification of deterministic trends. To quantify the strength of LRD, a variety of methods have been developed, e.g., rescaled range analysis, detrended fluctuation analysis, and spectral analysis. However, these methods are generally inapplicable to water-quality monitoring data that may be sampled irregularly or have missing values. This work systematically evaluates and compares two broad types of methods for estimating LRD in gappy water-quality time series using Monte Carlo simulation. The first type uses several forms of interpolation to fill in gaps, thus making the data analyzable by the traditional methods. The second type of methods, which includes some newly developed wavelet techniques, can be directly applied to gappy data. However, such methods have not been evaluated in the context of water-quality data, which usually contain irregularly distributed gaps. Here we present the simulation results obtained in three steps: first, we generate an ensemble of 1,000 replicates of synthetic data with known LRD using the fractional auto-regressive integrated moving average model. Second, we remove portions of the data to create the irregular sampling intervals that are typical of a water-quality time series. Third, we apply the candidate methods to estimate the LRD for the irregular data. The performances of the candidate methods are evaluated for values of the fractional differencing parameter (d) ranging from 0 (i.e., no LRD) to 0.499 (strong LRD yet stationary). For practicality, the simulations have been designed to mimic the Chesapeake Non-tidal Monitoring Program data, which are typical long-term water-quality data in terms of sampling frequency (weekly to monthly), length (15-40 years), and gaps (irregular and frequent). The results will help hydrologists choose appropriate methods to estimate LRD in water-quality time series, and the findings and approaches may be applicable to gappy data in other scientific fields.