Prospects and Pitfalls in the Coming Wave of High-frequency Environmental Data: What to Look Forward to, and Watch out for

Monday, 15 December 2014: 4:00 PM
James W Kirchner, ETH Swiss Federal Institute of Technology Zurich, Zurich, Switzerland
Advances in sensor technology and wireless networking are making environmental data available at unprecedented sampling frequency and with unprecedented spatial coverage. These measurements are allowing us to observe phenomena and process interactions that were previously invisible. This talk will present examples of promising developments spurred by the sensor revolution in hydrological sciences.

Caution is needed, however, in interpreting these high-frequency observations. Typical environmental time series have reddened spectra, implying that as measurements are made at finer temporal or spatial intervals, the measured values will be closer together as well. Above some sampling frequency, successive real-world values will differ by less than the measurement uncertainties themselves, with the result that the measured time series will be dominated by measurement noise rather than real-world dynamics.

Environmental sensors, like all measurement devices, will be characterized by noise spectra, and environmental signals will only be observable in frequency ranges where they are much stronger than the measurement noise. Thus it is important to know the noise spectra of environmental sensors, through long-term time series measurements on standards and blanks. To date, however, measurement noise spectra have rarely been measured or reported.

The reddened spectra of environmental fluctuations imply that measurements at sufficiently high frequencies will mostly "connect the dots" between nearby data points. In other words, measurements at higher and higher sampling frequencies will be more and more strongly autocorrelated, with the result that the number of effective degrees of freedom in the time series will be far smaller than the number of measurements. Statistical analyses of such time series will greatly exaggerate significance levels, and greatly underestimate confidence bounds, unless this loss of degrees of freedom is taken into account. Illustrative examples of these pitfalls will be presented.