H11D-1369
A ‘Large Catchment Sample’ Investigation of the Performance, Reliability and Robustness of Two Conceptual Rainfall-Runoff Models.

Monday, 14 December 2015
Poster Hall (Moscone South)
Thibault Mathevet, EDF-DTG, Grenoble, France, Hoshin Vijai Gupta, University of Arizona, Tucson, AZ, United States, Charles Perrin, IRSTEA, HBAN, Antony Cedex, France, Nicolas Le Moine, Universite Pierre et Marie Curie, Paris, France and Vazken Andreassian, IRSTEA, Antony Cedex, France
Abstract:
This presentation reports a ‘large-sample’ (2050 watersheds from around the world) intercomparison of 2 CRR model structures (GRX and MRX, both used for many research & operational applications), using a multi-objective evaluation process and two-way split-sample testing. The watershed sample represents a diversity of hydro-meteorological and measurement contexts thereby lending confidence and statistical robustness to the analyses and inferences drawn therefrom (Gupta et al., 2014). Overall, our results indicate that:

(i) The GRX and MRX models both provide similar levels of long-term (aggregate) performance during calibration and evaluation (as assessed by the various metrics computed). Hence, their simulations (and biases) are strongly correlated.

(ii) Both models suffer from a lack of robustness when simulating water balance and streamflow variability, although simulation of streamflow timing and rate of change is quite good (as indicated by the long-term linear correlation between observed and simulated time-series).

(iii) The MRX model tends to provide better and more robust reproduction of short-term processes than the GRX model (as indicated by the distributions of short-term linear correlations between observed and simulated time-series).

(iv) Model performance variations from one period to another appear to be mainly due to temporal variations in the hydro-meteorological properties of the period.

(v) The use of KGE (Gupta et al., 2009) as an objective function tends to reduce long-term process model bias (on average), which is not the case with NSE (Nash & Sutcliffe, 1970).

Further, our results clearly show that sub-period variability in model performance can be quite high (especially for water balance), and that aggregate long-term (full period) statistics can tend to over-estimate true predictive performance of a hydrologic model. Our preliminary results indicate that there may be value in computing and examining distributions of the various model performance metrics over sub-period samples, instead of relying upon a single period-average deterministic value. This could greatly improve model diagnosis by helping to reveal situations involving model structural inadequacy, non-stationarity of hydro-meteorological processes and/or problems with measurement data.