Background on LISFLOOD calibration for CEMS-Flood

LISFLOOD model calibration process aims to generate a set of 14 parameter maps (see CEMS-Flood calibration parameters) over the relevant domain (pan-European for EFAS, global for GloFAS) by minimising differences between simulated and observed river discharge on a network of gauging stations with adequate quality observations (see CEMS-Flood diagnostic and web reporting points).

Once calibrated, the maps are used to execute the long-term run (LTR), a continuous simulation with model forced with observations for a period as long as possible (e.g. 1990-01-01 - 2017-12-31). The simulations are then compared against observed discharge to assess how well the model is able to reproduce the hydrological time series. Because of the unequal length of observation across the domain especially for nested catchments, hydrological modelling performance is evaluated over all available discharge data rather than calibration and the validation periods separately.

Hydrological model performance criteria

As commonly done, one unique evaluation criteria is used, aiming to express in a single number the similarity between observed and simulated discharge (Gupta et al., 2009; Knoben et al., 2019).

For CEMS-Flood, the hydrological performance criteria is the modified Kling-Gupta Efficiency metric (KGE’; Gupta et al., 2009; Kling et al., 2012).  The KGE' is an expression of distance away from the point of ideal model performance in the space described by its three components (correlation, variability bias and mean bias). KGE' = 1 indicates perfect agreement between simulations and observations. KGE' score for a mean flow benchmark is KGE'0.41.

      

where r is the Pearson correlation coefficient between long-term run simulations (s) and observations (o), β is the bias ratio, γ is the variability ratio, μ the mean discharge, and σ the discharge standard deviation.

The KGE’ and its three decomposed components (correlation, bias ratio, and variability ratio) are all dimensionless with an optimum value of 1:

  • Pearson correlation (r) in KGE' highlights temporal errors through the strength of the linear relationship between simulation and observation time series. It ranges from -1 to 1.
  • Bias ratio represents the bias errors, ranging from 0 to +Inf. 
  • Variability ratio shows the variability related errors in the simulation. It ranges from 0 to +Inf. 

Note that by construction, the KGE' value will always be at most the lowest score of its components. For example, if the correlation is the lowest component score of, say 0.75, then the KGE will also be at limited to 0.75. This guarantees that high KGE'-values reflect a very good correspondence between simulated and observed discharges.

Additionally, a general model evaluation is also conducted with the following indicators:

  • Relative bias pbias (ideal value = 0) defined as β-1 and its absolute value abspbias
  • Relative variability var (ideal value = 0) defined as γ-1 and its absolute value absvar
  • Timing index to measure timing errors (timing in days; ideal value is 0), which shows the time delay between the simulated and observed river discharge time series and its absolute value abstiming). Timing is time lag (or shift) L that maximises Rxy(L), cross correlation function Rxy(m) with the simulated (x) and observed (y) time series shifted by L days. Positive/negative timing error indicates delayed/advanced simulated river discharge. So, for example a timing error of +5 means the simulation needs to be shifted by 5 days backwards (brought earlier) to get to the highest correlation, i.e. the simulation is generally 5-day late predicting the ups and downs in the flow time series. Although this is not directly equivalent to measuring the timing error of the highest flood peaks, it is in very good relation with that and can be used as a simple estimate.

Verification period

The verification focuses on the whole period with all available river discharge observations

Communication of results: graphical representation of the hydrological model performance

For each river gauge where observational data allows us to conduct a hydrological model performance, a number of summary graphs are shown to provide a main overview of the behaviour of the model hydrological model performance layer in the CEMS-Flood map viewers.

Hydrological model performance metric

The evaluation metric KGE' and its components are represented in speedometer-like gauges.

Figure 1: KGE' and its decomposition. Ideal values are 1 for each component.

Monthly discharge climatology

Climatological discharge main statistics (median, interquartile range and outliers) are calculated for each month (displayed over a 14-month period starting in September) for both observed and simulated discharges. The superposition of both shows the (dis)agreement between the two time series, giving a visual confirmation of the KGE and its components.

While the correlation is high if simulation and observation co-vary, the correspondence between the medians and the outliers reflect the bias and variability ratio respectively. The following figure helps to determine systematic errors for specific months or seasons and helps to identify the cause of a potentially low KGE score.

Figure 2: Monthly discharge climatology. Shows are boxplots with median and quartiles. The purple boxplots represent the simulated discharge throughout the entire calibration period. The blue shading represents the boxplots of the observed discharge. 

Discharge hydrographs

Full discharge time series at the model time-step are shown as hydrographs, with observed values highlighted as polygons. The time series plots help to identify particular periods of low model performance, and to understand where a low KGE value might come from. For example, if correlation is fairly good, but the model fails to capture low flows or peak amplitudes correctly, the discharge hydrograph helps to visually identify the quality of the gauged station's observations.


Figure 3: Simulated (purple) and observed (blue) discharge time series. The return periods are also shown. 


References

Gupta, H.V., Kling, H., Yilmaz, K.K. and Martinez, G.F. (2009) Decomposition of the Mean Squared Error and NSE Performance Criteria: Implications for Improving Hydrological Modelling. Journal of Hydrology, 377, 80-91.
http://dx.doi.org/10.1016/j.jhydrol.2009.08.003

Kling, H., Fuchs, M., and Paulin, M., 2012. Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios. Journal of Hydrology, 424425, 264277.doi:10.1016/j.jhydrol.2012.01.011.

Knoben, W. J. M., Freer, J. E., and Woods, R. A.: Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores, Hydrol. Earth Syst. Sci., 23, 4323–4331, https://doi.org/10.5194/hess-23-4323-2019, 2019.