Background on LISFLOOD calibration for EFAS

LISFLOOD model calibration process aims to generate a set of 14 parameter maps (see EFAS v4.0 calibration parameters) over the pan-European EFAS extent by minimising differences between simulated and observed river discharge on a network of gauging stations with adequate quality observations (see EFAS v4.0 calibration stations).

Once calibrated, the maps are used to execute the long-term run (LTR), a continuous simulation with model forced with observations for a period as long as possible (e.g. 1990-01-01 - 2017-12-31). The simulations are then compared against observed discharge to assess how well the model is able to reproduce the hydrological time series. Because of the unequal length of observation across the domain especially for nested catchments, hydrological modelling performance is evaluated over all available discharge data rather than calibration and the validation periods separately.

Hydrological model performance criteria

As commonly done, one unique evaluation criteria is used, aiming to express in a single number the similarity between observed and simulated discharge (Gupta et al., 2009; Knoben et al., 2019). For EFAS v4.0, the hydrological performance criteria is the modified Kling-Gupta Efficiency metric (KGE’; Gupta et al., 2009; Kling et al., 2012).  The KGE' is an expression of distance away from the point of ideal model performance in the space described by its three components (correlation, variability bias and mean bias). KGE' = 1 indicates perfect agreement between simulations and observations. KGE' score for a mean flow benchmark is KGE'0.41.

      

where r is the Pearson correlation coefficient between long-term run simulations (s) and observations (o), β is the bias ratio, γ is the variability ratio, μ the mean discharge, and σ the discharge standard deviation. The KGE’ and its three decomposed components (correlation, bias ratio, and variability ratio) are all dimensionless with an optimum value of 1. Note that by construction, the KGE' value will always be at most the lowest score of its components. For example, if the correlation is the lowest component score of, say 0.75, then the KGE will also be at limited to 0.75. This guarantees that high KGE'-values reflect a very good correspondence between simulated and observed discharges.

Communication of results: graphical representation of the hydrological model performance

For each river gauge where observational data allows us to conduct a hydrological model performance, a number of summary graphs are shown to provide a main overview of the behaviour of the model.

Hydrological model performance metric

The evaluation metric KGE' and its components are represented in speedometer-like gauges.

Figure 1: KGE' and its decomposition. Ideal values are 1 for each component.

Monthly discharge climatology

Climatological discharge main statistics (median, interquartile range and outliers) are calculated for each month (displayed over a 14-month period starting in September) for both observed and simulated discharges. The superposition of both shows the (dis)agreement between the two time series, giving a visual confirmation of the KGE and its components.

While the correlation is high if simulation and observation co-vary, the correspondence between the medians and the outliers reflect the bias and variability ratio respectively. The following figure helps to determine systematic errors for specific months or seasons and helps to identify the cause of a potentially low KGE score.

Figure 2: Monthly discharge climatology. Shows are boxplots with median and quartiles. The purple boxplots represent the simulated discharge throughout the entire calibration period. The blue shading represents the boxplots of the observed discharge. 

Discharge hydrographs

Full discharge time series at the model time-step are shown as hydrographs, with observed values highlighted as polygons. The time series plots help to identify particular periods of low model performance, and to understand where a low KGE value might come from. For example, if correlation is fairly good, but the model fails to capture low flows or peak amplitudes correctly, the discharge hydrograph helps to visually identify the quality of the gauged station's observations.


Figure 3: Simulated (purple) and observed (blue) discharge time series. The return periods are also shown. 


References

Gupta, H.V., Kling, H., Yilmaz, K.K. and Martinez, G.F. (2009) Decomposition of the Mean Squared Error and NSE Performance Criteria: Implications for Improving Hydrological Modelling. Journal of Hydrology, 377, 80-91.
http://dx.doi.org/10.1016/j.jhydrol.2009.08.003

Kling, H., Fuchs, M., and Paulin, M., 2012. Runoff conditions in theupper Danube basin under an ensemble of climate changescenarios.Journal of Hydrology, 424425, 264277.doi:10.1016/j.jhydrol.2012.01.011.

Knoben, W. J. M., Freer, J. E., and Woods, R. A.: Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores, Hydrol. Earth Syst. Sci., 23, 4323–4331, https://doi.org/10.5194/hess-23-4323-2019, 2019.