For EFAS v4.0, LISFLOOD model calibration was performed on 1137 stations from 215 different catchments on the Pan-European EFAS domain with a mix of 6-hourly (406) and daily (731) observed discharge data. Multiple calibration points were generally available in one catchment, often with a mix of 6-hourly and daily data and with calibration periods spanning over different and often overlapping years.
At the end of the calibration, 14 parameter maps with pan-European EFAS extent were produced. These parameter maps were used to execute the long-term run (LTR), a continuous simulation with model forced with observations, for the period Jan1990-Dec2017 (1990 is then excluded from evaluation). Simulated 6-hourly discharge was then compared against observed discharge from the 1137 calibration stations. For calibration stations with daily observations, 6-hourly LISFLOOD time series were aggregated at daily steps to allow comparison with daily observed discharge data. Because of the unequal length of observation across the domain especially for nested catchments, hydrological modelling performance is evaluated over all available discharge data rather than calibration and the validation periods separately.
This page summarises EFAS v4.0 hydrological skill. Details of the method used are provided in EFAS hydrological model performance.
Overview of EFAS-4 hydrological performance
The hydrological performance of EFAS v4.0 is expressed by the modified Kling-Gupta Efficiency (KGE'). Figure 1 provides the cumulative distribution function of KGE' values and the KGE' distribution for the 1137 calibration stations. Cumulative distribution is provided for all calibration stations (black), 6-hourly stations only (red) and daily stations only (blue). Subdivision between 6-hourly (blue) and daily stations (red) is also shown in KGE' distribution histogram. KGE' values show that ~50% of the stations have KGE'>0.75, with slightly higher percentage for 6-hourly stations. In general, KGE' cumulative distribution functions are very similar between 6-hourly and daily stations, with half of calibration points having KGE' values between 0.7 and 0.9.
Figure 1 - EFAS v4.0 KGE' Cumulative distribution function and KGE' distribution for all stations (black), 6-hourly stations only (blue) and daily stations only (red)
Figure 2 presents EFAS v4.0 results in terms of KGE' components that represent respectively: the linear correlation between observations and simulations (correlation), a bias term (mean bias) and a measure of the flow variability error (variability bias) (Knoben et al., 2019). Results are presented for all calibration stations (grey), 6-hourly stations only (blue) and daily stations only (red).
Figure 2 - EFAS v4.0 KGE' components distribution and cumulative distribution function for all stations (grey), 6-hourly stations only (blue) and daily stations only (red)
Spatial distribution of the hydrological performance across the EFAS domain
Figure 3 shows the spatial distribution of the EFAS v4.0 hydrological performance across EFAS domain. KGE' values were grouped in 5 categories: KGE'<0.2 (yellow); 0.2=<KGE'<0.4 and 0.4<=KGE'<0.7 (pink); 0.7<=KGE'<0.9 and 0.9<=KGE'<1 (blue). KGE' is generally uniformly distributed across EFAS pan-European domain, with higher performance (blue) in large parts of Central Europe and main European rivers, and lower performance (yellow) mostly concentrated in catchments with strongly regulated rivers, like the Iberian Peninsula.
Figure 3 - Spatial distribution of EFAS v4.0 hydrological skill (KGE') across the EFAS domain for calibration stations. For each point, size of the dot represents area of the upstream catchment.
A low score during evaluation of LISFLOOD model calibration is not necessarily an indicator for decreased forecast performance of the European Flood Awareness System, as EFAS forecasts are compared to model derived thresholds (Thielen et al., 2009; Bartholmes et al., 2009), which eliminates systematic bias that leads to an overall lower score in hydrological performance. However, correlation is a desired quality in hydrological performance as it represents the timing of flood peaks. Given the mathematical structure of KGE', all stations where KGE'>=0.7 have correlation >=0.7 (Gupta et al., 2009); but some of the stations where KGE'< 0.7 can still have correlation>0.7, associated to a large mean bias and/or variability bias. Calibration points with low KGE' but correlation >=0.7 won't decrease the forecast performance of the European Flood Awareness System, even if forecast discharge will exhibit large bias.
Figure 4 shows a combination of the spatial distribution of EFAS v4.0 KGE' and correlation. Stations with KGE'<0.7 and Correlation>=0.7 are highlighted in cyan. Compared to Figure 3, a large number of calibration stations with KGE<0.7 (pink and yellow) show a Correlation>0.7 (cyan).
Figure 4 - Spatial distribution of the EFAS v4.0 hydrological performance (KGE') across the EFAS domain combined with correlation: stations with KGE'<0.7 and correlation>=0.7 are highlighted in cyan . For each point, size of the dot represents area of the upstream catchment.
Figure 5, 6 and 7 present the spatial distribution of EFAS v4.0 hydrological performance across EFAS domain in terms of KGE' components: correlation, mean bias and variability bias.
Figure 5 - Spatial distribution of EFAS v4.0 correlation across the EFAS domain. For each point, size of the dot represents area of the upstream catchment.
Figure 6 - Spatial distribution of EFAS v4.0 mean bias across the EFAS domain. For each point, size of the dot represents area of the upstream catchment.
Figure 7 - Spatial distribution of EFAS v4.0 variability bias across the EFAS domain. For each point, size of the dot represents area of the upstream catchment.
Comparison between EFAS v4.0 and EFAS v3.0
Overview of hydrological performance comparison
EFAS v4.0 is the first EFAS version using 6-hourly model computation steps and 6-hourly observed discharge data for the calibration of LISFLOOD model. Many of the calibration stations used in previous LISFLOOD calibration exercises (EFAS v3.0) are now providing both daily and sub-daily discharge data to EFAS. Each time both daily and sub-daily data were available at one station, the sub-daily data were used for calibration, if at least 4 years of data were available and data quality was comparable with daily data. This means that many of the daily data used in the past were replaced by 6-hourly data, with generally different data periods, 6-hourly data are generally more recent, and often substantial reductions in the number of available years for model calibration. Besides, some of the stations were re-located on the model network to pixels that were more correctly representing the station in the model network, meaning that model outputs are no longer comparable between EFAS v4.0 and EFAS v3.0.
The different approach to the selection of calibration data, with priority to sub-daily data, the sometimes revised location of the stations on the model drainage network and the different model computation step make comparing different LISFLOOD model calibrations tricky. However, an attempt was made to show the improvements of the new 6-hourly calibration.
Out of the 1137 calibration stations, only 540 could be used for the comparison with the previous calibration. LISFLOOD version from EFAS v3.0 (LISFLOOD ec_2.8.9) was run over the period 1990-2017 using daily forcings and KGE' were computed using observed daily discharge. In order to make a fair comparison, LISFLOOD outputs from the 6-hourly long term run (LISFLOOD ec_2.12.6) on the period 1990-2017 were aggregated to daily steps and KGE' were computed using the same observed daily discharge. This slightly improved KGE' for EFAS v4.0 as correlation generally increases when aggregating outputs on longer time steps, but it was considered a better option rather then running EFAS v3.0 model version at 6-hourly steps. Besides, calibration stations from EFAS v3.0 are mainly located on larger rivers were the benefits of the 6-hourly computation step and calibration are less dominant.
Figure 8 shows the KGE' cumulative distribution functions for the 540 shared stations: EFAS v4.0 in blue and EFAS v3.0 in black. The new calibration shows an increase in the percentage of stations with KGE' > 0.75 from 40% to 60%.
Figure 8 - KGE' Cumulative distribution function for EFAS v3.0 (black) and EFAS v4.0 (blue).
Figure 9 shows the comparison between EFAS v3.0 and EFAS v4.0 KGE' for the 540 shared stations. Most of the stations show an improvement in KGE' even if some of them have lower KGE' values.
Figure 9 - Comparison between EFAS v3.0 and EFAS v4.0 KGE' for the 540 shared stations. For each point, size of the dot represents area of the upstream catchment.
Hydrological performance comparison across the EFAS domain
Figure 10 presents the spatial distribution of KGE' skill score between EFAS v4.0 and EFAS v3.0 (benchmark). Improvements are represented in blue while substantially similar values with +- 0.05 in KGE' skill score have no colour (white). KGE' skill score is generally positive over the entire model domain with only few stations showing KGE' skill score <-0.05.
Figure 10 - Spatial distribution of KGE' skill score between EFAS v4.0 and EFAS v3.0 (benchmark). For each point, size of the dot represents area of the upstream catchment.
REFERENCE
Knoben, W. J. M., Freer, J. E., and Woods, R. A.: Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores, Hydrol. Earth Syst. Sci., 23, 4323–4331, https://doi.org/10.5194/hess-23-4323-2019, 2019.
Bartholmes, J. C., Thielen, J., Ramos, M. H., and Gentilini, S.: The european flood alert system EFAS – Part 2: Statistical skill assessment of probabilistic and deterministic operational forecasts, Hydrol. Earth Syst. Sci., 13, 141–153, https://doi.org/10.5194/hess-13-141-2009, 2009.