This analysis evaluated the skill of the ERIC flash flood products when compared against flash flood observations, the results are used to decide the criteria for issuing flash flood notifications. It was necessary to perform a new skill assessment for EFAS v5.0 to decide these criteria, rather than using the criteria from EFAS v4.1 because of the following changes:

  • The new calibration of the LISFLOOD hydrological model which is used within the generation of the ERIC flash flood products
  • The ERIC products are now calculated directly from the surface runoff predictions from LISFLOOD, previously precipitation was combined with LISFLOOD predictions of soil moisture to estimate surface runoff

The evaluation was performed for ERIC flash flood predictions between 1st January 2022 - 31st December 2022

Evaluation Methodology

ERIC forecasts produced at 00 UTC on each day during the evaluation period were evaluated. A range of different exceedance probability threshold values of the 2, 5 and 20 year return periods were tested ranging from 0 to 100% in increments of 10%. The evaluation was performed separately for lead times of 0-24h, 24-48h, 48-72h, 72-96h and 96-120h.

For each forecast, the reporting points were extracted, then each of the different exceedance probability and lead time thresholds were applied. For each application of the different thresholds, the ERIC reporting points which satisfied the threshold criteria were extracted. For each extracted reporting point, the date of the forecasted flash flood event and the ID value of the EFAS administration regions layer were extracted. These were then compared with the observations to see if there was a corresponding observation in the same region on the same day (hit), or if there was no corresponding observation (false alarm), or if there was an observation but not forecasted event (miss). This process was repeated for every forecast during the evaluation period and the total number of hits, misses, false alarms and correct negatives was calculated.

The skill of the forecast for each unique combination of exceedance probability and lead time threshold was calculated from the corresponding hits, misses and false alarms using the f(beta) score, which combines the recall (also known as the hit rate) and precision (also known as the false alarm ratio) abilities of the forecast. This score was also used by Jesus Casado from the JRC in their evaluation of the EFAS formal and informal notifications.

f(beta) = (1+ beta2) * Hits / (1+ beta2) * Hits + beta2 * Misses * False Alarms  

beta = a predefined parameter which by default is 1.0 giving equal weight to recall and precision. A sensitivity assessment is performed with different values of this parameter.

Evaluation Results

The results below show the f(beta) score for the different exceedance probabilities of the 2, 5 and 20 year return periods. The highest scores occurred for a 30-40% exceedance probability of the 2 year return period at 0-24 and 24-48 hours lead time, and also a 10-20% exceedance probability of the 5 year return period at the same two lead times. The results from the exceedance of the 20 year return period had lower f(beta) scores, this could be due to fewer forecasts exceeding this threshold which results in a great number of missed events.

2 year return period5 year return period20 year return period

Sensitivity Analysis Beta Parameter

The beta parameter used in the f(beta) score controls the weight given to the components of recall and precision. A beta value lower than 1.0 means more weight is given to precision which means more a greater emphasis on reducing false alarms. Feedback from EFAS partners has suggested that the large number of false alarms received from the flash flood notifications is a concern for them. Therefore, a sensitivity analysis was conducted where different beta values <1.0 were used to compute the f(beta) score.

Results show that with decreasing the beta parameter the optimum f(beta) score can move towards a higher exceedance probability value. This is most noticeable for the 24-48 hour lead time, the original optimum for the 2 year return period was at 30% but with a beta value of 0.7 this moved to 40%. Likewise for the 5 year return period the optimum moved from 10% exceedance probability to 20% with a beta value of 0.7.


2 year return period5 year return period
beta = 1.0

beta = 0.9

beta = 0.8

beta = 0.7

Assessment of Hits, Misses and False Alarms

From the above results, the optimum f(beta) score is found for a 30-40% exceedance probability of the 2 year return period threshold when using a beta value of 0.7. However it is important to analyse the number of hits, misses and false alarms which are associated with these results, to understand the consequences for the number of flash flood notifications issued to EFAS partners. Therefore, the number of hits, misses and false alarms were analysed for 4 cases:

  1. 30% exceedance probability of the 2 year return period threshold
  2. 40% exceedance probability of the 2 year return period threshold
  3. 20% exceedance probability of the 5 year return period threshold
  4. 30% exceedance probability of the 5 year return period threshold

Results show a large number of false alarms for both exceedance probabilities of the 2 year return period and the 20% exceedance probability of the 5 year return period. The comparatively high f(beta) scores associated with these cases could be due to an increase in the number of hits and a reduction in the number of misses, however the large number of false alarms would not be acceptable to EFAS partners. Results for the 30% exceedance probability of the 5 year return period show a much lower number of false alarms, this will be at the cost of fewer hits and more misses, but the f(beta) score above (when beta=0.7) shows only a small reduction in overall skill. Therefore it would be recommended that the EFAS flash flood notification criteria were based on a 30% exceedance probability of the 5 year return period.

It should be noted that the large number of missed events in all four cases is likely due to the ERIC forecasts being unable to capture flash flood events associated with localised convective activity, however missed events due to other causes are also likely.



2 year return period5 year return period

a. 30% exceedance probability of 2 year return period

c. 20% exceedance probability of 5 year return period

b. 40% exceedance probability of 2 year return period

d. 30% exceedance probability of 5 year return period

Upstream Area Threshold

The maximum upstream area size for which flash flood notifications were issued was 2000 km2 for EFAS pre-version 5.0. However, the increased resolution of the EFAS v5.0 hydrological forecasts means that this value could be reduced. Lower maximum upstream area values of 1000 km2 and 500 km2 were evaluated for different exceedance probabilities of the 5 year return period threshold.

Max upstream area 2000 km2Max upstream area 1000 km2Max upstream area 500 km2

The above results show that reducing the maximum upstream area threshold to 1000 km2 does not have a major impact on the skill, but a slight reduction in skill is observed when reducing the upstream area threshold to 500 km2. Therefore it would be recommended to reduce the maximum upstream area threshold to 1000 km2.

Conclusions

Based on the results above, it is recommended that the flash flood notification criteria for EFAS version 5 remain unchanged:

  • 30% exceedance probability of the 5 year return period threshold
  • Upstream area <=1000 km2
  • Lead time <=48 hours