CEMS-Flood Post-processing

The aim of the CEMS-Flood post-processing methodology is to adjust the CEMS-Flood medium-range ensemble forecasts at specific locations, so they become predictors of future observed river discharge values. Because of the requirement of near-real time river discharge observed time series to generate the product, post-processing is applied only to the EFAS system.

The CEMS-Flood post-processing methodology is based on a combination of two post-processing techniques: the Model Conditional Processor (MCP; Todini, 2008) and the Ensemble Model Output Statistics (EMOS; Gneiting et al., 2005) method. The post-processed forecast is represented by a probability distribution that is dependent on recent observations, simulation forced by observations (also known as the CEMS hydrological reanalysis), and forecasts. The output of this process is the 'CEMS Post-processed Hydrograph' (formerly this was called the 'Real-time Hydrograph') which is available in the pop-out windows of the Reporting Point layer for static reporting points where near real-time and past river discharge observations are available. Since EFAS version 4.5, the post-processing has been performed at 6-hourly time steps, where possible.

In CEMS-Flood, the post-processing is composed of two parts; the calibration (offline), and the forecast update (online).

Calibration (offline)

The offline calibration of the post-processing is performed twice a year to include the most recent observations.

Data for the offline calibration

The offline calibration requires at least 2 years of river discharge observations (although longer time series are preferable) and the simulation forced by meteorological observations for the same time period. Where possible, 6-hourly observations and simulations are used (as this allows the forecasts to be post-processed at a 6-hourly time step in the forecast update part); daily observations are used otherwise and the simulation is aggregated to a daily time step. For each station, the simulation comes from the most recent LISFLOOD historical run (available https://cds.climate.copernicus.eu/cdsapp#!/dataset/efas-historical?tab=overview).

In the case of EFAS, all observations are provided by the EFAS Data Providers. More information on how to provide meteorological and hydrological data to EFAS is available on the EFAS website. New stations are added during the next scheduled offline calibration process.

The off-line procedure has two main objectives:

1) Estimation of separate river discharge distributions for the observed and simulated river discharge values.

This estimation is performed by fitting a Generalised Pareto distribution to the extreme river discharge values and applying a kernel density estimation procedure for the remainder of the distribution (see Figure 1).

Figure 1: An example of the estimated river discharge distribution for a station from the offline calibration. Orange shows the part estimated by the Generalised Pareto distribution. Purple shows the main part of the distribution. Small black lines show the individual river discharge values. Modified from Matthews et al., 2022.

2) Estimation of a joint probability distribution of observations and simulations across multiple timesteps.

The joint probability distribution describes the relationship between observed and simulated values at different times over a 55-day period. Figure 2 shows an example of a joint distribution between 2 variables (a simulated variable (model) and an observed variable (reality)). The joint-distribution defined in the offline calibration is between 440 variables for the 6-hourly stations and 110 variables for the daily stations. The joint distributions allows a first estimate to be made of future river discharge observations give the observations and simulation from the past 40 days.

Figure 2: Representation of the joint probability distribution of observations and simulations, from Biondi, Daniela & Todini, Ezio. (2018). Comparing Hydrological Postprocessors Including Ensemble Predictions Into Full Predictive Probability Distribution of Streamflow. Water Resources Research. 10.1029/2017WR022432. https://doi.org/10.1029/2017WR022432

The distributions defined in the offline calibration are used in the forecast update part of the post-processing method. The length of the observation record and the quality of the observations can impact the accuracy of the distributions. Since EFAS version 5.0 the calibration period as well as the maximum and minimum values observed during that period are provided in a table within the EFAS Reporting Point pop-out window.

Forecast Update (online)

The online part of the post-processing method is performed for each station where the offline calibration was successful and near real-time river discharge data are available. Currently, over 1600 stations are post-processed in EFAS.

Data for forecast update step

The forecast update step requires the observations, simulation forced with observations (water balance), and CEMS-Flood ensemble forecasts for the past 40 days (although some leniency is given for missing values). It also requires the current CEMS-Flood ensemble forecast (i.e., the forecast that is being post-processed). The distributions defined in the offline calibration are also required. Where possible, 6-hourly observations are used and daily observations are used otherwise. However, the offline calibration and the real-time post-processing must use the same time step.

In the case of EFAS, for each day observations are extracted from the EFAS hydrological database (maintained by the CEMS Hydrological Data Collection Centre (HYDRO)) at approximately 07:00 UTC for the 00 EFAS cycle and at approximately 21:00 UTC for the 12 EFAS cycle. Therefore, any near real-time observations received after these times will only be included in the following EFAS post-processed forecasts.

The forecast update step is further split into 3 steps:

The joint probability distribution defined in the offline calibration is used to condition the river discharge distributions defined in the offline calibration on recent river discharge observations (see Fig. 3a). Using the recent river discharge values in this way restricts the forecast probability distribution to the values that are likely given the recent state of the river. This is the MCP portion of the method and it is used to correct errors and uncertainty due to the hydrological model and initial conditions.
The current CEMS-Flood ensemble forecast is spread corrected (see Fig. 3b). This is done by calculating the average spread correction parameter needed to match the spread of the CEMS-Flood ensemble forecasts with the root mean square error of the ensemble mean for the past 40 days. This is the EMOS portion of the method and it is used to correct errors and uncertainty due to meteorological forcings.
Steps 1 and 2 results in two probabilistic distributions which are combined using a Kalman filter (see Fig.3) . The Kalman filter weights the two distributions from steps 1 and 2 depending on their uncertainty (or spread). This creates a probability distribution that is consistent with recent observations and influenced by the predicted meteorological forcings.
The Real-Time Hydrograph product is created (see below).

Figure 3: The forecast update part of the post-processing method uses a) the MCP method and b) the EMOS method. The output from these two methods is combined using the Kalman Filter to produce the Real-time Hydrograph'.

CEMS Post-processed Hydrograph

CEMS-Flood post-processed forecasts are available for stations with at least 2 years of river discharge data and that provide near real-time river discharge observations to the CEMS Hydrological Data Collection Centre (HYDRO). The post-processed forecast is shown by the 'CEMS Post-processed Hydrograph' product (previously known as the real-time hydrograph) shown in the pop-up window of the 'Reporting Points' layer. Stations for which the Post-processed hydrograph is available are represented by light blue points in the Reporting Points layer.

The main panel shows the probability distribution (blue shading) for each time step of the forecast and the recent observations (black dots). Darker blues show values closer to the forecast median. In addition to the hydrograph, the probability of exceedance of up to 6 thresholds are shown as boxplots. The two panels on the right show the probability of exceeding the mean annual maximum flow (MHQ; top) and the mean flow (MQ; bottom) thresholds respectively. These thresholds are calculated from observed river discharge values from the calibration period. The four lower panels show the probability of exceeding up to four thresholds provided to HYDRO by the EFAS hydrological data providers.

Thresholds for EFAS

Since EFAS version 5, thresholds from data providers are included in the EFAS Post-processed Hydrograph following the suggestion from EFAS partners during a workshop at the 17th EFAS Annual Meeting. Up to four river discharge thresholds can be provided and are named in increasing order as TL1(D) to TL4(D). To provide thresholds for a station please contact HYDRO. More details on how to provide hydrological data (including thresholds) to EFAS is available on the EFAS-IS.

As thresholds are not provided for all stations the MQ and MHQ thresholds are still calculated and shown in the EFAS Post-processed Hydrograph. However, all thresholds are provided in a table shown in the pop-up window of the 'Reporting Points' layer to provide greater context for the MQ and MHQ which may differ from those calculated locally due to different calibration periods. All thresholds are updated during the offline calibration process.

Two examples are shown below, for stations Gaulfoss, Gaula in Norway (ID 1099) at 6-hourly timesteps, and Sevlievo, Rositsa in Bulgaria (ID 582) at daily timesteps.

Figure 4 - Real-time hydrographs for stations Gaulfoss (left), and Sevlievo (right).

Note: The post-processed forecasts have the tendency to slightly underestimate peaks, particularly in catchments with quick hydrological response times. We are investigating improvements to the method.

Known Issues

The CEMS-Flood post-processing method is highly dependent on both the past and near real-time observed and simulated river discharge values. Issues can arise for a forecast cycle if either the observed or simulated river discharge values are much higher than those previously recorded or if an insufficient number of near real-time observations are available at the time the forecast is created.

If the ensemble forecast predicts river discharge values higher than those recorded in the CEMS-Flood historical river discharge simulation, the Post-processed hydrograph will show as:

If the observations in the 3 days before the forecast time are larger than those recorded in the observed record made available to CEMS-Flood, the Post-processed hydrograph will show as:

If an insufficient number of near real-time river discharge observations are made available to CEMS-Flood, the Post-processed hydrograph will show as:

Additionally, erroneous river discharge observation in the offline calibration can cause severe errors in the CEMS-Flood post-processed forecasts. We aim to remove all erroneous observations but it is possible that some are missed. Whilst we are continuously trying to improve our quality control procedures, users are encouraged to provide feedback should they identify a station that shows large errors and we will investigate the issue.

References

Biondi, Daniela & Todini, Ezio. (2018). Comparing Hydrological Postprocessors Including Ensemble Predictions Into Full Predictive Probability Distribution of Streamflow. Water Resources Research. 10.1029/2017WR022432. https://doi.org/10.1029/2017WR022432

Gneiting, T., Raftery, A. E., Westveld, A. H., & Goldman, T. (2005). Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Monthly Weather Review, 133(5), 1098-1118.

Matthews, G., Barnard, C., Cloke, H., Dance, S. L., Jurlina, T., Mazzetti, C., & Prudhomme, C. (2022). Evaluating the impact of post-processing medium-range ensemble streamflow forecasts from the European Flood Awareness System. Hydrology and Earth System Sciences, 26(11), 2939-2968.

Todini, E. (2008). A model conditional processor to assess predictive uncertainty in flood forecasting. International Journal of River Basin Management, 6(2), 123-137.

Page tree