As a preliminary solution to the verification of seasonal forecast products published by C3S (at https://climate.copernicus.eu/seasonal-forecasts), this article presents graphics for a few verification metrics relevant to ensemble predictions. These metrics and scores, consistent with the guidance on forecast verification provided by WMO, describe statistics of forecast performance over a past period, as an indication of expectation of 'confidence' in the real-time outputs.
Seasonal forecasts are, inevitably, based on ensembles which sample and quantify the effect of uncertainties from a variety of sources (initial conditions, model formulation, model errors) on the expected outcomes. As most forecast products are expressed as probabilities - to preserve as much of the information from the ensemble simulation as possible - evaluation requires a large sample of pairs of forecasts and observations. For long-range predictions - including seasonal - this is achieved by using hindcasts (also called retrospective forecasts, or, simply, re-forecasts). Hindcasts are created using the same forecast model, initial conditions and ensembles generated in the same way as in the real-time forecasts, for start dates in a past period. They have the same length as real-time forecasts, and are created without any knowledge of future data, thus representing as close as possible a parallel of having created forecasts with the same methodology, in the past. These can now be compared with observations, to determine their quality, for a variety of attributes (e.g. accuracy, association, reliability, discrimination).
The scores presented here are for the operational products published by C3S (the definition of the product is identical in the verification and the real-time forecast); the hindcast period used is 1993-2016. The scores/metrics used are linear temporal (Spearman) correlation, area under the relative (or 'receiver') operating characteristic (ROC) curve and the ranked probability score (RPS) for tercile categories, all defined as in the WMO forecast guidance (https://library.wmo.int/doc_num.php?explnum_id=4886). Therefore for correlation and ROC area, the higher the value the better; for RPS, the opposite is the case.
The drop-down menu selections are self explanatory (or easy to interpret alongside the information on forecast system from the relevant datasets' CDS documentation) with a possible exception: the 'C3S multi-system' options. For these, the year in brackets refers to the 'version' of the multi-system, indicating the year in which the particular choice was used for real-time forecasts. Explicit description of the relevant system version is available in the CDS documentation on data availability.
These diagnostics will soon be deployed alongside the graphical products on the C3S seasonal forecast webpages. Also, a Jupyter Notebook allowing users to reproduce those plots and giving them a starting point to create similar ones for the variables of their interest can be found in the C3S Data Tutorials: https://ecmwf-projects.github.io/copernicus-training-c3s/sf-verification.html