Data file consists of score values and corresponding metadata in an ASCII format.
Every score value is described by the full set of key attributes, like its parameter, station id, month, step etc. Parameters describing one score value at one station are organised into a record. Each record corresponds to one score value. A record is a collection of pairs key=value separated by commas. A record spans one line. Value of the key which is not given in the current record is inherited from the previous record, except for the value parameter v which has to be present in each record.
A record has the following format:
centre=centre, model=model_id, d=yyyymm, t=time, s=forecast_step, st=station_id, lat=latitude, lon=longitude, lam=model_grid_latitude, lom=model_grid_longitude, se=station_elevation, me=model_orography_elevation, par=parameter, sc=score, th=event_thresholds, n=sample_size, v=score_value |
If the value of any key is unknown it is encoded as "na". However, every record has to have a valid score_value (the "v" key); if the value is not known such record should not be included in the data file.
Example
For information on how these reports are assembled at ECMWF please refer to ECMWF implementation notes.
centre=ecmf,model=hr_0001,d=201602,t=0,s=0,st=97146,lat=-4.1,lon=122.43,lam=-4.147,lom=122.484,se=50,me=163,par=tcc,sc=ct,th=2/6,v=0/0/0/0/0/7/0/0/21 t=3,s=3,v=0/0/0/0/0/4/0/0/24 t=6,s=6,v=0/0/0/0/0/2/0/0/26 t=12,s=12,v=0/0/0/0/0/6/0/0/22 t=15,s=15,v=0/0/0/0/0/3/0/0/25 t=18,s=18,v=0/0/0/0/0/4/0/0/24 t=0,s=24,v=0/0/0/0/0/5/0/0/23 s=0,sc=mae,th=na,v=60.92 t=3,s=3,v=59.92 t=6,s=6,v=62.01 t=12,s=12,v=60.59 t=15,s=15,v=59.81 t=18,s=18,v=62.08 t=0,s=24,v=65.01 s=0,sc=me,v=-60.92 t=3,s=3,v=-59.92 t=6,s=6,v=-62.01 t=9,s=9,n=26,v=-66.37 ...
Values of keys
- centre (4-characters string) is the WMO identifier of the originating centre (ammc, cwao, ecmf, edzw, egrr, kwbc, lfpw, rjtd, rksl, rums etc);
model_id (a string not containing a comma or vertical bar) is free model identifier assigned by the originating centre (to distinguish between potentially different models provided by the centre);
- yyyymm is the month of the mean, where yyyy is the year and mm is the month (01-12);
- time is the validity time (in hours UTC) of the forecasts verified;
- forecast_step is the length of the forecast (in hours);
- station_id (a number) is the WMO ID of the observation station verifying the forecasts;
- latitude is the latitude of the observation station verifying the forecasts;
- longitude is the longitude of the observation station verifying the forecasts;
- model_grid_latitude is the latitude of the model grid point used to extract the forecast at the observation location (*);
- model_grid_longitude is the longitude of the model grid point used to extract the forecast at the observation location (*);
- station_elevation is the elevation of the observation station above the mean sea level in meters;
- model_orography_elevation is the elevation of the model orography at the observation location (*);
- parameter is the verified model output parameter:
parameter name units t2m air temperature at 2 meters above the model orography K td2m dewpoint at 2 meters above the model orography K rh2m relative humidity at 2 meters above the model orography % tp06 total precipitation accumulated over previous 6 hours mm tp24 total precipitation accumulated over previous 24 hours mm ff10m speed of wind at 10 meters above the model orography m/s dd10m direction of wind at 10 meters above the model orography deg tcc total cloud cover okta (rounded to nearest okta for contingency table) mslp mean-sea-level pressure hPa z500hPa geopotential height of 500hPa level m t850hPa air temperature at 850hPa level K w850hPa (vector) wind speed at 850hPa level m/s r700hPa relative humidity at 700hPa level %
- score is the name of the verification score or statistic:
score | description | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
me | mean error (bias) | ||||||||||||||||||||||||||||
mae | mean absolute error | ||||||||||||||||||||||||||||
rmse | root mean square error | ||||||||||||||||||||||||||||
ct | contingency table values, The rank of a contingency table is defined by number of values in the key event_thresholds
NB. Please note how those contingency tables are constructed.
but following the definition of 2x2 contingency table on the left the table should be
|
v=C/A/D/B | |||||||||||||||||||||||||||
v=G/D/A/H/E/B/I/F/C | |||||||||||||||||||||||||||||
v=M/I/E/A/N/J/F/B/O/K/G/C/P/L/H/D |
- event_thresholds is a value or values of threshold(s) defining events for contingency tables; number of values in event_thresholds defines the rank of the contingency table; multiple values are separated by a forward slash /
event_threshold | description |
---|---|
15 | threshold for a 2x2 contingency table, e.g. if par=ff10m this is an event of 10m wind speed |
5/10/15 | thresholds for a 4x4 contingency table for 10m wind speed |
- sample_size is number of observations used to compute the monthly mean at the given station;
- score_value is the value or values of the score mean computed based on the forecast initiating at time time UTC, verifying in the month yyyymm, for the forecast length forecast_step hours; in case of nxn contingency table these are the n2 values delimited by forward slashes (see the score table above); if possible, the value should be printed using the printf string format specifier "g" (or equivalent).
Remarks
(*) If the model grid or orography changed during the reported month (due to a model upgrade etc) the values of lam, lom and me should be those of the latest model run.