Contributors: Hendrik Boogaard (WAGENINGEN ENVIRONMENTAL RESEARCH), Allard de Wit (WAGENINGEN ENVIRONMENTAL RESEARCH), Jenny Lazebnik (WAGENINGEN ENVIRONMENTAL RESEARCH), Jonathan Schubert (METEOGROUP), Gerald van der Grijn (METEOGROUP)

Table of Contents

History of Modifications

Version

Date

Description of modification

editor

0.9

6 May 2019

Full draft

Hendrik Boogaard

1.0

18 may
2019

Review and minor edits

Ronald
Hutjes

1.1

Sept 2020

Updated dataset temporal coverage due to dataset update

ECMWF





Acronyms

Acronym

Description or definition

AgERA5

Daily surface meteorological data set for agronomic use, based on ERA5

CDS

Climate Data Store (of ECMWF)

ECMWF

European Centre for Medium Range Weather Forecast

ERA5

ECMWF Re-Analysis

HRES

High Resolution Forecast

JRC

Joint Research Centre of the European Commission

LT

Local Time

MARS

Monitoring Agricultural ResourceS

NN

Nearest Neighbor

1. Scope of the document


This document provides an overview of the AgERA5 product, the underlying data sets, the underlying algorithms and workflow. The AgERA5 dataset provides daily, agronomic relevant, meteorological data for the period 1979 to present at a spatial resolution of 0.1° .

2. Executive summary

The AgERA5 dataset provides daily surface meteorological data for the period 1979 to present at spatial resolution of 0.1° grid. The service is based on the fifth generation of ECMWF atmospheric re-analyses of the global climate, better known as ERA5. AgERA5 'connects' users in the agricultural domain to the new ERA5 data set. It includes daily aggregates of agronomic relevant variables, tuned to local day definitions and adapted to the finer topography, finer land use pattern and finer land-sea delineation of the ECMWF HRES operational model. The variables cover temperature, precipitation, snow depth, humidity, cloud cover and radiation.

3. Product description

The following text applies to AgERA5 version 1.0.

3.1. Introduction


Climate forcing data is used in analysis and agro-environmental modelling to study aspects of productivity and externalities of agriculture (e.g. Toreti et al, 2019; Glotter et al., 2016; De Wit et al., 2010). In this service we start from the hourly ECMWF ERA5 model data and convert the data into meaningful input for these analyses and modelling. It involves a large amount of data that needs to be processed. Acquisition and pre-processing of ERA5 data, both archive and near real- time (NRT) data, is a large and specialized job. It requires a heavy investment for users like technical policymakers, information agencies, NGOs, commodity traders, agri-businesses, insurance providers etcetera. The complex task and required effort may even be a barrier to start using the data.

This service is based on the original hourly deterministic ECMWF ERA5 data, at surface level and available at a spatial resolution of 30 km (~0.28125°). Data were aggregated to daily time steps and corrected towards a finer topography at a 0.1° spatial resolution. Aggregated data at daily time steps follow a local time zone definition and include a number of major agronomic parameters. The correction to the 0.1° grid was realized by applying grid and variable-specific regression equations to an ERA5 data set interpolated at 0.1 ° grid. The equations were trained on operational ECMWF HRES model data at a 0.1° resolution. The final data set is referred to as AgERA5. AgERA5 users will save potential users money and stimulate businesses in using such high quality data set. It avoids a possible proliferation of different data sets, originating from the basic hourly ERA5 data set.

3.2. Variable definitions


The AgERA5 includes 22 agronomic relevant variables. See Table 3-1.

Table 3-1:List of variables in the AgERA5 data set

Short name

Long name

Unit

Aggregation

AGROVOC URI

Cloud_Cover_Mean

Total cloud cover (00-00LT)

(0 - 1)

Mean

Dew_Point_Temperature_2m_Mean

2 meter dewpoint temperature (00-00LT)

K

Mean

Preciptation_Flux

Total precipitation (00-00LT)

mm d-1

Sum

Preciptation_Rain_Duration_Fraction

Precipitation type duration - rain (00-00LT)

-

Count


Preciptation_Solid_Duration_Fraction

Precipitation type duration - solid fraction (no hail) composed of: precipitation types freezing rain (3), snow (5), wet snow (6), mixture of rain and snow (7) and ice pellets (8) (00-00LT)

-

Count


Relative_Humidity_2m_06h

Relative humidity at 06LT

%

-

Relative_Humidity_2m_09h

Relative humidity at 09LT

%

-

Relative_Humidity_2m_12h

Relative humidity at 12LT

%

-

Relative_Humidity_2m_15h

Relative humidity at 15LT

%

-

Relative_Humidity_2m_18h

Relative humidity at 18LT

%

-

Snow_Thickness_LWE_Mean

Snow liquid water equivalent (00-00LT)

cm of liquid water equivalent

Mean

Snow_Thickness_Mean

Snow depth (00-00LT)

cm snow

Mean

Solar_Radiation_Flux

Surface solar radiation downwards (00-00LT)

J m-2d-1

Sum

Temperature_Air_2m_Max_24h

Maximum air temperature at 2 meter (00-00LT)

K

Maximum

Temperature_Air_2m_Max_Day_Time

Maximum air temperature at 2 meter (06-18LT)

K

Maximum

Temperature_Air_2m_Mean_24h

2 meter air temperature (00-00LT)

K

Mean

Temperature_Air_2m_Mean_Day_Tim e


2 meter air temperature (06-18LT)

K

Mean

Temperature_Air_2m_Mean_Night_Ti me


2 meter air temperature (18-06LT)

K

Mean

Temperature_Air_2m_Min_24h

Minimum air temperature at 2 meter (00-00LT)

K

Minimum

Temperature_Air_2m_Min_Night_Time

Minimum air temperature at 2 meter (18-06LT)

K

Minimum

Vapour_Pressure_Mean

Vapour pressure (00-00LT)

hPa

Mean

Wind_Speed_10m_Mean

10 meter wind component (00-00LT)

m s-1

Mean

3.3. Input data used

Logically the ERA5 data set is the main input data set. See https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5 ERA5 provides hourly estimates of a large number of atmospheric, land and oceanic climate variables. The data cover the earth on a 30 km grid and resolve the atmosphere using 137 levels from the surface up to a height of 80 km. ERA5 includes information about uncertainties for all variables at reduced spatial and temporal resolutions.
Concerning the archive the years 1979 to present were available during the project. Note that two versions of ERA5 are available through the CDS:

  • interpolated to a 0.25° grid
  • original ERA5 model level data (reanalysis-era5-complete) The latter version was used in this project.

 
ERA5 has a wide list of variables. See the following link: ERA5: data documentation, especially the tables:

  • 2: surface, instantaneous (averages)
  • 3: surface, accumulations
  • 4: surface, minimum/maximum

The following table shows the variables used for the AgERA5 product.

Table 3-2: Essential variables used for the AgERA5 product

Variable name

Unit

Short

Reference

Group

Snow density

kg m-3

rsn

table 2

INST1

Snow depth

m of water
equivalent

sd

table 2

INST1

10 metre U wind component

m s-1

u10

table 2

INST1

10 metre V wind component

m s-1

v10

table 2

INST1

Total cloud cover

(0 - 1)

tcc

table 2

INST1

2 metre temperature

K

2t

table 2

INST1

2 metre dewpoint temperature

K

2d

table 2

INST1

Surface solar radiation downwards

J m-2

ssrd

table 3

ACCMNMX

Total precipitation

m

tp

table 3

ACCMNMX

Precipitation type

code table
(4.201)1

ptype

table 2

INST2

Maximum temperature at 2 metres since
previous post-processing (last hour)

K

mx2t

table 5

ACCMNMX

Minimum temperature at 2 metres since
previous post-processing (last hour)

K

mn2t

table 5

ACCMNMX

Data of the HRES model were needed as a training data set to derive the bias correction. HRES data is not part of the C3S catalogue and was accessed through the contract (C3S422Lot1WEnR).

3.4. 3.4 Algorithms used

The workflow includes:

  • 0) Retrieving original hourly data of ERA5 from the CDS
  • 1) Nearest Neighbor interpolation to 0.1° grid (ECMWF HRES grid)
  • 2) Temporal aggregation and calculation of additional variables
  • 3) Apply location, variable and seasonal specific bias correction plus sea mask

The workflow is further described in Chapter 4 (workflow) and Chapter 5 (develop bias corrections).

1 The following types are distinguished: 0 = No precipitation, 1 = Rain, 3 = Freezing rain (i.e. super cooled), 5 = Snow, 6 = Wet snow, 7 = Mixture of rain and snow, 8 = Ice pellets 

4. Workflow

The AgERA5 workflow includes (see Figure 4-1):

  • 0) Retrieving original hourly data of ERA5 from the CDS
  • 1) Nearest Neighbor interpolation to 0.1° grid (ECMWF HRES grid)
  • 2) Temporal aggregation and calculation of additional variables
  • 3) Apply location, variable and seasonal specific bias correction plus sea mask

4.1. Step 0: Retrieving hourly data

The original ERA52 data are stored in the MARS archive and were retrieved, via the CDS (version: reanalysis-era5-complete), and prepared for further processing (see also section 3.3). ERA5 is originally calculated in a T639-spectral space and on a N320-gaussian grid3. This relates best to a 0.28125° grid and therefore this grid definition was used in the download.

4.2. Step 1: NN interpolation to 0.1° grid

Downloaded data were interpolated to a 0.1° grid which is close to the current HRES resolution. To preserve variability and extremes in the original data the Nearest Neighbor (NN) technique was applied.

4.3. Step 2: Temporal aggregation and additional variables

Next, hourly data were aggregated into daily accumulations applying variable and longitude specific aggregation schemes. By applying clever algorithms, agronomically relevant weather variables were computed that honor local time (LT), e.g. maximum temperature over daytime and minimum temperature over nighttime. Therefore, data comply with local calendar day definitions and aggregation schemes being used by NMIs4. Examples of such aggregation schemes, used to aggregate 3-hourly ERA-Interim data, can be found via the following URL: http://marswiki.jrc.ec.europa.eu/agri4castwiki/index.php/Meteorological_data_from_ECMWF_models.

In contrast to the study provided in the above URL, the number of longitudinal aggregation zones were increased from three to eight5 zones. Each zone was assigned to a certain longitude range for which a specific aggregation scheme was defined. See Annex I for the zone definition and Annex II for the aggregation schemes.

2 ERA5 pertains to ERA5-HRES (stream=oper) and the analyses (type=an)

3 https://confluence.ecmwf.int//display/CKB/ERA5+data+documentation#ERA5datadocumentation-Spatialgrid; https://confluence.ecmwf.int/display/CKB/ERA5%3A+What+is+the+spatial+reference

4 For example, JRC asked for a definition that is compatible with the ones used in the stations observations, for possible validation purposes. Furthermore, definitions (for daily averages) should roughly match a local calendar day or (for certain other elements) the corresponding day/night period, in all areas.

5 24 zones was not possible because the HRES operational model data, required for training the bias correction, was not available at 1-hourly time steps


Figure 4-1: Overview of the different processing steps in the whole workflow

An example: the ERA5 archive includes the maximum temperature of the previous hour. The 24 values of maximum temperature can be used to:

  • Derive the maximum temperature over day time taking the maximum of 12 maximum temperatures values occurring during the local day time (e.g. London between 06 and 18 UTC).
  • Derive the maximum temperature over 24 hours taking the maximum of 24 maximum temperatures values occurring during the local day (e.g. London between 00 UTC day X and 00 UTC day X + 1).

Similar aggregation can be done for minimum temperature but then taking the minimum over a range of hourly values. Most other elements were aggregated as the mean or sum over 24 hours of the local day. To obtain the set of 24 hours for a certain zone, hourly data of ERA5 is needed of day X, and possibly day X – 1 or X + 1. The exact dataset depends obviously on the zone (longitude range).

In case of precipitation type (rain, snow) the aggregation to a daily time step can be done type specific, thus counting the hours that the type appeared.

The applied aggregation zone definitions work very well with the local time zones of West- and East- Europe and mostly for the North-American continent. For Asia there is a shift of 2-3 hours between the actual local time definition and the definition in our study. The only extreme mismatch of the local time definitions will happen eastward of the dateline in zone E4. Fortunately, the affected areas (Pacific islands and the very western coast of Alaska) are, from an agricultural perspective, not particularly significant.

The following conversions were done:

  • unit conversion of precipitation (tp): m d-1 -> mm d-1
  • unit conversion of snow (sd; liquid water equivalent): m -> cm

In addition, the following variables were calculated:

  • 10m wind speed (m s-1) from the 10m u (10u) and 10 m v (10v) wind components: sqrt(10u*2 + 10v*2)
  • snow depth (cm) from snow density (rsn) and snow depth of liquid water equivalent (sd): (sd / rsn) * 1000 * 100
  • partial water vapour pressure (hPa) from dewpoint temperature (Td; Priestley and Taylor, 1972)): 10 * 0.6108 * exp((17.27 * d2m) / (d2m + 237.3))
  • relative humidity (%) from 2m temperature (t2m) and dewpoint temperature (d2m): 100 * (exp((17.27 * d2m) / (237.3 + d2m)) / exp((17.27 * t2m) / (237.3 + t2m)))

The temporal aggregation and calculation of additional variables lead to the final list of variables presented in Table 3-1.

The variables in the dataset answers the need of most common crop models6 (working at a daily time step) and their regional implementations and, in addition, the needs of users inventoried at the first stage of the project.

4.4. Step 3: Bias correction of data at 0.1° grid

A location, variable and season specific bias correction towards the HRES operational model was applied. This way the finer topography, finer land use pattern and finer land-sea delineation of the HRES operational model is more or less included in the downscaled ERA5. In fact, the ERA5 data set is tuned to the detailed topography of the HRES operational model also leading to more consistent time series between ERA5 and the HRES operational model.

For each grid cell and all variables, except precipitation and snow related variables, a linear equation is applied:

\[ Y_{i,j}^{ERA-5,corr} = \alpha_{i,j}Y_{i,j}^{ERA-5} + \beta_{i,j} + [T_{i,j}] \]

in which  \( Y_{i,j}^{ERA-5} \) is the ERA5 NN-interpolated variable (e.g. temperature, wind) for grid box [i,j],  \( Y_{i,j}^{ERA-5,corr} \) is the ERA5 NN-interpolated and bias corrected variable for grid box [i,j], and αi,j, βi,j are correction coefficients (hereinafter referred to as slope and intercept, respectively).

The parameter Ti,j accounts for an additional seasonal correction and reads:

\[ T_{i,j} = \gamma_{1,i,j}T_{1} + \gamma_{2,i,j}T_{2} + \gamma_{3,i,j}T_{3} + \gamma_{4,i,j}T_{4} \]

The correction towards the HRES operational model is very relevant for users that do near real time monitoring of growing conditions and agricultural production. Note that the final ERA5 product will come available with a time lag of one week including the temporary ERA5 line. For monitoring systems like JRC’s Monitoring Agricultural ResourceS (MARS) such time lag is too large and therefore data in such systems have to be completed with data from the HRES operational model. When combining data of two datasets, originating from different resolutions, biases might be introduced that negatively affect the monitoring performance. This can be avoided by correcting the ERA5 towards the HRES operational model. Similar reasoning applies to forecast products like the ENS forecasts (15/30 day ensemble forecasts). This product can also be downscaled and bias corrected towards the HRES operational model. This way more or less consistent time series are obtained linking reanalysis, HRES and ENS data all around a common ‘HRES’ reference. Some remarks:

  • To improve the timeliness of the foreseen service the preliminary ERA5 product, ERA5t, needs also to be processed. We hereby assume that the bias correction algorithms, which are based on ERA5 data, can also be applied on ERA5t data.
  • Specifically for users that need to link ERA5 to HRES for NRT monitoring purposes the following issue is relevant. The merge with the HRES operational model would need an
  • additional service relying on specific data contracts with ECMWF. And the HRES operational model data must be processed in a similar way (daily aggregation, possibly elevation corrections etc.) as the ERA5 data.
  • Note that the HRES model is constantly improving (improved model physics, increased spatial resolution etc.). Therefore, with each additional HRES model upgrade, the established statistical relationship between ERA5 and HRES will become less valid. Over time, this may lead to jumps in the time series as the bias correction is correcting for aspects that changed in the HRES model. In such case users, that link ERA5 to HRES, need to be warned and eventually the bias correction needs to be updated.

6 CGMS-WOFOST, EPIC-BOKU, EPIC-IIASA, EPIC-TAMU, GEPIC, LPJ-GUESS, LPJmL, pAPSIM, pDSSAT, PEGASUS, PEPIC, PRYSBI2 

During the processing only the 'land' locations at the surface level (topographical elevation) were maintained using the HRES land-sea mask. This mask includes the area fraction of land within each 0.1° grid cell. As threshold, the fraction 0.05 has been selected: above it is land, below it is sea (see Figure 4-2).

Figure 4-2: Select of land 0.1° grid cells: the area fraction land within a 0.1° grid cell (top) and selection of land grid cells after applying the threshold of 0.05 area fraction

5. Develop bias corrections

Step 3, as described in the previous chapter, covers the bias correction towards the HRES grid (0.1° grid). The grid and variable-specific regression equations are trained on operational ECMWF HRES model data.

The approach, to develop the equations, consists of the following main steps:

  1. Interpolate the data towards a 1° grid (see step 1)
  2. Aggregate hourly model data to daily variables (see step 2)
  3. Train statistical correction equations for each variable and grid point
  4. Apply the trained equations to the ERA5 data set

The development of the equations (using HRES operational model as training set) is an on-off action and has been documented in a separate document named “C3S422Lot1.WEnR.DS2_Downscaling and bias correction v1.7.pdf”. This section provides a summary of this work.

The input data:

  1. ECMWF ERA5 reanalysis (grid1: 28125° x 0.28125°)
  2. ECMWF HRES (grid2: 0.10° x 10°)

Both data sets are covering the globe, including land and sea grid boxes.

Originally, ERA5 data is available as hourly fields, while HRES has a temporal resolution of 3 hours. For both models, a set of 12 base parameters (see Table 3-2) was retrieved from the ECMWF MARS archive covering a period of two years. These base parameters with 1-hourly/3-hourly resolution were then aggregated to 22 (derived) daily parameters over 8 different longitudinal bands (see section 4.3; note that schemes given in Annex II only apply to ERA5, the schemes for HRES-data are available on request). Note that the ERA5 data was first interpolated towards the 0.1° grid using the NN-technique (see section 4.2) before applying the aggregation to days.

To train the regression equations, a data set of 2-3 years is desired. Both, ERA5 and HRES, need to be available for this period. Based on the recent HRES model upgrades outlined in the separate report, the period between 2016-04-01 and 2018-03-31 was chosen as the training period for the final bias correction equations. Most importantly, this period does not include any horizontal grid or resolution changes. Also, data of both models were available through ECMWFs MARS archive at the moment the bias correction analysis took place. Therefore, the generated equations correct ERA5 data towards a mixture HRES model cycles (41r2, 43r1 and 43r3).

The equations were derived by means of multiple linear regression.

Not all daily aggregated elements (see Table 3-1) are fitted to be corrected by this method. For instance, the snow parameters lack snow cases for most parts of the world, to build a robust correction statistic. Similar issues are expected to happen with the precipitation parameters (sum and type) in arid regions.

The MOS (Model Output Statistics) routine was used to carry out a multiple linear regression between the ECMWF HRES data and the NN-interpolated ERA5 data for each grid cell. The outcome is a linear equation (in this case demonstrated for the ERA5 data set):

\[ Y_{i,j}^{ERA-5,corr} = \alpha_{i,j}Y_{i,j}^{ERA-5} + \beta_{i,j} + [T_{i,j}] \]

in which  \( Y_{i,j}^{ERA-5} \) is the ERA5 NN-interpolated variable (e.g. temperature, wind) for grid box [i,j],  \( Y_{i,j}^{ERA-5,corr} \) is the ERA5 NN-interpolated and bias corrected variable for grid box [i,j], and αi,j, βi,j are correction coefficients (hereinafter referred to as slope and intercept, respectively).

The parameter Ti,j accounts for an additional seasonal correction and reads:

\[ T_{i,j} = \gamma_{1,i,j}T_{1} + \gamma_{2,i,j}T_{2} + \gamma_{3,i,j}T_{3} + \gamma_{4,i,j}T_{4} \]

in which T1 to T4 are sinusoidal time functions with a period of one year, and 𝛾1,𝑖,𝑗 to 𝛾4,𝑖,𝑗 are the respective coefficients. The sinusoidal time functions that were used read:

\[ T_{1} = 100\sin \left(2\pi \frac{day-21}{365} \right) \] \[ T_{2} = 100\sin \left(2\pi \frac{day-81}{365} \right) \]


\[ T_{2} = 100\sin \left(2\pi \frac{day-111}{365} \right) \]


\[ T_{2} = 100\sin \left(2\pi \frac{day-141}{365} \right) \]

With the combination of the above sine functions and coefficients, any grid-specific time correction function can be constructed. To achieve this, it is enough to use only the 2 best sinusoidal time functions of the 4 available for each grid point in the final equation.

The objects created by the bias correction application are twofold. The trained regression equation of a particular parameter was written to a NetCDF file, having the slope, the intercept and each of the seasonal cycle coefficients stored as a normal NetCDF parameter. The evaluation metrics were handled similarly. For analysis purposes the MAE, RMSE and R-squared were calculated and stored in a second NetCDF file.

A detailed analysis of the significance of the bias correction can be found in document “C3S422Lot1.WEnR.DS2_Downscaling and bias correction v1.7.pdf”.

Table 5-1 summarizes how the ERA5 improves (in terms of MAE for the main elements) after applying the bias correction.

Table 5-1: MAE (HRES-ERA5corrected) and MAE improvement of different bias corrected variables. The MAE improvements indicate the added value through the bias correction. All metrics were calculated for different regions and for subsets of grid points meeting certain conditions. E.g. “Land & above 800m” only uses grid points being located on land and above 800m. “Coasts & Lakes” subsets all grid points with a land fraction between 10% and 90%.


Land

Land & below 800m

Land & above 800m

Coasts & Lakes

Variable

Region

MAE

MAE Impr

MAE

MAE Impr

MAE

MAE Impr

MAE

MAE Impr

2t_davg [K]

Africa

0.44

40%

0.42

36%

0.47

48%

0.36

50%

2t_davg

Asia

0.72

36%

0.67

27%

0.86

48%

0.66

32%

2t_davg

Australia

0.43

42%

0.43

35%

0.37

83%

0.30

49%

2t_davg

Europe

0.51

36%

0.47

30%

0.75

55%

0.45

38%

2t_davg

N-America

0.71

31%

0.67

25%

0.85

41%

0.68

28%

2t_davg

S-America

0.45

50%

0.42

41%

0.61

65%

0.38

48%

2d_davg [K]

Africa

0.76

38%

0.77

38%

0.76

39%

0.55

46%

2d_davg

Asia

0.90

29%

0.81

25%

1.09

35%

0.73

28%

2d_davg

Australia

0.57

34%

0.57

28%

0.43

78%

0.36

43%

2d_davg

Europe

0.58

28%

0.55

22%

0.81

46%

0.54

27%

2d_davg

N-America

0.80

23%

0.73

18%

0.97

32%

0.70

21%

2d_davg

S-America

0.54

42%

0.44

37%

0.99

50%

0.41

40%

ff_davg [m/s]

Africa

0.27

25%

0.26

22%

0.28

32%

0.33

47%

ff_davg

Asia

0.29

28%

0.27

24%

0.34

35%

0.36

35%

ff_davg

Australia

0.24

31%

0.25

30%

0.22

41%

0.31

53%

ff_davg

Europe

0.25

31%

0.24

31%

0.32

33%

0.33

48%

ff_davg

N-America

0.29

28%

0.28

26%

0.33

31%

0.33

34%

ff_davg

S-America

0.23

30%

0.22

26%

0.27

42%

0.32

51%

tcc_davg [0-1]

Africa

0.08

3%

0.08

2%

0.08

4%

0.08

5%

tcc_davg

Asia

0.07

0%

0.07

-2%

0.08

4%

0.08

-2%

tcc_davg

Australia

0.06

-1%

0.06

-1%

0.06

5%

0.07

2%

tcc_davg

Europe

0.07

-1%

0.07

-1%

0.07

2%

0.07

-1%

tcc_davg

N-America

0.08

0%

0.08

-1%

0.07

2%

0.08

-1%

tcc_davg

S-America

0.07

4%

0.07

3%

0.07

8%

0.07

5%

ssrd_dsumdiff [J/m2d]

Africa

1055575

7%

1030480

7%

1118699

8%

1151300

13%

ssrd_dsumdiff

Asia

872717

4%

836249

3%

958997

7%

899084

5%

ssrd_dsumdiff

Australia

1205911

6%

1177253

6%

1772895

14%

1497494

12%

ssrd_dsumdiff

Europe

832226

2%

815116

2%

951428

5%

782759

4%

ssrd_dsumdiff

N-America

899054

4%

902781

3%

888809

6%

916596

4%

ssrd_dsumdiff

S-America

1427243

9%

1448626

9%

1328043

13%

1316248

11%

The MAE indicates the error of the corrected data (HRES-ERA5corrected), while the MAE improvement compares the error of the corrected versus the not corrected ERA5 data. All metrics were aggregated for different regions and certain subsets of grid points. Overall, the temperature, humidity and wind speed variables benefit most from the correction. The MAE is reduced by 30% to 60% in the majority of cases. Grid points being located in mountainous areas or along coasts and lakes are improved most. This is not surprising as these are the areas where the largest systematic differences between ERA5 and HRES can be expected. But not only the relative improvements are quite large, also the absolute MAE values after the correction are small. The MAE for the 24h mean of the 2m temperatures (2t_davg) for example is for all continents below 0.72K, and for 4 of 6 continents even below 0.51K.
For the solar radiation flux (ssrd_dsumdiff) the MAE improvement is solid and ranges between 2% and 14%, depending on the region and subset. The results of element "24h mean cloud cover" (tcc_davg) are mixed. For most grid points the correction doesn't add any value. The MAE improvement of the majority of all grid points (land and below 800m) is between -2% and +4%, and therefore near zero. Only for grid points above 800m we can observe a small but clear improvement (2% - 8%).
The following conclusions were drawn from the evaluation study:

  1. The selected bias correction method has its largest benefits in mountainous areas, at coast lines and at lakes.
  2. Seasonal correction on top of the simple bias correction further improves the accuracy of the derived correction equations.
  3. The approach works remarkable well for 3 out of the 4 groups of variables. The averaged relative reduction of MAE is between 30% and 60%. These are:
    1. Temperature parameters
    2. Humidity parameters
    3. Wind speed
  4. The correction models for solar radiation flux reach a MAE improvement of 2% to 14%.
  5. For cloud cover the correction has only a minor effect for most of the grid points. However, mountainous regions still benefit from the correction with a MAE improvement of 2%-8%.

6. Appendix I Longitudinal aggregation zones

Longitudinal aggregation zones are defined around central longitudes. The first zone is at zero longitude (London). This zones stretches from 22.5 west to 22.5 east. The next zone is centered around 45 east stretching from 22.5 east to 67.5 east. And so on. This definition works very well with the local time zone configuration of West- and East-Europe and mostly with the American continent. For Asia there will be a shift between the real local time definition and our definition of 2-3 hours. The only extreme mismatch of the local time definitions will happen eastward of the dateline in zone E4. Fortunately, the affected areas (island in the Pacific and the very western coast of Alaska) are, from agricultural perspective, not so interesting.

7. Appendix II Aggregation schemes

Some remarks:

  • A „hour box" in the top row is always meant to represent the hour on the left border of the box
  • Variables 2t, 2d, ff, tcc, sd, rsn, vp and ptype and rh are all instantaneous values. To align with HRES (only available with 3-hour timestep) the period 03-00 has been selected: aggregate 8 values like 03,06,09,12,15,18,21,00
  • Variables mn2t, mx2t, ssrd, tp summarize the condition of 1 hour (sum, min, max, type)

8. References

Toreti, A. Maiorano, G. De Sanctis, H. Webber, A.C. Ruane, D. Fumagalli, A. Ceglar, S. Niemeyer, Zampieri Using reanalysis in crop monitoring and forecasting systems Agricultural Systems, Volume 168, 2019, pp. 144-15.

Glotter et al., 2016, M.J. Glotter, A.C. Ruane, E.J. Moyer, J.W. Elliott Evaluating the sensitivity of agricultural model performance to different climate inputs Appl. Meteorol. Climatol., 55 (2016), pp. 579-594.

Wit, A.J.W. de, Baruth, B., Boogaard, H., Diepen, K. van, Kraalingen, D.W.G. van, Micale, F., Roller, J.A. te, Supit, I., Wijngaart, R. van der, 2010. Using ERA-INTERIM for regional crop yield forecasting in Europe. Climate Research 44 (2010)1. - ISSN 0936-577X - p. 41 - 53.

https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5

https://software.ecmwf.int/wiki/display/CKB/ERA5+data+documentation

http://marswiki.jrc.ec.europa.eu/agri4castwiki/index.php/Meteorological_data_from_ECMWF_mo dels.

https://confluence.ecmwf.int/display/CKB/ERA5%3A+What+is+the+spatial+reference


This document has been produced in the context of the Copernicus Climate Change Service (C3S).

The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.

The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view.

Related articles