Contributors: Hendrik Boogaard (WAGENINGEN ENVIRONMENTAL RESEARCH), Gerald van der Grijn (METEOGROUP), Jonathan Schubert (METEOGROUP)
1. Introduction
This report gives an overview of the statistical downscaling of daily aggregated ERA5 reanalysis data towards a 0.1° grid. This downscaling is realized by applying grid and parameter-specific regression equations to an interpolated ERA5 data set. The equations are trained on operational ECMWF HRES model data.
The approach consists of the following main steps:
- Interpolate the data towards a 0.1° grid
- Aggregate hourly model data to daily parameters
- Train statistical correction equations for each parameter and grid point
- Apply the trained equations to the ERA5 data set
The development of the bias correction equations is a one-off effort. These trained equations will be used later to downscale and bias correct the operational ERA5 data.
2. Input data
The following models are used as input data:
- ECMWF ERA5 reanalysis (grid1: 0.28125° x 0.28125°)
- ECMWF HRES (grid2: 0.10° x 0.10°)
Both data sets are covering the globe, including land and sea grid boxes.
Originally, ERA5 data is available as hourly fields, while HRES has a temporal resolution of 3 hours. For both models a set of 12 base parameters has been retrieved from the ECMWF MARS archive covering a period of two years (see section 2.2). These base parameters with 1-hourly/3-hourly resolution have then been aggregated to 22 (derived) daily parameters over 8 different longitudinal bands.
2.1. Parameters and daily aggregation
In order to adhere to local time as much as possible, the aggregation schema have been developed for different zones centred around 8 central longitudes. The first zone is at zero longitude (Greenwich). This zone stretches from 22.5° W to 22.5° E. The next zone is centred around 45° E stretching from 22.5° E to 67.5° E. The other zones are constructed in the same manner and are visualized in Figure 1. For each aggregation zone different forecast lead times of the models are considered when computing the daily parameters.
Following this process, it is ensured that for example the minimum temperature during night time is indeed covering the night hours for each part of the world and it therefore also harmonizes with the time zone dependent reporting practice of surface weather stations.
Similar aggregation methods (minimum, maximum, mean, sum) are applied to all parameters . In case of precipitation type (wet, solid) the aggregation to a daily time step is done type specific, thus counting the hours that a certain type was significant. Additionally, some parameters like 10m wind speed, humidity parameters and snow thickness have been calculated from the original base parameters.
The applied aggregation zone definitions work very well with the local time zones of West- and East- Europe and mostly for the North-American continent. For Asia there will be a shift of 2-3 hours between the actual local time definition and our definition. The only extreme mismatch of the local time definitions will happen eastward of the dateline in zone E4. Fortunately, the affected areas (Pacific islands and the very western coast of Alaska) are, from an agricultural perspective, not particularly significant.
The configuration of the used model time steps differs between ERA5 and HRES, as their temporal resolution differs. The detailed configuration can be found in Excel documents:
- C3S_D422Lot1.DS2_User_Documentation_HRES_aggregation_scheme.xlsx
- C3S_D422Lot1.DS2_User_Documentation_AgERA5_aggregation_scheme.xlsx
Figure 1: Longitudinal aggregation zones
2.2. Choice of training period
To train the regression equations, a data set of 2-3 years is desired. Both, ERA5 and HRES, need to be available for this period and should be mostly consistent, meaning that no major model updates happened during that period.
The ERA5 data is based on IFS model cycle 41r2 and is therefore self-consistent. In contrast, HRES has undergone a continuous improvement process and has been updated 5 times within the last 3 years.
The most significant changes of each update have been:
- 2015-05-12 – IFS cycle 41r1:
- Significant changes to the model physics, assimilation, observation usage and the ensemble configuration. HRES resolution unchanged
- 2016-03-08 – IFS cycle 41r2:
- Horizontal resolution and grid change (T1279 -> O1280)
- 2016-11-22 – IFS cycle 43r1:
- Many scientific contributions, including changes in data assimilation; in the use of observations; and in Ocean model resolution changed. Changes in ENS.
- 2017-07-11 – IFS cycle 43r3:
- Improvements mainly in medium-range and monthly HRES resolution unchanged
- 2018-06-05 – IFS cycle 45r1:
- Changes to the assimilation, observation usage and in HRES resolution unchanged
Based on the recent HRES model upgrades outlined above, the period between 2016-04-01 and 2018-03-31 has been chosen as the training period for the final bias correction equations. Most importantly, this period does not include any horizontal grid or resolution changes. Also, data of both models has been available through ECMWFs MARS archive at the moment the bias correction analysis took place. Therefore, the generated equations will correct ERA5 data towards a mixture HRES model cycles (41r2, 43r1 and 43r3).
Table 1: List of elements in Ag-ERA5 data set
Proposed name | Element name | Unit | Aggregation | Bias correction |
Wind-Speed-10m-Mean | 10 meter wind component (00-00LT) | m s-1 | Mean | Yes |
Dew-Point-Temperature-2m-Mean | 2 meter dewpoint temperature (00-00LT) | K | Mean | Yes |
Temperature-Air-2m-Mean | 2 meter air temperature (00-00LT) | K | Mean | Yes |
Temperature-Air-2m-Mean-Day-Time | 2 meter air temperature (06-18LT) | K | Mean | Yes |
Temperature-Air-2m-Mean-Night-Time | 2 meter air temperature (18-06LT) | K | Mean | Yes |
Temperature-Air-2m-Max-Day-Time | Maximum air temperature at 2 meter (06-18LT) | K | Maximum | Yes |
Temperature-Air-2m-Max-24h | Maximum air temperature at 2 meter (00-00LT) | K | Maximum | Yes |
Temperature-Air-2m-Min-Night-Time | Minimum air temperature at 2 meter (18-06LT) | K | Minimum | Yes |
Temperature-Air-2m-Min-24h | Minimum air temperature at 2 meter (00-00LT) | K | Minimum | Yes |
Precipitation-Rain-Duration-Fraction | Precipitation type duration - rain (00-00LT) | - | Count | No |
| Precipitation type duration - solid fraction (no hail) composed of: precipitation types freezing rain (3), snow (5), wet snow (6), mixture of rain |
|
|
|
Relative-Humidity-2m-06h | Relative humidity at 06LT | % | - | Yes |
Relative-Humidity-2m-09h | Relative humidity at 09LT | % | - | Yes |
Relative-Humidity-2m-12h | Relative humidity at 12LT | % | - | Yes |
Relative-Humidity-2m-15h | Relative humidity at 15LT | % | - | Yes |
Relative-Humidity-2m-18h | Relative humidity at 18LT | % | - | Yes |
|
| cm of liquid water |
|
|
Snow-Thickness-Mean | Snow depth (00-00LT) | cm snow | Mean | No |
Solar-Radiation-Flux | Surface solar radiation downwards (00-00LT) | J m-2d-1 | Sum | Yes |
Cloud-Cover-Mean | Total cloud cover (00-00LT) | (0 - 1) | Mean | Yes |
Precipitation-Flux | Total precipitation (00-00LT) | mm d-1 | Sum | No |
Vapour-Pressure-Mean | Vapour pressure (00-00LT) | hPa | Mean | Yes |
3. Approach: Bias corrected downscaling
A simple interpolation of the ERA5 data would still miss out on most local effects related to sub grid details in topography, land use and land/sea cover. To incorporate those effects, a bias correction is applied to the interpolated ERA5 data. The bias correction equations are trained on a 2-year data set of the ECMWF HRES model.
3.1. Downscaling / Interpolation
The ECMWF ERA5 reanalysis data is originally retrieved on a 0.28125° lat/lon grid. To downscale towards a 0.1° lat/lon grid a nearest neighbour (NN) interpolation is used. This interpolation technique is preferred over previously used methods, as it retains extreme values in the data set. Extreme values are especially important for the studies later carried out on this data set. Figure 2 shows an example of the original and NN interpolated grid. During the design phase of the downscaling method, an inverse distance weighted interpolation technique has been discussed but this technique has been dismissed due to the poor behaviour in keeping the extreme values in the data set.
Technically, the CDO software package1 is used to apply the NN interpolation to the ERA5 data set:
- Generation of weight coefficients:
- cdo gennn,global_0.1 <ifile> < weights_file>
- Applying the remapping:
- cdo remap,<weights_file> <ifile> <ofile>
3.2. Bias correction
Training the aggregated and downscaled ECMWF ERA5 on 2 recent years of HRES model data yields grid-specific bias correction equations. The equations are derived by means of multiple linear regression.
Not all daily aggregated elements are fitted to be corrected by this method. For instance, the snow parameters lack snow cases for most parts of the world, to build a robust correction statistic. Similar issues are expected to happen with the precipitation parameters in arid regions.
The MOS (Model Output Statistics) routine is used to carry out a multiple linear regression between the ECMWF HRES data and the NN-interpolated ERA5 data for each grid cell. The outcome is a linear equation (in this case demonstrated for the ERA5 data set):
in which \( Y_{i,j}^{ERA-5} \) is the ERA5 NN-interpolated variable (e.g. temperature, wind) for grid box [i,j], \( Y_{i,j}^{ERA-5,corr} \) is the ERA5 NN-interpolated and bias corrected variable for grid box [i,j], and αi,j, βi,j are correction coefficients (hereinafter referred to as slope and intercept, respectively).
The parameter Ti,j accounts for an additional seasonal correction and reads:
\[ T_{i,j} = \gamma_{1,i,j}T_{1} + \gamma_{2,i,j}T_{2} + \gamma_{3,i,j}T_{3} + \gamma_{4,i,j}T_{4} \]in which T1 to T4 are sinusoidal time functions with a period of one year, and 𝛾1,𝑖,𝑗 to 𝛾4,𝑖,𝑗 are the respective coefficients. The sinusoidal time functions that were used read:
\[ T_{1} = 100\sin \left(2\pi \frac{day-21}{365} \right) \] \[ T_{2} = 100\sin \left(2\pi \frac{day-81}{365} \right) \]With the combination of the above sine functions and coefficients, any grid-specific time correction function can be constructed. To achieve this, it is enough to use only the 2 best sinusoidal time functions of the 4 available for each grid point in the final equation.
3.3. Results
The objects being created by the bias correction application are twofold. The trained regression equation of a particular parameter will be written to a NetCDF file, having the slope, the intercept and each of the seasonal cycle coefficients stored as a normal NetCDF parameter. The evaluation metrics will be handled similarly. For analysis purposes the MAE, RMSE and R-squared will be calculated and stored in a second NetCDF file. Handling the components of the trained equation and the verification metrics as NetCDF files will simplify the evaluation of the generated equations. Also, finally applying the bias correcting to the complete ERA5 history will be much simpler.
Figure 2: Comparison of regridding techniques (Top: original ERA5, middle: NN interpolation, bottom: inverse-distance weighted interpolation)
4. Impact evaluation
4.1. Overview and approach
The presented downscaling and bias correction methodology is meant to learn the systematic differences between the ERA5 data and the HRES data and is therefore expected to introduce local HRES effects into ERA5. To ensure the validity and quality of the applied algorithms, a variety of analysis have been carried out. These evaluate how well the model is able to mimic the HRES data based on ERA5 data. The models have been trained with data covering 2 years of historical data (see chapter: 2.2 Choice of training period), the following analysis uses the same period of time for the evaluations (in-sample verification).
The following metrics and visualizations have been created and carried out for all elements:
- Global visualizations of the model (regression equation parts, as slopes and intercept) and model performance metrics (MAE, RMSE, R-squared, …)
- Applied bias correction: Global plots covering a short time period
- full spatial extent (global)
- temporal extent limited (day)
- Applied bias correction: Time series and scatter plots for single grid points of interest
- spatial extent limited (1 grid point)
- full temporal extent (2 years)
This report only features a sub-selection of the plots mentioned above. A complete set of all results for all elements is available upon request.
As explained previously (3.2 Bias correction), the statistical bias correction was applied to 17 elements out of all 22 processed elements. Those corrected elements can be grouped into temperature parameters (7 different aggregations), humidity parameters (7 different parameters and aggregation), the wind speed, the solar radiation and the cloud cover.
For these main element groups, the overall MAE improvements achieved by applying the bias correction, is summarized in Table 2. The MAE indicates the error of the corrected data (HRES- ERA5corrected), while the MAE improvement compares the error of the corrected versus the not corrected ERA5 data. All metrics are aggregated for different regions and certain subsets of grid points. Overall, the temperature, humidity and wind speed elements benefit most from the correction. The MAE is reduced by 30% to 60% in the majority of cases. Grid points being located in mountainous areas or along coasts and lakes are improved most. This is not surprising as these are the areas where the largest systematic differences between ERA5 and HRES can be expected. But not only the relative improvements are quite large, also the absolute MAE values after the correction are small. The MAE for the 24h mean of the 2m temperatures (2t_davg) for example is for all continents below 0.72K, and for 4 of 6 continents even below 0.51K.
Table 2: MAE (HRES-ERA5corrected) and MAE improvement of different bias corrected parameters. The MAE relates to the bias corrected ERA5 data. The MAE improvements indicate the added value through the bias correction. All metrics are calculated for different regions and for subsets of grid points meeting certain conditions. E.g. "Land & above 800m" only uses grid points being located on land and above 800m. "Coasts & Lakes" subsets all grid points with a land fraction between 10% and 90%.
Land | Land & below 800m | Land & above 800m | Coasts & Lakes | ||||||
Variable | Region | MAE | MAE Impr | MAE | MAE Impr | MAE | MAE Impr | MAE | MAE Impr |
2t_davg [K] | Africa | 0.44 | 40% | 0.42 | 36% | 0.47 | 48% | 0.36 | 50% |
2t_davg | Asia | 0.72 | 36% | 0.67 | 27% | 0.86 | 48% | 0.66 | 32% |
2t_davg | Australia | 0.43 | 42% | 0.43 | 35% | 0.37 | 83% | 0.30 | 49% |
2t_davg | Europe | 0.51 | 36% | 0.47 | 30% | 0.75 | 55% | 0.45 | 38% |
2t_davg | N-America | 0.71 | 31% | 0.67 | 25% | 0.85 | 41% | 0.68 | 28% |
2t_davg | S-America | 0.45 | 50% | 0.42 | 41% | 0.61 | 65% | 0.38 | 48% |
2d_davg [K] | Africa | 0.76 | 38% | 0.77 | 38% | 0.76 | 39% | 0.55 | 46% |
2d_davg | Asia | 0.90 | 29% | 0.81 | 25% | 1.09 | 35% | 0.73 | 28% |
2d_davg | Australia | 0.57 | 34% | 0.57 | 28% | 0.43 | 78% | 0.36 | 43% |
2d_davg | Europe | 0.58 | 28% | 0.55 | 22% | 0.81 | 46% | 0.54 | 27% |
2d_davg | N-America | 0.80 | 23% | 0.73 | 18% | 0.97 | 32% | 0.70 | 21% |
2d_davg | S-America | 0.54 | 42% | 0.44 | 37% | 0.99 | 50% | 0.41 | 40% |
ff_davg [m/s] | Africa | 0.27 | 25% | 0.26 | 22% | 0.28 | 32% | 0.33 | 47% |
ff_davg | Asia | 0.29 | 28% | 0.27 | 24% | 0.34 | 35% | 0.36 | 35% |
ff_davg | Australia | 0.24 | 31% | 0.25 | 30% | 0.22 | 41% | 0.31 | 53% |
ff_davg | Europe | 0.25 | 31% | 0.24 | 31% | 0.32 | 33% | 0.33 | 48% |
ff_davg | N-America | 0.29 | 28% | 0.28 | 26% | 0.33 | 31% | 0.33 | 34% |
ff_davg | S-America | 0.23 | 30% | 0.22 | 26% | 0.27 | 42% | 0.32 | 51% |
tcc_davg [0-1] | Africa | 0.08 | 3% | 0.08 | 2% | 0.08 | 4% | 0.08 | 5% |
tcc_davg | Asia | 0.07 | 0% | 0.07 | -2% | 0.08 | 4% | 0.08 | -2% |
tcc_davg | Australia | 0.06 | -1% | 0.06 | -1% | 0.06 | 5% | 0.07 | 2% |
tcc_davg | Europe | 0.07 | -1% | 0.07 | -1% | 0.07 | 2% | 0.07 | -1% |
tcc_davg | N-America | 0.08 | 0% | 0.08 | -1% | 0.07 | 2% | 0.08 | -1% |
tcc_davg | S-America | 0.07 | 4% | 0.07 | 3% | 0.07 | 8% | 0.07 | 5% |
ssrd_dsumdiff [J/m2d] | Africa | 1055575 | 7% | 1030480 | 7% | 1118699 | 8% | 1151300 | 13% |
ssrd_dsumdiff | Asia | 872717 | 4% | 836249 | 3% | 958997 | 7% | 899084 | 5% |
ssrd_dsumdiff | Australia | 1205911 | 6% | 1177253 | 6% | 1772895 | 14% | 1497494 | 12% |
ssrd_dsumdiff | Europe | 832226 | 2% | 815116 | 2% | 951428 | 5% | 782759 | 4% |
ssrd_dsumdiff | N-America | 899054 | 4% | 902781 | 3% | 888809 | 6% | 916596 | 4% |
ssrd_dsumdiff | S-America | 1427243 | 9% | 1448626 | 9% | 1328043 | 13% | 1316248 | 11% |
For the solar radiation flux (ssrd_dsumdiff) the MAE improvement is solid and ranges between 2% and 14%, depending on the region and subset. The results of element "24h mean cloud cover" (tcc_davg) are mixed. For most grid points the correction doesn't add any value. The MAE improvement of the majority of all grid points (land and below 800m) is between -2% and +4%, and therefore near zero. Only for grid points above 800m we can observe a small but clear improvement (2% - 8%).
4.2. Global Analysis
This section will concentrate on analyzing the bias correction on a global scale. The individual parts of the trained equations and various model performance metrics will be visualized in a way that global distributions and patterns become clear. The focus will be here on the "2m Temperature (24h Mean)" element, other elements of the temperature group reveal a very similar behavior. All remaining element groups have been analyzed in a similar way but are not presented here and available upon request.
4.2.1. Model equation: 2m Temperature (24h Mean)
In the following the bias correction equations for the 2m Temperature (24h Mean) element are shown in global contour plots. Figure 3 visualizes the intercept and the slope of the temperature predictor. Note that a slope of 1 and an intercept of 0 means no correction is applied to the ERA5 data. A slope value larger than 1 results in an increased value range through the correction, while a value smaller than 1 reduces the variability in the corrected ERA5 data set. The y-intercept does not have a direct physical meaning in this context.
The plots show that the values are non-uniformly distributed throughout the globe. The predominant pattern is a slope below 1 which relates to a reduced variability through the correction process. For some regions, especially coastal areas, also an increased variability is found. In the inter-tropical convergence zone (ITCZ) and the Himalaya region, the largest corrections in terms of variability ranges are found. In regions with strong seasonal cycle (mid-latitudes, temperature differences of ~40K during the year) the resulting correction effect on the variability will be different as compared to tropical regions (value range of 5K throughout the year). The y-intercept shows roughly the inverse global pattern of the slope.
In Figure 4 the slopes of all 4 seasonal predictors are shown. During the model training phase only the two best seasonal predictors were used in the training of each grid point. This becomes clear in Figure 4 where for each point in the maps, only two subplots give a value, the other two indicate a null value. For example, large parts of continental Russia use the seasonal predictors T2 and T3, while Western Europe uses a sinusoidal function constructed from T3 and T4. For India, with its predominant Monsoon pattern, the yearly cycle is best represented by a super composition of T1 and T2.
Figure 3: Global plots of 2m temperature (24h mean) correction equations. Slope of the temperature predictor (top), intercept of the equation (bottom)
4.2.2. Model performance: 2m Temperature (24h Mean)
This section describes the performance of the bias correction for the 2m Temperature (24h Mean) element. The maps in Figure 5 visualize different metrics derived from the complete 2 years training period:
- The MAE of the original ERA5 data
- The MAE of the bias corrected ERA5 data
- The coefficient of determination (R-squared) of the model
The coefficient of determination (3) describes how well the model predicts the target data, in our case the HRES data. In most regions of the world the value is close to 1, which means the model is able to predict nearly 100% of the original variability. It does not give an indication about the total level of variability, only the relative level of explained variability.
In the tropics the amount of explained variability is reduced to 60% to 90%, in some rare cases even to 30%. In these regions most of the variability originates from convective weather patterns that are inherently random in nature. The correction model in the tropics is therefore rather weak as there is no clear systematic difference between HRES and ERA5. In contrast, for mid- and high-latitudes the absolute variability is much larger, the models are able to pick up most of this variability, only a minor part is not represented in the models.
In Figure 5 the first MAE plot shows where the two models (HRES and ERA5) differ. As already mentioned, most prominent are mountainous and coastal regions with values between 1.5K and 4K. Most other land locations are much more similar in both models (0.5K to 1.5K). For ocean grid points the models mostly behave similar with MAE values much below 0.5K. The second MAE plot in Figure 5 indicates the mean absolute differences after the bias correction was applied. For most of the challenging regions the difference reduces from up to 4K down to roughly 1.5K. For Europe, Africa, South-America and Australia nearly all locations with originally large differences have seen a reduction in MAE to values below 1K.
In Figure 6 a similar evaluation is presented. This visualization shows the residual of the original and of the bias corrected ERA5 data, averaged over January and July. According to these, ERA5 is quite a bit colder in large parts of the Northern Hemisphere and warmer in the Southern Hemisphere during January 2018. For July 2017 this pattern is reversed. In both cases, the bias correction achieves a large reduction of these systematic differences.
Figure 4: Global plots of 2m temperature (24h mean) correction equations. Slope of seasonal predictors T1 (top left), T2 (top right), T3 (bottom left) and T4 (bottom right)
Figure 5: Global plots of 2m temperature (24h mean) model performance. MAE of bias corrected (center) and not bias corrected (top) ERA5 data and R-squared of the trained models (bottom)
Figure 6: Global plots of 2m temperature (24h mean). Top plots show the residual between HRES and the not corrected ERA5 data, while the bottom plots compare HRES to the bias corrected ERA5 data. The data is averaged over January 2018 (left) and July 2017 (right). Positive values: HRES warmer than ERA5
4.3. Case studies for single locations
While the analysis in the previous section focuses on the global pattern, this section will select specific locations and analyze their specific bias correction equation. The visualization dashboards used here (see Figure 7 and Figure 8) contain different subplots. The scatter plot in the top left shows the original ERA5 versus the HRES data in black as well as the corrected ERA5 versus HRES data in green. The perfect model would be able to fit all green dots on the corresponding black dots. A more basic bias correction without seasonal predictors would appear in this plot as a simple fitted line through the black dot cloud. In the top center a scatter plot visualizing the residuals is displayed. Each point represents the residual value of one day, the color indicates the season of the day. The table on the top right gives a summary of the location and shows several performance metrics of this location's equation. In the lower part of the dashboard, two line plots show the temporal behavior of the HRES, original ERA5 and the bias corrected ERA5 data. The bottom time series shows the difference between HRES and both ERA5 data sets.
4.3.1. Spanish Coast: 10m Wind Speed
One advantage of the bias correction is that it introduces the finer land-sea delineation of HRES to the ERA5 data (Figure 9). This effect can be understood by further analyzing data of an example grid point located right at the Spanish coastline, at 36.7°N / 3.8°W (see Figure 7). On the coarse ERA5 grid this location is a water grid point, while HRES, with its finer mask, considers it as a land location. Figure 7 shows the 24h mean wind speed element.
The equation, specifically trained for this location, has a slope of 0.7 for the main wind predictor. This slope value leads to a reduction of the wind variability from ERA5 to HRES and is expected, as the HRES land location has a larger friction. The scatter plot shows this behavior and also indicates that the seasonal features are able to pick up some of the variance. The residual scatter plot reveals that most strong wind events happen during the winter season and have the largest residuals.
This strong positive bias, which is more prominent for higher wind speeds, is well corrected by the equation. The averaged bias the correction is able to account for is 1.02 m/s, the total MAE is reduced from 1.21 m/s to 0.71 m/s.
Figure 7: Different visualizations of the original ERA5 data, the bias corrected ERA5 data and the HRES data for one grid point located at the Spanish coast line. Parameter: 10m wind speed. For a detailed description of the sub plots see Section 4.3.
4.3.2. Central Alps: 2m Temperature 24h Mean
The second location analyzed in detail is situated in the Central Alps (44.3°N / 6.6°W, Figure 8). Due to differences in model topography, this grid point is located higher up in the mountains in HRES as it is in ERA5 (Figure 10). As the coarser grid is expected to have less prominent elevation differences, the ERA5 data is expected to be warmer.
Both the scatter plot and the time plot clearly show this strong positive bias. Averaged over the training period, the bias correction is able to remove a systematic difference of 3.07K. The intraday variability throughout the 2 years is very similar between ERA5 and HRES, therefore the model can correct for most of the difference. Also, the coefficient of determination of 97% underlines that the residual is very small in comparison to the systematic error. In the end, the bias correction reduces the MAE from 3.12K to just 0.98K.
Figure 8: Different visualizations of the original ERA5 data, the bias corrected ERA5 data and the HRES data for one gridpoint located in the Central Alps. Parameter: 2m Temperature (24h Mean). For a detailed description of the sub plots see Section 4.3
Figure 9: Land-sea mask in HRES (bottom) and ERA5 (top). Left side: Southern coast of Spain; right side: Canary Islands
Figure 10: Model elevation of western Alps in HRES (bottom) and ERA5 (top)
4.4. Conclusions
The following conclusions are drawn from the evaluation study:
- The selected bias correction method has its largest benefits in mountainous areas, at coast lines and at lakes
- Seasonal correction on top of the simple bias correction further improves the accuracy of the derived correction equations.
- The approach works remarkable well for 3 of 4 groups of elements. The averaged relative reduction of MAE is between 30% and 60%. These are:
- Temperature parameters
- Humidity parameters
- Wind speed
- The correction models for solar radiation flux reach a MAE improvement of 2% to 14%.
- For cloud cover the correction has only a minor effect for most of the grid points. However, mountainous regions still benefit from the correction with a MAE improvement of 2%-8%.