Introduction
The Decadal Climate Prediction Project (DCPP)
The Decadal Climate Prediction Project (Boer et al., 2016) addresses the ability of the climate system to be predicted on annual, multi-annual and decadal timescales. The information generated by the DCPP and archived on the Earth System Grid Federation (ESGF) nodes that is made accessible in the CDS can provide a basis for socially relevant operational climate predictions on annual to decadal timescales.
DCPP: Part of CMIP6
The Decadal Climate Prediction Project (DCPP) is a contributing MIP (Model Intercomparison Project) to the 6th Coupled Model Intercomparison Project (CMIP6 – Eyring et al., 2016), which is running as part of the World Climate Research Programme (WCRP). DCPP addresses a range of scientific issues involving the ability of the climate system to be predicted on annual to decadal timescales, the skill that is currently and potentially available, the mechanisms involved in long timescale variability, and the production of forecasts of benefit to both science and society.
The CMIP6 data archive is distributed through the ESGF. A quality-controlled subset of CMIP6 global climate projection data are made available through the Climate Data Store (CDS) for the users of the Copernicus Climate Change Service (C3S). Dedicated ESGF data nodes are used for C3S in France (at IPSL) and in Germany (DKRZ). Similarly, the decadal climate prediction project (DCPP) data in the Climate Data Store (CDS) are a targeted, quality-controlled subset of the DCPP commissioned by C3S.
The published datasets are the ones which took part on the C3S sectoral demonstrator service. This demonstrator provided decadal prediction products tailored to specific users from the agriculture, energy, infrastructure and insurance sectors (see details at https://climate.copernicus.eu/sectoral-applications-decadal-predictions). The data were used in these demonstrators following processing procedures necessary to extract valid information (e.g., bias adjustment); details on this processing are available in the technical appendix at https://climate.copernicus.eu/sites/default/files/2021-09/Technical_appendix_2020.pdf. Any application - similar to or different from these examples - needs to consider and apply the required data processing with care.
Decadal Climate Prediction Project Data in the CDS
DCPP Experiments
The CDS provides data access to two DCPP experiments: dcppA-hindcast which consists of retrospective decadal forecasts that can be used to assess historical decadal prediction skill, and dcppB-forecast which are experimental quasi-real-time decadal forecasts that form a basis for potential operational forecast production. For these DCPP experiments, each model performs multiple overlapping simulations that are initialised annually throughout the experiment. The dcppA-hindcast and dcppB-forecast experiments are further described in the table below. The DCPP experiment descriptions presented here are based on information harvested from Earth System Documentation (ES-DOC).
Experiment Name | Experiment Long Name | Extended Description |
---|---|---|
dcppA-hindcast | hindcasts initialized from observations with historical forcing | dcppA-hindcast is a set of retrospective decadal forecasts (known as hindcasts) that are initialised every year mostly from 1960-2019 and performed with a coupled atmosphere-ocean general circulation model (AOGCM). The hindcasts begin in November to allow for DJF (December, January, February) seasonal averages to be calculated. There are 10 hindcasts for each start date and hindcasts run for 10 years. The models running these hindcasts are initialised using observed data. Prior to the year 2020, the models are forced with historical conditions that are consistent with observations, these conditions include atmospheric composition, land use, volcanic aerosols and solar forcing. When hindcasts extend beyond 2020, the models are forced with future conditions from the ssp245 scenario from 2020 until the end of the simulation. DCPP hindcast experiments can be used to assess and understand the historical decadal prediction skill of climate models. |
dcppB-forecast | forecasts initialised from observations with ssp245 scenario forcing | dcppB-forecast is a set of quasi-real-time decadal forecasts that are initialised every year from 2019 in real time and ongoing (although only the data used in the secotoral demonstrator service is available from the CDS). The forecasts are performed with the same coupled atmosphere-ocean general circulation model (AOGCM), which was used to generate the hindcast data. The forecasts begin in November to allow DJF (December, January, February) seasonal averages to be calculated. There are 10 forecasts for each start date and forecasts run for 10 years. The models running these forecasts are initialised using observed data. Prior to the year 2020, the models are forced with historical conditions that are consistent with observations, these conditions include atmospheric composition, land use, volcanic aerosols and solar forcing. When forecasts extend beyond 2020, the models are forced with future conditions from the ssp245 scenario from 2020 until the end of the simulation. DCPP forecast experiments form a basis for potential operational decadal forecast production. |
Models
Data for the dcppA-hindcast and dcppB-forecast experiments published in the CDS are generated from simulations run by the models described in the table below. The model descriptions presented here are harvested from the dataset DOI pages held at the World Data Centre for Climate (WDCC), further model details can be found on the ES-DOC. The EC-Earth3, MPI-ESM1-2-HR, MPI-ESM1-2-LR and HadGEM3-GC31-MM models were configured with 360-day years (where every month has 30 days), whereas the CMCC-CM2-SR5 model was configured with a 365 day year (with an irregular number of days in each month).
Model | Centre | Description |
---|---|---|
EC-Earth3 | EC Earth Consortium | The model used in climate research named EC Earth 3.3, released in 2019, includes the components:
The model was run in native nominal resolutions: atmos: 100 km, land: 100 km, ocean: 100 km, seaIce: 100 km. |
CMCC-CM2-SR5 | The Euro-Mediterranean Center on Climate Change (Centro Euro-Mediterraneo per I Cambiamenti Climatici, CMCC) | The model used in climate research named CMCC-CM2-SR5, released in 2016, includes the components:
The model was run in native nominal resolutions: aerosol: 100 km, atmos: 100 km, land: 100 km, ocean: 100 km, seaIce: 100 km. |
MPI-ESM1-2-HR | The German Weather Service (Deutscher Wetterdienst, DWD) / Max Planck Institute for Meteorology (MPI-M) | The model used in climate research named MPI-ESM1.2-HR, released in 2017, includes the components:
The model was run in native nominal resolutions: aerosol: 100 km, atmos: 100 km, land: 100 km, landIce: none, ocean: 50 km, ocnBgchem: 50 km, seaIce: 50 km. |
MPI-ESM1-2-LR | The German Weather Service (Deutscher Wetterdienst, DWD) / Max Planck Institute for Meteorology (MPI-M) | The model used in climate research named MPI-ESM1.2-LR, released in 2017, includes the components:
The model was run in native nominal resolutions: aerosol: 250 km, atmos: 250 km, land: 250 km, landIce: none, ocean: 250 km, ocnBgchem: 250 km, seaIce: 250 km. https://www.wdc-climate.de/ui/cmip6?input=CMIP6.DCPP.MPI-M.MPI-ESM1-2-LR |
HadGEM3-GC31-MM | Met Office Hadley Centre (MOHC) | The model used in climate research named HadGEM3-GC3.1-N216ORCA025, released in 2016, includes the components:
The model was run in native nominal resolutions: aerosol: 100 km, atmos: 100 km, land: 100 km, ocean: 25 km, seaIce: 25 km. |
Start-Date Ensembles
The DCPP experiments published in the CDS, are a suite of overlapping simulations that are initialised every year throughout the duration of the start-date range specified by the experiment. The simulations begin in November to allow for DJF (December, January, February) seasonal averages to be calculated. There are 10 simulations (ensemble members) for each start-date (called "Base year" in the CDS form), except for the MPI-ESM1-2-LR model which has 16 ensemble members.
The start-date ensemble is reflected in the DCPP data naming convention with the addition of a s<yyyy> start-date ensemble identifier. Please note that the conventional CMIP6 ripf ensemble identifiers are omitted for this particular dataset since all the ensemble members are concatenated into one file.
See some more more details in the File naming conventions and In-file metadata modifications sections below.
Practical details of the published data
In the table below some practical details of the data is shown including the base year (or start year) period covered and the number of ensemble members. For each start year there are (at least) 10 years of corresponding hindcast or forecast data available. Hindcast and forecast start years are not distinguished in the CDS form. Please note that the ensemble members are not available individually, but they are concatenated into one file while the data is downloaded, and generally users are encouraged to use all members instead of selecting one member of the predictions.
Hindcast start years* | Forecast start years* | Ensemble members | Nominal resolution | Monthly variables | Daily variables | |
---|---|---|---|---|---|---|
CMCC (Italy) | 1960 -2018 | 2019 - 2020 | 10 | 100 km | Near surface air temperature, precipitation, sea level pressure | --- |
EC-EARTH (Europe) | 1960 - 2018 | 2019 - 2020 | 10 | 100 km | Near surface air temperature, precipitation, sea level pressure | 500 hPa geopotential height, daily maximum near surface air temperature, daily minimum near surface air temperature, near surface air temperature, precipitation, sea level pressure |
HadGEM3 (UK) | 1960 - 2018 | 2019 - 2020 | 10 | 100 km | Near surface air temperature, precipitation, sea level pressure | 500 hPa geopotential height, daily minimum near surface air temperature, precipitation |
MPI-ESM1-2-HR (Germany) | 1960 - 2018 | --- | 10 | 100 km | Near surface air temperature, precipitation, sea level pressure | 500 hPa geopotential height, daily maximum near surface air temperature, daily minimum near surface air temperature, precipitation |
MPI-ESM1-2-LR (Germany) | 1960 - 2018 | 2019 - 2021 | 16 | 250 km | Near surface air temperature, precipitation, sea level pressure | Daily maximum near surface air temperature, daily minimum near surface air temperature |
*Note: Since hindcast and forecast data begins in November, the actual period the data covers includes only November and December for each start year, however the last year includes November and December. For example, for the 1960 start year, 1960 includes November and December and 1961 - 1970 have full coverage.
Parameter listings
Data for the dcppA-hindcast experiments and the dcppB-forecast experiments will include parameters at monthly and daily resolution as described in the tables below. The parameter descriptions presented here are harvested from the CMIP6 Data Request via the CLIPC variable browser.
CDS parameter name | ESGF variable id | units | Standard name (CF) | Long name | Description |
---|---|---|---|---|---|
500 hPa geopotential height | zg500 | m | geopotential_height | Geopotential Height at 500hPa | Gravitational potential energy per unit mass normalised by the standard gravity at 500hPa at the same latitude. |
Daily maximum near-surface air temperature | tasmax | K | air_temperature | Daily Maximum Near-Surface Air Temperature | Daily maximum temperature of air at 2m above the surface of land, sea or inland waters. |
Daily minimum near-surface air temperature | tasmin | K | air_temperature | Daily Minimum Near-Surface Air Temperature | Daily minimum temperature of air at 2m above the surface of land, sea or inland waters. |
Near-surface air temperature | tas | K | air_temperature | Near-Surface Air Temperature | Temperature of air at 2m above the surface of land, sea or inland waters. 2m temperature is calculated by interpolating between the lowest model level and the Earth's surface, taking account of the atmospheric conditions. |
Precipitation | pr | kg m-2 s-1 | precipitation_flux | Precipitation | The sum of liquid and frozen water, comprising rain and snow, that falls to the Earth's surface. It is the sum of large-scale precipitation and convective precipitation. This parameter does not include fog, dew or the precipitation that evaporates in the atmosphere before it lands at the surface of the Earth. This variable represents amount of water per unit area and time. |
Sea level pressure | psl | Pa | air_pressure_at_sea_level | Sea Level Pressure | The pressure (force per unit area) of the atmosphere at the surface of the Earth, adjusted to the height of sea level. It is a measure of the weight that all the air in a column vertically above a point on the Earth's surface would have, if the point were located at sea level. It is calculated over all surfaces - land, sea and inland water. |
Grids
DCPP data like the rest of CMIP6 is reported either on the model’s native grid or re-gridded to one or more target grids with data variables generally provided near the centre of each grid cell (rather than at the boundaries). A grid_label (found in the file name following the ensemble identifier and also in the file's global metadata attributes) indicates whether the data is provided on the model's native grid (gn) or has been re-gridded (gr) to a target grid. For DCPP data in the CDS, only data from the EC-Earth3 model has been re-gridded to a target grid, data from the other models are provided on each model's native grid. The file's "nominal_resolution" global metadata attribute gives an indication of the resolution of the data, for the DCPP data in the CDS the nominal resolution of the models is 100km (except for the MPI-ESM1-2-LR model, which is 250 km).
Calendars
Climate models sometimes use different calendars, for example Hadley Centre models (HadGEM3 in this entry) use a 360 day calendar, where every month has exactly 30 days. Other models use a fixed 365-day calendar, and others include leap-years. These variations can result in different length time-dimensions if daily data is downloaded, depending on the time period and models selected, or even failed data requests. Users need to be careful, when using the CDS user interface download form or API, to avoid selecting days which may not be available in the calendar of the given model (for example requests referring to day 31 for the Hadley Centre HadGEM3 model would fail, because it has a 360 day calendar). The CDS form for CMIP6 currently assumes a standard calendar, so allows the selection of such missing days, and conversely may not allow selection of all days from models with non-standard calendars (but this data can be retrieved using the API).
Data Format
The CDS subset of DCPP data are provided as NetCDF files. NetCDF (Network Common Data Form) is a file format that is freely available and commonly used in the climate modelling community. See more details: What are NetCDF files and how can I read them
A CMIP6 NetCDF file in the CDS contains:
- global metadata: these fields can describe many different aspects of the file such as
- when the file was created
- the name of the institution and model used to generate the file
- Information on the horizontal grid and regridding procedure
- links to peer-reviewed papers and technical documentation describing the climate model,
- links to supporting documentation on the climate model used to generate the file,
- software used in post-processing.
- variable dimensions: such as time, latitude, longitude and height
- variable data: the gridded data
- variable metadata: e.g. the variable units, averaging period (if relevant) and additional descriptive data
File naming conventions
When you download a DCPP file from the CDS it will have a naming convention that is as follows:
<variable_id>_<table_id>_<source_id>_<variant_label>.nc
Where:
- variable_id: variable is a short variable name, e.g. “tas” for “temperature at the surface”.
- table_id: this refers to the MIP table being used. The MIP tables are used to organise the variables. For example, Amon refers to monthly atmospheric variables and Oday contains daily ocean data.
- source_id: this refers to the model used that produced the data.
- variant_label: is a label constructed from the start year of the simulation as s<yyyy>, where yyyy is the start year.
Quality control of the CDS-CMIP6-DCPP subset
The CDS subset of the DCPP data have been through a set of quality control checks before being made available through the CDS. The objective of the quality control process is to ensure that all files in the CDS meet a minimum standard. Data files were required to pass all stages of the quality control process before being made available through the CDS. Data files that fail the quality control process are excluded from the CDS-CMIP6-DCPP subset, data providers are contacted and if they are able to release a new version of the data with the error corrected then providing this data passes all remaining QC steps may be available for inclusion in the next DCPP data release.
The main aim of the quality control procedure is to check for metadata and gross data errors in the CMIP6 files and datasets. A brief description of each of the QC checks is provided here:
- CF-Checks: The CF-checker tool checks that each NetCDF4 file in a given dataset is compliant with the Climate and Forecast (CF) conventions, compliance ensures that the files are interoperable across a range of software tools.
- PrePARE: The PrePARE software tool is provided by PCMDI (Program for Climate Model Diagnosis and Intercomparison) to verify that CMIP6 files conform to the CMIP6 data protocol. All CMIP6 data should meet this required standard however this check is included to ensure that all data supplied to the CDS have passed this QC test.
- nctime: The nctime checker checks the temporal axis of the NetCDF files. For each NetCDF file the temporal element of the file is compared with the time axis data within the file to ensure consistency. For a time-series of data comprised of several NetCDF files nctime ensures that the entire timeseries is complete, that there are no temporal gaps or overlaps in either the filename or in the time axes within the files.
- Errata: The dataset is checked to ensure that no outstanding Errata record exists.
- Data Ranges: A set of tests on the extreme values of the variables are performed, this is used to ensure that the values of the variables fall into physically realistic ranges.
- Handle record consistency checks: This check ensures that the version of the dataset used is the most recently published dataset by the modelling centre, it also checks for any inconsistency in the ESGF publication and excludes any datasets that may have an inconsistent ESGF publication metadata.
- Exists at both partner sites: It is asserted that each dataset exists at both partner ESGF data nodes at IPSL and DKRZ.
It is important to note that passing these quality control tests should not be confused with validity: for example, it will be possible for a file to pass all QC steps but contain errors in the data that have not been identified by either data providers or data users.
In cases where the quality control picks up errors that are related to minor technical details of the conventions, or behavior that is in line with expectations for climate model output despite being unexpected in a physical system, the data will be published with details of the errors referenced in the documentation. An example of the 2nd type of error is given by negative salinity values which occur in one model as a result of rapid release of fresh water from melting sea-ice. These negative values are part of the noise associated with the numerical simulation and reflect what is happening in the numerical model.
In-file metadata modifications
Some updates have been applied to the DCPP netCDF files in the CDS. These conform with the CF Metadata Conventions and improve the usability of the time dimension when multiple overlapping decadal experiments are used together with different start dates (adding additional time coordinates facilitates to use multiple datasets in parallel and enables unambiguous selection of time). The specific details of the updates include the following modifications:
- A “realization” variable is added, to represent the ensemble member
- The “sub_experiment_id” global attribute is adjusted to include the start year and month of the simulation
- A “reftime” variable is added, representing the start time of the simulation
- A “leadtime” coordinate variable is added, which is the prediction range of the forecasts: this is calculated from the “reftime” and the valid times from the existing time variable
- The "long_name" attribute of the "time" coordinate is updated to "valid_time".
Citation information
The CMIP6 data Citation Service provides information for data users on how to cite CMIP6 DCPP data and on the data license. Available CMIP6 data citations are discoverable in the ESGF or in the Citation Search at: http://bit.ly/CMIP6_Citation_Search (search for DCPP at the top of the page).
Known issues
CDS users will be directed to the CMIP6 ES-DOC Errata Service (see dcppA-hindcast and dcppB-forecast for experiment ID) for known issues with the wider CMIP6 data pool. Data that is provided to the CDS should not contain any errors or be listed in the Errata service, however this will still be a useful resource for CDS users as data they may be looking for but cannot access may have been withheld from the CDS for justifiable reasons.
Particularly, the "daily maximum near-surface air temperature" variable is missing for the HadGEM3-GC31-MM model due to the fact that in this model "grid point single time step spikes leading to excessively large daily maximum temperature value" were found. Details of the problem can be found at https://errata.es-doc.org/static/view.html?uid=76b3f818-d65f-c76b-bfd8-cae5bc27825c
Subsetting and downloading data
CDS users are able to apply subsetting operations to CMIP6 decadal datasets. This mechanism (the "roocs" WPS framework) runs at each of the partner sites: IPSL and DKRZ. The WPS can receive requests for processing based on dataset identifiers, a temporal range, a bounding box and a range of vertical levels. Each request is converted to a job that is run asynchronously on the processing servers at the partner sites. NetCDF files are generated and the response contains download links to each of the files. Users of the CDS will be able to make subsetting selections using the web forms provided by the CDS catalogue web-interface. More advanced users will be able to define their own API requests in the CDS Toolbox that will call the WPS. Output files will be automatically retrieved so that users can access them directly within the CDS.
References
Boer, G. J., D. M. Smith, C. Cassou, F. Doblas-Reyes, G. Danabasoglu, B. Kirtman, Y. Kushnir, M. Kimoto, G. A. Meehl, R. Msadek, W. A. Mueller, K. E. Taylor, F. Zwiers, M. Rixen, Y. Ruprich-Robert, R. Eade (2016), The Decadal Climate Prediction Project (DCPP) contribution to CMIP6, Geosci. Model Dev., 9, 3751-3777 doi.org/10.5194/gmd-9-3751-2016
Eyring, V. et al. (2016) ‘Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization’, Geoscientific Model Development, 9(5), pp. 1937–1958. doi: 10.5194/gmd-9-1937-2016.
Climate Change 2021: The Physical Science Basis, the Working Group I contribution to the Sixth Assessment Report. Available at: https://www.ipcc.ch/report/sixth-assessment-report-working-group-i/ (Accessed: 14 September 2021)
World Climate Research Programme (2020) CMIP Phase 6 (CMIP6): Overview CMIP6 Experimental Design and Organization. Available at: https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6 (Accessed: 2 November 2020).