Introduction
What are global climate projections?
Global climate projections are climate model simulations which have been generated by multiple independent climate research centres in an effort coordinated by the World Climate Research Program (WCRP) and assessed by the Intergovernmental Panel on Climate Change (IPCC). These climate projections underpin the conclusion of the IPCC 5th Assessment Report (published in 2013) that “Continued emission of greenhouse gases will cause further warming and long-lasting changes in all components of the climate system, increasing the likelihood of severe, pervasive and irreversible impacts for people and ecosystems”.
The Climate Model Intercomparison Project (CMIP)
The Climate Model Intercomparison Project (CMIP) was established in 1995 by the World Climate Research Program (WCRP) to provide climate scientists with a database of coupled Global Circulation Model (GCM) simulations.
The CMIP process involves institutions (such as national meteorological centres or research institutes) from around the world running their climate models with an agreed set of input parameters. The modelling centres produce a set of standardised output, when combined these produce a multi-model dataset that can be shared internationally between modelling centres and the results compared.
Analysis of the CMIP data allows for improving understanding of
- the climate, including its variability and change,
- the societal and environmental implications of climate change in terms of impacts, adaptation and vulnerability,
- informing the Intergovernmental Panel on Climate Change (IPCC) reports.
Comparison of different climate models allows for
- determining why similarly forced models to produce a range of responses,
- evaluating how realistic the different models are in simulating the recent past,
- examining climate predictability.
CMIP5
The fifth phase of Coupled Model Intercomparison Project (CMIP5, 2008-2012) involved 24 modelling centres running their climate models under the prescribed conditions to produce the multi-model dataset designed to advance our knowledge of climate (Taylor et al. 2012). The scientific analyses from CMIP5 were used extensively in the Intergovernmental Panel on Climate Change (IPCC) 5th Assessment Report (IPCC AR5), published in September 2013.
The CMIP5 data archive is distributed through the Earth System Grid Federation (ESGF) though many national centres have either a full or partial copy of the data for their scientists to utilise. A quality-controlled subset of CMIP5 data are made available through the Climate Data Store (CDS) for the users of the Copernicus Climate Change Service (C3S).
To obtain full details of the whole CMIP5 data archive please refer to the full documentation at the Program for Climate Model Diagnosis & Intercomparison (PCMDI). An introductory factsheet for an overview of an IPCC subset also provides a useful guide to the CMIP5 data.
CMIP6
The sixth phase of the Coupled Model Intercomparison Project (CMIP6) is in progress. Approximately 40 modelling centres are participating in this phase of CMIP. During the period 2019-2020 modelling centres are standardising and releasing their data to be distributed internationally through the Earth System Grid Federation (ESGF). It is expected that the Climate Data Store (CDS) will begin making CMIP6 data available from 2021.
Global climate projections in the CDS
The global climate projections in the Climate Data Store (CDS) are a quality-controlled subset of the wider CMIP5 data. These data represent only a small subset of CMIP5 archive. A set of 50 core variables from the CMIP5 archive were identified for the CDS. These are the most used of the CMIP5 data. These variables are provided from seven of the most popular CMIP5 experiments.
The CDS subset of CMIP5 data have been through a metadata quality control procedure which ensures a high standard of reliability of the data. It may be for example that similar data can be found in the main CMIP5 archive however these data come with no quality assurance and may have metadata errors or omissions. The quality-control process means that the CDS subset of CMIP5 data is further reduced to exclude data that have metadata errors or inconsistencies. It is important to note that passing of the quality control should not be confused with validity: for example, it will be possible for a file to have fully compliant metadata but contain gross errors in the data that have not been noted. In other words, it means that the quality control is purely technical and does not contain any scientific evaluation (for instance consistency check).
Experiments
The CDS-CMIP5 subset consists of the following CMIP5 experiments
- amip: An atmosphere-only configuration of the model as in the Atmospheric Model Intercomparison Project (AMIP, a pre-cursor to CMIP). Models impose sea surface temperatures (SSTs) & sea ice (from observations over 1979 to at least 2008), but with other conditions including CO2 concentrations and aerosols prescribed in the same way as the ‘historical’ experiment.
- historical: Models impose changing conditions (consistent with observations from 1850-2005), which may include: atmospheric composition due to both anthropogenic and volcanic influences, solar forcing, emissions or concentrations of short-lived species and natural and anthropogenic aerosols or their precursors, as well as land use.
- piControl (Pre-industrial Control): Models impose non-evolving, pre-industrial conditions, which may include prescribed atmospheric concentrations or non-evolving emissions of gases, aerosols or their precursors, as well as unperturbed land use.
- The piControl experiment is often run for a long number of years (500 or more) this allows for the models to reach an equilibrium state however this means that model data from this experiment only have a time element where the year is a modelling year not a representative year. Therefore to avoid confusion this experimental data is currently only available through the CDS API and will not be visible through the data download menu.
- Scenario experiments RCP2.6, RCP4.5, RCP6.0, RCP8.5: Future projections (2006-2100) forced by RCP2.6, 4.5, 6.0, and 8.5. RCPs (representative concentration pathways) approximately result in radiative forcings of 2.6, 4.5, 6.0 and 8.5 W m-2 at the year 2100 respectively, relative to pre-industrial conditions.
Models, grids and pressure levels
Models
The models included in the CDS-CMIP5 subset are detailed in the table below, these include most of the models from the main CMIP5 archive. However a small number of models were not included as the data from the models have a research-only restriction on their use, all data in the CDS are released without restriction, therefore, the MIROC and MRI models from Japan are not included.
The following table contains a list of the global climate models in use in the CDS and a brief description of the model where this information is readily available, further details can be found on the Earth System Documentation site.
Pressure levels
For pressure level data the model output is available on the pressure levels according to the table below. Note that not all models provide the same pressure levels.
Frequency | Number of Levels | Pressure Levels (hPa) |
Daily | 8 | 1000., 850., 700., 500., 250., 100., 50., 10. |
Monthly | 17 | 1000., 925., 850., 700., 600., 500., 400., 300., 250., 200., 150., 100., 70., 50., 30., 20., 10. |
Ensembles
Each modelling centre typically run the same experiment using the same model several times to confirm the robustness of results and inform sensitivity studies through the generation of statistical information. A model and its collection of runs is referred to as an ensemble. Within these ensembles, three different categories of sensitivity studies are done, and the resulting individual model runs are labelled by three integers indexing the experiments in each category.
- The first category, labelled “realization”, performs experiments which differ only in random perturbations of the initial conditions of the experiment. Comparing different realizations allow estimation of the internal variability of the model climate.
- The second category refers to variation in initialisation parameters. Comparing differently initialised output provides an estimate of how sensitive the model is to initial conditions.
- The third category, labelled “physics”, refers to variations in the way in which sub-grid scale processes are represented. Comparing different simulations in this category provides an estimate of the structural uncertainty associated with choices in the model design.
Each member of an ensemble is identified by a triad of integers associated with the letters r, i and p which index the “realization”, “initialization” and “physics” variations respectively. For instance, the member "r1i1p1" and the member "r1i1p2" for the same model and experiment indicate that the corresponding simulations differ since the physical parameters of the model for the second member were changed relative to the first member.
It is very important to distinguish between variations in experiment specifications, which are globally coordinated across all the models contributing to CMIP5, and the variations which are adopted by each modelling team to assess the robustness of their own results. The “p” index refers to the latter, with the result that values have different meanings for different models, but in all cases these variations must be within the constraints imposed by the specifications of the experiment.
For the scenario experiments, the ensemble member identifier is preserved from the historical experiment providing the initial conditions, so RCP 4.5 ensemble member “r1i1p2” is a continuation of historical ensemble member “r1i1p2”.
Parameter listings
Table 1: CMIP5 data on pressure levels
Table 2: CMIP5 data on single levels
Data Format
The CDS subset of CMIP5 data are provided as NetCDF files. NetCDF (Network Common Data Form) is a file format that is freely available and commonly used in the climate modelling community. See more details: What are NetCDF files and how can I read them
A CMIP5 NetCDF file in the CDS contains:
- Global metadata: these fields can describe many different aspects of the file such as
- when the file was created
- the name of the institution and model used to generate the file
- links to peer-reviewed papers and technical documentation describing the climate model,
- links to supporting documentation on the climate model used to generate the file,
- software used in post-processing.
- variable dimensions: such as time, latitude, longitude and height
- variable data: the gridded data
- variable metadata: e.g. the variable units, averaging period (if relevant) and additional descriptive data
The metadata provided in NetCDF files adhere to the Climate and Forecast (CF) conventions (v1.4 for CMIP5 data). The rules within the CF-conventions ensure consistency across data files, for example ensuring that the naming of variables is consistent and that the use of variable units is consistent.
File naming conventions
When you download a CMIP5 file from the CDS it will have a naming convention that is as follows:
<variable>_<cmor_table>_<model>_<experiment>_<ensemble_member>_<temporal_range>.nc
Where
- variable is a short variable name, e.g. “tas” for ”temperature at the surface”
- cmor_table is a reference to the realm (an earth system component such as atmosphere or ocean) and frequency of the variable, e.g. “Amon” indicates that a variable is present in the atmosphere realm at a monthly frequency (link to list of these)
- model is the name of the model that produced the data
- ensemble member is the ensemble identifier in the form “r<X>i<Y>p<Z>”, X, Y and Z are integers
- the temporal range is in the form YYYYMM[DDHH]-YYYY[MMDDHH], where Y is year, M is the month, D is day and H is hour. Note that day and hour are optional (indicated by the square brackets) and are only used if needed by the frequency of the data. For example daily data from the 1st of January 1980 to the 31st of December 2010 would be written 19800101-20101231.
Please note that the CIMP5 filenames in the CDS do not contain a version tag as opposed to the file names used for the corresponding data in the ESGF nodes, where you have the versioning information. In the CDS we serve the latest version of the CMIP5 data which is now complete (no new version expected). To find the version number of this latest version published in the CDS, see the metadata of the netcdf file. This contains a track identity of the file and an http address for each file from where the version number information may be checked.
Quality control of the CDS-CMIP5 subset
The CDS subset of the CMIP5 data have been through a set of quality control checks before being made available through the CDS. The objective of the quality control process is to ensure that all files in the CDS meet a minimum standard. Data files were required to pass all stages of the quality control process before being made available through the CDS. Data files that fail the quality control process are excluded from the CDS-CMIP5 subset or if possible the error is corrected and a note made in the history attribute of the file. The quality control of the CDS CMIP5 subset checks for metadata errors or inconsistencies against the Climate and Forecast (CF) Conventions and a set of CMIP5 specific file naming and file global metadata conventions.
Various software tools have been used to check the metadata of the CDS CMIP5 data:
- The Centre for Environmental Data Analysis (CEDA) compliance checking tool CEDA-CC is used to check that:
- the file name adheres to the CMIP5 file naming convention,
- the global attributes of the NetCDF file are consistent with filename,
- there are no omissions of required CMIP5 metadata.
- The CF-Checker Climate and Forecast (CF) conventions checker ensures that any metadata that is provided is consistent with the CF conventions.
- A time-axis-checker is used to check the temporal dimension of the data:
- for individual files the time dimension of the data is checked to ensure it is valid and is consistent with the temporal information in the filename,
- where more than one file is required to generate a time-series of data, the files have been checked to ensure there are no temporal gaps or overlaps between the files.
The data within the files were not individually checked however where it was known that a variable from a given model had a gross error, e.g in the sign convention of a flux, then these data were also omitted from the CDS-CMIP5 subset.
It is important to note that passing of these quality control tests should not be confused with validity: for example, it will be possible for a file to be fully CF compliant and have fully compliant CMIP5 metadata but contain gross errors in the data that have not been noted.
For a detailed description of all the quality control of the data please see the accompanying documentation
Known issues
- Please note that not all the combinations of models and variables exist. This feature is inherited from the ESGF system, where the main target is to publish as much as possible data and even publish incomplete datasets, which might be of use. This allows to have more data available with the price that not everything is fully complete.