I'm downloading daily CMIP6 data for max air temperature. I've asked for a spatial subset covering Brazil. When I open the NetCDF file in Panoply, I get very strange image, with the values from the most eastern border interpolating into the western border and wrapping around the globe.

And if I use CDO to query the file information, I get an error:

cdo sinfo tasmax_day_GFDL-ESM4_ssp585_r1i1p1f1_gr1_20150101-21001231_v20180701.nc
Warning (cdf_set_var): Inconsistent variable definition for lat_bnds!
Warning (cdf_set_var): Inconsistent variable definition for lon_bnds!
Warning (cdf_set_var): Inconsistent variable definition for time_bnds!
Segmentation fault (core dumped)

I tried both MIROC and GFDL models, downloading from the site or using the Python API.

I just checked and if I don't do a spatial subset, the CDO comand works fine. So I believe that the spatial subsetting is generating a malformed NetCDF file.

Has this happened to anyone else?

5 Comments

  1. I just found out that ncinfo works on the file. But when I use ncdump to look at the lat_bnds variable, I get what appears to be an infinite dump of the latitude values. Meanwhile, the same ncdump for the lat_bnds variable in the NON-subsetted dataset works fine.

    Here is the ncinfo output on the offending file

    m330625@desk7802:~/geodb/cmip6$ ncinfo tasmax_day_GFDL-ESM4_ssp585_r1i1p1f1_gr1_20150101-21001231_v20180701.nc
    <class 'netCDF4._netCDF4.Dataset'>
    root group (NETCDF4 data model, file format HDF5):
    external_variables: areacella
    history: File was processed by fremetar (GFDL analog of CMOR). TripleID: [exper_id_FlJGh4Wo6W,realiz_id_kt2pvOSbWt,run_id_1S546GMKbs]
    table_id: day
    activity_id: ScenarioMIP
    branch_method: standard
    branch_time_in_child: 60225.0
    branch_time_in_parent: 60225.0
    comment: <null ref>
    contact: gfdl.climate.model.info@noaa.gov
    Conventions: CF-1.7 CMIP-6.0 UGRID-1.0
    creation_date: 2019-06-19T01:17:16Z
    data_specs_version: 01.00.27
    experiment: update of RCP8.5 based on SSP5
    experiment_id: ssp585
    forcing_index: 1
    frequency: day
    further_info_url: https://furtherinfo.es-doc.org/CMIP6.NOAA-GFDL.GFDL-ESM4.ssp585.none.r1i1p1f1
    grid: atmos data regridded from Cubed-sphere (c96) to 180,288; interpolation method: conserve_order2
    grid_label: gr1
    initialization_index: 1
    institution: National Oceanic and Atmospheric Administration, Geophysical Fluid Dynamics Laboratory, Princeton, NJ 08540, USA
    institution_id: NOAA-GFDL
    license: CMIP6 model data produced by NOAA-GFDL is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (https://creativecommons.org/licenses/). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file). The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.
    mip_era: CMIP6
    nominal_resolution: 100 km
    parent_activity_id: CMIP
    parent_experiment_id: historical
    parent_mip_era: CMIP6
    parent_source_id: GFDL-ESM4
    parent_time_units: days since 1850-1-1
    parent_variant_label: r1i1p1f1
    physics_index: 1
    product: model-output
    realization_index: 1
    realm: atmos
    source: GFDL-ESM4 (2018):
    atmos: GFDL-AM4.1 (Cubed-sphere (c96) - 1 degree nominal horizontal resolution; 360 x 180 longitude/latitude; 49 levels; top level 1 Pa)
    ocean: GFDL-OM4p5 (GFDL-MOM6, tripolar - nominal 0.5 deg; 720 x 576 longitude/latitude; 75 levels; top grid cell 0-2 m)
    seaIce: GFDL-SIM4p5 (GFDL-SIS2.0, tripolar - nominal 0.5 deg; 720 x 576 longitude/latitude; 5 layers; 5 thickness categories)
    land: GFDL-LM4.1
    aerosol: interactive
    atmosChem: GFDL-ATMCHEM4.1 (full atmospheric chemistry)
    ocnBgchem: GFDL-COBALTv2
    landIce: GFDL-LM4.1
    (GFDL ID: 2019_0301)
    source_id: GFDL-ESM4
    source_type: AOGCM AER CHEM BGC
    sub_experiment: none
    sub_experiment_id: none
    title: NOAA GFDL GFDL-ESM4 model output prepared for CMIP6 update of RCP8.5 based on SSP5
    tracking_id: hdl:21.14100/67242666-b902-45dd-96fa-64875db6a5bb
    variable_id: tasmax
    variant_info: N/A
    references: see further_info_url attribute
    variant_label: r1i1p1f1
    dimensions(sizes): bnds(2), lat(39), time(31390), lon(32)
    variables(dimensions): float64 bnds(bnds), float64 height(), float64 lat(lat), float64 lat_bnds(time,lat,bnds), float64 lon(lon), float64 lon_bnds(time,lon,bnds), float32 tasmax(time,lat,lon), int64 time(time), float64 time_bnds(time,bnds)
    groups:


    And here is the ncinfo output of the variable with inconsistent definition

    m330625@desk7802:~/geodb/cmip6$ ncinfo -v lat_bnds tasmax_day_GFDL-ESM4_ssp585_r1i1p1f1_gr1_20150101-21001231_v20180701.nc
    <class 'netCDF4._netCDF4.Variable'>
    float64 lat_bnds(time, lat, bnds)
    _FillValue: nan
    long_name: latitude bounds
    coordinates: height
    unlimited dimensions:
    current shape = (31390, 39, 2)
    filling on

  2. Found out that I can clean the file using nccopy command and extracting just the lat,lon,time,tasmax variables to a new file.

    nccopy -v lat,lon,time,tasmax tasmax_day_GFDL-ESM4_ssp585_r1i1p1f1_gr1_20150101-21001231_v20180701.nc test.nc

     But I'm not sure what I'm missing when I trow away the lat_bnds, lon_bnds, time_bnds variable

  3. Glad to find this thread. I thought someone else must have wondered why the time dimension is in lat_bnds and lon_bnds, and how to correct it. I took a similar approach to remove and replace the bounds so cdo functions work:

    ncks -C -O -x -v lat_bnds,lon_bnds,time_bnds in.nc out1.nc

    ncap2 -O -s 'defdim("bnds",2); lon_bnds=make_bounds(lon,$bnds); lat_bnds=make_bounds(lat,$bnds); time_bnds=make_bounds(time,$bnds)' out1.nc out2.nc


  4. My attempt to use nccopy did not work. Something was left behind. Wish I new how to use ncks  and ncap2 so I could fix the file in an easier way. I ended up writing an R scritpt that will rewrite the NetCDF without the bnds.

    If anyone is interested, here goes the script:

    https://gist.github.com/dvictori/b439f19fa1c67707059847442a4e9c3a


  5. I deleted the dimension "bnds" using the following code:

    $ ncwa -a bnds infile.nc outfile.nc

    And CDO worked perfectly fine afterwards, since the outfile.nc has regular time,lev,lat,lon dimensions.

    However, "nccopy" did not work for me.