I'm trying to extract a timeseries for a single point from one of the EFAS datasets. I can retrieve the data from the catalogue:

print(data)

which gives the following output:

<xarray.DataArray 'snow_amount' (time: 152, y: 950, x: 1000)>
dask.array<shape=(152, 950, 1000), dtype=float32, chunksize=(48, 950, 1000)>
Coordinates:
  * y           (y) float32 752500.0 757500.0 762500.0 ... 5492500.0 5497500.0
  * x           (x) float32 2502500.0 2507500.0 ... 7492500.0 7497500.0
  * time        (time) datetime64[ns] 2012-01-01T06:00:00 ... 2012-05-31T06:00:00
    leadtime    timedelta64[ns] ...
    surface     int64 ...
    lat         (y, x) float64 dask.array<shape=(950, 1000), chunksize=(950, 1000)>
    lon         (y, x) float64 dask.array<shape=(950, 1000), chunksize=(950, 1000)>
    valid_time  (time) datetime64[ns] dask.array<shape=(152,), chunksize=(48,)>
Attributes:
    long_name:                Snow Amount
    units:                    kg m^-2
    grid_mapping:             lambert_azimuthal_equal_area
    standard_name:            surface_snow_amount
    cell_methods:             area: mean where land time: mean
    cell_measures:            area: areacella
    comment:                  where land over land, this is computed as the m...
    type:                     real
    Conventions:              CF-1.7
    institution:              European Centre for Medium-Range Weather Forecasts
    history:                  2019-10-17T08:37:36 GRIB to CDM+CF via cfgrib-0...
    grid_mapping_short_name:  laean
    source:                   EFAS


However, following the extract time series example, this:

data_sel = ct.geo.extract_point(data, lon=lon, lat=lat)

gives me an error:

Traceback (most recent call last):
  File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 49, in handle_request
    result = cached(context.method, proc, context, *context.args, **context.kwargs)
(...)
  File "/usr/local/lib/python3.6/site-packages/xarray/core/indexing.py", line 207, in get_dim_indexers
    % invalid)
ValueError: dimensions or multi-index levels ['lon', 'lat'] do not exist


This is, of course, not particularly helpful. Googling this as an xarray error message suggests that the problem could be that the lon/lat coordinates are in fact 2-dimensional, but I'm stumped as to what the solution then would be. I tried several examples of indexing xarray DataArrays but none of them seem to work as expected. The cube.select function gives me the same error message as above.  The function geo.make_regular seemed promising, but throws up a key error on 'laean'.


How can I perform this basic task, which is to select the data for a single point based on it's lat/lon value? 


Many thanks in advance for any pointers!

9 Comments

  1. Dear Rutger,

    I have never used this particular data, do you have the CDS catalogue reference and maybe the Toolbox retrieve that you used so I can try things out?

    I am surprised that the retrieved data is not in a regular grid which makes me think that the data is not Toolbox ready yet but I might be wrong on that.

    ct.geo.extract_point does not work because the data does not have 'lat' and 'lon' as dimension. However extract_point has two keyword arguments "lon_dim" and "lat_dim" where you can specify which dimensions should be considered as lat and lon. In your case you could try:

    ct.geo.extract_point(data, lat_dim='y', lon_dim='x', lat=757500.0, lon=2507500.0)

    I have never tested this as I don't have the data to test it but it should work. Obviously specifying the location with x and y is not that handy but it might be better than nothing.

    I hope some of this helps.

    Vivien


  2. Hi Vivien


    I will try that solution, the challenge of course is to find the correct X/Y coordinates for my lon/lat of interest. 


    The catalogue entry of the dataset is https://cds.climate.copernicus.eu/cdsapp#!/dataset/efas-historical?tab=overview and I will copy the Toolbox request below (which I based on the Toolbox request shown in the catalogue). 


    data = ct.catalogue.retrieve(
    'efas-historical',
    {
    'format':'netcdf',
    'simulation_version':'version_3',
    'variable':'snow_depth_water_equivalent',
    'model_levels':'surface_level',
    'hyear':'2012',
    'hmonth':[
    'january','february',
    'march','april','may'
    ],
    'hday':[
    '01','02','03',
    '04','05','06',
    '07','08','09',
    '10','11','12',
    '13','14','15',
    '16','17','18',
    '19','20','21',
    '22','23','24',
    '25','26','27',
    '28','29','30',
    '31'
    ]
    })


    (Apologies, for whatever reason indentation is not preserved when I copy the code here). 


    Thanks

    Rutger


  3. Incidentally, I get the same error message when I try it out with the UERRA dataset, with the following code:


     # Time range
    data = ct.catalogue.retrieve(
    'reanalysis-uerra-europe-single-levels',
    {
    'format':'grib',
    'origin':'uerra_harmonie',
    'variable':'snow_depth_water_equivalent',
    'year':'2000',
    'month':[
    '01','02','03',
    '04','05'
    ],
    'day':[
    '01','02','03',
    '04','05','06',
    '07','08','09',
    '10','11','12',
    '13','14','15',
    '16','17','18',
    '19','20','21',
    '22','23','24',
    '25','26','27',
    '28','29','30',
    '31'
    ],
    'time':'12:00'
    })


    print(data)

    data_sel = ct.geo.extract_point(data, lon=lon, lat=lat)


    I get the following output:


    <xarray.DataArray 'snow_amount' (time: 152, y: 565, x: 565)>
    dask.array<shape=(152, 565, 565), dtype=float32, chunksize=(48, 565, 565)>
    Coordinates:
        forecast_reference_time  (time) datetime64[ns] dask.array<shape=(152,), chunksize=(48,)>
        leadtime                 timedelta64[ns] ...
        surface                  int64 ...
        lat                      (y, x) float64 dask.array<shape=(565, 565), chunksize=(565, 565)>
        lon                      (y, x) float64 dask.array<shape=(565, 565), chunksize=(565, 565)>
      * time                     (time) datetime64[ns] 2000-01-01T12:00:00 ... 2000-05-31T12:00:00
    Dimensions without coordinates: y, x
    Attributes:
        long_name:      Snow Amount
        units:          kg m^-2
        standard_name:  surface_snow_amount
        cell_methods:   area: mean where land time: mean
        cell_measures:  area: areacella
        comment:        where land over land, this is computed as the mean amount...
        type:           real
        Conventions:    CF-1.7
        institution:    Norrkoping
        history:        2019-10-17T16:23:47 GRIB to CDM+CF via cfgrib-0.9.5.5/ecC...
        source:         UERRA-SINGLE_LEVELS


    And then the following error:


    Traceback (most recent call last):
      File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 49, in handle_request
        result = cached(context.method, proc, context, *context.args, **context.kwargs)
    (...)
    File "/usr/local/lib/python3.6/site-packages/xarray/core/indexing.py", line 207, in get_dim_indexers % invalid) ValueError: dimensions or multi-index levels ['lon', 'lat'] do not exist


    The problem seems similar in that lat, lon are coordinates but not dimensions, but here x and y are 'dimensions without coordinates' so I wouldn't immediately know how to go about referencing them.


    Cheers!

    Rutger

  4. Dear Rutger,

    I looked into the make_regular problem and the function is poorly documented and is missing two keyword arguments 'xref' and 'yref' in the documentation. This will be corrected.

    This means you can actually use ct.geo.make_regular(data, xref='x', yref='y') on both UFAS and UERRA data you were trying to work with and get regular gridded data to work with. I have copied two examples of retrieve / make_regular / extract_point that worked for me. 

    I did have a few Run time errors while I was testing, it might have to do with where the UERRA and EFAS data are pulled from as I have not experienced any similar errors working on other datasets lately. My advice with run time errors is to just try again.


    import cdstoolbox as ct
    
    
    @ct.application(title='Hello World!')
    @ct.output.download()
    def application():
    
        data = ct.catalogue.retrieve(
            'reanalysis-uerra-europe-single-levels',
            {
                'format':'grib',
                'origin':'uerra_harmonie',
                'variable':'snow_depth_water_equivalent',
                'year':'2000',
                'month':[
                    '01',
                ],
                'day':[
                    '01', '02'
                ],
                'time':'12:00'
            })
    
        data_reg = ct.geo.make_regular(data, xref='x', yref='y')
        
        data_point = ct.geo.extract_point(data_reg, lat=45.8326, lon=6.8652)
        
        print(data_point)
        return data_point
    
    
    import cdstoolbox as ct
    
    
    @ct.application(title='Hello World!')
    @ct.output.download()
    def application():
    
        data = ct.catalogue.retrieve(
            'efas-historical',
            {
                'format':'netcdf',
                'simulation_version':'version_3',
                'variable':'snow_depth_water_equivalent',
                'model_levels':'surface_level',
                'hyear':'2012',
                'hmonth':[
                    'january',
                ],
                'hday':[
                    '01','02',
                ]
            })
        
        data_reg = ct.geo.make_regular(data, xref='x', yref='y')
        
        data_point = ct.geo.extract_point(data_reg, lat=45.8326, lon=6.8652)
        
        print(data_point)
        return data_point
  5. Thank you, Vivien! That seems to work indeed - and useful to know.


    All the best

    Rutger


  6. Great, happy I could help.

    All the best.

    Vivien

  7. Hello,

    As a beginner in using the Toolbox editor I would like to ask a question about extracting data from an EFAS set for a specific point. For the following application I receive an error and I have difficulties in solving it. How can I solve it? 

    Thank you.

    import cdstoolbox as ct

    @ct.application(title='Download data')
    @ct.output.download()
    def download_application():
    data = ct.catalogue.retrieve(
    'efas-forecast',
    {
    'originating_centre': 'ecmwf',
    'product_type': 'high_resolution_forecast',
    'variable': 'river_discharge_in_the_last_6_hours',
    'model_levels': 'surface_level',
    'year': '2021',
    'month': '03',
    'day': [
    '01', '02', '03',
    '04', '05', '06',
    '07',
    ],
    'time': '12:00',
    'leadtime_hour': [
    '0', '102', '108',
    '114', '12', '120',
    '126', '132', '138',
    '144', '150', '156',
    '162', '168', '174',
    '18', '180', '186',
    '192', '198', '204',
    '210', '216', '222',
    '228', '234', '24',
    '240', '30', '36',
    '42', '48', '54',
    '6', '60', '66',
    '72', '78', '84',
    '90', '96',
    ],
    }
    )

    data_reg = ct.geo.make_regular(data, xref='x', yref='y')

    data_point = ct.geo.extract_point(data_reg, lat=44.8087, lon=21.3901)

    return data_point


    Traceback (most recent call last):
      File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 55, in handle_request
        result = cached(context.method, proc, context, context.args, context.kwargs)
      File "/opt/cdstoolbox/cdscompute/cdscompute/caching.py", line 108, in cached
        result = proc(context, *context.args, **context.kwargs)
      File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 118, in __call__
        return p(*args, **kwargs)
      File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 59, in __call__
        return self.proc(context, *args, **kwargs)
      File "/home/cds/cdsservices/services/python_service.py", line 38, in execute
        raise exceptions.InternalError(logging + traceback, '')
    cdsclient.exceptions.InternalError: Traceback (most recent call last):
      File "/opt/cdstoolbox/jsonrequest/jsonrequest/requests.py", line 71, in jsonrequestcall
        resp = coding.encode(req.callable(*req.args, **req.kwargs), register=encoders, **context)
      File "/opt/cdstoolbox/cdscdo/cdscdo/cdo.py", line 274, in _make_regular
        _fwrite_regular(data, xc, yc, regular_path)
      File "/opt/cdstoolbox/cdscdo/cdscdo/cdo.py", line 332, in _fwrite_regular
        xsize = data[x].size
    TypeError: list indices must be integers or slices, not str
  8. Hi Catalina


    The primary reason you are getting that error is that you are getting an empty data array back from the CDS. In case of the EFAS datasets, you need to add a keyword "format":"netcdf" to your retrieve statement in the Toolbox. I've come across this issue before, which is why I knew there was a solution!

    If you include the keyword, you will actually get some data back (you can check this by adding a print statement) with the following coordinates:

    Coordinates:
      * y                        (y) float32 752500.0 757500.0 ... 5497500.0
      * x                        (x) float32 2502500.0 2507500.0 ... 7497500.0
      * forecast_reference_time  (forecast_reference_time) datetime64[ns] 2021-03...
      * leadtime                 (leadtime) timedelta64[ns] 0 days 06:00:00 ... 3...
        surface                  int64 ...
        time                     (forecast_reference_time, leadtime) datetime64[ns] ...
        lat                      (y, x) float64 ...
        lon                      (y, x) float64 ...


    Unfortunately, the script will then fail in the second step (make_regular) with a different error: 

    ValueError: Unsupported: dataset with more than one variable.

    As far as I can see, there's only one variable in the array, so I'm not sure why this error comes up. Maybe someone from the Toolbox team can shed some light here?


    All the best

    Rutger




  9. Hi Rutger,


    Thank you for help, I added the format and I got the error you mentioned. 

    I hope to find some solution.

    All the best,

    Catalina