Hello.

I would like to use CDS Toolbox to process ERA5 data and down time series as csv. I have tried using the codes I modified below. I am able to do so with 1 time series data (refer to test2 in the code below). I would like to download two time series (refer to test3 in the code below) in one csv file. How do I do that? Can I strip the test (<xarray.DataArray 'tas' (time: 24)>) into just values, combined several of them together and save as csv?


My goal is to have 16 time series from several location in one csv file, and several csv files for several parameters. The simple code below is not related to my goal.



import cdstoolbox as ct

variables = {
    'Near-Surface Air Temperature': '2m_temperature',
    'Eastward Near-Surface Wind': '10m_u_component_of_wind',
    'Northward Near-Surface Wind': '10m_v_component_of_wind',
    'Sea Level Pressure': 'mean_sea_level_pressure',
    'Surface Pressure': 'surface_pressure',
}

@ct.application(title="AWS Matching data acquisation", description="Download point data matching AWS location for TNB Project. Main purpose is to compare AWS observation with ERA5 and ERA5-Land. Data downloaded will be further processed in R.")
@ct.input.dropdown('variable', label='Variable', values=variables.keys())
@ct.output.download()
@ct.output.download()
def application(variable):    
    
    # Retrieve the hourly 2m temperature over Europe for the selected year
    data = ct.catalogue.retrieve(
        'reanalysis-era5-single-levels',
        {
            'variable': variables[variable],
            'product_type': 'reanalysis',
            'year': 1999,
            'month': 1,
            'day': 1,
            'time': [
                '00:00', '01:00', '02:00', '03:00', '04:00', '05:00', 
                '06:00', '07:00', '08:00', '09:00', '10:00', '11:00',
                '12:00', '13:00', '14:00', '15:00', '16:00', '17:00',
                '18:00', '19:00', '20:00', '21:00', '22:00', '23:00' 
            ],
            'grid': [3, 3],
            'area': [60., -11., 34., 35.], # retrieve data for Europe only
        }
    )
    
 
    test = ct.geo.spatial_average(data)

    print(test)
    test2 = ct.cdm.to_csv(test)
    
    
    #Create an empty list
    empty_list = []
    #
    temporary_list=test
    empty_list.append(temporary_list)
    empty_list.append(temporary_list-50)
    test3 = ct.cdm.to_csv(empty_list)
    print(test2)
    print(test3)
    
    return test2, test3



Yunus.


Edit 6 January 2022. After trying various methods based on the documentation and API. I gave up. This is my approach now. This creates 16 csv for a single variable and a single period. Hopefully, there will be option and documentation to simplify the download to a single csv file.

import cdstoolbox as ct

variables = {
    'Near-Surface Air Temperature': '2m_temperature',
    'Eastward Near-Surface Wind': '10m_u_component_of_wind',
    'Northward Near-Surface Wind': '10m_v_component_of_wind',
    'Sea Level Pressure': 'mean_sea_level_pressure',
    'Surface Pressure': 'surface_pressure',
}

Case = {
    'Case 1 1993':0,
    'Case 2 1994':1,
}

Case_Year = {
    'Case 1 1993':1993,
    'Case 2 1994':1994,
}


Case_Month = {
    'Case 1 1993':2,
    'Case 2 1994':5,
}

Case_Day = {
    'Case 1 1993':[21,27],
    'Case 2 1994':[20,27],
}



@ct.application(title="AWS Matching data acquisation", description="Download point data matching AWS location for TNB Project. Main purpose is to compare AWS observation with ERA5 and ERA5-Land. Data downloaded will be further processed in R.")
@ct.input.dropdown('variable', label='Variable', values=variables.keys())
@ct.input.dropdown('Case', label='Case Study', values=Case.keys())
@ct.output.figure()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()
@ct.output.download()

def application(variable,Case):    
    
    # Retrieve the hourly 2m temperature over Europe for the selected year
    data = ct.catalogue.retrieve(
        'reanalysis-era5-single-levels',
        {
            'variable': variables[variable],
            'product_type': 'reanalysis',
            'year': Case_Year[Case],
            'month': Case_Month[Case],
            'day': Case_Day[Case],
            'time': [
                '00:00', '01:00', '02:00', '03:00', '04:00', '05:00', 
                '06:00', '07:00', '08:00', '09:00', '10:00', '11:00',
                '12:00', '13:00', '14:00', '15:00', '16:00', '17:00',
                '18:00', '19:00', '20:00', '21:00', '22:00', '23:00' 
            ],
            'grid': [3, 3],
            'area': [-70., 140.,-80. , 170.], # retrieve data for Europe only #NESW
        }
    )
    
    # Compute the daily mean temperature over Europe
    temperature_daily_mean = ct.cube.resample(data, freq='day', how='mean')
    fig = ct.map.plot(temperature_daily_mean)
    #Creating a simple figure is just to make sure the server side application is working. If a figure isn't plotted and a bunch of errors in the console page, give it up, something wrong with the server.
    data_point_01 = ct.geo.extract_point(data,lon=166.6211,lat=-73.5861)
    data_point_02 = ct.geo.extract_point(data,lon=162.9700,lat=-76.7150)
    data_point_03 = ct.geo.extract_point(data,lon=164.0922,lat=-74.6958)
    data_point_04 = ct.geo.extract_point(data,lon=145.8589,lat=-75.5361)
    data_point_05 = ct.geo.extract_point(data,lon=148.6556,lat=-71.6525)
    data_point_06 = ct.geo.extract_point(data,lon=164.1177,lat=-74.6944)
    data_point_07 = ct.geo.extract_point(data,lon=163.4306,lat=-74.1350)
    data_point_08 = ct.geo.extract_point(data,lon=161.7720,lat=-74.9506)
    data_point_09 = ct.geo.extract_point(data,lon=160.6456,lat=-74.6392)
    data_point_10 = ct.geo.extract_point(data,lon=159.1933,lat=-72.8292)
    data_point_11 = ct.geo.extract_point(data,lon=164.0331,lat=-74.7250)
    data_point_12 = ct.geo.extract_point(data,lon=169.6000,lat=-73.0500)
    data_point_13 = ct.geo.extract_point(data,lon=163.2333,lat=-74.8167)
    data_point_14 = ct.geo.extract_point(data,lon=158.5906,lat=-75.6166)
    data_point_15 = ct.geo.extract_point(data,lon=162.8956,lat=-74.1766)
    data_point_16 = ct.geo.extract_point(data,lon=164.2205,lat=-74.6159)
    #print(data_point_1)
    
    
    
    
    download_point_01=ct.cdm.to_csv(data=data_point_01)
    download_point_02=ct.cdm.to_csv(data=data_point_02)
    download_point_03=ct.cdm.to_csv(data=data_point_03)
    download_point_04=ct.cdm.to_csv(data=data_point_04)
    download_point_05=ct.cdm.to_csv(data=data_point_05)
    download_point_06=ct.cdm.to_csv(data=data_point_06)
    download_point_07=ct.cdm.to_csv(data=data_point_07)
    download_point_08=ct.cdm.to_csv(data=data_point_08)
    download_point_09=ct.cdm.to_csv(data=data_point_09)
    download_point_10=ct.cdm.to_csv(data=data_point_10)
    download_point_11=ct.cdm.to_csv(data=data_point_11)
    download_point_12=ct.cdm.to_csv(data=data_point_12)
    download_point_13=ct.cdm.to_csv(data=data_point_13)
    download_point_14=ct.cdm.to_csv(data=data_point_14)
    download_point_15=ct.cdm.to_csv(data=data_point_15)
    download_point_16=ct.cdm.to_csv(data=data_point_16)
    return fig,  download_point_01,  download_point_02,  download_point_03,  download_point_04,  download_point_05,  download_point_06,  download_point_07,  download_point_08,  download_point_09,  download_point_10,  download_point_11,  download_point_12,  download_point_13,  download_point_14,  download_point_15,  download_point_16#test2, test3

8 Comments

  1. Hi Yunus,

    the ct.cdm.to_csv function only works on 1 dimensional data objects. If you want a single CSV, it may be better to create a data object in the toolbox with all the data points, and then download the data in netCDF and convert to CSV on your local system. This page may help:

    How to convert NetCDF to CSV

    Kevin

    1. Kevin Marsh 

      I have gone through the link you gave me. Those methods works on a local downloaded data. I'm asking about inside CDS Toolbox.


      I tried using ct.observation.interp_from_grid, the added location points are treated as another dimension. So it doesn't work with ct.cdm.to_csv.

      I tried 2 variable for one data point ('variable': ['2m_temperature','surface_pressure'],), the added variable is treated as another dimension. So it doesn't work either.


      I tried creating a list of data. But I don't understand how the CDS Toolbox wants it to be done.

      From the description of cdstoolbox.cdm.to_csv, it says:

      data – A single, or list of, 1 dimensional CDS Toolbox remote data object(s). In the case of a list of data objects they must all have the same, single, dimension.

      I understand how to use a single 1 dimensional CDS Toolbox remote data object.

      How do I create and use list of 1dimensional CDS Toolbox remote data objects.


      Yunus.

  2. Hi Yunus,

    what i was saying is that I don't think it is currently possible to create a multi-dimension CSV output file from the toolbox; if you had a number of data objects with 1 dimension (e.g. only time), then in theory you could create a list containing these and convert that to csv in the toolbox.

    But converting a list of multi dimension data objects (e.g. time, location) to csv in the toolbox would not work, hence you would have to convert on your local system.

    Kevin 

    1. Kevin Marsh I understand the current limitation. No multi-dimension data objects can be converted into a single csv.

      For now, I'm asking about the API in this page. https://cds.climate.copernicus.eu/toolbox/doc/modules_tmp/cdstoolbox.cdm.html?highlight=csv#cdstoolbox.cdm.to_csv.

      cdstoolbox.cdm.to_csv(*args, **kwargs)
      
          Return a remote csv object of the input remote netcdf data object. The remote csv object is for download only,
      
              it will not work in any of the other CDS-Toolbox services or tools. data must be 1 dimensional.
      
          Parameters
      
                  data – A single, or list of, 1 dimensional CDS Toolbox remote data object(s). In the case of a list of data objects they must all have the same, single, dimension.
      
                  drop_coordinates (list, str) – a string, or a list of strings, of coordinates to drop prior to merging. This allows handling of conflicting coordinates between variables when merging into a single dataframe.
      
                  **kwargs – kwargs passed to pandas.to_csv. Supported keywords can be found in https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html.
      
          Returns
      
              A remote csv data object.
      
      
      

      It says there, list of 1 dimensional CDS Toolbox remote data objects. How do I go about creating this list?

  3. Hi Yunus,

    I usually work with merged data objects rather than lists, but would something like:

    data_all=[data_point_07,data_point_08, data_point_09]

    work?

    Kevin

    1. Kevin Marsh I have tried using your approach. No error in the console and the application run succesfully. However only one of the point is in the csv file. I have attached the csv file that was generated.

      import cdstoolbox as ct
      
      
      @ct.application(title='Hello World!')
      @ct.output.download()
      def application():
      
          data = ct.catalogue.retrieve(
              'reanalysis-era5-single-levels',
              {
                  'variable': '2m_temperature',
                  'product_type': 'reanalysis',
                  'year': '2017',
                  'month': '01',
                  'day': '01',
                  'time': ['12:00','13:00'],
                  #'area': [-71., 145.,-77. , 170.],
              }
          )
          data_point_01 = ct.geo.extract_point(data,lon=166.6,lat=-73.5)
          data_point_02 = ct.geo.extract_point(data,lon=162.8,lat=-76.7)
          data_all=[data_point_01,data_point_02]
          print(data_all)
          download_all=ct.cdm.to_csv(data_all)
          print(download_all)
      
          return download_all


      cdm.to_csv-1641558753.4026573-8269-3-697645a5-0b94-41ad-8bd0-5e6f298f04ce.csv

  4. Hi Yunus,

    Yes, I think the ct.cdm.to_csv works best for 'straightforward' 1 dimension data objects, rather than lists/multi-dimension objects.

    Thanks,

    Kevin

  5. Hi Yunus,


    Try this:

    #Extract point info

    data_point_01 = ct.cube.interpolate(data,lon=166.6211,lat=-73.5861)
    data_point_02 = ct.cube.interpolate(data,lon=162.9700,lat=-76.7150)
    data_point_03 = ct.cube.interpolate(data,lon=164.0922,lat=-74.6958)
    ...
    data_point_16 = ct.cube.interpolate(data,lon=164.2205,lat=-74.6159)


    #merge data points

    list = [data_point_01[0], data_point_02[0], data_point_03[0], ..., data_point_16[0]]

    points = ct.cube.concat(list, dim='time')


    #convert to csv

    csv = ct.cdm.to_csv(points, sep=';', na_rep='NoData')


    return csv


    The result should be a table like this:

    timerealizationexperimentVersionNumberlatlonpp1d
    01/01/1959 00:0001-73.5861166.621111.204933
    01/01/1959 01:0001-76.7150162.9711.166359
    01/01/1959 02:0001-74.6958164.092211.130714
    01/01/1959 03:0001-74.6159164.220511.094582



    If you have more than onle variable in the retrived data you have to do it like this.

    list1 = [data_point_01[0], data_point_02[0], data_point_03[0], ..., data_point_16[0]]

    list2 = [data_point_01[1], data_point_02[1], data_point_03[1], ..., data_point_16[1]]


    var1 = ct.cube.concat(list1, dim='time')

    var2 = ct.cube.concat(list2, dim='time')


    points = [var1, var2]


    #convert to csv

    csv = ct.cdm.to_csv(points, sep=';', na_rep='NoData')


    return csv


    Hope it works for you!


    Jorge