Hello.
I would like to use CDS Toolbox to process ERA5 data and down time series as csv. I have tried using the codes I modified below. I am able to do so with 1 time series data (refer to test2 in the code below). I would like to download two time series (refer to test3 in the code below) in one csv file. How do I do that? Can I strip the test (<xarray.DataArray 'tas' (time: 24)>) into just values, combined several of them together and save as csv?
My goal is to have 16 time series from several location in one csv file, and several csv files for several parameters. The simple code below is not related to my goal.
import cdstoolbox as ct variables = { 'Near-Surface Air Temperature': '2m_temperature', 'Eastward Near-Surface Wind': '10m_u_component_of_wind', 'Northward Near-Surface Wind': '10m_v_component_of_wind', 'Sea Level Pressure': 'mean_sea_level_pressure', 'Surface Pressure': 'surface_pressure', } @ct.application(title="AWS Matching data acquisation", description="Download point data matching AWS location for TNB Project. Main purpose is to compare AWS observation with ERA5 and ERA5-Land. Data downloaded will be further processed in R.") @ct.input.dropdown('variable', label='Variable', values=variables.keys()) @ct.output.download() @ct.output.download() def application(variable): # Retrieve the hourly 2m temperature over Europe for the selected year data = ct.catalogue.retrieve( 'reanalysis-era5-single-levels', { 'variable': variables[variable], 'product_type': 'reanalysis', 'year': 1999, 'month': 1, 'day': 1, 'time': [ '00:00', '01:00', '02:00', '03:00', '04:00', '05:00', '06:00', '07:00', '08:00', '09:00', '10:00', '11:00', '12:00', '13:00', '14:00', '15:00', '16:00', '17:00', '18:00', '19:00', '20:00', '21:00', '22:00', '23:00' ], 'grid': [3, 3], 'area': [60., -11., 34., 35.], # retrieve data for Europe only } ) test = ct.geo.spatial_average(data) print(test) test2 = ct.cdm.to_csv(test) #Create an empty list empty_list = [] # temporary_list=test empty_list.append(temporary_list) empty_list.append(temporary_list-50) test3 = ct.cdm.to_csv(empty_list) print(test2) print(test3) return test2, test3
Yunus.
Edit 6 January 2022. After trying various methods based on the documentation and API. I gave up. This is my approach now. This creates 16 csv for a single variable and a single period. Hopefully, there will be option and documentation to simplify the download to a single csv file.
import cdstoolbox as ct variables = { 'Near-Surface Air Temperature': '2m_temperature', 'Eastward Near-Surface Wind': '10m_u_component_of_wind', 'Northward Near-Surface Wind': '10m_v_component_of_wind', 'Sea Level Pressure': 'mean_sea_level_pressure', 'Surface Pressure': 'surface_pressure', } Case = { 'Case 1 1993':0, 'Case 2 1994':1, } Case_Year = { 'Case 1 1993':1993, 'Case 2 1994':1994, } Case_Month = { 'Case 1 1993':2, 'Case 2 1994':5, } Case_Day = { 'Case 1 1993':[21,27], 'Case 2 1994':[20,27], } @ct.application(title="AWS Matching data acquisation", description="Download point data matching AWS location for TNB Project. Main purpose is to compare AWS observation with ERA5 and ERA5-Land. Data downloaded will be further processed in R.") @ct.input.dropdown('variable', label='Variable', values=variables.keys()) @ct.input.dropdown('Case', label='Case Study', values=Case.keys()) @ct.output.figure() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() @ct.output.download() def application(variable,Case): # Retrieve the hourly 2m temperature over Europe for the selected year data = ct.catalogue.retrieve( 'reanalysis-era5-single-levels', { 'variable': variables[variable], 'product_type': 'reanalysis', 'year': Case_Year[Case], 'month': Case_Month[Case], 'day': Case_Day[Case], 'time': [ '00:00', '01:00', '02:00', '03:00', '04:00', '05:00', '06:00', '07:00', '08:00', '09:00', '10:00', '11:00', '12:00', '13:00', '14:00', '15:00', '16:00', '17:00', '18:00', '19:00', '20:00', '21:00', '22:00', '23:00' ], 'grid': [3, 3], 'area': [-70., 140.,-80. , 170.], # retrieve data for Europe only #NESW } ) # Compute the daily mean temperature over Europe temperature_daily_mean = ct.cube.resample(data, freq='day', how='mean') fig = ct.map.plot(temperature_daily_mean) #Creating a simple figure is just to make sure the server side application is working. If a figure isn't plotted and a bunch of errors in the console page, give it up, something wrong with the server. data_point_01 = ct.geo.extract_point(data,lon=166.6211,lat=-73.5861) data_point_02 = ct.geo.extract_point(data,lon=162.9700,lat=-76.7150) data_point_03 = ct.geo.extract_point(data,lon=164.0922,lat=-74.6958) data_point_04 = ct.geo.extract_point(data,lon=145.8589,lat=-75.5361) data_point_05 = ct.geo.extract_point(data,lon=148.6556,lat=-71.6525) data_point_06 = ct.geo.extract_point(data,lon=164.1177,lat=-74.6944) data_point_07 = ct.geo.extract_point(data,lon=163.4306,lat=-74.1350) data_point_08 = ct.geo.extract_point(data,lon=161.7720,lat=-74.9506) data_point_09 = ct.geo.extract_point(data,lon=160.6456,lat=-74.6392) data_point_10 = ct.geo.extract_point(data,lon=159.1933,lat=-72.8292) data_point_11 = ct.geo.extract_point(data,lon=164.0331,lat=-74.7250) data_point_12 = ct.geo.extract_point(data,lon=169.6000,lat=-73.0500) data_point_13 = ct.geo.extract_point(data,lon=163.2333,lat=-74.8167) data_point_14 = ct.geo.extract_point(data,lon=158.5906,lat=-75.6166) data_point_15 = ct.geo.extract_point(data,lon=162.8956,lat=-74.1766) data_point_16 = ct.geo.extract_point(data,lon=164.2205,lat=-74.6159) #print(data_point_1) download_point_01=ct.cdm.to_csv(data=data_point_01) download_point_02=ct.cdm.to_csv(data=data_point_02) download_point_03=ct.cdm.to_csv(data=data_point_03) download_point_04=ct.cdm.to_csv(data=data_point_04) download_point_05=ct.cdm.to_csv(data=data_point_05) download_point_06=ct.cdm.to_csv(data=data_point_06) download_point_07=ct.cdm.to_csv(data=data_point_07) download_point_08=ct.cdm.to_csv(data=data_point_08) download_point_09=ct.cdm.to_csv(data=data_point_09) download_point_10=ct.cdm.to_csv(data=data_point_10) download_point_11=ct.cdm.to_csv(data=data_point_11) download_point_12=ct.cdm.to_csv(data=data_point_12) download_point_13=ct.cdm.to_csv(data=data_point_13) download_point_14=ct.cdm.to_csv(data=data_point_14) download_point_15=ct.cdm.to_csv(data=data_point_15) download_point_16=ct.cdm.to_csv(data=data_point_16) return fig, download_point_01, download_point_02, download_point_03, download_point_04, download_point_05, download_point_06, download_point_07, download_point_08, download_point_09, download_point_10, download_point_11, download_point_12, download_point_13, download_point_14, download_point_15, download_point_16#test2, test3
8 Comments
Kevin Marsh
Hi Yunus,
the ct.cdm.to_csv function only works on 1 dimensional data objects. If you want a single CSV, it may be better to create a data object in the toolbox with all the data points, and then download the data in netCDF and convert to CSV on your local system. This page may help:
How to convert NetCDF to CSV
Kevin
Muhammad Yunus Ahmad Mazuki
Kevin Marsh
I have gone through the link you gave me. Those methods works on a local downloaded data. I'm asking about inside CDS Toolbox.
I tried using ct.observation.interp_from_grid, the added location points are treated as another dimension. So it doesn't work with ct.cdm.to_csv.
I tried 2 variable for one data point ('variable': ['2m_temperature','surface_pressure'],), the added variable is treated as another dimension. So it doesn't work either.
I tried creating a list of data. But I don't understand how the CDS Toolbox wants it to be done.
From the description of cdstoolbox.cdm.to_csv, it says:
data – A single, or list of, 1 dimensional CDS Toolbox remote data object(s). In the case of a list of data objects they must all have the same, single, dimension.
I understand how to use a single 1 dimensional CDS Toolbox remote data object.
How do I create and use list of 1dimensional CDS Toolbox remote data objects.
Yunus.
Kevin Marsh
Hi Yunus,
what i was saying is that I don't think it is currently possible to create a multi-dimension CSV output file from the toolbox; if you had a number of data objects with 1 dimension (e.g. only time), then in theory you could create a list containing these and convert that to csv in the toolbox.
But converting a list of multi dimension data objects (e.g. time, location) to csv in the toolbox would not work, hence you would have to convert on your local system.
Kevin
Muhammad Yunus Ahmad Mazuki
Kevin Marsh I understand the current limitation. No multi-dimension data objects can be converted into a single csv.
For now, I'm asking about the API in this page. https://cds.climate.copernicus.eu/toolbox/doc/modules_tmp/cdstoolbox.cdm.html?highlight=csv#cdstoolbox.cdm.to_csv.
It says there, list of 1 dimensional CDS Toolbox remote data objects. How do I go about creating this list?
Kevin Marsh
Hi Yunus,
I usually work with merged data objects rather than lists, but would something like:
data_all=[data_point_07,data_point_08, data_point_09]
work?
Kevin
Muhammad Yunus Ahmad Mazuki
Kevin Marsh I have tried using your approach. No error in the console and the application run succesfully. However only one of the point is in the csv file. I have attached the csv file that was generated.
cdm.to_csv-1641558753.4026573-8269-3-697645a5-0b94-41ad-8bd0-5e6f298f04ce.csv
Kevin Marsh
Hi Yunus,
Yes, I think the ct.cdm.to_csv works best for 'straightforward' 1 dimension data objects, rather than lists/multi-dimension objects.
Thanks,
Kevin
Jorge Costa
Hi Yunus,
Try this:
#Extract point info
data_point_01 = ct.cube.interpolate(data,lon=166.6211,lat=-73.5861)
data_point_02 = ct.cube.interpolate(data,lon=162.9700,lat=-76.7150)
data_point_03 = ct.cube.interpolate(data,lon=164.0922,lat=-74.6958)
...
data_point_16 = ct.cube.interpolate(data,lon=164.2205,lat=-74.6159)
#merge data points
list = [data_point_01[0], data_point_02[0], data_point_03[0], ..., data_point_16[0]]
points = ct.cube.concat(list, dim='time')
#convert to csv
csv = ct.cdm.to_csv(points, sep=';', na_rep='NoData')
return csv
The result should be a table like this:
If you have more than onle variable in the retrived data you have to do it like this.
list1 = [data_point_01[0], data_point_02[0], data_point_03[0], ..., data_point_16[0]]
list2 = [data_point_01[1], data_point_02[1], data_point_03[1], ..., data_point_16[1]]
var1 = ct.cube.concat(list1, dim='time')
var2 = ct.cube.concat(list2, dim='time')
points = [var1, var2]
#convert to csv
csv = ct.cdm.to_csv(points, sep=';', na_rep='NoData')
return csv
Hope it works for you!
Jorge