CDS API queued indefinitely

Created by Gautam Pradhan, last modified on Sep 02, 2021

EDIT: It took much longer than normal but did respond.

Hi,

I have noticed requests getting queued indefinitely over the last few days.

The debug log keeps polling every 120s (like below) but it never gets a response and goes back to sleep.

2021-09-02 08:34:28,749 DEBUG https://cds.climate.copernicus.eu:443 "GET /api/v2/tasks/8d8a8405-a2cf-4668-883f-d4f7fcbdce3a HTTP/1.1" 200 None 2021-09-02 08:34:28,751 DEBUG REPLY {'state': 'queued', 'request_id': '8d8a8405-a2cf-4668-883f-d4f7fcbdce3a', 'specific_metadata_json': {'top_request_origin': 'api'}} 2021-09-02 08:34:28,755 DEBUG Request ID is 8d8a8405-a2cf-4668-883f-d4f7fcbdce3a, sleep 120

The request sent via the API is:

2021-09-02 08:18:06,232 DEBUG POST https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels {"product_type": "reanalysis", "variable": ["100m_u_component_of_wind", "100m_v_component_of_wind", \
"10m_u_component_of_neutral_wind", "10m_v_component_of_neutral_wind", "10m_u_component_of_wind", "10m_v_component_of_wind", "2m_temperature", "boundary_layer_height", "convective_inhibition", "mean_total_precipitation_rate", \
"2m_dewpoint_temperature", "mean_sea_level_pressure", "surface_net_solar_radiation", "surface_pressure", "total_cloud_cover"], "year": 2021, "month": 8, "day": ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", \
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31"], "time": ["00:00", "01:00", "02:00", "03:00", "04:00", "05:00", "06:00", "07:00", "08:00", "09:00", "10:00", "11:00", "12:00",\
"13:00", "14:00", "15:00", "16:00", "17:00", "18:00", "19:00", "20:00", "21:00", "22:00", "23:00"], "format": "netcdf", "area": [38, 68, 6, 98], "grid": [0.25, 0.25]}

Is there any scheduled maintenance or downtime going on?

Thanks!

owned-single-by-makm

17 Comments

edna espinosa
Hi, I need to download data from "ERA5 hourly data on single levels from 1979 to present", and I am using the following code in python, but it takes too long and I need data from 1979-2016, does anyone know what I can do to make the request accepted:

import numpy as np
from datetime import datetime, timedelta
import cdsapi
import certifi
import urllib3

http = urllib3.PoolManager(
cert_reqs='CERT_REQUIRED',
ca_certs=certifi.where()
)
c = cdsapi.Client()
Date_inicio = datetime(2002, 8, 1)
Date_fin = datetime(2016, 12, 31)
array_horario = np.arange(Date_inicio, Date_fin, timedelta(hours = 1)).astype(datetime)
for i in array_horario:
year = i.strftime('%Y')
month = i.strftime('%m')
day = i.strftime('%d')
hour = i.strftime('%H:%M')

c.retrieve(
'reanalysis-era5-single-levels',
{
'product_type': 'reanalysis',
'variable': [
'10m_u_component_of_wind', '10m_v_component_of_wind', 'skin_temperature', '2m_dewpoint_temperature',
'2m_temperature','lake_cover', 'lake_depth', 'lake_mix_layer_depth', 'lake_shape_factor',
'lake_mix_layer_temperature', 'lake_total_layer_temperature',
'mean_surface_downward_long_wave_radiation_flux_clear_sky',
'mean_surface_downward_short_wave_radiation_flux_clear_sky',
'leaf_area_index_high_vegetation', 'leaf_area_index_low_vegetation', 'low_vegetation_cover',
'high_vegetation_cover', 'mean_surface_downward_long_wave_radiation_flux',
'mean_surface_downward_short_wave_radiation_flux', 'runoff', 'soil_type', 'sub_surface_runoff',
'skin_reservoir_content', 'surface_pressure', 'surface_runoff', 'total_precipitation',
],
'year': year,
'month': month,
'day': [
day,
],
'time': [
hour,
],
'format': 'netcdf',
},
f'.ERA5hourly_data_on_single_levels{year}{month}{day}.{i.strftime("%H")}.nc')
- Permalink
- Sep 14, 2021
Gautam Pradhan
In my experience you have to break into chunks like monthly.
- Permalink
- Sep 15, 2021
edna espinosa
this code returns a global hourly file, yesterday it lasted almost four hours downloading 15 hourly files. Then, I decided to work with a smaller region, and now it is downloading fast. Maybe the size of the file makes the request is queued: before it was 52739 KB; now 44 KB. this process should be automatic because I need to download around 333000 files.

I leave here the new code with the selected area (bold font) in case it works for someone! (check indentation)
sorry for my English!

import numpy as np
from datetime import datetime, timedelta
import cdsapi
import certifi
import urllib3

http = urllib3.PoolManager(
cert_reqs='CERT_REQUIRED',
ca_certs=certifi.where()
)
c = cdsapi.Client()
Date_inicio = datetime(2002, 8, 1)
Date_fin = datetime(2016, 12, 31)
array_horario = np.arange(Date_inicio, Date_fin, timedelta(hours = 1)).astype(datetime)
for i in array_horario:
year = i.strftime('%Y')
month = i.strftime('%m')
day = i.strftime('%d')
hour = i.strftime('%H:%M')

c.retrieve(
'reanalysis-era5-single-levels',
{
'product_type': 'reanalysis',
'variable': [
'10m_u_component_of_wind', '10m_v_component_of_wind', 'skin_temperature', '2m_dewpoint_temperature',
'2m_temperature','lake_cover', 'lake_depth', 'lake_mix_layer_depth', 'lake_shape_factor',
'lake_mix_layer_temperature', 'lake_total_layer_temperature',
'mean_surface_downward_long_wave_radiation_flux_clear_sky',
'mean_surface_downward_short_wave_radiation_flux_clear_sky',
'leaf_area_index_high_vegetation', 'leaf_area_index_low_vegetation', 'low_vegetation_cover',
'high_vegetation_cover', 'mean_surface_downward_long_wave_radiation_flux',
'mean_surface_downward_short_wave_radiation_flux', 'runoff', 'soil_type', 'sub_surface_runoff',
'skin_reservoir_content', 'surface_pressure', 'surface_runoff', 'total_precipitation',
],
'year': year,
'month': month,
'day': [
day,
],
'time': [
hour,
],
'format': 'netcdf',
'area': [
-24.08, -64.6, -30.52,
-58.09,
],
},
f'.ERA5hourly_data_on_single_levels{year}{month}{day}.{i.strftime("%H")}.nc')
- Permalink
- Sep 15, 2021
Gautam Pradhan
Yes. Now that you have shortlisted a smaller region, you could also query it for a longer time period at a time (say a month). This may be more efficient as well in terms of run-time and storage.
- Permalink
- Sep 16, 2021
Olabosipo Osibanjo
Thank you so much for sharing this code. But, I notice the code was only giving data for 1 hour in 1 day and in the last month of the last year I defined. How can I make it download data for all the years e.g 1979-2020?
Thank you for your help!
- Permalink
- Dec 01, 2021
edna espinosa
Hello, it could be a problem of the indentation. I would need to see how the blocks are indented in your code. I leave here a photo of my code so you can see how it is indented
- Permalink
- Dec 02, 2021
Olabosipo Osibanjo
Hello, thank you so much for your response. Please, see my code. I use your code and modified it a little for my task.
Please, help check what I am doing wrong.

Thank you.
- Permalink
- Dec 02, 2021
edna espinosa
I don't really understand how decorators are used in python (lines 28 and 29 of your code) or what is the purpose of defining your application function. but I understand that since lines 28 and 29 are not indented, the function (application) is executed only once with the final values of the variables (year, month, day, hour). You should try indenting lines 28 and 29, or without the application function (putting all the lines with the same indentation after the for) as I show in the following photo (although I couldn't run it because I don't know how you have defined ct)
- Permalink
- Dec 02, 2021
1. edna espinosa
  Another thing that I see in your code is that in the return of the aplication function it is necessary to put an f at the beginning (f '{year} {month} {day} ...), because without the f the variables inside {} are taken as strings and not as variables
  Permalink
  
  Dec 02, 2021
Olabosipo Osibanjo
Thank you for your reply. Please, see my code below:

import numpy as np
from datetime import datetime, timedelta
import cdstoolbox as ct
import certifi
import urllib3

http = urllib3.PoolManager(
cert_reqs='CERT_REQUIRED',
ca_certs=certifi.where()
)
Date_ini = datetime(1979, 1, 1)
Date_fin = datetime(1979, 1, 2)
array_horary = np.arange(Date_ini, Date_fin, timedelta(hours = 1)).astype(datetime)
print(array_horary)

for i in array_horary:
year = i.strftime('%Y')
month = i.strftime('%m')
day = i.strftime('%d')
hour = i.strftime('%H:%M')
print(year,month,day,hour,i.strftime('%H'))
if year < '1979':
product = 'reanalysis-era5-pressure-levels-preliminary-back-extension'
else:
product = 'reanalysis-era5-pressure-levels'
@ct.application(title='Compute Effective Wind Shear')
@ct.output.download()
@ct.output.download()
def application():

u, v = ct.catalogue.retrieve(
product,
{
'product_type': 'reanalysis',
'variable': [
'u_component_of_wind', 'v_component_of_wind',
],
'pressure_level': [
'500', '550', '600',
'650', '700', '750',
'775', '800', '825',
'850', '875', '900',
'925', '950', '975',
'1000',
],
'year': year,
'month': month,
'day': day,
'time': hour,

'area': [
-10, 110, -45,
155,
],
},
)

return u(f'{year}{month}{day}.{i.strftime("%H")}'), v(f'{year}{month}{day}.{i.strftime("%H")}')
- Permalink
- Dec 02, 2021
1. edna espinosa
  and works? I don't have cdstoolbox installed and that's why I can't run it. Is there a difference between ct.catalogue.retrieve and c.retrieve ?. What type of file are you downloading, a netcdf?
  Does ct.catalogue.retrieve return a tuple?
  I don't understand now the return of your function and what u () v () means: I think it is wrong to do that.
  Try with:
  
  import numpy as np
  from datetime import datetime, timedelta
  import cdstoolbox as ct
  import certifi
  import urllib3
  http = urllib3.PoolManager(
  cert_reqs='CERT_REQUIRED',
  ca_certs=certifi.where()
  )
  Date_ini = datetime(1979, 1, 1)
  Date_fin = datetime(1979, 1, 2)
  array_horary = np.arange(Date_ini, Date_fin, timedelta(hours = 1)).astype(datetime)
  print(array_horary)
  for i in array_horary:
  
  year = i.strftime('%Y')
  month = i.strftime('%m')
  day = i.strftime('%d')
  hour = i.strftime('%H:%M')
  print(year,month,day,hour,i.strftime('%H'))
  if year < '1979':
  product = 'reanalysis-era5-pressure-levels-preliminary-back-extension'
  else:
  product = 'reanalysis-era5-pressure-levels'
  
  ct.catalogue.retrieve(
  product,
  {
  'product_type': 'reanalysis',
  'variable': [
  'u_component_of_wind', 'v_component_of_wind',
  ],
  'pressure_level': [
  '500', '550', '600',
  '650', '700', '750',
  '775', '800', '825',
  '850', '875', '900',
  '925', '950', '975',
  '1000',
  ],
  'year': year,
  'month': month,
  'day': day,
  'time': hour,
  'area': [
  -10, 110, -45,
  155,
  ],
  },
  f'.{year}{month}{day}.{i.strftime("%H")}')
  Permalink
  
  Dec 02, 2021
Olabosipo Osibanjo
Thank you for your reply. c.retrieve does not work when I tried it in the cds toolbox editor. But, I have now tried the code below from my linux environment and the request has been in queue since:
import numpy as np
from datetime import datetime, timedelta
#import cdstoolbox as ct
import cdsapi
import certifi
import urllib3
http = urllib3.PoolManager(
cert_reqs='CERT_REQUIRED',
ca_certs=certifi.where()
)
c = cdsapi.Client()
Date_ini = datetime(1950, 1, 1)
Date_fin = datetime(2021, 10, 31)
array_horary = np.arange(Date_ini, Date_fin, timedelta(hours = 1)).astype(datetime)

for i in array_horary:
year = i.strftime('%Y')
month = i.strftime('%m')
day = i.strftime('%d')
hour = i.strftime('%H:%M')
print(year,month,day,hour,i.strftime('%H'))
if year < '1979':
product = 'reanalysis-era5-pressure-levels-preliminary-back-extension'
else:
product = 'reanalysis-era5-pressure-levels'
# RETRIEVE WIND COMPONENTS AT 500hPa

c.retrieve(
product,
{
'product_type': 'reanalysis',
'variable': [
'u_component_of_wind', 'v_component_of_wind',
],
'pressure_level': [
'500', '550', '600',
'650', '700', '750',
'775', '800', '825',
'850', '875', '900',
'925', '950', '975',
'1000',
],
'year': year,
'month': month,
'day': [
day,
],
'time': [
hour,
],
'format': 'netcdf',
'area': [
-10, 110, -45,
155,
],
},
f'.ERA5hourly_data_on_pressure_levels{year}{month}{day}.{i.strftime("%H")}.nc')
- Permalink
- Dec 02, 2021
Olabosipo Osibanjo
I also tried your suggestions in the cds toolbox editor but still didn't work. So, I am now using the c.retrieve code above in my linux environment but, it's been in queue since.
- Permalink
- Dec 02, 2021
edna espinosa
I never worked with the toolbox request, since September I am downloading the hourly data from reanalysis-era5-single-levels
and I still can't finish the download. with the same code sometimes it goes fast and sometimes it takes too long. Now I was seeing the differences in the code of the api request and the toolbox request, and I think that:
@ ct.application (title = 'Compute Effective Wind Shear')
@ ct.output.download ()
determine the name of the file and you should put an f '{year} {month} {day}. {hour}' to prevent the file from being overwritten.
- Permalink
- Dec 02, 2021
Olabosipo Osibanjo
Hello, thank you for your suggestions. I have tried it and still didn't work. But, the other cdsapi code ran and gave the error below:
File "/workflows/internal/code/47e2e5d178811b02d28aecc62c230038d1d23ac52d76ccbb67739c49/workflows.py", line 6, in <module>
import cdsapi
ModuleNotFoundError: No module named 'cdsapi'
- Permalink
- Dec 03, 2021
1. edna espinosa
  hello, these are the steps I followed:
  1) install the cdsapi library in your python environment with: conda install cdsapi or pip install cdsapi
  2) create a .cdsapirc file containing: key, copernicus url and verify = 1. Save this file in C: \ Users \ Your_user
  3) To obtain the key and url you have to make the request, show api request, go to the link documentation page:
  Permalink
  
  Dec 03, 2021
Olabosipo Osibanjo
Thank you so much for your help! Now, it's running in my Linux environment. I am using the cdsapi that you used.
- Permalink
- Dec 03, 2021

Feedback: C3S User Satisfaction Survey - CAMS User Satisfaction Survey

Web: C3S Help and Support - CAMS Help and Support

Page tree

17 Comments