EDIT: It took much longer than normal but did respond.

Hi,

I have noticed requests getting queued indefinitely over the last few days.

The debug log keeps polling every 120s (like below) but it never gets a response and goes back to sleep.

2021-09-02 08:34:28,749 DEBUG https://cds.climate.copernicus.eu:443 "GET /api/v2/tasks/8d8a8405-a2cf-4668-883f-d4f7fcbdce3a HTTP/1.1" 200 None 2021-09-02 08:34:28,751 DEBUG REPLY {'state': 'queued', 'request_id': '8d8a8405-a2cf-4668-883f-d4f7fcbdce3a', 'specific_metadata_json': {'top_request_origin': 'api'}} 2021-09-02 08:34:28,755 DEBUG Request ID is 8d8a8405-a2cf-4668-883f-d4f7fcbdce3a, sleep 120

The request sent via the API is:

2021-09-02 08:18:06,232 DEBUG POST https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels {"product_type": "reanalysis", "variable": ["100m_u_component_of_wind", "100m_v_component_of_wind", \
"10m_u_component_of_neutral_wind", "10m_v_component_of_neutral_wind", "10m_u_component_of_wind", "10m_v_component_of_wind", "2m_temperature", "boundary_layer_height", "convective_inhibition", "mean_total_precipitation_rate", \
"2m_dewpoint_temperature", "mean_sea_level_pressure", "surface_net_solar_radiation", "surface_pressure", "total_cloud_cover"], "year": 2021, "month": 8, "day": ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", \
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31"], "time": ["00:00", "01:00", "02:00", "03:00", "04:00", "05:00", "06:00", "07:00", "08:00", "09:00", "10:00", "11:00", "12:00",\
"13:00", "14:00", "15:00", "16:00", "17:00", "18:00", "19:00", "20:00", "21:00", "22:00", "23:00"], "format": "netcdf", "area": [38, 68, 6, 98], "grid": [0.25, 0.25]}

Is there any scheduled maintenance or downtime going on?

Thanks!

17 Comments

  1. Hi, I need to download data from "ERA5 hourly data on single levels from 1979 to present", and I am using the following code in python, but it takes too long and I need data from 1979-2016, does anyone know what I can do to make the request accepted:


    import numpy as np
    from datetime import datetime, timedelta
    import cdsapi
    import certifi
    import urllib3


    http = urllib3.PoolManager(
    cert_reqs='CERT_REQUIRED',
    ca_certs=certifi.where()
    )

    c = cdsapi.Client()

    Date_inicio = datetime(2002, 8, 1)
    Date_fin = datetime(2016, 12, 31)
    array_horario = np.arange(Date_inicio, Date_fin, timedelta(hours = 1)).astype(datetime)

    for i in array_horario:
          year = i.strftime('%Y')
          month = i.strftime('%m')
          day = i.strftime('%d')
          hour = i.strftime('%H:%M')

         c.retrieve(
         'reanalysis-era5-single-levels',
          {
                   'product_type': 'reanalysis',
                    'variable': [
                                '10m_u_component_of_wind', '10m_v_component_of_wind', 'skin_temperature', '2m_dewpoint_temperature',
                                 '2m_temperature','lake_cover', 'lake_depth', 'lake_mix_layer_depth', 'lake_shape_factor',
                                 'lake_mix_layer_temperature', 'lake_total_layer_temperature',
                                  'mean_surface_downward_long_wave_radiation_flux_clear_sky',
                                  'mean_surface_downward_short_wave_radiation_flux_clear_sky',
                                  'leaf_area_index_high_vegetation', 'leaf_area_index_low_vegetation', 'low_vegetation_cover',
                                  'high_vegetation_cover', 'mean_surface_downward_long_wave_radiation_flux',
                                  'mean_surface_downward_short_wave_radiation_flux', 'runoff', 'soil_type', 'sub_surface_runoff',
                                  'skin_reservoir_content', 'surface_pressure', 'surface_runoff', 'total_precipitation',
                    ],
                    'year': year,
                    'month': month,
                    'day': [
                            day,
                     ],
                    'time': [
                           hour,
                     ],
                     'format': 'netcdf',
          },
          f'.ERA5hourly_data_on_single_levels{year}{month}{day}.{i.strftime("%H")}.nc')

  2. In my experience you have to break into chunks like monthly.

  3. this code returns a global hourly file, yesterday it lasted almost four hours downloading 15 hourly files. Then, I decided to work with a smaller region, and now it is downloading fast. Maybe the size of the file makes the request is queued: before it was 52739 KB; now 44 KB. this process should be automatic because I need to download around 333000 files.


    I leave here the new code with the selected area (bold font) in case it works for someone! (check indentation)
    sorry for my English!


    import numpy as np
    from datetime import datetime, timedelta
    import cdsapi
    import certifi
    import urllib3


    http = urllib3.PoolManager(
    cert_reqs='CERT_REQUIRED',
    ca_certs=certifi.where()
    )

    c = cdsapi.Client()

    Date_inicio = datetime(2002, 8, 1)
    Date_fin = datetime(2016, 12, 31)
    array_horario = np.arange(Date_inicio, Date_fin, timedelta(hours = 1)).astype(datetime)

    for i in array_horario:
          year = i.strftime('%Y')
          month = i.strftime('%m')
          day = i.strftime('%d')
          hour = i.strftime('%H:%M')

         c.retrieve(
         'reanalysis-era5-single-levels',
          {
                   'product_type': 'reanalysis',
                    'variable': [
                                '10m_u_component_of_wind', '10m_v_component_of_wind', 'skin_temperature', '2m_dewpoint_temperature',
                                 '2m_temperature','lake_cover', 'lake_depth', 'lake_mix_layer_depth', 'lake_shape_factor',
                                 'lake_mix_layer_temperature', 'lake_total_layer_temperature',
                                  'mean_surface_downward_long_wave_radiation_flux_clear_sky',
                                  'mean_surface_downward_short_wave_radiation_flux_clear_sky',
                                  'leaf_area_index_high_vegetation', 'leaf_area_index_low_vegetation', 'low_vegetation_cover',
                                  'high_vegetation_cover', 'mean_surface_downward_long_wave_radiation_flux',
                                  'mean_surface_downward_short_wave_radiation_flux', 'runoff', 'soil_type', 'sub_surface_runoff',
                                  'skin_reservoir_content', 'surface_pressure', 'surface_runoff', 'total_precipitation',
                    ],
                    'year': year,
                    'month': month,
                    'day': [
                            day,
                     ],
                    'time': [
                           hour,
                     ],
                     'format': 'netcdf',

                     'area': [
                           -24.08, -64.6, -30.52,
                            -58.09,
                      ],
          },
          f'.ERA5hourly_data_on_single_levels{year}{month}{day}.{i.strftime("%H")}.nc')

  4. Yes. Now that you have shortlisted a smaller region, you could also query it for a longer time period at a time (say a month). This may be more efficient as well in terms of run-time and storage.

  5. Thank you so much for sharing this code. But, I notice the code was only giving data for 1 hour in 1 day and in the last month of the last year I defined. How can I make it download data for all the years e.g 1979-2020?

    Thank you for your help!

  6. Hello, it could be a problem of the indentation. I would need to see how the blocks are indented in your code. I leave here a photo of my code so you can see how it is indented


  7. Hello, thank you so much for your response. Please, see my code. I use your code and modified it a little for my task.

    Please, help check what I am doing wrong.



    Thank you.

  8. I don't really understand how decorators are used in python (lines 28 and 29 of your code) or what is the purpose of defining your application function. but I understand that since lines 28 and 29 are not indented, the function (application) is executed only once with the final values of the variables (year, month, day, hour). You should try indenting lines 28 and 29, or without the application function (putting all the lines with the same indentation after the for) as I show in the following photo (although I couldn't run it because I don't know how you have defined ct)

    1. Another thing that I see in your code is that in the return of the aplication function it is necessary to put an f at the beginning (f '{year} {month} {day} ...), because without the f the variables inside {} are taken as strings and not as variables

  9. Thank you for your reply. Please, see my code below:


    import numpy as np
    from datetime import datetime, timedelta
    import cdstoolbox as ct
    import certifi
    import urllib3


    http = urllib3.PoolManager(
    cert_reqs='CERT_REQUIRED',
    ca_certs=certifi.where()
    )

    Date_ini = datetime(1979, 1, 1)
    Date_fin = datetime(1979, 1, 2)
    array_horary = np.arange(Date_ini, Date_fin, timedelta(hours = 1)).astype(datetime)
    print(array_horary)

    for i in array_horary:
         year = i.strftime('%Y')
         month = i.strftime('%m')
         day = i.strftime('%d')
         hour = i.strftime('%H:%M')
         print(year,month,day,hour,i.strftime('%H'))
         if year < '1979':
             product = 'reanalysis-era5-pressure-levels-preliminary-back-extension'
         else:
             product = 'reanalysis-era5-pressure-levels'
         @ct.application(title='Compute Effective Wind Shear')
         @ct.output.download()
         @ct.output.download()
         def application():

       

          u, v = ct.catalogue.retrieve(
          product,
          {
                'product_type': 'reanalysis',
                'variable': [
                      'u_component_of_wind', 'v_component_of_wind',
                 ],
                 'pressure_level': [
                     '500', '550', '600',
                     '650', '700', '750',
                     '775', '800', '825',
                     '850', '875', '900',
                     '925', '950', '975',
                     '1000',
                ],
                'year': year,
                'month': month,
                'day': day,
                'time': hour,

                'area': [
                    -10, 110, -45,
                    155,
                ],
         },
         )

         return u(f'{year}{month}{day}.{i.strftime("%H")}'), v(f'{year}{month}{day}.{i.strftime("%H")}')


    1. and works? I don't have cdstoolbox installed and that's why I can't run it. Is there a difference between ct.catalogue.retrieve and c.retrieve ?. What type of file are you downloading, a netcdf?
      Does ct.catalogue.retrieve return a tuple?
      I don't understand now the return of your function and what u () v () means: I think it is wrong to do that.

      Try with:


      import numpy as np
      from datetime import datetime, timedelta
      import cdstoolbox as ct
      import certifi
      import urllib3

      http = urllib3.PoolManager(
      cert_reqs='CERT_REQUIRED',
      ca_certs=certifi.where()
      )

      Date_ini = datetime(1979, 1, 1)
      Date_fin = datetime(1979, 1, 2)
      array_horary = np.arange(Date_ini, Date_fin, timedelta(hours = 1)).astype(datetime)
      print(array_horary)

      for i in array_horary:
          
          year = i.strftime('%Y')
          month = i.strftime('%m')
          day = i.strftime('%d')
          hour = i.strftime('%H:%M')
          print(year,month,day,hour,i.strftime('%H'))
          if year < '1979':
              product = 'reanalysis-era5-pressure-levels-preliminary-back-extension'
          else:
              product = 'reanalysis-era5-pressure-levels'
           
           ct.catalogue.retrieve(
           product,
           {
              'product_type': 'reanalysis',
              'variable': [
                    'u_component_of_wind', 'v_component_of_wind',
               ],
               'pressure_level': [
                   '500', '550', '600',
                   '650', '700', '750',
                   '775', '800', '825',
                   '850', '875', '900',
                   '925', '950', '975',
                   '1000',
              ],
              'year': year,
              'month': month,
              'day': day,
              'time': hour,

              'area': [
                  -10, 110, -45,
                  155,
              ],
          },     
          f'.{year}{month}{day}.{i.strftime("%H")}')


  10. Thank you for your reply. c.retrieve does not work when I tried it in the cds toolbox editor. But, I have now tried the code below from my linux environment and the request has been in queue since:

    import numpy as np
    from datetime import datetime, timedelta
    #import cdstoolbox as ct
    import cdsapi
    import certifi
    import urllib3
    http = urllib3.PoolManager(
    cert_reqs='CERT_REQUIRED',
    ca_certs=certifi.where()
    )
    c = cdsapi.Client()

    Date_ini = datetime(1950, 1, 1)
    Date_fin = datetime(2021, 10, 31)
    array_horary = np.arange(Date_ini, Date_fin, timedelta(hours = 1)).astype(datetime)


    for i in array_horary:
    year = i.strftime('%Y')
    month = i.strftime('%m')
    day = i.strftime('%d')
    hour = i.strftime('%H:%M')
    print(year,month,day,hour,i.strftime('%H'))
    if year < '1979':
    product = 'reanalysis-era5-pressure-levels-preliminary-back-extension'
    else:
    product = 'reanalysis-era5-pressure-levels'

    # RETRIEVE WIND COMPONENTS AT 500hPa

    c.retrieve(
    product,
    {
    'product_type': 'reanalysis',
    'variable': [
    'u_component_of_wind', 'v_component_of_wind',
    ],
    'pressure_level': [
    '500', '550', '600',
    '650', '700', '750',
    '775', '800', '825',
    '850', '875', '900',
    '925', '950', '975',
    '1000',
    ],
    'year': year,
    'month': month,
    'day': [
    day,
    ],
    'time': [
    hour,
    ],
    'format': 'netcdf',
    'area': [
    -10, 110, -45,
    155,
    ],
    },
    f'.ERA5hourly_data_on_pressure_levels{year}{month}{day}.{i.strftime("%H")}.nc')

  11. I also tried your suggestions in the cds toolbox editor but still didn't work. So, I am now using the c.retrieve code above in my linux environment but, it's been in queue since. 

  12. I never worked with the toolbox request, since September I am downloading the hourly data from reanalysis-era5-single-levels
    and I still can't finish the download. with the same code sometimes it goes fast and sometimes it takes too long. Now I was seeing the differences in the code of the api request and the toolbox request, and I think that:
    @ ct.application (title = 'Compute Effective Wind Shear')
    @ ct.output.download ()
    determine the name of the file and you should put an f '{year} {month} {day}. {hour}' to prevent the file from being overwritten.

  13. Hello, thank you for your suggestions. I have tried it and still didn't work. But, the other cdsapi code ran and gave the error below:

    File "/workflows/internal/code/47e2e5d178811b02d28aecc62c230038d1d23ac52d76ccbb67739c49/workflows.py", line 6, in <module>
    import cdsapi
    ModuleNotFoundError: No module named 'cdsapi'

    1. hello, these are the steps I followed:
      1) install the cdsapi library in your python environment with: conda install cdsapi or pip install cdsapi
      2) create a .cdsapirc file containing: key, copernicus url and verify = 1. Save this file in C: \ Users \ Your_user

      3) To obtain the key and url you have to make the request, show api request, go to the link documentation page:


  14. Thank you so much for your help! Now, it's running in my Linux environment. I am using the cdsapi that you used.