I'm trying to download daily averaged data using the toolbox.  The code I'm using is at the bottom of the post.  As a side note, I can use the api to get hourly data, it's when I use the toolbox that things break.  


Thanks for any help!  

--Brent


After I start the code it tells me that things are queued and then it fails with:

2023-04-17 18:31:07,036 INFO Welcome to the CDS

2023-04-17 18:31:07,037 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/tasks/services/tool/toolbox/orchestrator/workflow/clientid-805d8a96edb44aa7b209f7dfd30eda68

2023-04-17 18:31:07,215 INFO Request is queued

2023-04-17 18:31:58,379 INFO Request is failed

2023-04-17 18:31:58,379 ERROR Message: an internal error occurred processing your request

2023-04-17 18:31:58,379 ERROR Reason:  Cmd('git') failed due to: exit code(128)

  cmdline: git clone git@gitrepo:c3s/era5.git /home/cds/compute_workflows/c3s/era5/master

  stdout: 'Cloning into '/home/cds/compute_workflows/c3s/era5/master'...'

  stderr: 'Warning: Permanently added the RSA host key for IP address '192.168.0.248' to the list of known hosts.

GitLab: The project you were looking for could not be found.

fatal: Could not read from remote repository.


Please make sure you have the correct access rights

and the repository exists.'

2023-04-17 18:31:58,379 ERROR   Traceback (most recent call last):

2023-04-17 18:31:58,379 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/cdshandlers/services/handler.py", line 59, in handle_request

2023-04-17 18:31:58,379 ERROR       result = cached(context.method, proc, context, context.args, context.kwargs)

2023-04-17 18:31:58,380 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/caching.py", line 108, in cached

2023-04-17 18:31:58,380 ERROR       result = proc(context, *context.args, **context.kwargs)

2023-04-17 18:31:58,380 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 124, in __call__

2023-04-17 18:31:58,380 ERROR       return p(*args, **kwargs)

2023-04-17 18:31:58,380 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/services.py", line 60, in __call__

2023-04-17 18:31:58,380 ERROR       return self.proc(context, *args, **kwargs)

2023-04-17 18:31:58,380 ERROR     File "/home/cds/cdsservices/services/workflow.py", line 26, in execute

2023-04-17 18:31:58,380 ERROR       cacheurl=cacheurl)

2023-04-17 18:31:58,380 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/workflow.py", line 56, in submit

2023-04-17 18:31:58,380 ERROR       gitcache.ensure_repo_version(local_repos_path, remote_repo_url, params['version'])

2023-04-17 18:31:58,380 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/gitcache.py", line 38, in ensure_repo_version

2023-04-17 18:31:58,380 ERROR       fetch_repo_version(remote_repo_url, local_repo_path, version)

2023-04-17 18:31:58,380 ERROR     File "/opt/cdstoolbox/cdscompute/cdscompute/gitcache.py", line 20, in fetch_repo_version

2023-04-17 18:31:58,380 ERROR       git.Git().clone(remote_repo_url, local_repo_path)

2023-04-17 18:31:58,380 ERROR     File "/usr/local/lib/python3.6/site-packages/git/cmd.py", line 548, in <lambda>

2023-04-17 18:31:58,380 ERROR       return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)

2023-04-17 18:31:58,380 ERROR     File "/usr/local/lib/python3.6/site-packages/git/cmd.py", line 1014, in _call_process

2023-04-17 18:31:58,380 ERROR       return self.execute(call, **exec_kwargs)

2023-04-17 18:31:58,381 ERROR     File "/usr/local/lib/python3.6/site-packages/git/cmd.py", line 825, in execute

2023-04-17 18:31:58,381 ERROR       raise GitCommandError(command, status, stderr_value, stdout_value)

2023-04-17 18:31:58,381 ERROR   git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)

2023-04-17 18:31:58,381 ERROR     cmdline: git clone git@gitrepo:c3s/era5.git /home/cds/compute_workflows/c3s/era5/master

2023-04-17 18:31:58,381 ERROR     stdout: 'Cloning into '/home/cds/compute_workflows/c3s/era5/master'...'

2023-04-17 18:31:58,381 ERROR     stderr: 'Warning: Permanently added the RSA host key for IP address '192.168.0.248' to the list of known hosts.

2023-04-17 18:31:58,381 ERROR   GitLab: The project you were looking for could not be found.

2023-04-17 18:31:58,381 ERROR   fatal: Could not read from remote repository.

Traceback (most recent call last):

  File "/Users/data/era5/get_daily_data.py", line 5, in <module>

    result = c.service("tool.toolbox.orchestrator.workflow",

  File "/usr/local/lib/python3.9/site-packages/cdsapi/api.py", line 382, in service

    result = self._api(

  File "/usr/local/lib/python3.9/site-packages/cdsapi/api.py", line 519, in _api

    raise Exception(

Exception: an internal error occurred processing your request. Cmd('git') failed due to: exit code(128)

  cmdline: git clone git@gitrepo:c3s/era5.git /home/cds/compute_workflows/c3s/era5/master

  stdout: 'Cloning into '/home/cds/compute_workflows/c3s/era5/master'...'

  stderr: 'Warning: Permanently added the RSA host key for IP address '192.168.0.248' to the list of known hosts.

GitLab: The project you were looking for could not be found.

fatal: Could not read from remote repository.


Please make sure you have the correct access rights

and the repository exists.'.

____________________________ My code below _________________

Using the following code:

import cdsapi
c = cdsapi.Client()

MONTHS = ["01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"]

for month in MONTHS:
    result = c.service("tool.toolbox.orchestrator.workflow",
    params= {"realm": "c3s", "project": "era5", "version": "master",
    "kwargs": {"dataset": "reanalysis-era5-single-levels", "product_type": "reanalysis","variable": "2m_temperature","statistic": "daily_mean","year": "2020","month": month,"time_zone": "UTC+00:0","frequency": "1-hourly","grid": "2.5/2.5","area": {"lat": [-90, 90], "lon": [-180, 180]}
            },
            "workflow_name": "application"
    })
    c.download(result)

3 Comments

  1. Hi Brent,

    The values for "realm" and  "project" were changed after the documentation was produced, so you just need to change the values for "realm" and  "project" in your script i.e.:


    import cdsapi

    c = cdsapi.Client()

    MONTHS = ["01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"]

    for month in MONTHS:

        result = c.service("tool.toolbox.orchestrator.workflow",

        params= {"realm": "user-apps", "project": "app-c3s-daily-era5-statistics", "version": "master",

        "kwargs": {"dataset": "reanalysis-era5-single-levels", "product_type": "reanalysis","variable": "2m_temperature","statistic": "daily_mean","year": "2020","month": month,"time_zone": "UTC+00:0","frequency": "1-hourly","grid": "2.5/2.5","area": {"lat": [-90, 90], "lon": [-180, 180]}

                },

                "workflow_name": "application"

        })

        c.download(result)

    % python3 era5_daily_via_cdsapi_forum_180423.py

    2023-04-18 16:10:07,271 INFO Welcome to the CDS

    2023-04-18 16:10:07,271 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/tasks/services/tool/toolbox/orchestrator/workflow/clientid-fbb3f759881246c08d9f9ef64416cbd4

    2023-04-18 16:10:07,390 INFO Request is queued

    2023-04-18 16:18:26,966 INFO Request is completed

    2023-04-18 16:18:26,970 INFO Downloading https://download-0019.copernicus-climate.eu/cache-compute-0019/cache/data8/052b32c4-cf41-424c-9f79-e01fd6dec194.nc to 052b32c4-cf41-424c-9f79-e01fd6dec194.nc (1.3M)

    2023-04-18 16:18:27,680 INFO Download rate 1.8M/s


     % ncdump -h 052b32c4-cf41-424c-9f79-e01fd6dec194.nc

    netcdf \052b32c4-cf41-424c-9f79-e01fd6dec194 {

    dimensions:

    time = 31 ;

    lat = 73 ;

    lon = 144 ;

    variables:

    int64 time(time) ;

    time:long_name = "time" ;

    time:standard_name = "time" ;

    time:axis = "T" ;

    time:stored_direction = "increasing" ;

    time:type = "double" ;

    time:units = "days since 2020-01-01" ;

    time:calendar = "proleptic_gregorian" ;

    int64 realization ;

    realization:long_name = "realization" ;

    realization:units = "1" ;

    realization:standard_name = "realization" ;

    realization:stored_direction = "increasing" ;

    realization:type = "integer" ;

    double lat(lat) ;

    lat:_FillValue = NaN ;

    lat:units = "degrees_north" ;

    lat:standard_name = "latitude" ;

    lat:long_name = "latitude" ;

    lat:stored_direction = "decreasing" ;

    lat:axis = "Y" ;

    lat:positive = "up" ;

    lat:type = "double" ;

    lat:valid_max = 90. ;

    lat:valid_min = -90. ;

    double lon(lon) ;

    lon:_FillValue = NaN ;

    lon:units = "degrees_east" ;

    lon:standard_name = "longitude" ;

    lon:long_name = "longitude" ;

    lon:axis = "X" ;

    lon:positive = "up" ;

    lon:type = "double" ;

    lon:valid_max = 360. ;

    lon:valid_min = -180. ;

    float t2m(time, lat, lon) ;

    t2m:_FillValue = NaNf ;

    t2m:long_name = "2 metre temperature" ;

    t2m:units = "K" ;

    t2m:standard_name = "air_temperature" ;

    t2m:comment = "near-surface (usually, 2 meter) air temperature" ;

    t2m:cds_magics_style_name = "near-surface-air-temperature" ;

    t2m:type = "real" ;

    t2m:coordinates = "realization" ;


    // global attributes:

    :Conventions = "CF-1.7" ;

    :institution = "European Centre for Medium-Range Weather Forecasts" ;

    :history = "2023-04-18T15:18 GRIB to CDM+CF via cfgrib-0.9.9.1/ecCodes-2.27.0 with {\"source\": \"/nfs/compute-0019/data1/adaptor.mars.internal-1681831066.585602-8010-6-cc816243-dc94-442d-9747-037c4ce046e3.grib\", \"filter_by_keys\": {}, \"encode_cf\": [\"parameter\", \"time\", \"geography\", \"vertical\"]}" ;

    :source = "ECMWF" ;

    }


    This page has more information and examples:

    Retrieve daily ERA5/ERA5-Land data using the CDS API


    Thanks,

    Kevin

                                                                                                                                                                                                                                                         


  2. Hi Kevin,


    Thanks for your reply.  Changing those parameters fixed the issue.

    Are there updated docs for the API that show all the available parameters and functions in the API?  I was trying to specify the output filename and thought there would be an easy way but wasn't able to do so using the toolbox.  I was able to do it using some code (yours?) that I copied below but it seems cludgy and like I'm missing some obvious way to do it.

    Thanks again!


    --Brent



    location=result[0]['location']
    res = requests.get(location, stream = True)
    print("Writing data to " + file_name)
    with open(file_name,'wb') as fh:
        for r in res.iter_content(chunk_size = 1024):
            fh.write(r)
    fh.close()

  3. Hi Brent,

    re: API functions and parameters - i believe that currently the best reference is the code itself https://github.com/ecmwf/cdsapi

    I don't think it's possible to directly specify the Toolbox output filename in the current version of the Toolbox, but this is something that may be possible in the future as the CDS develops.

    Thanks,

    Kevin