Hi all,

I'm trying to download ERA5-Land data for 1950-2022 over Senegal, using the latest version of the "ecmwfr" R-package. My requests are split up into parts by downloading each variable and each year separately. At the bottom you can find an example of how my requests look like.

Doing this, I came across the following issue: about half of the NetCDF files can be downloaded successfully, but for the other half get a "File Not Found (404)" error. There doesn't seem to be a relationship with the variable or the year (e.g. "surface_pressure" of year 1976 is fine, but 1975 is not), and also the more recent years (>1981) can have this problem. The files are marked "Complete", but the R-package downloads a webpage containing a 404 error instead of the actual data. When I go to the "Your Requests" page on the CDS website, the erroneous file is also marked "Complete", but when I click on the Download button, the status of the file changes to "Unavailable". See screenshots below.

Any ideas of what goes wrong?

I also found that it takes very long before the files become available. The files are processed one-by-one and each takes about 25 minutes to process. For ~500 files that means about a week before everything is ready. Is there a way I could optimize my request? I thought 36.7 MB would be a reasonable file size.

Thank you!

Wim


Before clicking Download:


After clicking Download:


Example of request:

{
  "area": [
    17,
    -17,
    13,
    -12
  ],
  "dataset_short_name": "reanalysis-era5-land",
  "day": [
    "01",
    "02",
    "03",
    "04",
    "05",
    "06",
    "07",
    "08",
    "09",
    "10",
    "11",
    "12",
    "13",
    "14",
    "15",
    "16",
    "17",
    "18",
    "19",
    "20",
    "21",
    "22",
    "23",
    "24",
    "25",
    "26",
    "27",
    "28",
    "29",
    "30",
    "31"
  ],
  "format": "netcdf",
  "month": [
    "01",
    "02",
    "03",
    "04",
    "05",
    "06",
    "07",
    "08",
    "09",
    "10",
    "11",
    "12"
  ],
  "target": "raw/surface_pressure/1975.nc",
  "time": [
    "00:00",
    "01:00",
    "02:00",
    "03:00",
    "04:00",
    "05:00",
    "06:00",
    "07:00",
    "08:00",
    "09:00",
    "10:00",
    "11:00",
    "12:00",
    "13:00",
    "14:00",
    "15:00",
    "16:00",
    "17:00",
    "18:00",
    "19:00",
    "20:00",
    "21:00",
    "22:00",
    "23:00"
  ],
  "variable": "surface_pressure",
  "year": "1975"
}


4 Comments

  1. Hi Wim,

    re: the " Your  requests" page; once your request has run, the data will only remain available from the CDS for a limited time (~ few days); after this the request output will be removed in order to free up resource on the CDS, although it may still be shown as "Available" on the request page until you actually click on the download link/refresh the page. The requests you show are ~5 days old, so I think this is what is happening, and to get the data you will need to re-run the request.

    Hope that helps,

    Kevin

  2. Hi Kevin,

    Thank you, I noticed that I indeed had more of those errors after the weekend! However, as you can see below, it also occurs with files that finished only yesterday.

    I'll re-run the request and try to download the files as soon as they are available. I'll also immediately delete the requests of the downloaded files, maybe that will remove some pressure..

    Best regards,

    Wim


  3. Hi Wim,

    if you request the data using the r package/script, are the data not downloaded to your local system as soon as they complete?

    Thanks

    Kevin

  4. 下载页面404,以及下载报红