How to efficiently download more than one year of CERA-20C daily data?

Created by sebastiano piccolroaz on Oct 10, 2019

I would like to download ensemble mean, daily data from the CERA-20C reanalysis product in a fast and efficient way.

In the following page:

CERA-20C Atmospheric model, daily data (enda) retrieval efficiency#20CAtmosphericmodel,dailydata(enda)retrievalefficiency-Requestingensemblemean(ep),multipleyears,surface(sfc)

an efficient script to download such type of data is provided. However, when I run the script (for a small area, not the entire globe) only one month at a time is downloaded from the server, and the downloading of the next month is queued. The resulting nc files are small, because I am interested on a small area and few parameters, thus also the corresponding extraction and downloading times are short. However the queue time is around 1 hour. Since I need to download 110*12=1320 months, this procedure will take weeks instead of few minutes (seconds) if the data would be downloaded all at one.

I wonder if there exist a procedure to download all the data at once.

Thank you.

Sebastiano

owned-single-by-usmg

8 Comments

Michela Giusti
Hi,
maybe you could try to request data group by year or try to get grib files and convert them in nc files locally.

Regards
Michela
- Permalink
- Oct 10, 2019
sebastiano piccolroaz
Thank you for the reply.
I suspect that CERA-20C can be downloaded only month by month. At least this is the case when using the browser for downloading the data, but it is also the case if I define "date": "19010101/TO/20101231" in my python script. In this case, in fact, multiple requests are sent in series, month by month.
Do you think that defining "format": "grib" may change something?
Bests,
Sebastiano
- Permalink
- Oct 10, 2019
Michela Giusti
Hi,
at least you save the time of the conversion from grib to nc file.

Regards
Michela
- Permalink
- Oct 11, 2019
sebastiano piccolroaz
Thank you Michela,
the problem is not processing time, rather the queue time. The processing (extraction, possible conversion, transfer, and downloading) takes few seconds.
The main issue is being able to combine more months (possibly years) in the same request in order to reduce the queue time (hours).
Regards,
Sebastiano
- Permalink
- Oct 11, 2019
Anabelle Menochet
CERA-20C data is stored on tapes in MARS (ECMWF archive). Each tape contains one month worth of data. The most efficient way of retrieving the data is to retrieve everything you need from one tape at a time - in this case this means one month at a time (you can loop through the months if you wish). If you try submitting a script to retrieve more than one month at a time, you will end up at the bottom of the queue, possibly facing days before data is retrieved. Depending on workload, your request may even get cancelled as inefficient requests affect the overall performance of the system for all users.
- Permalink
- Oct 11, 2019
sebastiano piccolroaz
Thank you Anabelle. Hence, I suppose that the script suggested in the web-page that I linked above is the most efficient. Still, it undergoes (~1 hour) queuing from one month request to the following one. However, if this is the most efficient procedure I'll use it.
Bests,
Sebastiano
- Permalink
- Oct 11, 2019
Anabelle Menochet
No problem, Sebastiano,
Indeed, the script on the web-page is the most efficient
Queueing for ~1 hour on MARS is very good going It may go a little bit faster during the weekend when activity slows down some.
Anabelle
- Permalink
- Oct 11, 2019
Nilanjan Debsharma
While downloading the datasets through the given script it is saying that dataset has been phased out.How to download CERA-20C in 2023 through script?
- Permalink
- Aug 24, 2023

Feedback: C3S User Satisfaction Survey - CAMS User Satisfaction Survey

Web: C3S Help and Support - CAMS Help and Support

Page tree

8 Comments