If a user runs exactly the same CDS API request several times, and the data from the first time the request is submitted is still in the CDS cache. then the same data are retrieved from the cache, rather than being re-extracted from the dataset.
For datasets which are updated on a daily basis (such as ERA5T), this can cause issues, as the later request may not be returned the data as expected.
To avoid this, we suggest users to add the keyword 'nocache' in their CDS API request, with a random numeric string which is changed each time the request is submitted e.g.:
|
Hope that helps!
Kevin
3 Comments
Joseph Yang
Thanks for this advice. When I try adding this to the request, it fails about half the time with the error shown below. I'm not sure what could be causing this - are there any requirements for the random string?
Anthony Russel
Joseph,
I was experiencing the same issue you were and I have been having success with only passing numbers (as a string) as the value for the 'nocache' parameter. I have been generating a random string of three numbers and passing that. Here is the Python code I have been using:
import random
import string
...
digits = string.digits
rand_str = ''.join(random.choice(digits) for i in range(3))
...
'nocache':rand_str,
I don't have any knowledge of the inner workings of the cdsapi to know if this is a permanent solution, but it has been working for me throughout my testing today.
Kevin Marsh
Hi Joseph, Anthony,
Thanks for your comments; there is a subtle 'feature' of the API which means it is better to use a random numeric string. I've updated the original forum posting accordingly.
Thanks,
Kevin