There are many situations where a user is only interested in a subset of the dataset spatial domain.
For example, when comparing modelled river flow against observations, it is reasonable to be able to extract the time-series at those point coordinates rather than dealing with many GB of data. Similarly, when focusing on a specific catchment it is likely that you want to deal with only that part of the spatial domain.
In summary, there are two operations of data size reduction that are very popular on CEMS-Flood datasets, area cropping and time-series extraction.
There are two ways to perform those operations:
- Remotely - Using the CDS API to perform the operation remotely on the CDS compute nodes and retrieve only the reduced data.
- Locally - Using the CDS API to retrieve the entire data and perform the operation locally.
This section provides scripts for both cases and for both CEMS-Flood products, GloFAS and EFAS.
Set up a Python environment
If you have not done it yet, create a Python virtual environment.
Activate the conda environment and install the additional Python package https://corteva.github.io/rioxarray
conda install rioxarray
Prepare and retrieve data (for local processing)
For the following exercises on extracting time series on the local machine, we are going to use the latitude and longitude coordinates from a tiny subset of the GRDC dataset.
Copy the content of the code block into an empty file named "GRDC.csv", the file should reside in your working folder.
Then, retrieve the following datasets into the same working folder.
EFAS
Removal of subsetting for EFAS
An issue has been identified with the EFAS sub-region extraction tool, whereby it serves data that is not correctly located on the river network. The sub-region extraction tool has therefore been removed from the EFAS CDS entries, and any area specified in cdsapi requests will return the entire domain .
Data previously downloaded using this tool should be disregarded.
For more information please see EFAS-Known Issues
Coordinates precision
When transforming from lat/lon (source coordinates) to projected LAEA (target coordinates), you need to consider that the number of decimal places of the source coordinates affects the target coordinates precision:
An interval of 0.001 degrees corresponds to about 100 metres in LAEA.
An interval of 0.00001 degrees corresponds to about 1 metre in LAEA.
Remote processing
Time series extraction:
Area cropping:
Local processing
Time series extraction:
Important - Download upstream area
EFAS x and y coordinates, when converted from GRIB to NetCDF, are not projected coordinates but matrix indexes (i, j), It is necessary to download the upstream area that contains the projected coordinates and replace them in EFAS, as described in the code block below.