Below is a list of up-to-now identified possible technical solutions for archiving of time-series datasets. Any new proposals or links to subject matter experts are still welcome.
Currently the NetCDF format has been chosen as the first candidate for the storage of the TIGGE time-series data. The related technical work (choosing the exact file structure; defining meta data; writing the maintenance scripts etc.) is in progress.
Relational database (e.g. PostgreSQL)
pros:
- flexible
- conditional verification
- SQL queries
- easy to implement
cons:
- possibly very slow and hard to maintain as an estimated annual increases is around 100 TB
- probably a need for a tool preparing some common output data file for users from the database values
ODB
pros:
- full support at ECMWF
- can be stored in ECMWF MARS and use all its capability
- SQL like queries
- compression
cons:
- generally not too known format?
- not too flexible if any re-computations are needed
NetCDF
pros:
- well known format
- compression
cons:
- not ECMWF MARS support
- not too flexible if any re-computations are needed
BUFR
pros:
- fast access
- full support at ECMWF - new API in preparation
- can be stored in ECMWF MARS and use all its capability (?)
- compression
cons:
- special setup probably required if there would be any data policy for external users (one file cannot contain all information)
- not too flexible if any re-computations are needed
Other possibilities
- some sort of GRIB?