Importing CSV data

The odb import tool can import data in the "wide" format (as produced by the sql tool run with "-f wide" option):

Printing file content in a delimited text format

$ ./odb sql select \* -i 2000010106.1.0.odb -f wide -o 2000010106.1.0.csv
$ head -n 1 2000010106.1.0.csv
expver@desc:string      andate@desc:integer     antime@desc:integer     seqno@hdr:integer       obstype@hdr:integer     obschar@hdr:Bitfield[codetype:9;instype:10;retrtype:6;geoarea:6]   subtype@hdr:integer     date@hdr:integer        time@hdr:integer        rdbflag@hdr:Bitfield[lat_humon:1;lat_qcsub:1;lat_override:1;lat_flag:2;lat_hqc_flag:1;lon_humon:1;lon_qcsub:1;lon_override:1;lon_flag:2;lon_hqc_flag:1;date_humon:1;date_qcsub:1;date_override:1;date_flag:2;date_hqc_flag:1;time_humon:1;time_qcsub:1;time_override:1;time_flag:2;time_hqc_flag:1;stalt_humon:1;stalt_qcsub:1;stalt_override:1;stalt_flag:2;stalt_hqc_flag:1]   status@hdr:Bitfield[active:1;passive:1;rejected:1;blacklisted:1;monthly:1;constant:1;experimental:1;whitelist:1] ...

The header of the text format is a list of column descriptions, each in a format: <column-name>:<type>

The type can be:

REAL
DOUBLE
INTEGER
STRING
BITFIELD

In the last case, BITFIELD, the list of fields and their sizes in bits follows, in square brackets, for example:

Example of a bitfield definition

 rdbflag@hdr:Bitfield[lat_humon:1;lat_qcsub:1;lat_override:1;lat_flag:2;lat_hqc_flag:1;lon_humon:1;lon_qcsub:1;lon_override:1;lon_flag:2;lon_hqc_flag:1;date_humon:1;date_qcsub:1;date_override:1;date_flag:2;date_hqc_flag:1;time_humon:1;time_qcsub:1;time_override:1;time_flag:2;time_hqc_flag:1;stalt_humon:1;stalt_qcsub:1;stalt_override:1;stalt_flag:2;stalt_hqc_flag:1]

So, importing CSV text data (TAB delimited similarly as the one produced using the odb sql tool in the example above) to ODB can be done like follows:

Importing a CSV file to ODB format

$ ./odb import -d TAB 2000010106.1.0.csv 2000010106.1.0.imported.odb

Delimiter can be changed with option -d, by default it is ','.

Regarding the data in CSV, one should remember that we have currently the limitation that STRINGS can be 8 characters long only.

Converting from other binary formats like e.g. netcdf to ODB via an intermediate ASCII should be avoided, due to lose of precision (unless the data is printed with full precision).

Space shortcuts

Page tree

6 Comments

Ian Harris

Peter Kuchta

Ian Harris

Ian Harris

Peter Kuchta

Ian Harris