Hi all
I'm running a coupled model using OpenIFS cy40r1v2 (with some slight modifications) T159 L91 + NEMO ORCA05 L46 + OASIS3-MCT3, and I'm trying to optimise for performance.
What I've found is that OpenIFS output is the biggest bottleneck. If I output T,U,V,W,q on 91 model levels every 6 hours (run name "lucia11"), the model does 1 month in about 600 seconds (600 OpenIFS CPUs + 400 NEMO CPUs), but if I reduce the output to only Temperature on 91 levels and only 12-hourly (run name "lucia12"), the total runtime is down to 200 seconds!
I'm wondering if this is normal for OpenIFS? Is it also the same for the IFS forecast system?
I'm attaching two plots of the time step runtime from ifs.stat (top plot is entire run, NSTOP=744, and lower plot is only first 25 steps).
The first and last time steps always take some time, which is to be expected (reading initial state etc.). The 3rd step also takes a while, which is the first coupling step and also first radiation step, so there's probably some initialisation there. But after that, every 6th step is really slow, and the only thing I can think of that happens every 6th step is output (coupling and radiation are every 3rd step).
Has anyone else experience similar issues?
I'm using gribapi 1.28.0. Could switching to eccodes make a difference? Are there any tweaks that can be made to speed up the output?
Best regards
Joakim
2 Comments
Glenn Carver
Hi Joakim,
Operational IFS has a parallel I/O server code, where IFS essentially hands over the data to be GRIB encoded and output in parallel to the file system. The I/O server is a significant piece of code, which requires additional post-processing, and is not included with OpenIFS to keep things simpler.
That means OpenIFS does all it's output through the master MPI task which can introduce some overhead. I have not looked at this recently but some time ago (with OpenIFS 38r1) I did some tests and reckoned the overhead for a "reasonable" parameter output set, all output through the master task was about 10%. I could do a similar test again if you are interested.
For your case, there will of course be a difference in the two runs simply from the extra GRIB encoding the model has to do. This is before any I/O overhead costs. In the first run you are encoding 5x91x4 = 1,820 fields per model day. In the second, it's only 1x91x2 = 182, a factor of 10 difference. GRIB encoding does take some time.
I don't think switching to eccodes will make much difference as you have the latest grib-api release, but you could try it. How have you compiled grib-api, is it using optimization? If grib-api was compiled at -O0 for instance that might have an impact on the encoding speed.
The underlying filesystem also might be playing a role. It would be interesting to see the performance difference for 1, 2, 3, 4, 5, etc output variables to see if the time increases linearly with output size (and rate) or whether there is a jump at, say, 5 output variables because the filesystem response suddenly changes.
Other things you could potentially do is reduce the number of model (or pressure) levels you output on. The NAMFPC namelist lets you choose which levels are output.
Another option is the change the grib packing per field. Typically either 12 or 16 bit packing is used if I remember right. This can be changed in the model but I wouldn't recommend it without first testing what impact this might have on your diagnostics.
The DrHook tracing utility (see OpenIFS Documentation) can be used to generate a profile trace so you could see where the time was being spent and compared between the two runs.
As you know, there are plans to make the XIOS I/O server available with OpenIFS 43r3, but there is still considerable development needed before it's ready. When it is, you'll have the added benefit of direct netcdf output.
Cheers, Glenn
Glenn Carver
Hi Joakim,
Here's a link to a page that may be of interest: http://cms.ncas.ac.uk/wiki/Projects/OpenIFS-IO This was a project I carried out sometime ago with people at the university of Reading. We took part of the operational IFS I/O library, called the Field Database (FDB) library, and added it to OpenIFS. FDB is not an I/O server, but it allows the model to write in parallel from each of it's MPI tasks. Doing this though complicates the model output because each output parameter is now split across multiple files. A post-processing step is needed to reorganise all the GRIB records (a single model level is encoded into a single GRIB record).
You can see from the results we were using the highest model resolution with 1hrly output. As the OpenIFS I/O is serial, the percentage cost of the model output increases as more processors are being used (so the cost of the parallel code sections goes down by comparison). At T159, this will not be as severe though as you show it's still noticeable.
Cheers, Glenn