I experienced some odd behaviour today with cy40r1 of OpenIFS. I am running an aqua-planet configuration with both Tq63 and Tq106 horizontal resolution. Both experiments worked fine. I then added the grib codes 121 and 122 (Maximum temperature at 2 metre in the last 6 hours and Minimum temperature at 2 metre in the last 6 hours) to the MFPPHY namelist option. This was the only change. With both model resolutions, the model ran successfully (and wrote output files) for 64 days and then crashed. In the NODE file there was this error
NSTEP = 3072 SCAN2M_HPOS P
FIELD IN BUFFER BUT PP NOT REQUESTED ...******
IO-STREAM SETUP - IOTYPE = 2 NUMIOPROCS = 1 CPATH =
ICMGGguiu+003072 MODE=w
and in the error file
13:44:02 STEP 3071 H=1535:30 +CPU= 0.077GRIB_SET_INT 7 endStep 5529600 FAILED -25GRIB_API ERROR MSG: Unable to set stepGRIB_SET_INT 7 endStep 5529600 FAILED -25GRIB_API ERROR MSG: Unable to set stepMPL_ABORT: CALLED FROM PROCESSOR 3 THRD 1MPL_ABORT: CALLED FROM PROCESSOR 4 THRD 1MPL_ABORT: THRD 1 GRIB_SET_VALUE FAILEDMPL_ABORT: THRD 1 GRIB_SET_VALUE FAILED
I have no idea what is causing this. However it is not a big problem for me as I don't really need these two diagnostics. I am just writing this to let others know of this problem.
Victoria
7 Comments
Glenn Carver
Hi Victoria,
Thanks for noting this. It is a known issue though we don't have a fix yet. The problem is the 'endStep' has too high an integer value to be encoded into the grib file, so it's related to the length of the run.
When we have a fix for it, I'll post an update here.
Glenn
Jan Streffing
I just ran into the same error.
Thank you for providing the information about it here. That made it easy to guess that its grib codes 201 and 202 for me.
Best regards,
Jan
Victoria Sinclair
I have just had the same problem again. Same time step (step 2047) but this time the cause of the problem was grib code 49 which is "10 metre wind gust since previous post-processing". Shortname 10fg.
This was with cy40r1. Is there a fix for this yet?
Victoria
Glenn Carver
Hi Victoria,
I never completely tracked down the problem but I think it's related to a grib packing problem with the fields that require a difference between successive timesteps, that happens on the longer runs. One option is to follow what EC-Earth do and restart the model and reset the date offset so the grib packing doesn't see such big numbers. At least, I think that's the problem.
Glenn
Etienne Tourigny
Strangely I had today the same error with openifs43r3 using the T255L91 resolution and a 11 day forecast (no restarts).
I was able to run the forecast when removing 201 and 202 variables from the output namelist.
I think these variables are fairly standard and cannot be recovered from raw output, so it would be nice to fix it soon.
Please let me know if you need more information about the setup. Here is the output of the job runscript.
This happened on day 6 of the forecast. Maybe the initial condition file has something funny?
13:50:36 STEP 791 H= 131:50 +CPU= 0.336
GRIB_SET_INT 8 endStep 475200 FAILED -25
GRIB_API ERROR MSG: Unable to set step
MPL_ABORT: CALLED FROM PROCESSOR 17 THRD 1
Etienne Tourigny
actually I also had to disable the following variable (in addition to 201 and 202 already mentioned above), it now works fine!
Glenn Carver
Hi Etienne,
Yes, this is a known problem with the variables you've identified. It affects the accumulated variables which compute difference between steps. This was originally a problem with OpenIFS 40 but also occurs with 43.
The IFS and OpenIFS code is identical in computing these fields. However, the big difference is the output. OpenIFS writes the fields through the master task only, using the 'older' I/O approach. The IFS writes out via a different method using its parallel I/O server to the Field Database (FDB). When I looked at this before, my suspicion was the different output code parts accounted for the behaviour. I've not had time to get back to look at this as more urgent things tend to take over. This isn't something I've got time to look at any time soon either unfortunately.
Glenn