There are many reasons why a the server is terminated
- The machine where the server is running crashes
- The machines is subject to maintenance, and needs to be shut down and then restored. i.e. operating system upgrade
- There is a new version of ecflow
When this happens, any scheduled task that were due to run over the downtime period are missed.
This has now been fixed in ecflow 5.
When the server is restarted, the checkpoint point is read. We determine how out of date the suite calendars are with the real clock
If they are only out of date by less than an hour, the suites, time attributes will catch up to real time.
Hence:
- scheduled task over the last hour should run,
- expired auto-cancelled nodes will be removed
- expired auto-archived nodes will be archived
- expired late tasks will be flagged
However there are a few limitations:
- When we have a time series, if the server is restored before the time series expires, then the job will only run once
If the server is restored after the time series has expired, then the job will not run.
time 10:00 12:00 00:30 # server is restored after 12:00 then then task will still not run.