Willem,
We think the problem is created by a delay in the shared file system. When the job finishes, the heartbeat information is not yet synced to the file system and makeflow incorrectly thinks that the job was lost. For now we think it should be safe for you to run with the disable heartbeat command line option. (It only comes into play when the batch system itself terminates the job, e.g., because of a timeout.)
Since your jobs are finishing correctly anyway, we think we can improve the heartbeat check to take this into account, but we need to do some tests. In any case, something you may need to do is to run the workflows with --wait-for-files-upto=N (where N is number of seconds, e.g. 60) so that you give the file system a chance to sync.
It may be worth it to run a test with --wait-for-files-upto=60 and without the --disable-heartbeat option, although it won't make a difference if the problem is not that the heartbeat file is not created, but rather that it is not updated on time.
Ben