On 31 Jul 2017, at 5:45 am, Manuel López-Ibáñez <manuel.lo...@manchester.ac.uk> wrote:
Hi Markus,
This is a segfault crash in R and the error is happening at the call to target-runner, but it is difficult to say what may be causing it. Is this the only output? In theory, irace cannot crash R, because we do not load any compiled code. Also, in theory, your target-runner cannot crash irace and much less crash R, you should get the usual error saying "this is not a bug in irace…".
Still, several things may cause R to crash:
* irace is installed (byte-compiled) using one version of R (in your submit node) and loaded with a (perhaps older) version of R (in your execution node).
* The cluster system (via system limits or cluster limits) is killing the R process (or some child of the R process) because either irace or your program is consuming too much memory, or too much disk space or spawning too many children or…
* A bug in R. Those exists but I don't know about this one.
Does it ever happen when running outside SLURM?
Could you try to reproduce the crash when irace is running under valgrind? Just pre-prend: "valgrind --error-exitcode=1 --log-file='irace-%p' --trace-children=yes" to your call to irace.
Cheers,
Manuel.--
You received this message because you are subscribed to a topic in the Google Groups "The irace package: Iterated Racing for Automatic Configuration" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/irace-package/HCOMUNlxNiA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to irace-packag...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/irace-package/9022f75d-cc0d-4bbd-b5fc-496e6c21ca04%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi Manuel,Thanks a lot for this.
On 31 Jul 2017, at 5:45 am, Manuel López-Ibáñez <manuel.lopez-ibanez@manchester.ac.uk> wrote:Hi Markus,
This is a segfault crash in R and the error is happening at the call to target-runner, but it is difficult to say what may be causing it. Is this the only output? In theory, irace cannot crash R, because we do not load any compiled code. Also, in theory, your target-runner cannot crash irace and much less crash R, you should get the usual error saying "this is not a bug in irace…".On another cluster, I am getting "this is not a bug in irace..” messages, and I am living with them, although I don’t like them. The segfault messages are new on this cluster.
Still, several things may cause R to crash:
* irace is installed (byte-compiled) using one version of R (in your submit node) and loaded with a (perhaps older) version of R (in your execution node).I am asking my admins to update R now. As I said, this happens randomly. Maybe the configuration of the compute nodes is not consistent...
* The cluster system (via system limits or cluster limits) is killing the R process (or some child of the R process) because either irace or your program is consuming too much memory, or too much disk space or spawning too many children or…
* A bug in R. Those exists but I don't know about this one.
Does it ever happen when running outside SLURM?I have not noticed it. I know that I should debug this on my laptop. There is a slight technical problem: the crash probability is about 1/3 in runs that consume a couple CPU weeks, so it is a little tricky to debug this on a 2-core laptop.Could you try to reproduce the crash when irace is running under valgrind? Just pre-prend: "valgrind --error-exitcode=1 --log-file='irace-%p' --trace-children=yes" to your call to irace.I am asking my admins to install valgrind. Will let you know, but this will take a while…