Warn if no slaves appear!

17 views
Skip to first unread message

chris....@gmail.com

unread,
Sep 30, 2015, 6:31:04 AM9/30/15
to SMufin
Hi devs,

Just got asked to take a look at a Smufin process that had been running for 3 weeks without apparently achieving anything. Turns out it was launched with -np 1 but --cpus_per_node=96. Clearly the user expected this to use one *core* on the machine as the master and the other 95 as slaves, but of course in fact the master is always single-threaded and only the slaves (the non-existent ranks 1+) would actually use that setting.

The user hadn't noticed the problem however as Smufin busy-waits with zero free CPUs, rather than exiting promptly when NUM_NODES ends up zero. I suggest you should add a check there.

Chris

montse

unread,
Oct 5, 2015, 6:07:15 AM10/5/15
to SMufin
Hi Chris,

Thank you for your suggestion. We will take a look.

Regards,
SMuFin
Reply all
Reply to author
Forward
0 new messages