--
You received this message because you are subscribed to the Google Groups "User Level Fault Mitigation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ulfm+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ulfm/76a14f57-7f8a-44a5-a5bb-e61d4df904a2%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "User Level Fault Mitigation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ulfm+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ulfm/fccfa62b-de53-4a9a-bb18-322854f73695%40googlegroups.com.
Hi Pedro,Try to set the —enable-recovery flag on the external mpirun. It might be enough.Another approach could be to `singularity run mpirun -np x executable`while at the same time using -mca orte_launch_agent=`singularity run orted`; the singularity environment should then be inherited by the executable as well.
Best,Aurelien
On Aug 27, 2019, at 14:13, Pedro Henrique Rosso <pedro...@gmail.com> wrote:
Hey Aurelien, thanks for the reply,I just followed the Singularity 3.3 User Guide which is similar that you posted but for the newest version of Singularity. I figured out that calling mpirun in host to launch Singularity containers uses the host OpenMPI interface for process management, not the ULFM's one, and then, I can't use some features of ULFM, such as not cleaning up the mpi job when a process dies.Maybe I just have to try a new approach, something like launching instances of the container with ULFM and the launching my program inside the containers via passworless ssh or something.Sincerely, Pedro.--
You received this message because you are subscribed to the Google Groups "User Level Fault Mitigation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ul...@googlegroups.com.