Error when running OpenFoam over Singularity using Slurm

651 views
Skip to first unread message

ccvera

unread,
Feb 5, 2019, 11:30:02 AM2/5/19
to singularity
Dear all,

I'm experiencing some issues running OpenFOAM over singularity with slurm. 

I've several images based on Ubuntu and within several versions of OpenMPI and PMIx and i'm able to run OpenFOAM correctly without use slurm (directly on the node) using next command:

$ mpirun -n 16 singularity exec -B /home ../../of6/openfoam6.x.img simpleFoam -parallel -case /home/carmen/test_singularity/OpenFOAM/pruebaOF6_16cores_SLURM/pruebaOF6_16cores

My problem comes when I run my program with slurm. Whether I make salloc or execute a script with sbatch, it shows me the following error:

It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "(null)" (-43) instead of "Success" (0)

and

*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[cn3045:369] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

I know I must have the same openMPI versions on both (host and container) and I have also tried other versions of OpenMPI (2.X.X) and in all cases OpenFOAM works correctly, but at the moment I want to run it with slurm it show me the errors.

I have also tried other ways to run the program with srun using the option --mpi=pmi2 (among others) but I always find the same problem.

I use the following script to run OpenFoam:
----------------------------
#!/bin/bash

#SBATCH -N 1
#SBATCH -p haswell 
#SBATCH -J test_OpenFOAM
#SBATCH --output="singularity.%j.o" 
#SBATCH --error="singularity.%j.e"

module load haswell/singularity_2.6.0
module load haswell/openmpi_3.1.2_gcc8.2.0_pmix

ulimit -s unlimited

mpirun -n 16 singularity exec ../../of6/openfoam6.x.img simpleFoam -parallel -case /home/software/test_singularity/OpenFOAM/pruebaOF6_16cores_SLURM/pruebaOF6_16cores
----------------------------

The versions that I'm using are:
Host: 
OS: CentOS7.5
OpenMPI: 3.1.2
PMIx: 2.1.4

Container:
OS: Ubuntu16.04
OpenMPI: 3.1.2
PMIx: 2.1.4

Can it be a configuration problem of SLURM? Is there any limitation of SLURM that is affecting OpenFOAM?

Some info about slurm:
# srun --version
slurm 18.08.3
# srun --mpi=list
srun: MPI types are...
srun: pmi2
srun: openmpi
srun: none

I'm a little bit lost with this issue :(
Can someone give me some lights?

Thanks a lot in advance,
Carmen

Shenglong Wang

unread,
Feb 5, 2019, 11:37:08 AM2/5/19
to singu...@lbl.gov, Shenglong Wang
Can you try to unset all SLURM environment variables?

for e in $(env | egrep ^SLURM_ | cut -d= -f1); do unset $e; done

or

unset SLURM_NODELIST

But you’ll have to manually generate host file.

Best,
Shenglong


--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Shenglong Wang

unread,
Feb 5, 2019, 11:56:27 AM2/5/19
to singu...@lbl.gov, Shenglong Wang
It seems 

unset SLURM_JOBID

is enough to cheat mpiexec

Shenglong

ccvera

unread,
Feb 6, 2019, 4:43:42 AM2/6/19
to singularity, wangs...@gmail.com
Thanks a lot for your quickly reply :)

This solution doesn't work for me. I tried to unset all SLURM environment variables (first SLURM_JOBID, then SLURM_NODELIST and finally all as you told me) and i obtain the same MPI error. 

Carmen

Fatih Ertinaz

unread,
Feb 6, 2019, 9:33:36 AM2/6/19
to singu...@lbl.gov, wangs...@gmail.com
Hello Carmen

To me this looks like an OpenMPI & Slurm issue rather than an OF & Slurm problem. 

Few things you can check;
xx Try to execute simple jobs using Slurm, e.g. printing hostnames or mpi ping-pong stuff. 
xx Do you know how OpenMPI is installed in the host? Maybe it is built with some other underlying libraries regarding IB that you don't have in your container.

I'd say if the first one works with hostnames then I'd say focus on the OpenMPI installation.

ccvera

unread,
Feb 8, 2019, 3:09:26 AM2/8/19
to singularity, wangs...@gmail.com
Hi!

I didn't say it in my first post (sorry) but, in case it serves as information, the problem appears only when I execute OF in parallel (using the -parallel option, that is what I need).

Regarding the options you mention to me, Fatih:
xx I don`t have problems executing simple works (and even some more complicated) e.g. variable printing and information (all without singularity). Also, I run singularity basic programs and, normally, I use then to train CNN (no need MPI) and all work fine.
xx I have replicated the OpenMPI and PMIx host installation in the container, so they have the same versions and libraries were copied. 

In the logs, both slurmd and slurmctl or the nodes logs I'm not seeing nothing that gives me light. 
I think you're right when you tell me that it can be an openmpi problem. I'm trying again to execute a "hello world" on singularity and when requesting several nodes I have the same problems :(

Should Slurm have a special configuration to run mpi programs in parallel with singularity, apart from OpenMPI and PMIx? Also, should I include other configuration in my container? 

Thax for your help.

Carmen.

Samy

unread,
Feb 8, 2019, 2:50:42 PM2/8/19
to singularity, wangs...@gmail.com
Can you try to run the same command on single node and see? like: mpirun -n 1 singularity exec .....
Also, if you have access to interactive mode nodes, it would be a good test to run OF with mpirun interactively on 2 or more nodes. It sounds to me that it's an issue running your OF on multinode not a slurm problem.

Good luck,

Fatih Ertinaz

unread,
Feb 8, 2019, 11:19:16 PM2/8/19
to singu...@lbl.gov
Ok, this is really interesting to me. So as a summary -- if I'm not mistaken:

xx You can run parallel OF on a single node using OMPI through Singularity without Slurm 
xx You can run basic parallel MPI tasks using Slurm with and without Singularity
xx You cannot run multi-node basic parallel jobs using Singularity -- I don't know if you used Slurm, so maybe fails with both

Should Slurm have a special configuration to run mpi programs in parallel with singularity, apart from OpenMPI and PMIx?

I don't think Slurm is the problematic part because of the item 2 above. For sure Slurm needs information about compute nodes and user account, but if that was the issue you wouldn't be able to run any tasks even on a single node. With that being said, I never configured Slurm from scratch, so I am not a Slurm expert.

Also, I don't think this is an OF issue because of the item 1. However you might want to make sure if OF is built with "SYSTEMOPENMPI" option. But again, even though restricted to 1 node, you managed to run it without Slurm so OF should be fine imho. 

If you cannot run multi-node jobs, I guess that's a clear indication of a potential OMPI problem. Check how OMPI is installed, which fabrics are being used etc. Additionally, you can also check if Slurm flag is explicitly defined, sth. like "--with-slurm=/opt/slurm". 

Moreover can you give some information about the cluster you're working on? I mean, is this a typical cluster with many users running their simulations? If that's the case, then I think Slurm or OMPI should be quite reliable. If this is a cloud cluster that you're experimenting, I bet it is OMPI :)

Hope this helps 

ccvera

unread,
Feb 13, 2019, 3:16:36 AM2/13/19
to singularity
I was doing more tests and nothings works, I'm a little desperate with this problem... :(

Thanks Samy for your answer but my problem exists with on single node as well as 4 nodes (64 cores). Running:

singularity exec -B /home ../../of6/openfoam6.MPIx.img mpiexec -n 16 simpleFoam -case /home/carmen/test_singularity/OpenFOAM/pruebaOF5_16cores

I can work on 16 cores correctly, but I understand that I am running the container's OMPI, likewise, I want to launch it on 64 cores for performance and for that (I understand) that I should launch it like this (as I have done with SGE):

mpirun -n $SLURM_NTASKS singularity exec ../../of6/openfoam6.x.img simpleFoam -parallel -case /home/carmen/test_singularity/OpenFOAM/pruebaOF5_16cores

And it doesn't work.

Fatih :)

xx True
xx Not at all. I can run basic programs in parallel using slurm without singularity
xx True. Only fails with Slurm and Singularity

I've tried so many combinations and cases, so I'm sure I'm forgetting some of them.

Background:
- I was working with SGE and everything worked perfectly (recently the cluster in which I work has migrated to Slurm, -is a supercomputing center, not a particular mini-cluster-)
- This same container worked with up to 64 cores without problems.

Some tests:
- Connecting directly to a node by ssh, OF works correctly with 16 cores (w/ 32 too, but having only 16 cores creates oversubscription)
- Doing salloc -N1 -n16 (or salloc -N2 -n32), and then run the program, fails.
- Both with 16 cores and with 32, executing my script (sbatch myscript.sh), fails.
- A simple "hello world" with Slurm doesn't work either, e.g. running:
mpirun -n $SLURM_NTASKS singularity exec -B /home container.img /home/carmen/test_singularity/mpi_slurm/hello_world

I attach the recipe of the last container that I created.

Thanx,

Carmen
----
container_OF.txt

Justin Cook

unread,
Feb 18, 2019, 11:33:03 AM2/18/19
to singularity
Carmen,

It looks like this has a history going now. Can we move this to a github issue?

Can you show us the definition of your hello world container that is not working? Let's start there and see where we get.

Thanks,

Justin

ccvera

unread,
Mar 4, 2019, 4:50:18 AM3/4/19
to singularity
Hello again,

After a long time, I think one of the possible causes is that OpenFOAM works by default with SGE and not with Slurm. Today, in the cluster, they have installed OF and I don't need to use a container anymore. 

Anyway, I will continue testing to see if that is the real reason for the failure with Slurm or not.

Thank you very much for your help!! :)

Carmen

victor sv

unread,
Mar 28, 2019, 3:02:28 PM3/28/19
to singu...@lbl.gov
Hi,

I've not read carefully all the thread, but (for me) the issue seems to be clear in the first mail:

 Some info about slurm:
# srun --mpi=list
srun: MPI types are...
srun: pmi2
srun: openmpi
srun: none
 
You describe how you match OpenMPI/PMI(x) versions in the container and in the host to make it work. You are doing this because you need some level of compatibility between these libraries in order to perform process communications inside and outside the container.

If you use Slurm (srun) as process manager, you are removing the OpenMPI layer outside the container. This should be Ok, but you still have to match the PMI(x) inside and outside the container. If the Slurm you are using supports PMIx (it seems that is not supported), then things could work with the proper versions inside and outsie.

The main issue (I think) is that it seems that the host Slurm is linked with PMI2. As far as I know, PMI2 is vendor dependant. I mean, there is default API, but there is not ABI compatibility between different PMI vendors (Slurm, OpenMPI, MPICH, etc.). To be more explicit, the combination PMI2(Slurm) outside the container and PMI2(OpenMPI) inside the container does not work in parallel multinode jobs.

In my mind the alternatives are:
  - to install exactly the same PMI2 version (implies the same Slurm version) inside the container
  - If the OpenMPI inside the container  is linked agains PMI2, to bindmount (and replace or at least prepend in the library path) the PMI2 from the host inside the container. This solution is dirty,but could work.
  - Continue using OpenMPI mpirun as process manager.

Hope it helps!

Best regards,
Víctor
Reply all
Reply to author
Forward
0 new messages