Running OpenFOAM in parallel

756 views
Skip to first unread message

Shervin Sammak

unread,
Apr 7, 2018, 11:56:29 PM4/7/18
to singularity
Hi all,

I created an ubuntu image with openfoam installation. Within the container (run command), I can run openfoam in parallel. Outside the container (exec command), running in parallel is not possible but I still can run openfoam in serial. I tried  
$ mpirun -n 4 singularity  exec of4.img simpleFoam -parallel
which errors out
/.singularity.d/actions/exec: 146: export: -parallel: bad variable name
/.singularity.d/actions/exec: 146: export: -parallel: bad variable name
/.singularity.d/actions/exec: 146: export: -parallel: bad variable name
/.singularity.d/actions/exec: 146: export: -parallel: bad variable name
and 
$singularity  exec of4.img mpirun -n 4 simpleFoam -parallel
which results in 
/.singularity.d/actions/exec: 146: export: -n: bad variable name

Altough, I put " echo '. /opt/openfoam4/etc/bashrc' >>$SINGULARITY_ENVIRONMENT" in my build recipe, this sounds like an environment variable issue.  Any help on this would be appreciated.

Jason Stover

unread,
Apr 8, 2018, 1:20:07 AM4/8/18
to singu...@lbl.gov
Hi Shervin,

Try in your Def file ... change the /bin/sh symlink to bash instead of dash

So in %post have:

/bin/rm /bin/sh
/bin/ln -s /bin/bash /bin/sh

I'm betting dash doesn't have the '-n' option to export which bash
has. The 'exec' script uses /bin/sh as the shell, so everything needs
to be posix. The openfoam bashrc most definitely has bashism's in it.

-J
> --
> You received this message because you are subscribed to the Google Groups
> "singularity" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to singularity...@lbl.gov.

Shervin Sammak

unread,
Apr 8, 2018, 7:58:22 PM4/8/18
to singu...@lbl.gov
Hi Jason,

Appreciate your help. It solved a part of the problem. On the system (ubuntu 14.04) that I build the container, I can run openfoam in parallel via "mpirun -n 4 singularity exec of4.img simpleFoam -parallel". However, on another machine (RHEL 7), this gives me an error and that is because of MPI incompatibility between two systems. This actually confuses me. If I need to install the same openmpi version on the RHEL7 machine to run OpenFOAM in parallel (I did it and it works), what is the benefits of putting the software in a container in the first place?! 

-----------------------------------------------------------
Shervin Sammak
Research Assistant Professor
Center for Research Computing
University of Pittsburgh
4420 Bayard St
Pittsburgh, PA 15213
E-mail: shervin...@pitt.edu
Website: www.crc.pitt.edu

~ You chase quality and quantity will chase you.



--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Alan Sill

unread,
Apr 8, 2018, 8:11:07 PM4/8/18
to singu...@lbl.gov
Containers work by unsharing namespaces and by management of memory. They cannot alter the kernel itself, so software like MPI that depends on direct calls to kernel functionality must be implemented to be compatible in these calls. That’s the price of (and most of the reason for) the improved speed and efficiency of containers compared to full virtualization. 



--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Jason Stover

unread,
Apr 9, 2018, 1:32:08 AM4/9/18
to singu...@lbl.gov
Hi Shervin,

Beyond what Alan has said, yes the MPI needs to be "the same" on the
host as well as in the container. You're starting MPI on the host, but
your MPI command is running singularity which is then using that host
initiated MPI session to do the MPI work that happens in the program
started inside the container. For OpenMPI if the versions are
different, but they're ABI compatible then it _should_ still work...
still the possibility of something popping up where the two versions
don't talk to each other right exists.

> what is the benefits of putting the software in a container in the
first place

Say you have a program that requires an old library version that no
longer exists in a recent distribution. You could build a container
based on the old distribution, and still run it on new systems since
to the program being executed it is running in the older distribution.
For serial/threaded programs, you shouldn't see an issue no matter
where you run it. You get more restrictions once you start needing to
talk between multiple hosts. If you're running an MPI job on a single
node, you could try starting MPI from within the container. So, in
your example since you're running with '-n 4' (four processes), you
could try running:

singularity exec of4.img mpiexec -n 4 simplefoam -parallel

As long as your MPI install in the container will work on the host,
that should work and be portable. You can test by running hostname, or
similar, to see if it would work. But, once you go multinode that
_will not work_ as you will be *outside* the container when mpiexec
goes to start the binary on another host.

So, in that case we need to start singularity through MPI itself as
you have done, and that gives us the restriction that the host MPI and
container MPI need to be compatible. Us saying "the same version" is
an easy way to cut off any possible issues and is usually pretty easy
to accomplish.

Does that make sense?

-J
>> > email to singularity...@lbl.gov.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "singularity" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to singularity...@lbl.gov.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "singularity" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to singularity...@lbl.gov.

Shervin Sammak

unread,
Apr 9, 2018, 6:49:58 AM4/9/18
to singu...@lbl.gov
Hi Jason,

It makes sence. Since I needed across the node implementation, I installed the same openmpi on the host. It is working very nice. Thanks for your help on this.

Shervin

Priedhorsky, Reid

unread,
Apr 12, 2018, 5:13:44 PM4/12/18
to 'Oliver Freyermuth' via singularity

> the MPI needs to be "the same" on the host as well as in the container

This is only true if you need the host MPI to do something. In this case, you’re using mpirun to start your MPI ranks. But, there are other ways to start your MPI ranks.

For example, if you have Slurm installed, you should be able to:

$ srun -n4 singularity exec ...

In this case you don’t need OpenMPI on the host at all.

What you DO need is for the MPI ranks to be able to find one another. If you start with mpirun, then mpirun starts some daemons called orted (one per node IIRC), and ranks talking to that daemon is where the version dependency comes in.

Slurm does it by providing something called PMI to the ranks. They use this to find one another, no host MPI needed.

(This is how OpenMPI works; other MPI implementations may differ.)

HTH,
Reid

Gregory M. Kurtzer

unread,
Apr 12, 2018, 5:27:33 PM4/12/18
to singularity
This is a fantastic point, thanks Reid!

Do you know if this requires PMIx in both RM and MPI, or does it worth with previous PMI (without 'x') support as well?

Greg

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.



--
Gregory M. Kurtzer
CEO, Sylabs Inc.

victor sv

unread,
Apr 13, 2018, 7:12:21 AM4/13/18
to singu...@lbl.gov
Hi,

some quick comments on this.

In my experience, compatibility between PMI on the host and the container relies on exact the same PMI version. The issue here is the ABI compatibility.

This is something PMIx team is working on. Starting with PMIx 2.1.1, it seems the is some backward compatibility (in particular with  v2.0.3 and v1.2.5 PMIx releases), and they tell that the will provide forward compatibility. (I've to check this. No time till now)

On the OpenMPI side, they also did some work on this direction to support PMIx cross-version compatibility. I think on the host side you should start with OpenMPI 3.X.X to get this working.

You can find more info on this in this thread https://github.com/pmix/pmix/issues/556

Have anyone checked this? I will report if I have time to do it ...

Hope it helps,
Víctor


To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.



--
Gregory M. Kurtzer
CEO, Sylabs Inc.
Reply all
Reply to author
Forward
0 new messages