I have an image I've been using successfully on a few different hosts. However, there is one host where I get mpi errors from within singularity.
Note that other mpi applications work fine on this host outside of
singularity, so it seems mpi is correctly configured on the host
itself.
I can successfully run "mpirun -n 2 whoami" from inside the singularity image, and it works as expected.
However, when I try any other applications that use the mpi (without mpirun, for example gmsh and fenics), I consistently get the following group of messages:
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
detected:
Local host: myhost
Device name: i40iw0
Device vendor ID: 0x8086
Device vendor part ID: 14290Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
No OpenFabrics connection schemes reported that they were able to be
used on a specific port. As such, the openib BTL (OpenFabrics
support) will be disabled for this port.
Local host: myhost
Local device: i40iw0
Local port: 1
CPCs attempted: udcm
--------------------------------------------------------------------------
--------------------------------------------------------------------------
A process has executed an operation involving a call to the
"fork()" system call to create a child process. Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your job may hang, crash, or produce silent
data corruption. The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.The process that invoked fork was:
Local host: [[12513,1],0] (PID 203581)
If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
[myhost:203617] 1 more process has sent help message help-mpi-btl-openib.txt / no device params found
[myhost:203617] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[myhost:203617] 1 more process has sent help message help-mpi-btl-openib-cpc-base.txt / no cpcs for port
Here is one additional piece of information, which may or may not be relevant: to use mpi outside of singularity, an environment module must be loaded. Of course, from inside the image I can't load that external module. But the whoami command above works anyway, so maybe the external module doesn't matter.
Any suggestions?
Thanks,
Tom Pace