HI Matthias,
It looks like the Open MPI in the containers was not built with PMI1 or PMI2 support, so its defaulting to using PMIx.
You are seeing this error message because the call within Open MPI 4.1.x’s runtime system to PMIx_Init returned an error.
Namely that there was no PMIx server to connect to.
Not sure why the behavior would have changed between your SLURM variants.
If you run
srun –mpi=list
does it show a pmix option?
If not you need to rebuild slurm with the –with-pmix config option. You may want to check what pmix library is installed in the containers and if possible use that version of PMIx when rebuilding SLURM.
Howard
♥️
Davide DelVento reacted via Gmail
--
slurm-users mailing list -- slurm...@lists.schedmd.com
HI Matthias,
If in fact you do need to build in pmix support in SLURM, remember to either use the –mpi=pmix option on the srun command line or set the SLURM_MPI_TYPE env. variable to pmix.
You can actually build multiple variants of the pmix plugin each using a different verson of pmix in case you need that.
Our admin has this setup for our slurm 24.05.2 :
hpp@foobar:~>srun --mpi=list
MPI plugin types are...
none
cray_shasta
pmi2
pmix
specific pmix plugin versions available: pmix_v4,pmix_v5
This is getting in to details I’m not familiar with though.
Howard