[slurm-users] --no-alloc breaks mpi?

136 views
Skip to first unread message

O'Grady, Paul Christopher

unread,
Mar 8, 2021, 3:57:08 PM3/8/21
to slurm...@lists.schedmd.com
Hi,

I’m having an issue with srun's --no-alloc flag with mpi which I can reproduce with a fairly simple example.  When I run a simple one-core mpi test program as “slurmUser” (the account that has the --no-alloc privilege) it succeeds:

srun -p psfehq -n 1 -o logs/test.log -w psana1507 python ~/ipsana/mpi_simpletest.py

However when I add the --no-alloc flag it fails in a way that appears to break mpi (see logfile output and other slurm/mpi info below).  It fails similarly on 2 cores. 

srun --no-alloc -p psfehq -n 1 -o logs/test.log -w psana1507 python ~/ipsana/mpi_simpletest.py
srun: do not allocate resources
srun: error: psana1507: task 0: Exited with exit code 1

Would anyone have any suggestions for how I could make the “--no-alloc” flag work with mpi?  Thanks!

chris

------------------------------------------------------------------------------------------------------

Logfile error with --no-alloc flag:

(ana-4.0.12) psanagpu105:batchtest_slurm$ more logs/test.log
--------------------------------------------------------------------------
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM support. This usually happens
when OMPI was not configured --with-slurm and we weren't able
to discover a SLURM installation in the usual places.

Please configure as appropriate and try again.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[psana1507:13884] Local abort before MPI_INIT completed completed successfully, 
but am not able to aggregate error messages, and not able to guarantee that all 
other processes were killed!
(ana-4.0.12) psanagpu105:batchtest_slurm$ 

System information:

(ana-4.0.12) psanagpu105:batchtest_slurm$ conda list | grep mpi
mpi                       1.0                     openmpi    conda-forge
mpi4py                    3.0.3            py27h9ab638b_1    conda-forge
openmpi                   4.1.0                h9b22176_1    conda-forge


(ana-4.0.12) psanagpu105:batchtest_slurm$ srun --mpi=list
srun: MPI types are...
srun: cray_shasta
srun: none
srun: pmi2
srun: pmix
srun: pmix_v3
(ana-4.0.12) psanagpu105:batchtest_slurm$ srun --version
slurm 20.11.3
(ana-4.0.12) psanagpu105:batchtest_slurm$ 



Pritchard Jr., Howard

unread,
Mar 8, 2021, 4:35:24 PM3/8/21
to Slurm User Community List

Hi Chris,

 

What’s happening is that there’s no SLURM_JOBID (my speculation since I don’t have perms to use –no-alloc) is set, but SLURM_NODELIST may be set, so its confusing ORTE.

Could you list which SLURM env variables are set in the shell in which your running the srun command?

 

Howard

O'Grady, Paul Christopher

unread,
Mar 8, 2021, 9:37:16 PM3/8/21
to slurm...@lists.schedmd.com


On Mar 8, 2021, at 1:35 PM, slurm-use...@lists.schedmd.com wrote:

What?s happening is that there?s no SLURM_JOBID (my speculation since I don?t have perms to use ?no-alloc) is set, but SLURM_NODELIST may be set, so its confusing ORTE.

Could you list which SLURM env variables are set in the shell in which your running the srun command?

Howard,

I believe you are correct.  Once I set SLURM_JOBID then ORTE starts functioning again with the --no-alloc option.  Since you asked (and for completeness) I include the list of environment variables that were different with/without --no-alloc below, but my tests show that jobid seems to be the magic one, as you predicted.

I guess I will manufacture an artificial job id for our “--no-alloc” runs, but if anyone is aware of any dangers lurking in the shadows from that approach I would be interested.

Thanks for the guidance ... impressive that you could identify the issue so quickly!

chris

----------------------------------------------------------

SLURM_JOB_CPUS_PER_NODE=1
SLURM_JOB_ID=25300
SLURM_JOBID=25300
SLURM_JOB_NUM_NODES=1
SLURM_JOB_PARTITION=psfehq
SLURM_JOB_QOS=normal
SLURM_CPUS_ON_NODE=1

Reply all
Reply to author
Forward
0 new messages