--
You received this message because you are subscribed to the Google Groups "mpi4py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mpi4py+unsubscribe@googlegroups.com.
To post to this group, send email to mpi...@googlegroups.com.
Visit this group at https://groups.google.com/group/mpi4py.
To view this discussion on the web visit https://groups.google.com/d/msgid/mpi4py/6c599b36-6669-4f68-aad6-db6a91130bc1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I'm 90% sure this is an mvapich2 issue and not a slurm issue. I'm not sure why the scheduler should be affecting MPI. I've had a bit of trouble working with mvapich2 + mpi spawning recently. Things that have popped up include using mpirun_rsh, mpiexec.hyrda, and setting environment variables to ensure spawning is allowed. I haven't solved the problem yet (on Stampede), so I don't know what's going on atm.
On Tue, Dec 20, 2016 at 3:31 AM, Conn O'Rourke <conn.o...@gmail.com> wrote:
Hi guys,
I wonder if anyone has any experience using mpi.spawn with the slurm scheduler?
I'm having difficulty getting spawned processes to run, and I expect it is down to some flag that needs to be passed to the task manager in the submission script, but I can't see what should do the trick.
If anyone has a sample submission script to run a code that uses mpi.spawn I'd be grateful if you could share it with (and explain the flags to) me.
Thanks,
Conn
(ps. using anaconda3 2.50 and mvapich2)
--
You received this message because you are subscribed to the Google Groups "mpi4py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mpi4py+un...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mpi4py+unsubscribe@googlegroups.com.
To post to this group, send email to mpi...@googlegroups.com.
Visit this group at https://groups.google.com/group/mpi4py.
To view this discussion on the web visit https://groups.google.com/d/msgid/mpi4py/7238f6da-2d9b-4fb1-b952-2425b5f4582f%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mpi4py/7238f6da-2d9b-4fb1-b952-2425b5f4582f%40googlegroups.com.
srun hostname -s | sort -u >slurm.hosts
mpiexec.hydra -f slurm.hosts -np 1 python3 ./script_here.py
Unfortunately when I run over multiple nodes the code runs a lot more slowly than on a single node. It really shouldn't given the nature of the code (master-slave with a queue of tasks sent out). Not sure what is causing the problem yet, but hopefully I'll figure it out soon enough. I'll let you know if I do.
As always, if there are any bright ideas feel free to let me know!
mpiexec.hydra -n 1 python ./script_here.py
#!/bin/bash
#SBATCH --account=****#SBATCH --nodes=2#SBATCH --ntasks=48#SBATCH --cpus-per-task=1#SBATCH --output=gprmax_mpi_cpu_2nodes-out.%j#SBATCH --error=gprmax_mpi_cpu_2nodes-err.%j#SBATCH --time=00:05:00#SBATCH --partition=devel
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}module --force purgemodule use /usr/local/software/jureca/OtherStagesmodule load Stages/2018bmodule load Intel IntelMPI
cd /p/project/****/****/gprMaxsource activate gprMax
# Method required for MPI WITH Spawn
srun hostname -s | sort -u > slurm.hosts
mpiexec.hydra -f slurm.hosts -n 1 python -m gprMax user_models/cylinder_Bscan_2D.in -n 47 -mpi 48
This question is off-topic for this list, Please post it over on
the Slurm-users mailing list.
https://slurm.schedmd.com/mail.html
Prentice
--
You received this message because you are subscribed to the Google Groups "mpi4py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mpi4py+un...@googlegroups.com.
To post to this group, send email to mpi...@googlegroups.com.
Visit this group at https://groups.google.com/group/mpi4py.
To view this discussion on the web visit https://groups.google.com/d/msgid/mpi4py/221aae77-0813-4fdd-82fe-ad27cf7f0f8e%40googlegroups.com.