salloc --cpus-per-task=8 --ntasks=2 /opt/openmpi-1.4/bin/mpirun -n 2
-report-bindings /test.mpi
However, when mpirun tries to allocate the resources, it can't maybe
because slurm is consuming them?
--------------------------------------------------------------------------
All nodes which are allocated for this job are already filled.
-------------------------------------------------------------------------
Building an mpi based executable requires specifying mpi value to -x10rt option on the x10c++ command-line. In addition, you must also link with pmi library, which is part of Slurm installation. This linkage ensures that, MPICH2 (default MPI distribution available on Three Musketeers) based executables can be directly launched with Slurm. If you compile on athos with default x10c++ (v 2.0.6) compiler, this is taken care of automatically. For all other cases (when you use your own x10c++ compiler on athos), specify -post option with value "# # -lpmi #" on the x10c++ command-line.
=============================I'm definitely linking to the pmi lib, and it seems like everyone else
is too, based on their compile traces.
john
On Sun, Dec 12, 2010 at 6:02 PM, Vijay Saraswat <vi...@saraswat.org> wrote:
The MPICH2 libs are:
[jmg2016@athos ~]$ rpm -qal mpich2 | grep .so
/usr/lib64/mpich2/lib/libfmpich.so.1
/usr/lib64/mpich2/lib/libfmpich.so.1.2
/usr/lib64/mpich2/lib/libmpich.so.1
/usr/lib64/mpich2/lib/libmpich.so.1.2
/usr/lib64/mpich2/lib/libmpichcxx.so.1
/usr/lib64/mpich2/lib/libmpichcxx.so.1.2
/usr/lib64/mpich2/lib/libmpichf90.so.1
/usr/lib64/mpich2/lib/libmpichf90.so.1.2
john
I run with "srun -N2 -n16 ./BodySystem.mpi -a 16" and get 16 copies of my program.
I think the problem is that
/opt/x10-2.1.0 has the wrong mpich2 configuration, so it cannot
compile working versions (ldd always has mpichcxx.so unlinked)
/opt/x10 uses openmpi, which doesn't work with slurm.
So basically the only solution was to build the x10 dist ourselves or
use the one that you have provided.
Thanks,
john
Chat, John, et al: I was able to run the FRASimpleDist program (Shreedar's example) and get the same output simply by changing my path to X10_PATH=~vj/x1021/trunk/x10.dist/bin/
Vijay: Do you know the difference between the latest release and the SVN head that would have caused things to behave differently?
I was running your version of my program a couple of times. While it runs fine generally, every once in a while, it blows up with the following (below). Is this MPI related and can it be ignored, or is it an issue in the program? Code attached.