Problems using MPI : program executed n times on n processors to execute the same thing

410 views
Skip to first unread message

Bastien Lauras

unread,
Apr 1, 2016, 4:34:57 PM4/1/16
to deal.II User Group
Hi!

I had quite a few problems to run step-17 and step-18 of the tutorial. I think I have installed everything needed (BLAS, Lapack, MPI, petsc, metis). Though, when I run the "make test" commmand after the "make install" and "cmake ..." in build directory of Deal.II, I get the following error while running test 4 (mpi.debug) :

[100%] Built target mpi.debug
mpi.debug: RUN failed. Output:
 Hi from 0/1
 Hi from 0/1
ERROR: process does not see nproc=2!
ERROR: process does not see nproc=2!
--------------------------------------------------------------------------
mpiexec noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------


mpi.debug: ******    RUN failed    *******

===============================    OUTPUT END   ===============================
Expected stage PASSED - aborting
CMake Error at /home/bastien/dealii-8.4.0/cmake/scripts/run_test.cmake:140 (MESSAGE):
  *** abort

Moreover, when I try to run step-17, the "cmake ." command runs without problem, as the "make", but when I type "mpirun -np 4 ./step-17", the code seems to be executed 4 times on 4 processes. I mean, every process works on every degree of freedom and so on. In fact, I get the following output :

mpirun -np 4 ./step-17
Cycle 0:
Cycle 0:
   Number of active cells:       64
   Number of active cells:       64
Cycle 0:
   Number of degrees of freedom: 162 (by partition:   Number of degrees of freedom: 162 (by partition: 162)
 162)
   Number of active cells:       64
Cycle 0:
   Number of degrees of freedom: 162 (by partition: 162)
   Number of active cells:       64
   Number of degrees of freedom: 162 (by partition: 162)
   Solver converged in 9 iterations.
   Solver converged in 9 iterations.
Cycle 1:
Cycle 1:
   Solver converged in 9 iterations.
   Number of active cells:       124
 [.......]

It's not the one I should have, it seems that every process execute the code, and  n_mpi_processes = 1, and this_mpi_process is always equal to 0.
What can be the problem? Are both issues related?

Many thanks for your help.

Bastien

Bastien Lauras

unread,
Apr 1, 2016, 5:40:36 PM4/1/16
to deal.II User Group
Moreover, if I display the number of threads I seem to have by adding this line in my run method :

  void TopLevel<dim>::run ()
  {
    pcout << "Number of threads : " <<  MultithreadInfo::n_threads() <<  std::endl;  

I have the following answer (times the number of thrads I wanted to run the program on:

Number of threads : 1 

So, why is this value false? It is the same thing if I display the value of n_mpi_processes.

Many thanks for your answer beforehand!

Timo Heister

unread,
Apr 2, 2016, 6:25:42 AM4/2/16
to dea...@googlegroups.com
Do you have more than one MPI library installed? Can you post the
output of "which mpirun" and "which mpic++" (or whatever your compiler
is called)? Also give us the output of "grep MPI detailed.log" in your
deal.II build directory.
> --
> The deal.II project is located at http://www.dealii.org/
> For mailing list/forum options, see
> https://groups.google.com/d/forum/dealii?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "deal.II User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dealii+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Timo Heister
http://www.math.clemson.edu/~heister/

Bastien Lauras

unread,
Apr 4, 2016, 10:51:19 AM4/4/16
to deal.II User Group
Hi, Thanks for answering!

Here are the outputs :

bastien@PC-Bastien:~$ which mpirun
/usr/bin/mpirun
bastien@PC-Bastien:~$ which mpic++
/usr/bin/mpic++

bastien@PC-Bastien:~/build$ grep MPI detailed.log
#        CMAKE_CXX_COMPILER:     GNU 4.8.4 on platform Linux x86_64
#        CMAKE_C_COMPILER:       /usr/bin/cc
#        CMAKE_Fortran_COMPILER: /usr/bin/gfortran
#        DEAL_II_WITH_MPI set up with external dependencies
#            MPI_VERSION = 3.0
#            MPI_C_COMPILER = /usr/bin/mpicc
#            MPI_CXX_COMPILER = /usr/bin/mpicxx
#            MPI_Fortran_COMPILER = /usr/bin/mpif90
#            MPI_CXX_FLAGS = -fstack-protector
#            MPI_LINKER_FLAGS = -Wl,-Bsymbolic-functions  -Wl,-z,relro
#            MPI_INCLUDE_DIRS = /usr/include/mpich
#            MPI_USER_INCLUDE_DIRS = /usr/include/mpich
#            MPI_LIBRARIES = /usr/lib/x86_64-linux-gnu/libmpichcxx.so;/usr/lib/x86_64-linux-gnu/libmpichf90.so;/usr/lib/x86_64-linux-gnu/libmpich.so;/usr/lib/x86_64-linux-gnu/libopa.so;/usr/lib/x86_64-linux-gnu/libmpl.so;rt;/usr/lib/libcr.so;pthread

I probably have more than one MPI library installed. I had quite a hard time to install it.
Should I uninstall everything (including Deal.II) and install everything again?

Many thanks,
Bastien
Message has been deleted

Bastien Lauras

unread,
Apr 4, 2016, 11:23:12 AM4/4/16
to deal.II User Group

And here is what I have in my bin folder :


Bruno Turcksin

unread,
Apr 4, 2016, 11:31:16 AM4/4/16
to deal.II User Group
Bastien,

Can you try mpirun --version  it should say MPICH However, I strongly advise you to have either openmpi or mpich installed but not both. Having both installed will lead to hard to understand bug.

Best,

Bruno

Bastien Lauras

unread,
Apr 4, 2016, 12:19:34 PM4/4/16
to deal.II User Group
Bruno,

The first time I ran mpirun --version, it said :
mpirun (Open MPI) 1.6.5


So I uninstalled openmpi and mpich, and then re-installed mpich. Now, mpirun --version gives me :

HYDRA build details:
    Version:                                 3.0.4
    Release Date:                            Wed Apr 24 10:08:10 CDT 2013
    CC:                              cc -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wl,-Bsymbolic-functions -Wl,-z,relro 
    CXX:                             c++ -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wl,-Bsymbolic-functions -Wl,-z,relro 
    F77:                             gfortran -g -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro 
    F90:                             gfortran  -Wl,-Bsymbolic-functions -Wl,-z,relro 
    Configure options:                       '--disable-option-checking' '--prefix=/usr' '--build=x86_64-linux-gnu' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--disable-dependency-tracking' '--enable-shared' '--enable-fc' '--disable-rpath' '--disable-wrapper-rpath' '--sysconfdir=/etc/mpich' '--libdir=/usr/lib/x86_64-linux-gnu' '--includedir=/usr/include/mpich' '--docdir=/usr/share/doc/mpich' '--with-hwloc-prefix=system' '--enable-checkpointing' '--with-hydra-ckpointlib=blcr' 'build_alias=x86_64-linux-gnu' 'MPICHLIB_CFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security' 'MPICHLIB_CXXFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security' 'MPICHLIB_FFLAGS=-g -O2' 'MPICHLIB_FCFLAGS=-g -O2' 'CFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -O2' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro ' 'CPPFLAGS=-D_FORTIFY_SOURCE=2 -I/build/buildd/mpich-3.0.4/src/mpl/include -I/build/buildd/mpich-3.0.4/src/mpl/include -I/build/buildd/mpich-3.0.4/src/openpa/src -I/build/buildd/mpich-3.0.4/src/openpa/src -I/build/buildd/mpich-3.0.4/src/mpi/romio/include' 'CXXFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -O2' 'F77=gfortran' 'FFLAGS=-g -O2 -g -O2 -O2' 'FC=gfortran' '--cache-file=/dev/null' '--srcdir=.' 'CC=cc' 'LIBS=-lrt -lcr -lpthread '
    Process Manager:                         pmi
    Launchers available:                     ssh rsh fork slurm ll lsf sge manual persist
    Topology libraries available:            hwloc
    Resource management kernels available:   user slurm ll lsf sge pbs cobalt
    Checkpointing libraries available:       blcr
    Demux engines available:                 poll select
 
I tried to run the step-17 tutorial, I had the same output than before (like : Number of degrees of freedom: 570 (by partition: 570) )
Should I recompile Deal.II now?

Thanks for your help.

Bastien

Bruno Turcksin

unread,
Apr 4, 2016, 12:31:23 PM4/4/16
to dea...@googlegroups.com
Bastien,

2016-04-04 12:19 GMT-04:00 Bastien Lauras <laur...@umn.edu>:
 
I tried to run the step-17 tutorial, I had the same output than before (like : Number of degrees of freedom: 570 (by partition: 570) )
Should I recompile Deal.II now?
Unfortunately yes. You will need to recompile every library that uses MPI (PETSC, p4est, etc.) to make sure they all use the same version of MPI. For deal, you will also need to remove the build directory and create a new one.

Best,

Bruno

Bastien Lauras

unread,
Apr 4, 2016, 12:38:49 PM4/4/16
to deal.II User Group
Bruno,

Sure, I'll do it, even if it's quite long! Thank you for the help.
I keep in touch,

Bastien

Bastien Lauras

unread,
Apr 4, 2016, 2:50:00 PM4/4/16
to deal.II User Group
Hi Bruno,

It worked, everything's now running on 8 threads!
Thank you so much for the help.
By the way, the fourth test launched during the installation of Deal.II worked this time.

Have a good afternoon,

Bastien

Bastien Lauras

unread,
Apr 4, 2016, 3:35:45 PM4/4/16
to deal.II User Group
Hi,

I've found the root of the problem.
After doing everything you said me, ParaView wasn't working anymore on my computer.
So I uninstalled it and reinstalled it. And I had the same errors in step 17 as at the beginning!
I uninstalled ParaView, and everything was working well again.

Thus, to install Deal.II with MPI and ParaView, one needs to remove the add-ons :

I don't know which one wasn't working with Deal.II, but it's not only the first one (I tried with the second and the third one checked and it didn't work).


Thanks again for the help,


Bastien

Bastien Lauras

unread,
Apr 5, 2016, 12:18:43 AM4/5/16
to deal.II User Group
In fact, I can't even install ParaView this way. I tried to uninstall everything, install ParaView, Petsc with openmpi (--download-openmpi=1) and not mpich, and then Deal.II, but mpi was not working anymore on Deal.II.
So I uninstalled everything (once again), reinstalled petsc with mpich installed before on my computer (downloaded in the software manager), and reinstalled Deal.II. Everything is now working, but if I want to use MPI I need to uninstall ParaView.
I think I should install ParaView manually forcing it to take MPICH and not openmpi but it's quite hard.

Wolfgang Bangerth

unread,
Apr 5, 2016, 7:36:39 AM4/5/16
to dea...@googlegroups.com
On 04/04/2016 11:18 PM, Bastien Lauras wrote:
> In fact, I can't even install ParaView this way. I tried to uninstall
> everything, install ParaView, Petsc with openmpi (--download-openmpi=1) and
> not mpich, and then Deal.II, but mpi was not working anymore on Deal.II.
> So I uninstalled everything (once again), reinstalled petsc with mpich
> installed before on my computer (downloaded in the software manager), and
> reinstalled Deal.II. Everything is now working, but if I want to use MPI I
> need to uninstall ParaView.

Yes, using different MPI versions on the same system is asking for trouble.


> I think I should install ParaView manually forcing it to take MPICH and not
> openmpi but it's quite hard.

Just download the binary versions of ParaView from
http://www.paraview.org/
It will probably not be linked against MPI, but you would only need this if
you want to visualize in parallel, and that is only necessary for truly large
simulations.

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@math.tamu.edu
www: http://www.math.tamu.edu/~bangerth/

Bastien Lauras

unread,
Apr 5, 2016, 9:26:39 PM4/5/16
to deal.II User Group
Hi,
Sure, I've installed ParaView this way (it's a bit harder than with the Software Manager!), and everything's now running well.
Thanks again for your help.

Bastien

Bastien Lauras

unread,
Apr 11, 2016, 11:16:31 AM4/11/16
to Wolfgang Bangerth, dea...@googlegroups.com
Hi,

I think you are aware of it, but the Deal.II website is now longer reachable, because of an invalid security certificate:

dealii.org uses an invalid security certificate.
The certificate expired on 04/08/2016 07:54 AM.
The current time is 04/11/2016 10:13 AM.
Error code: SEC_ERROR_EXPIRED_CERTIFICATE

Hope you'll find a way to solve it!
Have a good day,

Bastien

Bruno Turcksin

unread,
Apr 11, 2016, 12:26:48 PM4/11/16
to dea...@googlegroups.com
Bastien,

2016-04-11 11:16 GMT-04:00 Bastien Lauras <laur...@umn.edu>:
> I think you are aware of it, but the Deal.II website is now longer
> reachable, because of an invalid security certificate:
We know about it thanks. What works for me is to go to
http://dealii.org instead of https://dealii.org using a private
window.

Best,

Bruno

Aquaman

unread,
Sep 23, 2018, 7:23:38 PM9/23/18
to deal.II User Group
Bastien,

How can you uninstall MPI in a safe way? If I uninstall MPI, some related ( important ) packages will be broken.

Best,
Yaakov   
Reply all
Reply to author
Forward
0 new messages