Dear all,
I’m trying to solve the thermal diffusion and Stokes flow problem simultaneously, similar to the step-32. I opened two threads to solve thermal diffusion and Stokes equations during solving by linear solvers (the former is solved by CG solver and the latter is solved by GMRES solver) after assembling the matrices and RHS. I encountered a problem when I using parallel computing in combination with distributed and shared memory.
The code is like following:
assemble_thermal_diffusion_system();
assemble_stokes_flow_system ();
Threads::Task<void> task =
Threads::new_task (&Problem<dim>::solve_thermal_diffsuion, *this);
solve_stokes_flow ();
task.join();
The program ran successfully with single process and showed a high performance of thread parallel computing, unfortunately, when I ran the program using several processes I got following error. I have googled this error and can not find any solution. Could someone help me?
Thanks in advance!
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
[warn] opal_libevent2021_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
Thank you very much.
Best,
Jack
I appreciated it very much for your responses.
The Matrices and Vectors are of TrilinosWrappers, the solvers should use MPI, too. I used the OpenMPI 1.8.1. The release date is Apr 22 2014, later than the post on the Github. I initialized the MPI as following
try
{
using namespace dealii;
using namespace problem;
Utilities::MPI::MPI_InitFinalize mpi_initialization (argc, argv,
numbers::invalid_unsigned_int);
Problem<2> app;
app.infofile = argv[1];
app.run ();
}
So, It seems this error is because I called two functions which use MPI during solving thermal diffusion and stokes flow equations.
According to this post https://github.com/open-mpi/ompi/issues/1081
“It turns out that if one tries to use Open MPI with MPI THREAD MULTIPLE (at least master) and actually has multiple threads making concurrent MPI calls, one gets a flood of messages of the following ilk:
opal_libevent2022_event_base_loop: reentrant invocation. Only one event_base_loop can run on each event_base at once.
showing up in stderr.
This needs to be fixed before release of
v2.x”
I should update my OpenMPI to the latest version and such problem will be avoided and I do not need to change my code for initialization of MPI and TBB myself?
Thanks for your kind help!
Regards,
Jack
Hi Prof. Wolfgang,
Thanks so much!
“The way to do this is to
clone the MPI communicator you use for your overall
problem
once for each linear system. ”
That means for my problem I have to copy the Vector and Matrix of one linear system(either the thermal diffusion or stokes flow) to another Vector and Matrix which are built with a clone of present communicator before starting a thread to solve this linear system.
And then I have to copy the solution of this linear system to the targeted Vector which is built using the present MPI communicator for assembling systems.
By the way is it possible to copy Vector and Matrix of TrilinosWrappers to another ones those are initialized with another MPI communicator?
Thanks,
All best,
Jack
Hi Prof. Wolfgang,
Many thanks!
Originally, I suspected that when solving two linear systems simultaneously by two threads would reduce time. But now it seems that this idea increases the complexity of communication between MPI communicators and coding, and also not surely to decrease computing time. I’m not going to try this idea at present. When I’m free, I will test its feasibility and post here the result.
Thank you very much!
All my best wishes,
Jack
Many thanks for your encouragement.
I’m neither an expert on MPI nor on thread. I underestimated the difficulty.
This may take a very long time for me to figure out.
Hopefully some experts on these parallel computing regimes would like to address such a curious problem soon.
Thanks!