Segmentation fault in WorkStream::run() for colored iterators in parallel(MPI)

82 views
Skip to first unread message

julius...@gmail.com

unread,
May 24, 2016, 1:18:17 PM5/24/16
to deal.II User Group
Hey guys,

the current MeshWorker::loop() uses the WorkStream::run() which copies in serial.
I tried to use the WorkStream::run() version which uses colored iterators and therefore assembles in parallel, in the sense of a tbb::parallel_for().

My code runs without using MPI as well as using an arbitrary amount of MPI procs with a single thread each. If I use 2 MPI procs with 2 threads each, it crashes due to a segmentation fault.
For the construction of the colored iterators I am using the make_graph_coloring() in such a way that two cells sharing a face have different colors.

The segmentation fault always appears while calling assemble() of the ResidualSimple in simple.h.

Does anybody have a clue why I get a segmentation fault after adding local data to the global dst vector inside assemble(), although I avoid write conflicts by colored iterators?

In the "error_message" I have attached the output my code writes for two MPI procs with two threads, respectively.

Best,
Julius
error_message
main.cc

Wolfgang Bangerth

unread,
May 24, 2016, 4:23:57 PM5/24/16
to dea...@googlegroups.com

Julius,

> the current MeshWorker::loop() uses the WorkStream::run() which copies in serial.
> I tried to use the WorkStream::run() version which uses colored iterators and
> therefore assembles in parallel, in the sense of a tbb::parallel_for().
>
> My code runs without using MPI as well as using an arbitrary amount of MPI
> procs with a single thread each. If I use 2 MPI procs with 2 threads each, it
> crashes due to a segmentation fault.
> For the construction of the colored iterators I am using the
> make_graph_coloring() in such a way that two cells sharing a face have
> different colors.
>
> The segmentation fault always appears while calling assemble() of the
> ResidualSimple in simple.h.
>
> Does anybody have a clue why I get a segmentation fault after adding local
> data to the global dst vector inside assemble(), although I avoid write
> conflicts by colored iterators?

Nothing particularly obvious comes to mind, without having tried to debug this
in more detail.

Have you tried running the program under a debugger? Segmentation faults are
often relatively easy to debug because something is accessing an invalid
address, and the debugger can tell you what object this address corresponds
to, and how this address got into the current function. Try the approach
outlined here, for example:

https://github.com/dealii/dealii/wiki/Frequently-Asked-Questions#how-do-i-debug-mpi-programs

using the `mpirun -np 4 xterm -e gdb ./my_executable` approach.

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@math.tamu.edu
www: http://www.math.tamu.edu/~bangerth/

julius...@gmail.com

unread,
May 25, 2016, 8:42:41 AM5/25/16
to deal.II User Group
Thanks Wolfgang for your kind and fast response.

I simplified my code and use gdb as you suggested.
Now, it uses only one color, the idea being that the copier task is executed in multithread.
I still get the same random crash when using two or more MPI procs with two or more threads, respectively.
Is the implementation of the (colored) WorkStream::run() compatible with a hybrid parallel code-architecture?

Here is the simplified code (also attached):
#include <deal.II/fe/fe_dgq.h>
#include <deal.II/grid/grid_generator.h>
#include <deal.II/meshworker/loop.h>
#include <deal.II/meshworker/simple.h>
#include <deal.II/distributed/tria.h>
#include <deal.II/lac/trilinos_vector.h>
#include <vector>
int main(int argc,char **argv)
{
  dealii
::Utilities::MPI::MPI_InitFinalize mpi_initialization(argc,argv,2);
 
typedef dealii::MeshWorker::DoFInfo<2> DOFINFO ;
 
typedef dealii::MeshWorker::IntegrationInfoBox<2> INFOBOX ;
 
typedef dealii::MeshWorker::Assembler::ResidualSimple<dealii::TrilinosWrappers::MPI::Vector> ASSEMBLER ;
 
typedef typename dealii::DoFHandler<2>::active_cell_iterator ITERATOR ;
  dealii
::parallel::distributed::Triangulation<2> tr(MPI_COMM_WORLD);
 
const dealii::FE_DGQ<2> fe{1};
  dealii
::DoFHandler<2> dof_handler{tr};
  dealii
::GridGenerator::hyper_cube (tr,0.,1.);
  tr
.refine_global(6);
  dof_handler
.distribute_dofs(fe);
  std
::vector<std::vector<ITERATOR> > colored_iterators{1} ;
 
for (auto cell = dof_handler.begin_active();cell != dof_handler.end(); ++cell)
    colored_iterators
[0].push_back(cell);
 
const dealii::IndexSet locally_owned_dofs = dof_handler.locally_owned_dofs();
  dealii
::TrilinosWrappers::MPI::Vector dst(locally_owned_dofs,MPI_COMM_WORLD);
  dealii
::MeshWorker::Assembler::ResidualSimple<dealii::TrilinosWrappers::MPI::Vector> assembler;
  dealii
::AnyData dst_data;
  dst_data
.add<dealii::TrilinosWrappers::MPI::Vector* >(&dst, "dst");
  assembler
.initialize(dst_data);
  DOFINFO dof_info
(dof_handler);
  dealii
::MeshWorker::DoFInfoBox<2, DOFINFO> dinfo_box(dof_info);
  assembler
.initialize_info(dinfo_box.cell, false);
 
for (unsigned int i=0; i<dealii::GeometryInfo<2>::faces_per_cell; ++i){
      assembler
.initialize_info(dinfo_box.interior[i], true);
      assembler
.initialize_info(dinfo_box.exterior[i], true);}
  dealii
::WorkStream::run(colored_iterators,
                          dealii
::std_cxx11::bind(&dealii::MeshWorker::cell_action<INFOBOX, DOFINFO, 2, 2, ITERATOR>,
                                                  dealii
::std_cxx11::_1, dealii::std_cxx11::_3, dealii::std_cxx11::_2,
                                                 
nullptr, nullptr, nullptr, dealii::MeshWorker::LoopControl{}),
                          dealii
::std_cxx11::bind(&dealii::internal::assemble<2,DOFINFO,ASSEMBLER>,
                                                  dealii
::std_cxx11::_1, &assembler),
                          INFOBOX
{}, dinfo_box);
}
The gdb-backtrace (also attached) shows some problem with
memcopy using ssse3 in frame #0 (I checked my cpuinfo and I have ssse3
).

It is very difficult for me to point out the issue just by looking to the backtrace, particularly of its randomness.
I wonder if anybody has had problems with colorized iterators in a hybrid parallel environment?

Thanks in advance, here is the backtrace:

#0  __memmove_ssse3_back ()
    at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1629
#1  0x00007ffff07358d1 in __copy_m<int> (__result=<optimized out>,
    __last=<optimized out>, __first=0x8f2424)
    at /usr/include/c++/4.8/bits/stl_algobase.h:372
#2  __copy_move_a<true, int*, int*> (__result=<optimized out>,
    __last=<optimized out>, __first=0x8f2424)
    at /usr/include/c++/4.8/bits/stl_algobase.h:390
#3  __copy_move_a2<true, int*, int*> (__result=<optimized out>,
    __last=<optimized out>, __first=0x8f2424)
    at /usr/include/c++/4.8/bits/stl_algobase.h:428
#4  copy<std::move_iterator<int*>, int*> (__result=<optimized out>,
    __last=..., __first=...) at /usr/include/c++/4.8/bits/stl_algobase.h:460
#5  __uninit_copy<std::move_iterator<int*>, int*> (__result=<optimized out>,
    __last=..., __first=...)
    at /usr/include/c++/4.8/bits/stl_uninitialized.h:93
#6  uninitialized_copy<std::move_iterator<int*>, int*> (
    __result=<optimized out>, __last=..., __first=...)
    at /usr/include/c++/4.8/bits/stl_uninitialized.h:117
#7  __uninitialized_copy_a<std::move_iterator<int*>, int*, int> (
    __result=<optimized out>, __last=..., __first=...)
    at /usr/include/c++/4.8/bits/stl_uninitialized.h:258
#8  __uninitialized_move_if_noexcept_a<int*, int*, std::allocator<int> > (
    __alloc=..., __result=<optimized out>, __last=<optimized out>,
    __first=0x8f2424) at /usr/include/c++/4.8/bits/stl_uninitialized.h:281
#9  std::vector<int, std::allocator<int> >::_M_insert_aux<int const&> (
    this=0x8ecb40, __position=0) at /usr/include/c++/4.8/bits/vector.tcc:369
#10 0x00007ffff0735a05 in std::vector<int, std::allocator<int> >::insert (
    this=this@entry=0x8ecb40, __position=..., __x=@0x7fffda9c85f8: 1)
    at /usr/include/c++/4.8/bits/vector.tcc:127
#11 0x00007ffff077ff62 in Epetra_FEVector::inputNonlocalValues<int> (
    this=this@entry=0x8eca40, GID=771, numValues=numValues@entry=1,
    values=values@entry=0x7fffda9c8680, suminto=suminto@entry=true,
    vectorIndex=vectorIndex@entry=0)
    at /export/home/jwitte/trilinos-12.6.1-Source/packages/epetra/src/Epetra_FEVector.cpp:427
#12 0x00007ffff077db5d in inputNonlocalValue<int> (vectorIndex=0,
    suminto=true, value=0, GID=<optimized out>, this=0x8eca40)
    at /export/home/jwitte/trilinos-12.6.1-Source/packages/epetra/src/Epetra_FEVector.cpp:371
#13 Epetra_FEVector::inputValues<int> (this=0x8eca40, numIDs=numIDs@entry=1,
    GIDs=GIDs@entry=0x7fffda9c870c, values=values@entry=0x7fffd0002b58,
    vectorIndex=vectorIndex@entry=0, suminto=true)
    at /export/home/jwitte/trilinos-12.6.1-Source/packages/epetra/src/Epetra_FEVector.cpp:296
#14 0x00007ffff077dcc5 in Epetra_FEVector::SumIntoGlobalValues (
    this=<optimized out>, numIDs=numIDs@entry=1,
    GIDs=GIDs@entry=0x7fffda9c870c, values=values@entry=0x7fffd0002b58,
    vectorIndex=vectorIndex@entry=0)
    at /export/home/jwitte/trilinos-12.6.1-Source/packages/epetra/src/Epetra_FEVector.cpp:168
#15 0x00000000004307be in dealii::TrilinosWrappers::VectorBase::add (
    this=this@entry=0x7fffffffb7d0, n_elements=4, indices=0x7fffd00029d0,
    values=0x7fffd0002b40)
    at /mnt/data/jwitte/dealii/install/include/deal.II/lac/trilinos_vector_base.h:1375
#16 0x0000000000430b28 in dealii::TrilinosWrappers::VectorBase::add (
    this=this@entry=0x7fffffffb7d0,
    indices=std::vector of length -1073741823, capacity -1073741823 = {...},
    values=...)
    at /mnt/data/jwitte/dealii/install/include/deal.II/lac/trilinos_vector_base.h:1343
#17 0x0000000000439a4a in dealii::MeshWorker::Assembler::ResidualSimple<dealii::TrilinosWrappers::MPI::Vector>::assemble<dealii::MeshWorker::DoFInfo<2, 2, double> > (this=this@entry=0x7fffffffb8c0, info=...)
    at /mnt/data/jwitte/dealii/install/include/deal.II/meshworker/simple.h:530
#18 0x0000000000439aac in assemble<dealii::MeshWorker::Assembler::ResidualSimple<dealii::TrilinosWrappers::MPI::Vector> > (assembler=..., this=0x7fffd0001030)
    at /mnt/data/jwitte/dealii/install/include/deal.II/meshworker/dof_info.h:458---Type <return> to continue, or q <return> to quit---

#19 dealii::internal::assemble<2, dealii::MeshWorker::DoFInfo<2, 2, double>, dealii::MeshWorker::Assembler::ResidualSimple<dealii::TrilinosWrappers::MPI::Vector> > (dinfo=..., assembler=0x7fffffffb8c0)
    at /mnt/data/jwitte/dealii/install/include/deal.II/meshworker/loop.h:63
#20 0x0000000000437ed9 in __call<void, dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > const&, 0ul, 1ul> (
    __args=<unknown type in /export/home/jwitte/test-colored/build/myloop, CU 0x0, DIE 0xb1296>, this=0x8efd10) at /usr/include/c++/4.8/functional:1296
#21 operator()<const dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> >&, void> (this=0x8efd10)
    at /usr/include/c++/4.8/functional:1355
#22 std::_Function_handler<void (dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > const&), std::_Bind<void (*(std::_Placeholder<1>, dealii::MeshWorker::Assembler::ResidualSimple<dealii::TrilinosWrappers::MPI::Vector>*))(dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > const&, dealii::MeshWorker::Assembler::ResidualSimple<dealii::TrilinosWrappers::MPI::Vector>*)> >::_M_invoke(std::_Any_data const&, dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > const&) (
    __functor=..., __args#0=...) at /usr/include/c++/4.8/functional:2071
#23 0x0000000000433686 in std::function<void (dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > const&)>::operator()(dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > const&) const (
    this=this@entry=0x7fffffffb6c0, __args#0=...)
    at /usr/include/c++/4.8/functional:2471
#24 0x0000000000440db0 in dealii::WorkStream::internal::Implementation3::WorkerAndCopier<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, dealii::MeshWorker::IntegrationInfoBox<2, 2>, dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > >::operator() (
    this=0x7fffffffb640, range=...)
    at /mnt/data/jwitte/dealii/install/include/deal.II/base/work_stream.h:847
#25 0x000000000043c8a1 in operator()<tbb::blocked_range<__gnu_cxx::__normal_iterator<const dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >*, std::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > >&, void> (
    __object=..., this=0x7fffdea03d60) at /usr/include/c++/4.8/functional:588
#26 __call_c<void, tbb::blocked_range<__gnu_cxx::__normal_iterator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > const*, std::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > >&, 0ul, 1ul> (
    __args=<unknown type in /export/home/jwitte/test-colored/build/myloop, CU 0x0, DIE 0xb9fd1>, this=0x7fffdea03d60) at /usr/include/c++/4.8/functional:1305
#27 operator()<tbb::blocked_range<__gnu_cxx::__normal_iterator<const dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >*, std---Type <return> to continue, or q <return> to quit---
::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > >&, void> (this=0x7fffdea03d60)
    at /usr/include/c++/4.8/functional:1369
#28 run_body (r=..., this=0x7fffdea03d40)
    at /usr/include/tbb/parallel_for.h:110
#29 tbb::interface6::internal::partition_type_base<tbb::interface6::internal::auto_partition_type>::execute<tbb::interface6::internal::start_for<tbb::blocked_range<__gnu_cxx::__normal_iterator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > const*, std::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > >, std::_Bind<std::_Mem_fn<void (dealii::WorkStream::internal::Implementation3::WorkerAndCopier<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, dealii::MeshWorker::IntegrationInfoBox<2, 2>, dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > >::*)(tbb::blocked_range<__gnu_cxx::__normal_iterator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > const*, std::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > > const&)> (std::reference_wrapper<dealii::WorkStream::internal::Implementation3::WorkerAndCopier<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, dealii::MeshWorker::I---Type <return> to continue, or q <return> to quit---
ntegrationInfoBox<2, 2>, dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > > >, std::_Placeholder<1>)>, tbb::auto_partitioner const>, tbb::blocked_range<__gnu_cxx::__normal_iterator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > const*, std::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > > >(tbb::interface6::internal::start_for<tbb::blocked_range<__gnu_cxx::__normal_iterator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > const*, std::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > >, std::_Bind<std::_Mem_fn<void (dealii::WorkStream::internal::Implementation3::WorkerAndCopier<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, dealii::MeshWorker::IntegrationInfoBox<2, 2>, dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > >::*)(tbb::blocked_range<__gnu_cxx::__normal_iterator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > const*, std::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > > const&)> (std::reference_wrapper<dealii::WorkStream::internal::Implementation3::WorkerAndCopier<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, dealii::MeshWorker::IntegrationInfoBox<2, 2>, dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWor---Type <return> to continue, or q <return> to quit---
ker::DoFInfo<2, 2, double> > > >, std::_Placeholder<1>)>, tbb::auto_partitioner const>&, tbb::blocked_range<__gnu_cxx::__normal_iterator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > const*, std::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > >&) (this=0x7fffdea03d78, start=...,
    range=...) at /usr/include/tbb/partitioner.h:277
#30 0x000000000043c98f in tbb::interface6::internal::start_for<tbb::blocked_range<__gnu_cxx::__normal_iterator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > const*, std::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > >, std::_Bind<std::_Mem_fn<void (dealii::WorkStream::internal::Implementation3::WorkerAndCopier<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, dealii::MeshWorker::IntegrationInfoBox<2, 2>, dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > >::*)(tbb::blocked_range<__gnu_cxx::__normal_iterator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > const*, std::vector<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, std::allocator<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> > > > > > const&)> (std::reference_wrapper<dealii::WorkStream::internal::Implementation3::WorkerAndCopier<dealii::TriaActiveIterator<dealii::DoFCellAccessor<dealii::DoFHandler<2, 2>, false> >, dealii::MeshWorker::Int---Type <return> to continue, or q <return> to quit---
egrationInfoBox<2, 2>, dealii::MeshWorker::DoFInfoBox<2, dealii::MeshWorker::DoFInfo<2, 2, double> > > >, std::_Placeholder<1>)>, tbb::auto_partitioner const>::execute() (this=<optimized out>) at /usr/include/tbb/parallel_for.h:116
#31 0x00007ffff0a3cb3a in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all (this=0x7fffde9ffe00, parent=...,
    child=<optimized out>) at ../../src/tbb/custom_scheduler.h:455
#32 0x00007ffff0a38816 in tbb::internal::arena::process (this=0x7fffdea17300,
    s=...) at ../../src/tbb/arena.cpp:106
#33 0x00007ffff0a37f4b in tbb::internal::market::process (this=0x7fffdea17900,
    j=...) at ../../src/tbb/market.cpp:479
#34 0x00007ffff0a340ff in tbb::internal::rml::private_worker::run (
    this=0x7fffde839000) at ../../src/tbb/private_server.cpp:283
#35 0x00007ffff0a342f9 in tbb::internal::rml::private_worker::thread_routine (
    arg=<optimized out>) at ../../src/tbb/private_server.cpp:240
#36 0x00007fffe83d8182 in start_thread (arg=0x7fffda9c9700)
    at pthread_create.c:312
#37 0x00007fffefb4747d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

debug.txt
main.cc

Bruno Turcksin

unread,
May 25, 2016, 9:21:55 AM5/25/16
to deal.II User Group
Julius,


On Wednesday, May 25, 2016 at 8:42:41 AM UTC-4, julius...@gmail.com wrote:
Here is the simplified code (also attached):
This code won't work. The colored_iterators contains every active cells. You need to check that they are locally owned.

Best,

Bruno

Martin Kronbichler

unread,
May 25, 2016, 9:30:04 AM5/25/16
to dea...@googlegroups.com
Hi Julius,


Is the implementation of the (colored) WorkStream::run() compatible with a hybrid parallel code-architecture?

Besides checking for the validity of the ranges (as Bruno points out), there is at least one potential problem as seen from the stack trace: You cannot write into TrilinosWrappers::MPI::Vector in a multithreaded way unless you take special care at construction. The way the vector works is that it accumulates non-local entries in a data structure a bit similar to an std::map. Of course, that is not thread-safe. This also explains why the problem only appears sporadically: It happens when one thread changes the shared data structure while the other is looking at it (or trying to modify the same data structure at the same time).

The solution to this problem is to either initialize the vector with the appropriate ghost entries (locally relevant dofs), i.e., the reinit method/constructor that takes two index sets, an MPI_Comm and a vector_writable flag. Alternatively, you can use parallel::distributed::Vector that you initialize with the same two index sets, locally owned dofs as first argument and locally relevant dofs as the second element (i.e., ghosts are the locally relevant minus the locally owned dofs). Trilinos matrices and preconditioners should know how to collaborate with these vectors.

There is potentially the same problem for Trilinos sparse matrices in case you start using them, but there it is simpler to arrive to the thread-safe path (if you initialize them by a DynamicSparsityPattern and set 'exchange_data' to true).


#10 0x00007ffff0735a05 in std::vector<int, std::allocator<int> >::insert (
    this=this@entry=0x8ecb40, __position=..., __x=@0x7fffda9c85f8: 1)
    at /usr/include/c++/4.8/bits/vector.tcc:127
#11 0x00007ffff077ff62 in Epetra_FEVector::inputNonlocalValues<int> (
    this=this@entry=0x8eca40, GID=771, numValues=numValues@entry=1,
    values=values@entry=0x7fffda9c8680, suminto=suminto@entry=true,
    vectorIndex=vectorIndex@entry=0)
    at /export/home/jwitte/trilinos-12.6.1-Source/packages/epetra/src/Epetra_FEVector.cpp:427


It is this frame #11 that tells me that you use Epetra_FEVector::inputNonlocalValues that is a non-thread-safe method.

Best,
Martin
Message has been deleted

julius...@gmail.com

unread,
May 25, 2016, 3:16:06 PM5/25/16
to deal.II User Group
Thanks a lot to both of you!

First, I modified my code in a way that the "colored iterators" (still one color) only contains locally owned cells.
Unfortunately, this didn't help to avoid the crash and I still got the same random error as before.

Next, I tested Martin's suggestion:
The solution to this problem is to either initialize the vector with the appropriate ghost entries (locally relevant dofs), i.e., the reinit method/constructor that takes two index sets, an MPI_Comm and a vector_writable flag. Alternatively, you can use parallel::distributed::Vector that you initialize with the same two index sets, locally owned dofs as first argument and locally relevant dofs as the second element (i.e., ghosts are the locally relevant minus the locally owned dofs). Trilinos matrices and preconditioners should know how to collaborate with these vectors.

And it works fine. To be clear, instead of using
dealii::TrilinosWrappers::MPI::Vector dst(locally_owned_dofs,MPI_COMM_WORLD);
the correct way would be
dealii::TrilinosWrappers::MPI::Vector dst;
dst
.reinit(locally_owned_dofs,locally_relevant_dofs,MPI_COMM_WORLD,true);

This attempt also works fine in a much more sophisticated code of mine. And maybe just to note it,
it runs for both cases, using the whole range of iterators as well as having a range consisting only of locally owned iterators.
In my eyes this is fulfilled by the fact that cell_action() checks, if the given cell is locally owned or not.

Again, thanks a lot for all your kind responses and your offered help.
Best,
Julius

P.S. I will upload the corrected code after the holidays as it is on my local machine in the university.
Reply all
Reply to author
Forward
0 new messages