set_zero fails for multiple threads

34 views
Skip to first unread message

Maxi Miller

unread,
Oct 10, 2017, 7:15:49 AM10/10/17
to deal.II User Group

When trying to run my program on an i7-CPU with 4/8 threads, I get the following error when calling set_zero() from the ConstraintMatrix-class:
--------------------------------------------------------
An error occurred in line <565> of file </home/roland/Downloads/dealii/source/lac/trilinos_vector.cc> in function
   
void dealii::TrilinosWrappers::MPI::Vector::compress(dealii::VectorOperation::values)
The violated condition was:  
    result
.max-result.min<1e-5
Additional information:  
   
Not all processors agree whether the last operation on this vector was an addition or a set operation. This will prevent the compress() operation from succeeding.
 
Stacktrace:
-----------
#0  /opt/dealII/lib/libdeal_II.g.so.9.0.0-pre: dealii::TrilinosWrappers::MPI::Vector::compress(dealii::VectorOperation::values)
#1  /opt/dealII/lib/libdeal_II.g.so.9.0.0-pre:  
#2  /opt/dealII/lib/libdeal_II.g.so.9.0.0-pre: void dealii::ConstraintMatrix::set_zero<dealii::TrilinosWrappers::MPI::Vector>(dealii::TrilinosWrappers::MPI::Vector&) const
#3  main: MinimalSurfaceProblem<2>::compute_residual(double, double, double)
#4  main: MinimalSurfaceProblem<2>::run()
#5  main: main
--------------------------------------------------------

The code where this line arises is
LinearAlgebraTrilinos::MPI::Vector local_residual(dof_handler.locally_owned_dofs(), mpi_communicator);
//Code in between for setting values
hanging_node_constraints.set_zero(local_residual);

It only happens if running with more than 4 threads, i.e. with mpirun -n 5 or larger. What is the problem here, and how can I fix that? The program runs (more or less fine) when compiling in release mode.
Thanks!

Wolfgang Bangerth

unread,
Oct 10, 2017, 9:18:32 AM10/10/17
to dea...@googlegroups.com
On 10/10/2017 05:15 AM, 'Maxi Miller' via deal.II User Group wrote:
>
> When trying to run my program on an i7-CPU with 4/8 threads, I get the
> following error when calling set_zero() from the ConstraintMatrix-class:
> |
> --------------------------------------------------------
> Anerror occurred inline <565>of file
> </home/roland/Downloads/dealii/source/lac/trilinos_vector.cc>infunction
> voiddealii::TrilinosWrappers::MPI::Vector::compress(dealii::VectorOperation::values)
> Theviolated condition was:
> result.max-result.min<1e-5
> Additionalinformation:
> Notall processors agree whether the lastoperation on thisvector was an
> addition ora setoperation.Thiswill prevent the compress()operation fromsucceeding.
>
> Stacktrace:
> -----------
> #0 /opt/dealII/lib/libdeal_II.g.so.9.0.0-pre:
> dealii::TrilinosWrappers::MPI::Vector::compress(dealii::VectorOperation::values)
> #1 /opt/dealII/lib/libdeal_II.g.so.9.0.0-pre:
> #2 /opt/dealII/lib/libdeal_II.g.so.9.0.0-pre: void
> dealii::ConstraintMatrix::set_zero<dealii::TrilinosWrappers::MPI::Vector>(dealii::TrilinosWrappers::MPI::Vector&)
> const
> #3 main: MinimalSurfaceProblem<2>::compute_residual(double, double, double)
> #4 main: MinimalSurfaceProblem<2>::run()
> #5 main: main
> --------------------------------------------------------
> |
>
> The code where this line arises is
> |
> LinearAlgebraTrilinos::MPI::Vectorlocal_residual(dof_handler.locally_owned_dofs(),mpi_communicator);
> //Code in between for setting values
>
> hanging_node_constraints.set_zero(local_residual);
>
> |
>
>
> It only happens if running with more than 4 threads, i.e. with mpirun -n 5 or
> larger. What is the problem here, and how can I fix that?
That would sound like a bug somewhere. Can you create a small program that
demonstrates this and that you could share so that others could look at it? It
doesn't have to do anything useful, just demonstrate the problem. For example,
you could start from your existing program and strip assembly and solver from it.

Best
W.



--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Timo Heister

unread,
Oct 10, 2017, 2:22:21 PM10/10/17
to dea...@googlegroups.com
It might also be that you are forgetting to compress() your vector
after you assemble it. The error would then trigger on the next
operation, which is the set_zero() in this case. What is the last
thing you have been doing with this vector?
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.math.colostate.edu_-7Ebangerth_&d=DwIBaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=R5lvg9JC99XvuTgScgbY_QFS80R7PEA2q0EPwDy7VQw&m=FDYQy9r_-sUm-_WjnWoeIPlURA8ea5TpOVtCtg5BBE4&s=Kcs0rW4kEvR2VF170rElcgkdyMqv9obffaWQEm_ZAm4&e=
> --
> The deal.II project is located at
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.dealii.org_&d=DwIBaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=R5lvg9JC99XvuTgScgbY_QFS80R7PEA2q0EPwDy7VQw&m=FDYQy9r_-sUm-_WjnWoeIPlURA8ea5TpOVtCtg5BBE4&s=SKIvxONbceO3FmrAH9KojFbKCSd-anplulITEuNMEp0&e=
> For mailing list/forum options, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_forum_dealii-3Fhl-3Den&d=DwIBaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=R5lvg9JC99XvuTgScgbY_QFS80R7PEA2q0EPwDy7VQw&m=FDYQy9r_-sUm-_WjnWoeIPlURA8ea5TpOVtCtg5BBE4&s=hKawPWSQ0WMEnRH1iQtU3I0vhGKZUx4O0WI95oK_5B0&e=
> --- You received this message because you are subscribed to the Google
> Groups "deal.II User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dealii+un...@googlegroups.com.
> For more options, visit
> https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_optout&d=DwIBaQ&c=Ngd-ta5yRYsqeUsEDgxhcqsYYY1Xs5ogLxWPA_2Wlc4&r=R5lvg9JC99XvuTgScgbY_QFS80R7PEA2q0EPwDy7VQw&m=FDYQy9r_-sUm-_WjnWoeIPlURA8ea5TpOVtCtg5BBE4&s=SpbJ1nIWvdzEUmE8TiMiISu1pTjD0x7L1YWbiWAiCK8&e=
> .



--
Timo Heister
http://www.math.clemson.edu/~heister/

Maxi Miller

unread,
Oct 10, 2017, 3:49:58 PM10/10/17
to deal.II User Group
I am coming out of a loop where I am doing
hanging_node_constraints.distribute_local_to_global(cell_residual, local_dof_indices, local_residual);


so I already was wondering why I get that error.

Timo Heister

unread,
Oct 10, 2017, 4:54:46 PM10/10/17
to dea...@googlegroups.com
On Tue, Oct 10, 2017 at 3:49 PM, 'Maxi Miller' via deal.II User Group
<dea...@googlegroups.com> wrote:
> I am coming out of a loop where I am doing
> hanging_node_constraints.distribute_local_to_global(cell_residual,
> local_dof_indices, local_residual);

Do you compress() your vectors (and matrices) after this? It needs to
happen on all processors!

Maxi Miller

unread,
Oct 11, 2017, 3:42:22 AM10/11/17
to deal.II User Group
No, somehow I skipped that section and only remembered the last part (All other operations like scaling or adding vectors, assignments, calls into deal.II (VectorToolsConstraintMatrix, ...) or solvers do not require calls to compress().)
But do I need to call it again after calling .set_zero()? I am not sure if this is a part of "calls into deal.II" or "setting individual values".

Timo Heister

unread,
Oct 11, 2017, 11:30:03 AM10/11/17
to dea...@googlegroups.com
> But do I need to call it again after calling .set_zero()?

No. It falls under "all other operations".
Reply all
Reply to author
Forward
0 new messages