Convergence failure in step 0.

103 views
Skip to first unread message

yy.wayne

unread,
Feb 13, 2023, 9:38:28 AM2/13/23
to deal.II User Group
Hi,
I'm solving a multi-component problem in MatrixFree framework, preconditioned by geometric multigrid. The code runs good for several cases except one (with different geometry and boundary condition). The error is 
    "Iterative method reported convergence failure in step 0. 
     The residual in the last step was nan".
It's strange that the iteration breaks at step 0. If I try right preconditioning, it breaks at step 1 with nan. Besides, if there is only on multigrid level, the problem is solved correctly with 1 iteration, so matrices should be correct. The coarse solution is good enough for smoothing.

The question is, in which case may a CG or GMRES solver breaks at step 0 with nan?

yy.wayne

unread,
Feb 13, 2023, 8:58:19 PM2/13/23
to deal.II User Group
The exception aborts during the chebyshev smoothing

Snipaste_2023-02-14_09-54-42.png

Wolfgang Bangerth

unread,
Feb 13, 2023, 11:48:42 PM2/13/23
to dea...@googlegroups.com

> It's strange that the iteration breaks at step 0. If I try right
> preconditioning, it breaks at step 1 with nan. Besides, if there is only on
> multigrid level, the problem is solved correctly with 1 iteration, so matrices
> should be correct. The coarse solution is good enough for smoothing.
>
> The question is, in which case may a CG or GMRES solver breaks at step 0 with nan?

NaN errors, like segmentation faults, are relatively easy to find because you
know that every NaN you get is wrong. So you check that the matrix and right
hand side and solution vector you give to the solver don't have any NaNs. If
they don't, and you get NaNs out, then the problem likely lies with the
preconditioner, so you single step through the solver around the place where
the preconditioner is applied and check in which operation the NaN appears.

In your case, you seem to have already found out that it's the Chebyshev
preconditioner. The question is why. Is it because the matrix the
preconditioner uses has NaNs in it? Is it because the preconditioner makes an
assumption that it can divide by certain matrix entries, but they are zero?
Single stepping in a debugger will give you the answer :-)

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/


yy.wayne

unread,
Feb 14, 2023, 12:33:10 AM2/14/23
to deal.II User Group
Thank you Wolfgang. I didin't dive much into the Chebyshev preconditioner because I'm not familiar with it. 
I assume it's because the CG solver inside Chebyshev preconditioner didn't converge or what.
But as you stated, if NaN should only get from NaN elements and divided by zero, then I may find the error. 
I'll follow your debug suggestion.  

Best
Wayne

Martin Kronbichler

unread,
Feb 14, 2023, 2:33:18 AM2/14/23
to dea...@googlegroups.com

To extend over what Wolfgang said, the most likely causes for failure with the Chebyshev preconditioner are:

- the vector supplying the matrix diagonal (via the field AdditionalData::preconditioner) contains Inf or NaN entries, which in turn result from either a ill-formed differential operator or a bug in the code computing the diagonal, or

- a coarse level holding only constrained degrees of freedom, which are not treated properly so that the SolverCG making an eigenvalue estimate needs to work with an unsolvable linear system (right hand side not in the image of the matrix).

For the first case, you would check the diagonal on each level and the code leading to that, in the second the treatment of constraints.

Best,
Martin

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dealii+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/a0326f84-a7b7-49ec-97b7-db1579574e87n%40googlegroups.com.

yy.wayne

unread,
Feb 14, 2023, 3:25:28 AM2/14/23
to deal.II User Group
Thanks for your reply Martin.

The reason my Chebyshev preconditioner breaks is the (inverse) diagonal vector I get from MatrixFree operator contains negative entries. 
In my corresponding matrix-based code with exactly same geometry and boundary condition, the matrix diagonal has no negaive entries.

I apply a "compute_nonzero_tangential_flux_constraints_on_level" function similar to the normal_flux one. The geometry of this problem is a shell. When that BC is applied on outer boundary there's no negative entries, but the error comes when applied on inner boundary. I'm still debugging to see where the error comes from.

I got a quesiton that do FEEvaluation auto deal with constraints? It seems a matrixfree_operator.vmult solution 
equals matrixbased_matrix.vmult + AffineConstraints.distribute.

Best
Wayne

Peter Munch

unread,
Feb 14, 2023, 3:50:28 AM2/14/23
to deal.II User Group
Hi Wayne,

it seems that you are using multigrid on a locally refined mesh. Have you tried to use globally refined meshes? This one does not have inner boundaries so that the issue should not occur. Does it?


As proof of concept, you could also try out the global-coarsening multigrid infrastructure (step-75); there, there are no internally boundaries per definition.

Peter

yy.wayne

unread,
Feb 14, 2023, 9:15:16 AM2/14/23
to deal.II User Group
Hi Peter,

I'm solving a globally refined problem. I'll describe my case more detailed.

The weak problem I'm solving is  <curl u, curl u> + <div u, div u> - k^2<u, u> = RHS, which is vector Maxwell wave equation (for nodal basis).
The geometry is a shell (file attached), on outer boundary absorption BC is applied, on inner boundary PEC BC ( nxu=0) is applied.
The trick here is PEC BC applied by VectorTools::compute_normal_flux_constraints.
When I set VectorTools::compute_normal_flux_constraints or compute_normal_flux_constraints_on_level 
on inner shell boundary, the diagonal entries have negative one. Otherwise it's all positive.
I believe it is these negative diagonal entries causing Chebyshev precondition failure.

Therefore I use a homogeneous operator with empty comstraints to assemble inverse diagonal. There is no negative entry and no iteration
stop at step 0 with NaN error. Though I address the error, I'm not sure if this method is right and got several questions:
1) Is it reasonable the diagonal computed by MatrixFree operator with compute_normal_flux() has negative entries? 
2) How to compare the matrix entries between MatrixFree operator and MatrixBased matrix when there's constraints(I've done homogeneous 
    constraints compare). In this case PEC is kind of a hanging constraints rather than Dirichlet constraints.
3) In MatrixFree method, feevaluation.distribute_local_to_global() include what constraints.distribute() does. So when I assemble the diagonal   
     vector, shall I disable the constraints.distribute() like function to get "real" diagonal entries? Because in MatrixBased system_matrix, all 
    diagonal entries are positive.
4) In the end, I think I misunderstand the role of AffineConstraints class. AffineConstraints class doesn't affect the system_matrix itself(I used 
    to see AffineConstraints as a smarter method to eliminate constraints during assemble rather than after assemble like 
    MatrixTools::apply_boundary_values does), it only reflect constraints when distribute? Then why we need constraints.distributed_local_to_global?
    If it does affect, why matrix's diagonal is positive but matrixfree computed has negative entries?

Best,
Wayne
Snipaste_2023-02-14_21-16-33.png
Snipaste_2023-02-14_21-16-57.png
Snipaste_2023-02-14_20-16-45.png

Wolfgang Bangerth

unread,
Feb 14, 2023, 11:39:41 AM2/14/23
to dea...@googlegroups.com
On 2/14/23 07:15, 'yy.wayne' via deal.II User Group wrote:
> The weak problem I'm solving is  <curl u, curl u> + <div u, div u> -
> k^2<u, u> = RHS, which is vector Maxwell wave equation (for nodal basis).
> [...]
> 1) Is it reasonable the diagonal computed by MatrixFree operator with
> compute_normal_flux() has negative entries?

The diagonal entries of the matrix, for this bilinear form, are
integrals of the form

A_ii = \int (curl phi_i)*(curl phi_i) + (div phi_i)*(div phi_i)
- k^2 phi_i * phi_i

In other words, A_ii is a sum of squares, but one of the squares is
negative. If k is small, the sum may still be positive, but for sure if
k is large, the diagonal entries of A will be negative.

yy.wayne

unread,
Feb 14, 2023, 9:04:33 PM2/14/23
to deal.II User Group
Hi Wolfgang,

Yes the diagonal could be negative when k rises. Will Chebyshev preconditioner smoother converge
when there are negative entries?

In my case k hasn't rise so large that diagonal entries is negative. The negative entries shows up only
when compute_normal_flux constraints is applied. I think AffineConstraints::distribute_local_to_global
is the cause since in matrix-based assemble, this function add a positive value on the constrained 
dofs while matrix-free does not. Therefore the diagonals are different between matrix-free and matrix-based.

Best,
Wayne

Wolfgang Bangerth

unread,
Feb 14, 2023, 11:17:44 PM2/14/23
to dea...@googlegroups.com
On 2/14/23 19:04, 'yy.wayne' via deal.II User Group wrote:
>
> Yes the diagonal could be negative when k rises. Will Chebyshev preconditioner
> smoother converge
> when there are negative entries?

I have no idea :-)


> In my case k hasn't rise so large that diagonal entries is negative. The
> negative entries shows up only
> when compute_normal_flux constraints is applied. I think
> AffineConstraints::distribute_local_to_global
> <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.dealii.org%2Fcurrent%2Fdoxygen%2Fdeal.II%2FclassAffineConstraints.html%23a373fbdacd8c486e675b8d2bff8943192%3A~%3Atext%3DThus%2C%2520if%2520a%2520degree%2Cin%2520the%2520global%2520matrix.&data=05%7C01%7CWolfgang.Bangerth%40colostate.edu%7Cd2ecd8f014fc4c51b63308db0ef8fd28%7Cafb58802ff7a4bb1ab21367ff2ecfc8b%7C0%7C0%7C638120234786119642%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=c2TyCwiADwVWxKmtOEDrujHEkVC%2F8QPR5k7gIip3OoQ%3D&reserved=0>
> is the cause since in matrix-based assemble, this function add a positive
> value on the constrained
> dofs while matrix-free does not. Therefore the diagonals are different between
> matrix-free and matrix-based.

That's possible. I don't know enough about the matrix-free framework to tell
one way or the other, but maybe some of my colleagues here will know.

yy.wayne

unread,
Feb 14, 2023, 11:34:09 PM2/14/23
to deal.II User Group
I'll do some further test. Your advice on locating NaN error really helps me. Thank you Wolfgang :)

Best,
Wayne
Reply all
Reply to author
Forward
0 new messages