bug in program only with adaptive mesh refinement

68 views
Skip to first unread message

Daniel Shapero

unread,
Feb 22, 2017, 7:49:08 PM2/22/17
to deal.II User Group
Hi all -- I'm writing a library that involves solving a nonlinear elliptic PDE, which I'll write as f(u) = 0. There is an exact solution for this PDE for a certain simplified geometry. To test everything, I check that my numerical solution is tolerably close to the analytic solution, with both bilinear and biquadratic elements, and on a mesh that has been adaptively refined in part of the domain. This all works fine.

The PDE can be derived from an action principle, i.e. there's a convex functional P such that f = dP. I've decided to reimplement some of my code to explicitly use the action principle, so I wrote a routine to calculate P. The action is analytically computable for the exact solution I'm already using to test the code, so this is already one unit test. However, I decided to take it a step further, and check that the nonlinear differential operator f is the derivative of P, and that its linearization df is the Hessian operator of P. To check that, I take some vector field u, another vector field v, and check that

  P(u + h * v) = P(u) + h*f(u) . v + h^2 * (df(u)v) . v / 2 + O(h^3)

This test passes in the simple geometry with both bilinear and biquadratic elements, but *not* for an adaptively refined mesh. The errors in both the local linear approximation to the action functional and the local quadratic approximation do not go to 0 as the increment h is reduced. In addition, the value of the action is roughly the same for each h for both the adaptively refined and uniform meshes. The local quadratic approximation also converges as h goes to 0 on a realistic, complex geometry with real data, although we don't have an analytic solution for the problem there.

I'm very confused by this result; since the solution of the PDE on an adaptively-refined mesh agrees with the analytic solution, it's kind of baffling that the local quadratic approximation to the action functional isn't working right in that case. The only other clue I have is that the error of the numerical solution against the analytic solution is actually somewhat worse on the adaptively-refined mesh than for the uniform mesh, but before I assumed this just had something to do with the linear solver.

Any ideas of what I might have missed? The code is available here if that helps.

Daniel

Wolfgang Bangerth

unread,
Feb 23, 2017, 12:12:15 AM2/23/17
to dea...@googlegroups.com

Daniel,

> Hi all -- I'm writing a library that involves solving a nonlinear elliptic
> PDE, which I'll write as f(u) = 0. There is an exact solution for this PDE for
> a certain simplified geometry. To test everything, I check that my numerical
> solution is tolerably close to the analytic solution, with both bilinear and
> biquadratic elements, and on a mesh that has been adaptively refined in part
> of the domain. This all works fine.
>
> The PDE can be derived from an action principle, i.e. there's a convex
> functional P such that f = dP. I've decided to reimplement some of my code to
> explicitly use the action principle, so I wrote a routine to calculate P. The
> action is analytically computable for the exact solution I'm already using to
> test the code, so this is already one unit test. However, I decided to take it
> a step further, and check that the nonlinear differential operator f is the
> derivative of P, and that its linearization df is the Hessian operator of P.
> To check that, I take some vector field u, another vector field v, and check that
>
> P(u + h * v) = P(u) + h*f(u) . v + h^2 * (df(u)v) . v / 2 + O(h^3)
>
> [...]

I agree it makes not very much sense, but that often points to a
misunderstanding rather than a bug. So let me poke:

* In the formula above, P(.) is a functional, I assume, i.e., it takes a
function and returns a number, right?
* If so, what exactly does
f(u) . v
actually mean? How do you compute this?
* Same for the second derivatives?


> This test passes in the simple geometry with both bilinear and biquadratic
> elements, but *not* for an adaptively refined mesh. The errors in both the
> local linear approximation to the action functional and the local quadratic
> approximation do not go to 0 as the increment h is reduced.

I assume you mean by "error" the size of the second and third term?


> In addition, the
> value of the action is roughly the same for each h for both the adaptively
> refined and uniform meshes.

I don't think I understand this statement.


> The only other clue I have is that the error of
> the numerical solution against the analytic solution is actually somewhat
> worse on the adaptively-refined mesh than for the uniform mesh, but before I
> assumed this just had something to do with the linear solver.

How do you measure "worse"? As the error as a function of the number of unknowns?

Best
W.

PS: I do like this approach to testing. It shows great maturity in designing
code and testing numerical methods to think this way. Well done!


--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Denis Davydov

unread,
Feb 23, 2017, 2:12:28 AM2/23/17
to deal.II User Group
Hi Daniel,

The vectors you have in this equation P(u+h*v) =..., which of those have constrains distributed and which zeroed?
If you assembly matrices with ConstraintMatrix.distribute_local_to_global()  the diagonal elements corresponding to constrained DoFs
will be dummy (and positive) just to make SLAE positive-definite. 
When you use such solution further in matrix-vector products you would most likely need to
set constrained DoFs of the solution vector to zero because your matrix is assembled with constrained already distributed. 
Say in my case I need to compute the scalar   (u, Ku) and that's where i would zero constrained DoFs.
The same value could be obtained from (u', K' u') where by ' i denote object which don't take into account constrains,
i.e. solution vector with distribute constrains and matrix assembled with empty constrains.
On another hand, if you use this vector to evaluate the unknown field and update quadrature data, then you would need to
distribute constrains to make the field continuous.

Along the same lines: does you test work with non-homogeneous Dirichlet BC? The discussion above applies there as well.

Regards,
Denis.

Daniel Shapero

unread,
Feb 23, 2017, 10:51:32 AM2/23/17
to deal.II User Group, bang...@colostate.edu
Hi Wolfgang, thanks for looking!
 
* In the formula above, P(.) is a functional, I assume, i.e., it takes a
function and returns a number, right?
* If so, what exactly does
     f(u) . v
actually mean? How do you compute this?
* Same for the second derivatives?

Sorry if that was unclear -- if only we could write LaTeX in a newsgroup :) Yes, P is a functional, so P : H^1 to R; f : H^1 to (H^1)^*; and df : H^1 to the space of linear operators from (H^1) to (H^1)^*. Maybe a better way to write it would be

P(u + h * v) = P(u) + h * <f(u), v> + h^2 * <df(u)v, v> / 2 + O(h^3)

where < * , * > is the duality pairing.

I assume you mean by "error" the size of the second and third term?

Yes exactly, I checked that the errors in the linear approximation looked quadratic and that the errors in the quadratic approximation looked cubic as a function of h.
 
> In addition, the
> value of the action is roughly the same for each h for both the adaptively
> refined and uniform meshes.

I don't think I understand this statement.

By that, I meant that, if u and v are the velocity fields on the uniform mesh and u_a, v_a are the velocity fields on the adaptively-refined mesh, then P(u + h * v) = P(u_a + h * v_a) to within some negligible error.

> The only other clue I have is that the error of
> the numerical solution against the analytic solution is actually somewhat
> worse on the adaptively-refined mesh than for the uniform mesh, but before I
> assumed this just had something to do with the linear solver.

How do you measure "worse"? As the error as a function of the number of unknowns?

If u_true is the projection of the exact solution onto the uniform mesh and u_a_true is the projection of the exact solution onto the adaptively-refined mesh, then |u - u_true| is less than |u_a - u_a_true|, where |*| is the L^2 norm of the functions (not the l^2 norm of the vector of coefficients). This is one of the weirder things to me; I would think that the error would be lower for the adaptive case since overall the cells are finer. But that could just be a quirk of my hand-rolled Newton solver.

Thanks for taking the time to check this out, I really appreciate it! Let me know if any more information would be helpful in diagnosing this,
Daniel

Daniel Shapero

unread,
Feb 23, 2017, 11:08:56 AM2/23/17
to deal.II User Group
The vectors you have in this equation P(u+h*v) =..., which of those have constrains distributed and which zeroed?
If you assembly matrices with ConstraintMatrix.distribute_local_to_global()  the diagonal elements corresponding to constrained DoFs
... 
Along the same lines: does you test work with non-homogeneous Dirichlet BC? The discussion above applies there as well.
 
Hi Denis! This sounds like the kind of thing I would have overlooked. I assemble both the PDE operator f and its linearization using ConstraintMatrix.distribute_local_to_global. The vector field u has non-homogeneous Dirichlet boundary conditions. I took another vector field ub that's equal to u along the boundary, and set v = u - ub. So v should be 0 along the boundary, but I never explicitly enforced that using e.g. MatrixTools::apply_boundary_values or something.

Are you thinking that I should assemble without the constraints, and then only add them later in, say, the Newton solver?

Daniel

Denis Davydov

unread,
Feb 23, 2017, 11:22:23 AM2/23/17
to dea...@googlegroups.com
You should NOT need to, just be careful doing matrix-vector products.
And if you enforce constraints, I would do ConstrantMatrix::distribute() as you would have extra constraints in h-FEM case.
How you setup those constraints should be irrelevant, most likely you would call VectorTools::interpolate_boundary_values().

Denis.

Wolfgang Bangerth

unread,
Feb 23, 2017, 2:02:01 PM2/23/17
to Daniel Shapero, deal.II User Group

Hi Dan,

> * In the formula above, P(.) is a functional, I assume, i.e., it
> takes a
> function and returns a number, right?
> * If so, what exactly does
> f(u) . v
> actually mean? How do you compute this?
> * Same for the second derivatives?
>
>
> Sorry if that was unclear -- if only we could write LaTeX in a newsgroup
> :) Yes, P is a functional, so P : H^1 to R; f : H^1 to (H^1)^*; and df :
> H^1 to the space of linear operators from (H^1) to (H^1)^*. Maybe a
> better way to write it would be
>
> P(u + h * v) = P(u) + h * <f(u), v> + h^2 * <df(u)v, v> / 2 + O(h^3)
>
> where < * , * > is the duality pairing.

...which you compute via quadrature? Or do you compute a vector F that
corresponds to f(u) somehow?

If you go the route via vectors you have to pay attention to *what kind
of vector you have*, namely one that does or does not incorporate
constraints. Dealing with dual space vectors (e.g., rhs vectors) is
tricky because they obey different rules than primal space vectors
(e.g., solution vectors).


> I assume you mean by "error" the size of the second and third term?
>
>
> Yes exactly, I checked that the errors in the linear approximation
> looked quadratic and that the errors in the quadratic approximation
> looked cubic as a function of h.
>
>
> > In addition, the
> > value of the action is roughly the same for each h for both the
> adaptively
> > refined and uniform meshes.
>
> I don't think I understand this statement.
>
>
> By that, I meant that, if u and v are the velocity fields on the uniform
> mesh and u_a, v_a are the velocity fields on the adaptively-refined
> mesh, then P(u + h * v) = P(u_a + h * v_a) to within some negligible error.

So u_a, v_a are the same as u, v in the pointwise sense, just
interpolated from a uniform mesh to an adaptively refined one obtained
from the uniform one?


> > The only other clue I have is that the error of
> > the numerical solution against the analytic solution is actually
> somewhat
> > worse on the adaptively-refined mesh than for the uniform mesh,
> but before I
> > assumed this just had something to do with the linear solver.
>
> How do you measure "worse"? As the error as a function of the number
> of unknowns?
>
>
> If u_true is the projection of the exact solution onto the uniform mesh
> and u_a_true is the projection of the exact solution onto the
> adaptively-refined mesh, then |u - u_true| is less than |u_a -
> u_a_true|, where |*| is the L^2 norm of the functions (not the l^2 norm
> of the vector of coefficients). This is one of the weirder things to me;
> I would think that the error would be lower for the adaptive case since
> overall the cells are finer. But that could just be a quirk of my
> hand-rolled Newton solver.

Are you using the L2 projection? For example, for the Laplace equation,
I think you can only guarantee that the H1 seminorm error is smaller for
a finer mesh, but not necessarily the L2 error. It all depends on your
equation.

And how do the two meshes differ?

Best
W.

Daniel Shapero

unread,
Feb 23, 2017, 3:17:42 PM2/23/17
to deal.II User Group, shapero...@gmail.com, bang...@colostate.edu
...which you compute via quadrature? Or do you compute a vector F that
corresponds to f(u) somehow?

If you go the route via vectors you have to pay attention to *what kind
of vector you have*, namely one that does or does not incorporate
constraints. Dealing with dual space vectors (e.g., rhs vectors) is
tricky because they obey different rules than primal space vectors
(e.g., solution vectors).
 
I'm assembling a vector F that represent f(u); the action of f(u) on a vector field v with FE expansion coefficients V can be represented as F*V. Of course if you want to compute the L2 inner product of say, u and v, that would be U^*MV where M is the mass matrix. I actually made a wrapper class with an extra template parameter to keep track of whether a field is in H^1 or in (H^1)^* in order to keep myself from making this mistake. Do you think there might be some additional subtlety regarding either boundary of hanging node constraints in this case?
 
So u_a, v_a are the same as u, v in the pointwise sense, just
interpolated from a uniform mesh to an adaptively refined one obtained
from the uniform one?

Not quite, I defined a dealii::TensorFunction object representing the exact solution and a perturbation of it that matches it along the boundary; u, v are interpolated from the TensorFunction object to the uniform mesh, u_a and v_a are interpolated from it to the non-uniform mesh.
 
Are you using the L2 projection? For example, for the Laplace equation,
I think you can only guarantee that the H1 seminorm error is smaller for
a finer mesh, but not necessarily the L2 error. It all depends on your
equation.


I did not know that! I shouldn't have said projection, really I interpolated an analytic formula for the solution.
 
And how do the two meshes differ?

The domain is a rectangle; the grid spacing is half as large in the right half of the mesh. 

Wolfgang Bangerth

unread,
Feb 23, 2017, 4:52:46 PM2/23/17
to Daniel Shapero, deal.II User Group

> I'm assembling a vector F that represent f(u); the action of f(u) on a
> vector field v with FE expansion coefficients V can be represented as
> F*V.

OK, so F_i would then be (f(u),\varphi_i), right?

I think the question is whether you do or do not apply
ConstraintMatrix::condense to F. I never really know whether you do or
do not have to do that when you want to use F that way, so I typically
just assemble the (scalar number)
(f(u),v)
directly.


> Are you using the L2 projection? For example, for the Laplace equation,
> I think you can only guarantee that the H1 seminorm error is smaller
> for
> a finer mesh, but not necessarily the L2 error. It all depends on your
> equation.
>
>
> I did not know that! I shouldn't have said projection, really I
> interpolated an analytic formula for the solution.

That, too, is not clear to me. We know that for elliptic problems, the
Galerkin approximation is the projection in the energy norm, and
consequently optimal in the energy norm (i.e., the one that minimizes
the energy norm error). But the interpolant is not optimal, either in
the L2 or energy norm, and so it's not clear to me that the interpolant
should be better (in either of these norms) on any one mesh compared to
another.


> The domain is a rectangle; the grid spacing is half as large in the
> right half of the mesh.

OK, so they're nested and the finite element space is strictly larger in
the adaptive mesh. Then indeed the energy norm error of the respective
solutions should be better, but we don't know whether that's true for
the L2 norm.

Daniel Shapero

unread,
Mar 5, 2017, 5:16:56 PM3/5/17
to deal.II User Group, shapero...@gmail.com, bang...@colostate.edu
OK, so F_i would then be (f(u),\varphi_i), right?

I think the question is whether you do or do not apply
ConstraintMatrix::condense to F. I never really know whether you do or
do not have to do that when you want to use F that way, so I typically
just assemble the (scalar number)
   (f(u),v)
directly.

Hi Wolfgang, a brief update -- before, I was using ConstraintMatrix::distribute_local_to_global in the assembly of both the derivative vector and the Hessian matrix. I can see how that would make the computed vector no longer represent the derivative of a functional, since you're changing entries more to make the hanging node constraints come out correctly. To see if this was the issue, I instead changed the code to add up contributions to the global vector/matrix directly, and only reconcile the constraints in the nonlinear solver routine. All the other unit tests still pass, so this change didn't break the nonlinear solver for the unrefined case. Still, the errors in the local linear approximation to P aren't decreasing for the refined mesh.

I'll try your approach about assembling (f(u), v) directly and report back. If you have other suggestions, I'm all ears; this is a pretty serious roadblock for what I want to do next.

Wolfgang Bangerth

unread,
Mar 7, 2017, 7:52:30 PM3/7/17
to Daniel Shapero, deal.II User Group

> Hi Wolfgang, a brief update -- before, I was using
> ConstraintMatrix::distribute_local_to_global in the assembly of both the
> derivative vector and the Hessian matrix. I can see how that would make
> the computed vector no longer represent the derivative of a functional,
> since you're changing entries more to make the hanging node constraints
> come out correctly. To see if this was the issue, I instead changed the
> code to add up contributions to the global vector/matrix directly, and
> only reconcile the constraints in the nonlinear solver routine. All the
> other unit tests still pass, so this change didn't break the nonlinear
> solver for the unrefined case. Still, the errors in the local linear
> approximation to P aren't decreasing for the refined mesh.
>
> I'll try your approach about assembling (f(u), v) directly and report
> back. If you have other suggestions, I'm all ears; this is a pretty
> serious roadblock for what I want to do next.

I imagine -- but you're doing the right thing trying to figure things out.

I don't have any other suggestions, except possibly to try to work out a
little example that you can compute by hand on a piece of paper. E.g.,
start with two cells, refine one, and prescribe a "current solution" u
that is convenient (e.g., constant, or linear). Then work out on a piece
of paper how everything ought to look, and compare what you get out of
your code. I've often used this as a last resort, because I know that
I'm going to spend an afternoon writing stuff on a piece of paper and
comparing with what I get, but it's also often helped me debug things
with which I've banged my head against the wall for too long already.
Reply all
Reply to author
Forward
0 new messages