What I noticed is that using Q1-P0 significantly increases the cost of the linear solve. Here are some example iteration counts from step-56:
Q2-Q1:
Q1-P0:
I looked at a paper co-authored by Wolfgang (link here), and Fig. 8 suggests the FGMRES iteration count shouldn’t blow up this much. But clearly, the true bottleneck is the inner iterations for A.
My understanding is that using the upper block-triangular preconditioner with ILU for both the mass matrix and the A-matrix should still be reasonable when lowering the polynomial degree.
Does anyone (maybe the authors of the tutorials?) have an idea why the iteration counts increase so dramatically with the Q1-P0 element?
My goal is actually not to solve Stokes, but to develop a preconditioner for incompressible nonlinear elasticity, solving:
The discretized system looks structurally similar to Stokes, especially since the A-matrix (stiffness matrix) is symmetric and the B-matrix depends similarly on
. So the same upper block-triangular preconditioner should be applicable, maybe with tweaks to the inner preconditioners.I tried switching inner preconditioners for the A-matrix (AMG with one cycle, CG with ILU like in step-56), but the huge number of inner iterations for A persists --- even for the more stable Q2-P1 element and at small loadings. AMG didn’t notably help reduce iterations either.
I can still solve moderate-sized problems with a direct solver, so invertibility is not the issue. I also double-checked that the matrices A and B are on similar scales by comparing their frobenius norm, so scaling should not be the main problem either.
Any recommendations or insights on what might need to be adapted in the preconditioner design when moving from Stokes to incompressible nonlinear elasticity?
Thanks in advance!