Better GMRES usage

284 views
Skip to first unread message

Shohei Ogawa

unread,
Sep 28, 2018, 5:31:39 PM9/28/18
to moose...@googlegroups.com
Hello,

I am solving an ion-migration problem with some surface reaction models. I am wondering how to use GMRES more effectively by changing the parameter for ksp_gmres_restart option. I have ksp_gmres_restart 300 in my current setting. Most of the linear solve take 600 to 700 iterations, so GMRES restarts happen a few times in each linear solve. After each GMRES restart, it takes 30 to 100 iterations for the residual to be a similar number as before restart.

1. Memory usage of GMRES restart
If I increase the number of iteration before restart by 1, is the increase in memory usage approximately [# of degree of freedoms] * 8 byte? (I believe there should be other things which take memory.)
For example, if I have 6 million nodes with three variables for each and stores up to 300 sets of vectors (with -ksp_gmres_restart 300), the total memory usage for storing Krylov-subspace vectors will be 6M * 3 * 8 * 300 / 1e9 = 43.2 GB. 
So even though GMRES can perform better, increasing the number for -ksp_gmres_restart can increase the memory usage significantly depending. Does this estimation look reasonable?

2. Can more frequent GMRES restart improve the convergence?
With restarts, the GMRES throws away the existing Krylov-subspace vectors. Is it possible for GMRES to work better (faster to coverage) with frequent restart and too large number for ksp_gmres_restart can degrade convergence?

My preconditioning parameters is something as below.
 [smp_full_asm_ilu]
 type = SMP
 full = true
 solve_type = PJFNK
 petsc_options_iname = '-pc_type -pc_asm_overlap -sub_pc_type -sub_pc_factor_levels -ksp_type  -ksp_gmres_restart'
 petsc_options_value = 'asm 2 ilu 1 gmres 300'
 []

Thank you,
Shohei Ogawa

Derek Gaston

unread,
Sep 28, 2018, 5:58:02 PM9/28/18
to MOOSE
First: you just simply shouldn't be taking that many linear iterations!  You need a better preconditioner!  Optimally linear iterations should be closer to 10-30.  Ideally they should be less than 100.  Anything more than that, and you can almost always do better by using a better preconditioner.

Can you tell us about your PDEs?  We might be able to suggest a better preconditioner.

As for your questions:

1.  Yes - with Lagrange shape functions that's pretty much right.

2.  This is very problem dependent.  For some problems restart can be a good thing... for others it can mean complete death. Maybe Jed Brown or Fande can tell us a bit more about how restart interacts with convergence of different types of problems.  In my own work I almost always find restart to be detrimental to convergence rate and therefore I set it above my max linear its so I never have it.

Derek

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/CAHfiw0fp__42xaYnuaaL8Yjza-69eiZ%3D4E0x63G6dc_18oBBew%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Jed Brown

unread,
Sep 28, 2018, 11:55:21 PM9/28/18
to Derek Gaston, MOOSE
Derek Gaston <frie...@gmail.com> writes:

> 2. This is very problem dependent. For some problems restart can be a
> good thing... for others it can mean complete death. Maybe Jed Brown or
> Fande can tell us a bit more about how restart interacts with convergence
> of different types of problems. In my own work I almost always find
> restart to be detrimental to convergence rate and therefore I set it above
> my max linear its so I never have it.

Lots of hard problems show that behavior. Some are only weakly
sensitive to restarts so the default of GMRES(30) is okay (and at least
it doesn't threaten to run out of memory). As usual with nonsymmetric
solvers, there are always counterexamples, such as when shorter restarts
yield arbitrarily faster convergence (Embree, 2003).

https://pdfs.semanticscholar.org/138b/9e1bc73a31c6c89b5b0d867956dbb31dff06.pdf

Shohei Ogawa

unread,
Oct 4, 2018, 5:30:06 PM10/4/18
to moose...@googlegroups.com, frie...@gmail.com
Thank you so much for the replies and sorry for the delay.

My simulation is for solving Poisson-Nernst-Planck equations. Basically, I solve for the concentration field of two species (O2 and H+) and one electrostatic potential. H+ species has diffusion and electromigration (advection) effects while O2 only has a diffusion effect. As boundary conditions, I have a non-lienar species consumption rate for both species concentration and kind of high flux (surface charge) on platinum metal surfaces for the electrostatic potential. I implemented off-diagonal Jacobians and should be right according to the Jacobian debugger. The surface charge attracts H+ and causes a very steep concentration increase of H+. As one of the species is affected by the electrostatic potential, the problem would be a convection-diffusion problem. In addition, diffusivities of the species vary by ~10 times in three different subdomains. Model equations are in the file at the URL below.

Also, I have to solve those equations on mesh generated based on 3D microscopy images, which makes much harder to handle mesh size. The too many linear iterations could be due to insufficient element density, but not sure. In past, I developed a marker to refine elements near certain sidesets to solve this issue partially, but element densities still might not enough to resolve the steep concentration increase as stated above.

I read the paper Prof. Brown shared. With a smaller number for the GMRES restart parameter, it seems that the linear iteration convergence is better, even though it still takes more than 1000 iterations in the last non-linear iterations... I am trying to do a more precise comparison.

I discussed solvers for this problem before. And it turned out Boomeramg doesn't work at all in my problem. 

Is ASM only the option for problems to be solved in parallel? Also, with ASM, we can specify a sub preconditioner. I am using ILU but is there any better choice?

Thank you,
Shohei Ogawa


--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

Derek Gaston

unread,
Oct 4, 2018, 6:19:25 PM10/4/18
to ogawa...@gmail.com, MOOSE
Hmmm - Fande Kong has developed a new "hybrid" AMG method that uses strong block smoothers for convection diffusion style problems.  I wonder if it could help you here...

Derek

Jed Brown

unread,
Oct 4, 2018, 6:20:26 PM10/4/18
to Shohei Ogawa, moose...@googlegroups.com, frie...@gmail.com
Shohei Ogawa <ogawa...@gmail.com> writes:

> I read the paper Prof. Brown shared. With a smaller number for
> the GMRES restart parameter, it seems that the linear iteration convergence
> is better, even though it still takes more than 1000 iterations in the last
> non-linear iterations... I am trying to do a more precise comparison.
>
> I discussed solvers for this problem before. And it turned out
> Boomeramg doesn't
> work at all in my problem.
> Solver setting for Poisson-Nernst-Planck (PNP) equations for
> electrochemical systems
> <https://groups.google.com/forum/#!searchin/moose-users/PNP%7Csort:date/moose-users/O5dLCn-oowE/TNmOgtRQBgAJ>

Just shooting from the hip, I would expect it's the asymmetric/transport
term and off-diagonal Laplacian in your H+ equation that is tripping up
strength measures for BoomerAMG. If you have all the Jacobian terms (or
approximations), you should be able to use a FieldSplit with AMG for the
Laplacian blocks and either that or ASM for the transport (depending on
regime).

> Is ASM only the option for problems to be solved in parallel? Also, with
> ASM, we can specify a sub preconditioner. I am using ILU but is there any
> better choice?

-sub_pc_type lu

would use a direct solve (instead of ILU) on each subdomain.

Shohei Ogawa

unread,
Oct 4, 2018, 8:16:00 PM10/4/18
to j...@jedbrown.org, moose...@googlegroups.com, Derek Gaston
Thank you so much for the suggestions.

As Derek mentioned 'a new "hybrid" AMG method that uses strong block smoothers for convection diffusion style problems. ', is this something available already in Moose?

> Just shooting from the hip, I would expect it's the asymmetric/transport
> term and off-diagonal Laplacian in your H+ equation that is tripping up
> strength measures for BoomerAMG.  If you have all the Jacobian terms (or
> approximations), you should be able to use a FieldSplit with AMG for the
> Laplacian blocks and either that or ASM for the transport (depending on
> regime).

Is there any good information resource about field split? I was not able to find documents about it on Moose. I looked at the tests for this in Moose, but they didn't make much sense to me.

Thank you,
Shohei 

Fande Kong

unread,
Oct 4, 2018, 9:03:34 PM10/4/18
to moose...@googlegroups.com, j...@jedbrown.org, Derek Gaston


On Oct 4, 2018, at 6:15 PM, Shohei Ogawa <ogawa...@gmail.com> wrote:

Thank you so much for the suggestions.

As Derek mentioned 'a new "hybrid" AMG method that uses strong block smoothers for convection diffusion style problems. ', is this something available already in Moose?

Not yet. It is tricky to build AMG for convection problems. We made progresses for some neutron NDA problems, but not done yet.  I am progressively working on. It is a little tricky 



> Just shooting from the hip, I would expect it's the asymmetric/transport
> term and off-diagonal Laplacian in your H+ equation that is tripping up
> strength measures for BoomerAMG.  If you have all the Jacobian terms (or
> approximations), you should be able to use a FieldSplit with AMG for the
> Laplacian blocks and either that or ASM for the transport (depending on
> regime).

Is there any good information resource about field split? I was not able to find documents about it on Moose. I looked at the tests for this in Moose, but they didn't make much sense to me.

Why you said the example doesn’t make sense for you? We have this nice interface in moose to allow users easily setup FS PC.  I believe some users successfully use the capability for their problems.

Fande 

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

Shohei Ogawa

unread,
Oct 4, 2018, 9:29:01 PM10/4/18
to moose...@googlegroups.com, j...@jedbrown.org, frie...@gmail.com
On Thu, Oct 4, 2018 at 9:03 PM Fande Kong <fdko...@gmail.com> wrote:


On Oct 4, 2018, at 6:15 PM, Shohei Ogawa <ogawa...@gmail.com> wrote:

Thank you so much for the suggestions.

As Derek mentioned 'a new "hybrid" AMG method that uses strong block smoothers for convection diffusion style problems. ', is this something available already in Moose?

Not yet. It is tricky to build AMG for convection problems. We made progresses for some neutron NDA problems, but not done yet.  I am progressively working on. It is a little tricky 

Thank you very much for the information. That problem sounds interesting.




> Just shooting from the hip, I would expect it's the asymmetric/transport
> term and off-diagonal Laplacian in your H+ equation that is tripping up
> strength measures for BoomerAMG.  If you have all the Jacobian terms (or
> approximations), you should be able to use a FieldSplit with AMG for the
> Laplacian blocks and either that or ASM for the transport (depending on
> regime).

Is there any good information resource about field split? I was not able to find documents about it on Moose. I looked at the tests for this in Moose, but they didn't make much sense to me.

Why you said the example doesn’t make sense for you? We have this nice interface in moose to allow users easily setup FS PC.  I believe some users successfully use the capability for their problems.

Maybe my background knowlege about field split is not sufficient. I will learn about it.

Thank you,
Shohei Ogawa

Shohei Ogawa

unread,
Oct 10, 2018, 1:28:47 PM10/10/18
to moose-users
I am lerning how to use field split for my problem. 

Meanwhile I am wondering how different the field split and the physics based preconditioner are in Moose? It seems that both of them enable us to use different preconditioners for different variables, but the physics based preconditioner can solve with a certain order of variables.

Thank you,
Shohei
Fande 

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

Fande Kong

unread,
Oct 10, 2018, 1:47:27 PM10/10/18
to moose...@googlegroups.com
On Wed, Oct 10, 2018 at 11:28 AM Shohei Ogawa <ogawa...@gmail.com> wrote:
I am lerning how to use field split for my problem. 

Meanwhile I am wondering how different the field split and the physics based preconditioner are in Moose? It seems that both of them enable us to use different preconditioners for different variables, but the physics based preconditioner can solve with a certain order of variables.

Field split is actually  implemented in PETSc, and we provides the interface only.  FS has several ways to couple the variable (multiplicative, additive, schur).   FS works on your original global matrix. PBP use multiplicative by default, and it works directly on the individual block matrices.  

You could choose the one that works better for you. 

Fande,

 

Thank you,
Shohei
Fande 

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

Shohei Ogawa

unread,
Oct 10, 2018, 2:03:36 PM10/10/18
to moose...@googlegroups.com
Thank you for the explanation. It makes a lot of sense for me.

Shohei Ogawa


Shohei Ogawa

unread,
Oct 10, 2018, 4:47:46 PM10/10/18
to moose-users
I am trying the PBP first. I have several questions.

1. What can be put to the 'preconditioner' param?
The doc string for the param says "TODO: docstring"...

2. So far I am able to run PBP with 'preconditioner' with ASM, AMG, ILU, or LU.
If AMG set, is this HYPRE boomeramg? Can we set other params for AMG?
If ASM set, how can we specify the sub_pc_type?

3. How to find right settings
According to my equations,
the O2 concentration is diffusion with reaction and the coupling with other two variables could be the least tight. So should I solve for this variable first? And maybe the electrostatic potential next.

Thank you,
Shohei 


Thank you,
Shohei
Fande 

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

Fande Kong

unread,
Oct 10, 2018, 5:11:52 PM10/10/18
to moose...@googlegroups.com
On Wed, Oct 10, 2018 at 2:47 PM Shohei Ogawa <ogawa...@gmail.com> wrote:
I am trying the PBP first. I have several questions.

1. What can be put to the 'preconditioner' param?
The doc string for the param says "TODO: docstring"...

2. So far I am able to run PBP with 'preconditioner' with ASM, AMG, ILU, or LU.
If AMG set, is this HYPRE boomeramg? Can we set other params for AMG?
If ASM set, how can we specify the sub_pc_type?

I did not do this so far. But you should be able to change the PC behaviors using PETSc options as you do every day.

I believe boomeramg will be used by default. You may stay with the default settings first. In my experience, boomeramg works better.  

 

3. How to find right settings
According to my equations,
the O2 concentration is diffusion with reaction and the coupling with other two variables could be the least tight. So should I solve for this variable first? And maybe the electrostatic potential next.


I would try the order like this: CO2, electro, CH.

Fande,
 

Thank you,
Shohei 


Thank you,
Shohei
Fande 

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

Shohei Ogawa

unread,
Oct 10, 2018, 7:01:25 PM10/10/18
to moose...@googlegroups.com
Thank you for the suggestion.

I think other PETSc options are working if specified in a regular way. I got differnt residual behavior with different values for -sub_pc_type with ASM.

Shohei Ogawa

unread,
Oct 17, 2018, 7:28:33 PM10/17/18
to moose-users
It turned out the field split preconditioner as below works well. For my problem where the advection-like effect is strong to the H+ concentration, schur complements field splitting seem to be suitable.
[fsp_schur_full_selfp]
  type = FSP
  solve_type = "PJFNK"
  topsplit = 'separate_c_H_plus'
  [separate_c_H_plus]
    splitting = 'c_O2_and_psi_electrolyte c_H_plus'
    splitting_type  = schur
    petsc_options_iname = '-pc_fieldsplit_schur_fact_type -pc_fieldsplit_schur_precondition'
    petsc_options_value = 'full selfp'
  []
  [c_O2_and_psi_electrolyte]
    vars = 'c_O2 psi_electrolyte'
    petsc_options_iname = '-pc_type -pc_hypre_type -ksp_type'
    petsc_options_value = ' hypre    boomeramg      gmres'
  []
  [c_H_plus]
    vars = 'c_H_plus'
    petsc_options_iname = '-pc_type -sub_pc_type -ksp_type'
    petsc_options_value = '     asm          ilu  gmres'
  []
[]
For a small problem, it takes only a few linear iterations for each non-linear iteration! I am wondering if something I expect is happening here with the options. To make the solver scalable, for c_O2 and psi_electrolyte variables, I want to apply the AMG. For c_H_plus variable, I am applying the ASM with the ILU for sub PC. Is this doing as I expected?

For comparison, when I use ASM with the settings as below,
  [smp_full_asm_ilu]
    type = SMP
    full = true
    solve_type = PJFNK
    petsc_options_iname = '-pc_type -pc_asm_overlap -sub_pc_type -sub_pc_factor_levels -ksp_type  -ksp_gmres_restart'
    petsc_options_value = 'asm 1 ilu 1 gmres 100'
  []
it takes about 10 times (50 to 60) linear iterations for each non-linear iteration. In terms of wall time, it takes roughly 3 times longer than with the field split. (Each iteration is much faster without filed split) 

Thank you,
Shohei

Thank you,
Shohei 


Thank you,
Shohei
Fande 

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe@googlegroups.com.

Shohei Ogawa

unread,
Oct 17, 2018, 7:37:14 PM10/17/18
to moose-users
In addition to the previous post, I just noticed the boomeramg is working for all the variables with the field split. It seems that it is slightly faster (less wall time) than with ASM for the c_H_plus variable. Interestingly, without FS, the boomeramg doesn't work at all to my problem.

Fande Kong

unread,
Oct 17, 2018, 7:46:15 PM10/17/18
to moose...@googlegroups.com
On Wed, Oct 17, 2018 at 5:37 PM Shohei Ogawa <ogawa...@gmail.com> wrote:
In addition to the previous post, I just noticed the boomeramg is working for all the variables with the field split. It seems that it is slightly faster (less wall time) than with ASM for the c_H_plus variable. Interestingly, without FS, the boomeramg doesn't work at all to my problem.

May because the convective term is known during FS processing, and then all equations are diffusive.

I was wondering if additive or multiplicative  mode work for you. 

Fande,
 

Thank you,
Shohei 


Thank you,
Shohei
Fande 

To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.

Shohei Ogawa

unread,
Oct 17, 2018, 8:10:42 PM10/17/18
to moose...@googlegroups.com
That makes sense.

I tried both adfitive and mutiplicative, but it was not as effective as the schur option in terms of the number of iterations needed.

Shohei

Cody Permann

unread,
Oct 17, 2018, 8:58:57 PM10/17/18
to moose...@googlegroups.com
Shohei, thanks for sharing. Looks like you are making good progress!

Shohei Ogawa

unread,
Oct 19, 2018, 2:59:34 PM10/19/18
to moose...@googlegroups.com
Also, Just for your information, I noticed for my problem choosing a correct line search algorithm is very important. 
  • Giving "basic" or literally "none" to line_search results in very fast convergence. Within just several nonlinear iterations to a relative tolerance of 1e-8.
  • Giving "bt", "l2", or "default" or not having "line_search = ..." result in very slow decrease in the non-linear residual, and I just canceled the simulations.
  • With "cp", linear solve for second non-linear iteration didn't converge at all.
Thank you,
Shohei Ogawa


Fande Kong

unread,
Oct 19, 2018, 3:25:33 PM10/19/18
to moose...@googlegroups.com
On Fri, Oct 19, 2018 at 12:59 PM Shohei Ogawa <ogawa...@gmail.com> wrote:
Also, Just for your information, I noticed for my problem choosing a correct line search algorithm is very important. 
  • Giving "basic" or literally "none" to line_search results in very fast convergence. Within just several nonlinear iterations to a relative tolerance of 1e-8.
This has been true for many applications. Good to know.

Thanks,

Fande,

 

Jed Brown

unread,
Oct 19, 2018, 3:26:19 PM10/19/18
to Shohei Ogawa, moose...@googlegroups.com
Shohei Ogawa <ogawa...@gmail.com> writes:

> Also, Just for your information, I noticed for my problem choosing a
> correct line search algorithm is very important.
>
> - Giving "basic" or literally "none" to line_search results in very
> fast convergence. Within just several nonlinear iterations to a relative
> tolerance of 1e-8.
> - Giving "bt", "l2", or "default" or not having "line_search = ..."
> result in very slow decrease in the non-linear residual, and I just
> canceled the simulations.
> - With "cp", linear solve for second non-linear iteration didn't
> converge at all.

I haven't been following this thread, but is there a simple test problem
with this property, especially one that could be included in PETSc? As
usual, all bets are off for Newton globalization. The heuristics used
by these line searches make assumptions that are evidently not valid for
your problem, but there isn't a "smart" way to say when to remove those
assumptions. Trying to create such a thing could be a nice student
project.

Fande Kong

unread,
Oct 19, 2018, 3:35:17 PM10/19/18
to moose...@googlegroups.com, ogawa...@gmail.com
HI Shohei,

It would be great, if you could share a simplified version (just a test) of your moose-based code with us. I could rewrite it using PETSc. This may help PETSc team to have more insights into the linear search methods.

Fande,

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

Shohei Ogawa

unread,
Oct 19, 2018, 4:59:15 PM10/19/18
to fdko...@gmail.com, moose...@googlegroups.com
That sounds an interesting idea for me.

My code got complicated recently and I am working on refactoring. To simplify my model, maybe I should get rid of a complicated non-linear reaction part. Just constant reaction rate and surface charge, which attracts species might be enough. I believe the later one is making the problem difficult to solve. Also, I am wondering the complexity of my problem is somehow coming from the shape of the domain. So making the problem small and simple while keeping the same numerical solution aspect might be hard and takes some time. I will let you know if I can come up with such a problem.

Thank you,
Shohei Ogawa

Alexander Lindsay

unread,
Oct 21, 2018, 11:19:53 AM10/21/18
to moose...@googlegroups.com
Fande, we really do need to change the default for MOOSE.


Jed Brown

unread,
Oct 21, 2018, 11:26:28 AM10/21/18
to Alexander Lindsay, moose...@googlegroups.com
Alexander Lindsay <alexlin...@gmail.com> writes:

> Fande, we really do need to change the default for MOOSE.
>
> Jed, we do have this example in PETSc...
> https://bitbucket.org/petsc/petsc/pull-requests/983/create-example-illustrating-newton-stuck/diff

Yes (thanks), but the CP line search is the preferred method for this problem.

The example in the present thread is said to fail with all line searches
that have been tried.

Fande Kong

unread,
Oct 21, 2018, 1:02:32 PM10/21/18
to moose...@googlegroups.com, Alexander Lindsay
Alex,

I totally agree with you that we should just use the “basic” line search method by default. Simply, no line search is the best line search in most cases :). It is ironic .

Fande


Sent from my iPhone
> --
> You received this message because you are subscribed to the Google Groups "moose-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
> Visit this group at https://groups.google.com/group/moose-users.
> To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/87in1vtj6o.fsf%40jedbrown.org.

Shohei Ogawa

unread,
Oct 23, 2018, 4:10:00 PM10/23/18
to moose-users
FYI, I created some plots to graphically show the residual behavior with different pre-conditioner settings.

With ASM + ILU, I got a non-linear residual plot

asm_tmp_nl.png


and linear residual plot by plotting residual for each non-linear iteration:

asm_tmp_lin.png

Some spikes are due to restarts of GMRES.


With the Schur type field split and Boomeramg for all sub-problems, the non linear residual and linear residual plots are as follow.

fsp_schur_tmp_nl.pngfsp_schur_tmp_lin.png

With the field split, each linear solve is within 2 iterations and I was able to solve the same problem with tighter tolerance about 10 times faster (wall time). With Schur type field split, each linear iteration took much longer so the speed up was just about 10 times even though the total number of linear iterations was 100 times less. The number of nodes was ~5.3M for the problem I created the plots with. I saw a 3 times speed up for a 20K nodes problem I previously ran on my laptop. If I scale up my problem more, the speed up would be more significant considering I am using the multigrid pre-conditioner in addition to much less linear iterations thanks to Schur type field split.


Also, as I mentioned earlier in this thread, I tried the physics based pre-conditioner with ASM for the H+ concentration and AMG for the O2 concentration and electric potential. This worked well for the smaller problem above, but it was not practical at all (very slow linear residual decrease) for the larger problem.


Thank you,

Shohei

John Peterson

unread,
Oct 23, 2018, 4:28:39 PM10/23/18
to moose-users


On Tuesday, October 23, 2018 at 2:10:00 PM UTC-6, Shohei Ogawa wrote:
FYI, I created some plots to graphically show the residual behavior with different pre-conditioner settings.

With ASM + ILU, I got a non-linear residual plot

asm_tmp_nl.png


and linear residual plot by plotting residual for each non-linear iteration:

asm_tmp_lin.png

Some spikes are due to restarts of GMRES.


With the Schur type field split and Boomeramg for all sub-problems, the non linear residual and linear residual plots are as follow.

fsp_schur_tmp_nl.pngfsp_schur_tmp_lin.png

With the field split, each linear solve is within 2 iterations and I was able to solve the same problem with tighter tolerance about 10 times faster (wall time). With Schur type field split, each linear iteration took much longer so the speed up was just about 10 times even though the total number of linear iterations was 100 times less. The number of nodes was ~5.3M for the problem I created the plots with. I saw a 3 times speed up for a 20K nodes problem I previously ran on my laptop. If I scale up my problem more, the speed up would be more significant considering I am using the multigrid pre-conditioner in addition to much less linear iterations thanks to Schur type field split.


Also, as I mentioned earlier in this thread, I tried the physics based pre-conditioner with ASM for the H+ concentration and AMG for the O2 concentration and electric potential. This worked well for the smaller problem above, but it was not practical at all (very slow linear residual decrease) for the larger problem.


Thank you for sharing your plots. If I understood correctly, the FieldSplit approach is anywhere from 3-10x faster than applying a "monolithic" (ASM + ILU) preconditioner for your particular problem? Do you have a plot wall clock time or speedup vs. number of processors? I am curious about the scalability of FieldSplit with increasing numbers of procs in addition to its performance for a single timestep.

Also, it may have been discussed previously in the thread, but is your problem non-dimensionalized or in some other way scaled so that each of the unknowns is comparable in size? It's interesting how much faster the "psi_electrolyte" residual decreases relative to the other unknowns, but that could simply be a scaling issue.

--
John

Fande Kong

unread,
Oct 23, 2018, 4:53:23 PM10/23/18
to moose...@googlegroups.com
I agree with this. It is worthwhile to scale the variables to check if it can make any difference.

Fande,
 

--
John

--
You received this message because you are subscribed to the Google Groups "moose-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to moose-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/moose-users.

Shohei Ogawa

unread,
Oct 23, 2018, 5:18:23 PM10/23/18
to moose...@googlegroups.com
Thank you for the replies.

As John mentioned,
>the FieldSplit approach is anywhere from 3-10x faster than applying a "monolithic" (ASM + ILU) preconditioner for your particular problem?
Yes this is right according to what I observed. What I expect is that hopefully there would be more speed up if I increase the problem size until the field split starts limiting the scaling.

>Do you have a plot wall clock time or speedup vs. number of processors?
No. I haven't done scaling analysis on a cluster computer yet. Maybe it is my interest after I moved to a full size problem on a super computer.

>is your problem non-dimensionalized or in some other way scaled so that each of the unknowns is comparable in size?
Yes, I sacled the variables so that the initial residuals for all the variables are in order of 1. I changed units of my problem at first, but there was still a lot of difference between the residuals so I had to scale them with scaling factor.

I am not sure if I can do some variable scaling analysis with the current dataset, but I will keep it in mind.

Thank you,
Shohei
Reply all
Reply to author
Forward
0 new messages