Wrong Exit Solver Status in Flash at Higher Resolutions

90 views
Skip to first unread message

Piyush Sharda

unread,
Mar 12, 2019, 8:35:42 PM3/12/19
to KROMEusers
Dear Krome Users,

I am running Krome with my 3D MHD simulation of first stars in Flash. I am using the react_primordial_wD file to include Deuterium chemistry in my simulations. My lower resolution simulations (lrefine_max=10, 12 in Flash, with a sink density threshold of 2.3e-14 g/cm^3 and 6.4e-13 g/cm^3 respectively) just run fine and do not crash at all even after many thousand years of sink formation. However, the higher resolution runs with the same setup crash a few timesteps after the first sink is created. I am unable to figure out what can cause the crash only at higher resolutions (perhaps the cooling/heating at higher densities is getting unstable?). My cooling options are: H2, HD, CHEM, CIE, ATOMIC and my heating option is CHEM. I am using Ripamonti and Abel (2004) H2 opacities.

The error I am getting is:

ERROR: wrong solver exit status!
 istate:           1
 iter count:       10001
 max iter count:       10000
 SEE KROME_ERROR_REPORT file


As you can see, I have tried to increase the maximum number of iterations to 10000, but that does not help. Following is a part of the KROME_ERROR_REPORT from my lrefine_max=14 run (with a sink density threshold of 1.5e-11 g/cm^3):

 Species abundances
 **********************
    #                name         qty       dn/dt ninit
 **********************
    1    E                 0.206E+003 -0.466E-007  0.206E+003
    2    H-                0.142E-003  0.119E-008  0.142E-003
    3    D-                0.112E-006 -0.993E-008  0.116E-006
    4    H                 0.384E+013  0.254E+009  0.385E+013
    5    HE                0.619E+012 -0.842E-032  0.619E+012
    6    H2                0.115E+013 -0.254E+009  0.115E+013
    7    HD                0.359E+010  0.254E+009  0.349E+010
    8    D                 0.287E+010 -0.254E+009  0.298E+010
    9    H+                0.205E+003  0.958E-002  0.206E+003
   10    HE+               0.149E-022  0.842E-032  0.152E-022
   11    H2+               0.630E-003 -0.152E-006  0.628E-003
   12    D+                0.166E+000 -0.958E-002  0.169E+000
   13    HD+               0.810E-008 -0.598E-009  0.834E-008
   14    HE++              0.474E-048 -0.609E-057  0.476E-048
   15    CR                0.000E+000  0.000E+000  0.000E+000
   16    g                 0.000E+000  0.000E+000  0.000E+000
   17    Tgas              0.200E+004 -0.292E-008  0.200E+004
   18    dummy             0.000E+000  0.000E+000  0.000E+000

As I understand, wrong exit solver status with istate=1 is printed when inside the DLSODES function, ISTATE=-1 which implies 'excess work done, perhaps wrong MF'. In the krome_all.f90 file produced by the patch of krome to flash, MF=222 for internally generated Jacobian and sparsity. Should the MF value be something else for my higher resolution runs?

I tried increasing the relative tolerance (rtol) from its default value of 1d-4 to 1d-3, but that did not help either. The absolute tolerance (atol) is at 1d-20.

Please let me know if you can help me debug this crash, or need more information on the simulation.

Cheers,
Piyush

Daniel Seifried

unread,
Mar 13, 2019, 5:31:04 AM3/13/19
to Piyush Sharda, KROMEusers
Dear Piyush,

unfortunately it always gets tricky when all the different kind of physics including sinks are involved. You reach very high densities of up to 1e14, but I don’t know to which extend this could be a problem for the network itself.

In this context I have one question about how you choose your sink threshold density. According to Truelove 1998, the Jeans length corresponding to the maximum/sink threshold density should be resolved with at least 4 grid cells. I.e. rho_sink ~ 1/dx^2, where dx is the smallest cell size.

That means when going from level 10 to 12 and 14, each time the sink density should increase by a factor of 4^2 = 16. However, your increase is first a factor of 27 and then another factor of 23. Maybe the crash could be related to the fact that you introduce sinks only at too high densities. But that is so far only speculation.

Best,

Daniel


--
You received this message because you are subscribed to the Google Groups "KROMEusers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kromeusers+...@googlegroups.com.
To post to this group, send email to krome...@googlegroups.com.
Visit this group at https://groups.google.com/group/kromeusers.
To view this discussion on the web visit https://groups.google.com/d/msgid/kromeusers/b25cafaa-1e65-4749-aafb-a57f54fc343c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christoph Federrath

unread,
Mar 13, 2019, 6:17:16 AM3/13/19
to Daniel Seifried, Piyush Sharda, KROMEusers

Hi Daniel,

the sink particle density threshold depends on the grid resolution *and* sound speed (or temperature) — see Eq. (32) in http://adsabs.harvard.edu/abs/2010ApJ...713..269F 

Your calculation below only applies to isothermal gas — which is not the case here.

Cheers,
Christoph

Stefano Bovino

unread,
Mar 13, 2019, 8:18:04 AM3/13/19
to Piyush Sharda, KROMEusers
Dear Piyush,

I’m also not sure this can depend on the sinks. What I see in your Krome_error_report is that you enter with a Tgas of 2e3 K and the temperature at the time of the error is still the same!
You have also a lot of H2. Can you give a try keeping rtol=1e-4 and atol = 1e-10?

Another thing you can try is to run a simple 0D model (the ones you find in Krome) with the same initial conditions and the same setup and catch the error in a simple computational environment.

Let me know if this helps. 

Cheers,

Stefano


Piyush Sharda

unread,
Mar 17, 2019, 1:47:21 AM3/17/19
to KROMEusers
Dear Stefano and Daniel,

Thank you for your valuable suggestions on this issue! After a few more runs at different resolutions, this is what I find:

1.) Increasing atol to 1e-10 and even 1e-5 still led to the high resolution runs crashing, although the crash occurs later and later after the sink is formed. The error at the time of the crash is same - 'wrong exit solver status'. These runs used react_primordial_wD, where Deuterium and its associated species/reactions are also included in the Chemistry.

2.) I did the same set of runs with the chemical network file react_primordial3 which is the most stable and double-checked, as mentioned in an earlier thread. Further, react_primordial3 has no species or reactions associated with Deuterium. None of the runs with this chemical network have failed. They are still running and have already passed the time by a marge margin, when the runs in step 1 crashed.

I tend to think that certain reaction(s) with Deuterium seem not to be converging at higher densities/temperatures. Do you suspect any issues with the react_primordial_wD network file? Additionally, in all the runs (step 1 and 2), once the sink particle is formed and starts accreting, the temperature in the cells which constitute the accretion shock around the sink rises to a few thousand Kelvin. The temperatures do not increase beyond 10000 K because ATOMIC cooling perhaps gets stronger. Can this temperature rise be an issue with certain chemical reactions in the react_primordial_wD file?

From the 0D test in krome (krome/tests/collapse) with these reaction networks, I find that neither of the runs crash, and run till they reach the maximum density allowed. Following is my initial setup for these runs:

Options.opt:
-n networks/react_primordial3
-cooling=H2,COMPTON,CHEM,ATOMIC,CIE
-heating=COMPRESS,CHEM
-H2opacity=RIPAMONTI
-gamma=FULL
-n networks/react_primordial3

test.f90:
  !INITIAL CONDITIONS
  krome_redshift = 30d0    !redshift
  ntot = 9086d0 !d0               !total density, cm-3
  Tgas = 260d0 !1d2               !temperature, K

  !initialize KROME (mandatory)
  call krome_init()

  !species default, cm-3
  x(:) = 1d-40

  !set individual species
  x(KROME_idx_H)         = 0.99918686d0*ntot
  x(KROME_idx_H2)        = 4.0849d-4*ntot
  x(KROME_idx_E)         = 8.80875d-7*ntot
  x(KROME_idx_Hk)        = 5.3591d-14*ntot
  !x(KROME_idx_Dk)        = 1.8817d-18*ntot
  x(KROME_idx_Hj)        = 8.80875d-7*ntot
  x(KROME_idx_HE)        = 0.083d0*ntot
  !x(KROME_idx_D)         = 2.59249d-5*ntot
  !x(KROME_idx_Dj)        = 1.8896d-11*ntot
  !x(KROME_idx_HDj)        = 3.0952d-20*ntot
  x(KROME_idx_HEj) = 2.5861d-44*ntot
  x(KROME_idx_H2j) = 5.6566d-16*ntot
  x(KROME_idx_HEjj) = 7.4453d-53*ntot
  !x(KROME_idx_HD)        = 7.4967d-8*ntot

[For react_primordial_wD, I also include cooling by HD and uncomment the D-based species in test.f90]


Please let me know what do you think about this analysis. 
Thank you for your help with this issue!

Kind Regards,
Piyush

Stefano Bovino

unread,
Mar 17, 2019, 7:54:50 AM3/17/19
to Piyush Sharda, KROMEusers
Hi Piyush,

Ok this is a good test. I think that if I have to bet the problem could come from the helium (new reactions) included in the network with deuterium, not necessarily the deuterium ones.

Can you give a try removing reaction 28, 29, and 30 which are mainly charge-exchange. In the past these produced some problems.

Then we can check if there is anything else related to deuterium, we should go step-by-step.

Cheers,

Stefano



Piyush Sharda

unread,
Mar 19, 2019, 7:53:04 AM3/19/19
to KROMEusers
Hi Stefano,

I followed your suggestion and removed reactions 28, 29 and 30 from react_primordial_wD and ran the same set of simulations at different resolutions. While the lower resolution run (res 12) runs fine and does not crash, the higher resolution runs (res 14 and 16) both crash because of the same error, after a few years of sink formation.

The KROME_ERROR_REPORT for res 16 run is:
    #                name         qty       dn/dt    ninit
 **********************
    1    E                 0.587E+003 -0.925E-005  0.587E+003
    2    H-                0.405E-003 -0.174E-004  0.373E-003
    3    D-                0.228E-004  0.261E-004  0.216E-004
    4    H                 0.694E+013 -0.425E+012  0.776E+013
    5    HE                0.149E+013  0.186E-034  0.149E+013
    6    H2                0.364E+013  0.425E+012  0.323E+013
    7    HD                0.770E+012 -0.425E+012  0.777E+012
    8    D                 0.369E+012  0.425E+012  0.363E+012
    9    H+                0.554E+003 -0.177E+002  0.556E+003
   10    HE+               0.265E-025 -0.186E-034  0.265E-025
   11    H2+               0.296E-002  0.824E-005  0.273E-002
   12    D+                0.332E+002  0.177E+002  0.313E+002
   13    HD+               0.159E-005 -0.527E-007  0.154E-005
   14    HE++              0.135E-060 -0.495E-069  0.135E-060

   15    CR                0.000E+000  0.000E+000  0.000E+000
   16    g                 0.000E+000  0.000E+000  0.000E+000
   17    Tgas              0.200E+004  0.748E-007  0.151E+004

   18    dummy             0.000E+000  0.000E+000  0.000E+000
 **********************

The KROME_ERROR_REPORT for res 14 run is:
    #                name         qty       dn/dt    ninit
 **********************
    1    E                 0.259E+003  0.133E-005  0.260E+003
    2    H-                0.176E-003  0.337E-006  0.181E-003
    3    D-                0.720E-005 -0.176E-005  0.498E-008
    4    H                 0.200E+013  0.183E+011  0.244E+013
    5    HE                0.164E+012  0.724E-025  0.164E+012
    6    H2                0.274E+012 -0.183E+011  0.135E+012
    7    HD                0.426E+011  0.183E+011  0.969E+007
    8    D                 0.775E+011 -0.183E+011  0.533E+008
    9    H+                0.249E+003  0.165E+001  0.260E+003
   10    HE+               0.233E-015 -0.724E-025  0.234E-015
   11    H2+               0.352E-003 -0.141E-004  0.414E-003
   12    D+                0.983E+001 -0.165E+001  0.554E-002
   13    HD+               0.493E-006 -0.592E-007  0.359E-009
   14    HE++              0.169E-048 -0.274E-057  0.171E-048

   15    CR                0.000E+000  0.000E+000  0.000E+000
   16    g                 0.000E+000  0.000E+000  0.000E+000
   17    Tgas              0.200E+004  0.662E-008  0.185E+003

   18    dummy             0.000E+000  0.000E+000  0.000E+000
 **********************

If you focus on the error report for res 14, you will find that the temperature increases from 185 K to 2000 K in a single step, and the abundances of D-, HD, D and D+ all increase by 3 orders of magnitude. While the final temperature in the res 16 run at the error is 500 K more than the initial temperature, there is no jump in the abundances of any species. Are these changes in temperature in a single timestep suspicious? Do you find anything conclusive from these runs?

Cheers,
Piyush

piyush sharda

unread,
Nov 26, 2019, 7:02:49 AM11/26/19
to KROMEusers, Christoph Federrath, Mark Krumholz
Dear Stefano, Daniel and Krome Users,

It has been a while since we discussed this issue, please accept my apologies for reopening this thread. I am glad to inform you that we have been able to crack down the cause of the crash discussed in this thread. It turns out that the reactions that caused non-convergence in the DLSODES solver used by Krome are reactions no. 33 and 34 in react_primordial_wD, taken from Glover et al. 2009: H2 + D ---> HD + H. We find that in "all" of our simulations that crash, the output temperature is "always" 2000 K and this is the reaction with the highest flux (defined as the product of number densities of all reactants and products with the rate coefficient). Hence, if this reaction does not converge, it is not a surprise to find that the simulation cannot continue further. As I remarked earlier, this crash does not occur with react_primordial3 that does not have Deuterium.

This reaction has been adopted in Krome from Appendix A, Table A.10 of Glover et al. 2009 (reaction IX17, appendix attached for reference). The original experimental data for this reaction comes from Mielke et al. 2003. Glover et al. introduced a temperature breakpoint in the rate coefficient formulation of this reaction at T = 2000 K; the rate coefficient as defined in Krome for T <= 2000 K is a fit by Glover et al. whereas that defined for T > 2000 K is the original fit done by Mielke et al. for their data that exists for 1100 - 2200 K. Thus, in reality, the equation that defines the rate coefficient for T > 2000 K is valid for 1100 < T < 2000 as well. Hence, there does not seem to be a reason to have a breakpoint in this reaction's rate coefficient formulation at T = 2000 K. While Glover's fit is needed for lower temperatures ( T < 1000 K) to avoid the very steep decline in the rate coefficient given by Mielke's fit, Mielke's fit can nicely describe the rate coefficient for all T > 1100 K. The reason the solver cannot converge near T = 2000 K is because of the discontinuity in the rate coefficient between the two parts at T = 2000 K. Attached figure explicitly shows this discontinuity at 2000 K in the rate coefficient equation. The solid black curve is Glover's fit, solid blue is Mielke's fit and vertical green line is the current position of the breakpoint. If the breakpoint is moved to 1167 K where the two curves have a real solution (vertical gray line), the solver is able to converge and the crash does not occur anymore in "any" of the simulations that crashed earlier. We have rigorously tested this hypothesis and find that just changing the breakpoint in this reaction to ensure continuity (and keeping everything else same) ensures the simulations can run well past the point of crash. Moreover, since Mileke's original fit is defined for 1100 K and above, moving the breakpoint does not lead to an erroneous calculation in the rate coefficient. Even though the difference in the rate coefficients at 2000 K for the two parts is tiny (of the order of 1e-11), the reaction flux is significant at high densities of H, H2 and HD, specially in the high resolution runs.

I hope I could convey the reason and our solution to avoid the crash. Please let me know what you think.


Regards,
Piyush

  


For more options, visit https://groups.google.com/d/optout.


--

Piyush Sharda, FRAS, PHF
PhD Student,
A1.05, EOS House,
Mt Stromlo Observatory, Research School of Astronomy and Astrophysics,
The Australian National University,
Cotter Road, Canberra, ACT 2612
krome_reactionh2d.pdf
AppendixA_Glover_2009.pdf

Stefano Bovino

unread,
Nov 26, 2019, 7:13:12 AM11/26/19
to piyush sharda, KROMEusers, Christoph Federrath, Mark Krumholz
Dear Piyush,

Thanks for digging into that. First of all, I have to admit that some of the networks coming with Krome have been tested only in specific cases, for instance the one with Deuterium is probably one of the less tested and we used mainly in simple 0d collapse problems where as you know temperature are always < 2000 K (Omukai-like).

In any case, what you said does indeed make sense to me, and it is reasonable to change that rate. I’ll probably ensure that the divergence behavior above 2000 K is real, but if you check the original paper it could be the case.
Concerning deuterium I think there have been recently also some updates to some of the reactions, but I’m not sure this is one of those.

I apologize for the time you had to spend on this, but as you know network is really the only thing Krome users should take care of and it is normally a big pain.

Thanks for spotting this. And thanks for using Krome. 

Cheers,
Stefano

To view this discussion on the web visit https://groups.google.com/d/msgid/kromeusers/CAAJjd5t0R%2Bw0ofPPYdMAmTZftQ0JW197daFEh60g%2BgDP58gKug%40mail.gmail.com.
<krome_reactionh2d.pdf><AppendixA_Glover_2009.pdf>

piyush sharda

unread,
Nov 26, 2019, 7:36:11 PM11/26/19
to Stefano Bovino, KROMEusers, Christoph Federrath, Mark Krumholz
Dear Stefano,

Thank you for your assessment of the fix. Krome has proved to be an integral package for our work on star formation simulations and we are committed to help you make it even more robust so that it can be used with various different applications. I hope discussions on this thread will be of benefit to other users interested in using Deuterium chemistry in their research.

Cheers,
Piyush


--

PhD Student,
Reply all
Reply to author
Forward
0 new messages