Debugging NaNs in solution

32 views
Skip to first unread message

Ian Bolliger

unread,
Apr 21, 2024, 8:49:23 PMApr 21
to claw-...@googlegroups.com
I’m running a tropical cyclone simulation through Geoclaw and getting an error that looks like this:

```
SOLUTION ERROR --- ABORTING CALCULATION
At ichecknan = 1
mx,my,t: 44 30 -323815.21902547235
m,i,j: 2 23 26
q(m,i,j) = NaN
```

This occurs in the middle of the run, after 8 frames have been output (~2 days). I’ve checked the topography files and the storm file and nothing jumps out. I’m wondering if anyone has seen something like this before and/or has tips on how to go about debugging this.

Kyle Mandli

unread,
Apr 22, 2024, 9:22:16 AMApr 22
to claw-...@googlegroups.com
Hi Ian,

This is an internal check for NaNs that sometimes is hit before it is seen, in output for instance.  The best way to debug without a fancy debugger is to control the output of the steps a bit and run up to the problematic point and start outputting at every time step and see if you can see something.  This can be done via a checkpoint and output_style = 3, using output_style = 2, or some combination.

Kyle
--
You received this message because you are subscribed to the Google Groups "claw-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to claw-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/claw-users/A89B22D2-9C76-4141-AB6C-639DF9557E03%40gmail.com.

berger Marsha

unread,
Apr 22, 2024, 10:17:56 AMApr 22
to claw-...@googlegroups.com
what does it look like right before the nan? Is it blowing up? Is it at a boundary, right near topography?  Send a picture if you can.

Marsha

Ian Bolliger

unread,
Apr 23, 2024, 1:33:05 AMApr 23
to claw-...@googlegroups.com
Here’s the full (periodic) domain at the last output point before the error gets raised. Nothing jumps out. I’ll follow Kyle’s suggestions and try to output more steps right up to the point at which the nans occur. I’m just using one very low-res global DEM right now as a test, so it shouldn’t be at the boundary of any topography

PastedGraphic-1.png

Ian Bolliger

unread,
Apr 23, 2024, 1:33:05 AMApr 23
to claw-...@googlegroups.com
Thanks Kyle! Will take a look and see what I can find

Ian Bolliger

unread,
Apr 24, 2024, 3:25:37 PMApr 24
to claw-...@googlegroups.com
I followed Kyle’s suggestion and plotted outputs 1 second before it raises the NaN error. Nothing here is looking that abnormal to me. Very bizarre. I’ll keep debugging but if you have any thoughts or suggestions let me know! This is a very weak storm if that helps at all. I’m also attaching the topography I’m using for this test, which is a downscaled version of GEBCO 2023, downscaled to 1 degree. Also, for what it’s worth, I’ve been experimenting with this 1 degree super low resolution grid for a periodic domain, and then allowing higher refinement within an area surrounding CONUS. This error occurs before refining (which is constrained to only start when the storm gets close using the `regions` configuration). 
PastedGraphic-1.png

frame0002fig1004.pngframe0002fig1002.pngframe0002fig1001.png
On Apr 22, 2024, at 8:34 PM, Ian Bolliger <bolli...@gmail.com> wrote:

Here’s the full (periodic) domain at the last output point before the error gets raised. Nothing jumps out. I’ll follow Kyle’s suggestions and try to output more steps right up to the point at which the nans occur. I’m just using one very low-res global DEM right now as a test, so it shouldn’t be at the boundary of any topography

berger Marsha

unread,
Apr 26, 2024, 12:03:31 PMApr 26
to claw-...@googlegroups.com
Have you tried running on one level, so see if it still blows up, or any other simplifications?

If all else fails I can take a look at it, if you can find a way to get me the files.

Marsha

On Apr 24, 2024, at 3:25 PM, Ian Bolliger <bolli...@gmail.com> wrote:

I followed Kyle’s suggestion and plotted outputs 1 second before it raises the NaN error. Nothing here is looking that abnormal to me. Very bizarre. I’ll keep debugging but if you have any thoughts or suggestions let me know! This is a very weak storm if that helps at all. I’m also attaching the topography I’m using for this test, which is a downscaled version of GEBCO 2023, downscaled to 1 degree. Also, for what it’s worth, I’ve been experimenting with this 1 degree super low resolution grid for a periodic domain, and then allowing higher refinement within an area surrounding CONUS. This error occurs before refining (which is constrained to only start when the storm gets close using the `regions` configuration). 
<PastedGraphic-1.png>

<frame0002fig1004.png><frame0002fig1002.png><frame0002fig1001.png>

Ian Bolliger

unread,
Apr 30, 2024, 7:50:42 PMApr 30
to claw-...@googlegroups.com
I finally solved it! Also learned how to use gdb in the process so it was a useful exercise. It turns out the issue occurred b/c my storm input file happened to have a time point at which the storm center aligned perfectly with the center of a grid cell. When this happens, L623 of `model_storm_module.f90` has a divide-by-0 issue because `r` is 0. I think this yields infinite pressure and then eventually causes NaN’s. When I added some debug flags at compile time, it now throws an error at that line rather than writing a NaN. I’ll file a PR to handle this case.

berger Marsha

unread,
May 1, 2024, 6:16:42 PMMay 1
to claw-...@googlegroups.com
Good work.  (I still recommend my favorite debugger though - totalview, $550/year).

Marsha

Reply all
Reply to author
Forward
0 new messages