Strange floating point error

20 views
Skip to first unread message

Corbin Foucart

unread,
Dec 4, 2020, 1:44:21 PM12/4/20
to deal.II User Group

I'm attempting to compile and run deal.ii step-35 on a CentOS cluster node (everything works fine on my desktop ubuntu machine). The program runs fine at certain refinement levels, but as I refine more, I get a floating point exception (no compiler errors):

ibfdr-compute-0-0(step-35)% ./step-35.debug
Number of refines = 4
Number of active cells: 15360
dim (X_h) = 278592
dim (M_h) = 62144
Re        = 500

Floating point exception

Running with GDB, I find the following debug information (attached) upon crash. It looks like deal.ii is looking for a fortran file at a location which doesn't exist on my cluster (/home/amd/loanamd29).

Does anyone have any idea
  1. Why this exception occurs only at certain refinement levels?
  2. Is this a problem with my deal.ii build? I haven't encountered this problem on any of the other steps I've run
Thank you!
Corbin

Screenshot from 2020-12-04 13-37-52.png

Wolfgang Bangerth

unread,
Dec 4, 2020, 5:15:05 PM12/4/20
to dea...@googlegroups.com
On 12/4/20 11:44 AM, Corbin Foucart wrote:
>
> Does anyone have any idea
>
> 1. Why this exception occurs only at certain refinement levels?

Floating point exceptions happen when you try to do arithmetic on numbers
called "signaling NaNs" that are used as "invalid values" and because the
processor aborts the program when they are used, they're a good way to make
sure that uninitialized values are never used.

Some compilers/operating systems support them, and others don't, so they are
not available on every system. That might explain the difference between the
two systems you are using.

Where the actual exception actually happens in the code is not overly relevant
-- the question is why you give that function uninitialized memory. So in your
case, run the program in a debugger like you're already doing. Look at the
backtrace and go to the last function you were in inside your own program --
then inspect the values of all of the arguments you pass to the function being
called in that place and try to understand why one of these will have these
NaNs in them.

Best
W.

--
------------------------------------------------------------------------
Wolfgang Bangerth email: bang...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/

Reply all
Reply to author
Forward
0 new messages