MPI job exited with code: 174

120 views
Skip to first unread message

Nadeem Malik

unread,
Mar 29, 2023, 8:36:24 PM3/29/23
to Nek5000
Hi Neks,

I am getting unexpected errors running my code. Can anyone help. Thanks.

I have severe error when I run code. It ran perfectly well for 64x64 grid and polynomial order 14; now I want to run on 128x128 grid, with p-order 8. Sounds straight forward. I am running on 2 nodes, 160 ranks. I have updated SIZE accordingly and recompiled the code. I have reduced the time step as well. Sample of error messages are below.

In the *.e log files I get:

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
libifcoremt.so.5 00002B55146673BC for__signal_handl Unknown Unknown
libpthread-2.17.s 00002B55141D5630 Unknown Unknown Unknown
nek5000 00000000005F11B7 Unknown Unknown Unknown
nek5000 0000000000407C64 Unknown Unknown Unknown

Stack trace terminated abnormally.
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
libifcoremt.so.5 00002B7F845DE3BC for__signal_handl Unknown Unknown
libpthread-2.17.s 00002B7F8414C630 Unknown Unknown Unknown
nek5000 00000000005F11B7 Unknown Unknown Unknown
nek5000 0000000000407C64 Unknown Unknown Unknown

......

And in the *.o logfile, I get the error:
set initial conditions
nekuic (1) for ifld 1
call nekuic for vel
xyz min -3.1416 -3.1416 0.0000
uvwpt min 0.0000 0.0000 0.0000 0.0000 0.0000
PS min 0.0000 0.99000E+22
xyz max 3.1416 3.1416 0.0000
uvwpt max 0.0000 0.0000 0.0000 0.0000 0.0000
PS max 0.0000 -0.99000E+22

done :: set initial conditions

call userchk
0 Error Hmholtz psi 1900 3.5523E-03 5.1324E+00 1.0000E-09
0 Error Hmholtz psi 1900 3.5523E-03 5.1324E+00 1.0000E-09
0 0.000000E+00 t Time
7 0.000000E+00 0.000000E+00 8.566837E-05 Div er
7 2.615536E+00 2.615536E+00 0.000000E+00 0.000000E+00 u,V err

0 0.0000E+00 Write checkpoint
FILE:
/scratch/06396/tg856952/TR_2D_01/voruniform0.f00001

0 0.0000E+00 done :: Write checkpoint
file size = 20. MB
avg data-throughput = 437.6MB/s
io-nodes = 160
div: davg: 0.0000E+00 0.0000E+00 0.0000E+00
0 0.00000E+00 0.00000E+00 0.00000E+00 Infinity Infinity cdiv
0 0 -2.403E+00 2.403E+00 0.000E+00 0.000E+00 Infinity 0.000E+00divmnmx
TACC: MPI job exited with code: 174
TACC: Shutdown complete. Exiting.

Nadeem Malik

unread,
Mar 30, 2023, 2:20:21 PM3/30/23
to Nek5000
This problem is resolved.

Hashnayne Ahmed

unread,
May 25, 2023, 9:36:45 PM5/25/23
to Nek5000
Hi Nadeem,

I am facing the same issues: severe (174), segmentation fault occurred. Can you please share what makes this error and how you solved it?
I assume it is something with the SIZE file. Thanks.

Hashnayne
Reply all
Reply to author
Forward
0 new messages