segmentation fault

1,711 views
Skip to first unread message

Nikita

unread,
Jul 22, 2011, 5:13:53 AM7/22/11
to cp2k
Dear CP2K users and developers,

I have recently compiled CP2K on our university cluster. But the
calculation failed with the following message:
********************************************
--------------------------
OPTIMIZATION STEP: 2
--------------------------
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line
Source
cp2k.popt 0000000001195583 Unknown Unknown
Unknown
cp2k.popt 000000000104A51A Unknown Unknown
Unknown
cp2k.popt 00000000004CEDA1 Unknown Unknown
Unknown
cp2k.popt 000000000040E631 Unknown Unknown
Unknown
cp2k.popt 00000000004096BE Unknown Unknown
Unknown
cp2k.popt 000000000040868C Unknown Unknown
Unknown
libc.so.6 00007FED46734CF4 Unknown Unknown
Unknown
cp2k.popt 0000000000408599 Unknown Unknown
Unknown
************************************************

Could you suggest any solutions to my problem? By the way, here is my
arch file:

**********************************
INTEL_MKL = /opt/intel/mkl/10.2.4.032
INTEL_INC = $(INTEL_MKL)/include/fftw
INTEL_LIB = $(INTEL_MKL)/lib/em64t

CC = mpicc
CPP =
FC = mpif90
LD = mpif90
AR = /usr/bin/ar -r
DFLAGS = -D__INTEL -D__FFTSG -D__parallel -D__SCALAPACK -D__BLACS -
D__FFTW3 -D__LIBINT -D__HAS_NO_ISO_C_BINDING
CPPFLAGS = -C -traditional $(DFLAGS) -I$(INTEL_INC)
FCFLAGS = $(DFLAGS) -I$(INTEL_INC) \
-O2 -xW -funroll-loops -fpp -free -heap-arrays 64
LDFLAGS = $(FCFLAGS) -I$(INTEL_INC) -i-static
LIBS = -L$(INTEL_LIB) -lmkl_scalapack_lp64 -
lmkl_blacs_intelmpi_lp64 \
-lmkl_intel_lp64 -lmkl_sequential -lmkl_core \
/home/vakula/CP2K_NEW/fftw3_compiled/lib/libfftw3.a \
/home/vakula/CP2K_NEW/cp2k/tools/hfx_tools/libint_tools/
libint_cpp_wrapper.o \
/home/vakula/CP2K_NEW/libint_compiled/lib/libderiv.a \
/home/vakula/CP2K_NEW/libint_compiled/lib/libint.a \
-lstdc++


OBJECTS_ARCHITECTURE = machine_intel.o
******************************

Thank you in advance,
Nikita Vakula
PhD student, Moscow State University

Jörg Saßmannshausen

unread,
Jul 22, 2011, 5:36:35 AM7/22/11
to cp...@googlegroups.com
Hi Nikita,

the intel compiler is notorious for doing segfaults.

Have you tried:

$ ulimit -s unlimited

That might cure the problem.

Regards

Jörg

--
*************************************************************
Jörg Saßmannshausen
University College London
Department of Chemistry
Gordon Street
London
WC1H 0AJ

email: j.sassma...@ucl.ac.uk
web: http://sassy.formativ.net

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

Nikita

unread,
Jul 22, 2011, 6:06:51 AM7/22/11
to cp2k
Hi Jörg,

thanks for your suggestion! But I had already tried this before
compilation, but the error occured all the same.
Maybe there are some other tricks to overcome this obstacle?

Thank you in advance,
Nikita

On Jul 22, 1:36 pm, Jörg Saßmannshausen <j.sassmannshau...@ucl.ac.uk>
wrote:
> email: j.sassmannshau...@ucl.ac.uk

Jörg Saßmannshausen

unread,
Jul 22, 2011, 6:56:18 AM7/22/11
to cp...@googlegroups.com
Dear Nikita,

did you use that before you started a run as well?

Unfortunately, it is some time that I have compiled cp2k so I don't know on
top of my head. The limit of the stack-size is usually the culprit here.

All the best from a sunny London

Jörg

email: j.sassma...@ucl.ac.uk

Peter Mamonov

unread,
Jul 22, 2011, 7:37:15 AM7/22/11
to cp...@googlegroups.com
Hi Nikita!

Correct me if I'm wrong, but as i can see you are trying to build CP2K
for SKIF `Chebyshev`. If so, you can try this (working) arch file for
Chebyshev (see below). Also be sure to switch to intel's compiler
using command:

mpi-selector --set intel_mpi_intel64-4.0.0.025

and run your task with `-as intel` option:

mpirun -as intel -blah -blah blah

Best regards,
Peter
--

# Chebyshev ARCH file

MKL_ROOT = /opt/intel/mkl/10.2.4.032
MKL_LIB = $(MKL_ROOT)/lib/em64t
MKL_INCLUDE = $(MKL_ROOT)/include

CC = cc


CPP =
FC = mpif90
LD = mpif90

AR = ar -r

DFLAGS = -D__INTEL -D__FFTSG -D__FFTMKL -D__parallel -D__BLACS
-D__SCALAPACK -D__HAS_NO_ISO_C_BINDING

CPPFLAGS = -traditional -C $(DFLAGS) -I$(MKL_INCLUDE)/include

FCFLAGS = -fc=ifort $(DFLAGS) -I$(MKL_INCLUDE) -O2 -xHost
-heap-arrays 64 -fpp -free

LDFLAGS = $(FCFLAGS) -L$(MKL_LIB)

LIBS = -Wl,--start-group \
$(MKL_LIB)/libmkl_scalapack_lp64.a \
$(MKL_LIB)/libmkl_blacs_intelmpi_lp64.a \
$(MKL_LIB)/libmkl_intel_lp64.a \
$(MKL_LIB)/libmkl_sequential.a \
$(MKL_LIB)/libmkl_core.a \
-static-mpi \
-Wl,--end-group

OBJECTS_ARCHITECTURE = machine_intel.o

# End of Chebyshev ARCH file

> --
> You received this message because you are subscribed to the Google Groups "cp2k" group.
> To post to this group, send email to cp...@googlegroups.com.
> To unsubscribe from this group, send email to cp2k+uns...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/cp2k?hl=en.
>
>

Nikita

unread,
Jul 22, 2011, 9:01:06 AM7/22/11
to cp2k
Hi Peter!

Thank you very much for the ARCH file! Yes, I'm trying to build it on
SKIF Chebyshev. But using your arch file I got warning:

********************
ifort: command line warning #10156: ignoring option '-static'; no
argument required
make[1]: warning: Clock skew detected. Your build may be incomplete.
make[1]: Leaving directory `/home/vakula/CP2K_NEW/cp2k/obj/
skif_chebyshev/popt'
make: warning: Clock skew detected. Your build may be incomplete.
********************

Nevertheless, I had tried to run my calculation with obtained
(chebyshev arch) executable and I got the same problem:

********************
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line
Source
cp2k.popt 00000000011ADFE0 Unknown Unknown
Unknown
cp2k.popt 00000000010637CB Unknown Unknown
Unknown
cp2k.popt 00000000004EA0F1 Unknown Unknown
Unknown
cp2k.popt 000000000042FAE1 Unknown Unknown
Unknown
cp2k.popt 000000000042AB6E Unknown Unknown
Unknown
cp2k.popt 0000000000429B3C Unknown Unknown
Unknown
libc.so.6 00007FF4AAB0FCF4 Unknown Unknown
Unknown
cp2k.popt 0000000000429A49 Unknown Unknown
Unknown
********************

plus a new one

*******************
[55:node-62-01] unexpected disconnect completion event from
[7:node-41-03]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 55
[61:node-41-05] unexpected disconnect completion event from
[7:node-41-03]
[62:node-41-05] unexpected disconnect completion event from
[7:node-41-03]
[57:node-41-05] unexpected disconnect completion eventrank 4 in job 1
t60-2.parallel.ru_42663 caused collective abort of all ranks
exit status of rank 4: killed by signal 9
from [7:node-41-03]
[58:node-41-05] unexpected disconnect completion event from
[7:node-41-03]
[56:node-41-05] unexpected disconnect completion event from
[7:node-41-03]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 56
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 57
[60:node-41-05] unexpected disconnect completion event from
[7:node-41-03]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
[63:node-41-05] unexpected disconnect completion event from
[7:node-41-03]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 63
[51:node-62-01] unexpected disconnect completion event from
[7:node-41-03]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 51
[52:node-62-01] unexpected disconnect completion event from
[7:node-41-03]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 58
[59:node-41-05] unexpected disconnect completion event from
[7:node-41-03]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 59
internal ABORT - process 60
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 61
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 62
[50:node-62-01] unexpected disconnect completion event from
[7:node-41-03]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 50
[1:node-41-03] unexpected disconnect completion event from
[33:node-62-07]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 1
[0:node-41-03] unexpected disconnect completion event from
[33:node-62-07]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 0
[2:node-41-03] unexpected disconnect completion event from
[33:node-62-07]
[3:node-41-03] unexpected disconnect completion event from
[33:node-62-07]
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 3
Assertion failed in file ../../dapl_module_util.c at line 1593: 0
internal ABORT - process 2
rank 1 in job 1 t60-2.parallel.ru_42663 caused collective abort of
all ranks
exit status of rank 1: killed by signal 9
rank 0 in job 1 t60-2.parallel.ru_42663 caused collective abort of
all ranks
exit status of rank 0: killed by signal 9
******************************
Any suggestions? And do you use CP2K on SKIF? Does it work without any
problem?

Thank you in advance,
Nikita

Peter Mamonov

unread,
Jul 22, 2011, 9:35:29 AM7/22/11
to cp...@googlegroups.com
You can try to add '-g' option to compiler and linker options to get
subroutines names in backtrace output (those lines `cp2k.popt
00000000011ADFE0 Unknown Unknown`). Probably this will
shed the light on the source of the problem.

Also you can try a working binary, that i use for calculations (CP2K
version 2.2.134 (Development Version)). Copy it from
/home/mamonov/cp2k/cp2k/exe/SKIF-intel/cp2k.impi being on a cluster
frontend node.

Peter

Nikita

unread,
Jul 26, 2011, 2:07:14 PM7/26/11
to cp2k
Hi Peter!

I'm sorry for the delay in getting back in touch with you! Thank you
very much for you reply! I've added traceback and debug options to
compiler and the problem has disappeared. ("g" option didn't give any
result). Can you tell me how can I get your executable?
Thank you in advance,
Nikita

> >> >> > > CPPFLAGS = -C -traditional...
>
> read more »
Reply all
Reply to author
Forward
0 new messages