I get a segfault with a psmp build of cp2k trunk. The corresponding
popt works. The problem occurs even when run as a single process and
with OMP_NUM_THREADS=1. This is what it looks like to gdb:
==========
...
Spin 1
Number of electrons: 17
Number of occupied orbitals: 17
Number of molecular orbitals: 17
Spin 2
Number of electrons: 16
Number of occupied orbitals: 16
Number of molecular orbitals: 16
Number of orbital functions: 169
Number of independent orbital functions: 169
Extrapolation method: initial_guess
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffef887700 (LWP 23493)]
__libc_free (mem=0x2020202000000001) at malloc.c:3709
3709 malloc.c: No such file or directory.
in malloc.c
(gdb) backtrace
#0 __libc_free (mem=0x2020202000000001) at malloc.c:3709
#1 0x000000000257a6ec in for_deallocate ()
#2 0x00000000016d413b in
QS_COLLOCATE_DENSITY::L_qs_collocate_density_mp_calculate_rho_elec__917__par_region0_2_110
()
at /home/andy/build/cp2k/cp2k/makefiles/../src/qs_collocate_density.F:1163
#3 0x0000000002634763 in L_kmp_invoke_pass_parms ()
#4 0x00007fffffff53fc in ?? ()
#5 0x00007fffffff53c4 in ?? ()
#6 0x00007fffffff5380 in ?? ()
...
==========
The CPU is a Core i7. The arch file and the input that triggers the
segfault (almost immediately after start) are here:
http://marge.uochb.cas.cz/~marsalek/tmp/W3-H3O-min-global.tar.gz
http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
The versions used for the build are:
cp2k trunk checked out today
OpenMPI 1.5
Intel Compiler 11.1.073 and corresponding MKL
I understand that this might be difficult or impossible to reproduce,
but would be grateful for any suggestions as for how to try to resolve
this.
Thanks,
Ondrej
it was my impression that this problem that you mention is
sufficiently resolved using the compile flag to put arrays on the heap
rather than the stack. I tried your suggestion anyway, but
unfortunately it did not change anything about my original problem.
Ondrej
> --
> You received this message because you are subscribed to the Google Groups "cp2k" group.
> To post to this group, send email to cp...@googlegroups.com.
> To unsubscribe from this group, send email to cp2k+uns...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/cp2k?hl=en.
>
>
this is a long shot have you tried MKL_NUM_THREADS=1 in addition to
the OMP_NUM_THREADS?
Also try with a different openmpi version, or simple a different mpi
implementation.
regards,
Alin
--
I force myself to contradict myself in order to avoid conforming to my
own taste. -- Marcel Duchamp
Without Questions there are no Answers!
_____________________________________________________________________
Alin Marin ELENA
Advanced Molecular Simulation Research Laboratory
School of Physics, University College Dublin
----
Ardionsamblú Móilíneach Saotharlann Taighde
Scoil na Fisice, An Coláiste Ollscoile, Baile Átha Cliath
-----------------------------------------------------------------------------------
Address:
Room 318, UCD Engineering and Material Science Centre
University College Dublin
Belfield, Dublin 4, Ireland
-----------------------------------------------------------------------------------
http://alin.elenaworld.net
alin....@ucdconnect.ie, alinm...@gmail.com
______________________________________________________________________
I am pretty sure that the MKL number of threads is not an issue
because I have the sequential version of MKL in my arch file. I've
tried anyway, just to be sure, but indeed got the same result.
> Also try with a different openmpi version, or simple a different mpi
> implementation.
I would really rather stay with OpenMPI. It does not seem to be MPI
related, as it happens even in a single-process run, but I will also
try a ssmp build just to be sure.
Ondrej
http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
And the problem persists as described, regardless of the number of
OpenMP threads.
Any ideas how to get this working?
Thanks,
Ondrej
firstly I will change the compiler to eliminate a bug in the intel compiler...
then I will reduce the optimisation flags....
Alin
> > _par_region0_2_110 ()
> > at
> > /home/andy/build/cp2k/cp2k/makefiles/../src/qs_collocate_density.F:1163
> > #3 0x0000000002634763 in L_kmp_invoke_pass_parms ()
> > #4 0x00007fffffff53fc in ?? ()
> > #5 0x00007fffffff53c4 in ?? ()
> > #6 0x00007fffffff5380 in ?? ()
> > ...
> > ==========
> >
> > The CPU is a Core i7. The arch file and the input that triggers the
> > segfault (almost immediately after start) are here:
> >
> > http://marge.uochb.cas.cz/~marsalek/tmp/W3-H3O-min-global.tar.gz
> > http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
> >
> > The versions used for the build are:
> > cp2k trunk checked out today
> > OpenMPI 1.5
> > Intel Compiler 11.1.073 and corresponding MKL
> >
> > I understand that this might be difficult or impossible to reproduce,
> > but would be grateful for any suggestions as for how to try to resolve
> > this.
> >
> > Thanks,
> > Ondrej
Thanks,
Ondrej
PS:
The threads can also be seen in gdb:
Starting program:
/home/andy/build/cp2k/cp2k/exe/Linux-x86-64-intel/cp2k.ssmp W3-H3O.inp
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff7fd3700 (LWP 32192)]
**** **** ****** ** PROGRAM STARTED AT 2011-01-11 14:38:08.422
***** ** *** *** ** PROGRAM STARTED ON cassandra
** **** ****** PROGRAM STARTED BY andy
***** ** ** ** ** PROGRAM PROCESS ID 32189
**** ** ******* ** PROGRAM STARTED IN /home/andy/W3-H3O-min-global
CP2K| version string: CP2K version 2.2.94 (Development Version)
CP2K| is freely available from http://cp2k.berlios.de/
CP2K| Program compiled at Tue Jan 11 14:00:00 CET 2011
CP2K| Program compiled on cassandra
CP2K| Program compiled for Linux-x86-64-intel
CP2K| Last CVS entry
CP2K| Input file name W3-H3O.inp
[New Thread 0x7ffff4fe4700 (LWP 32193)]
[New Thread 0x7ffff4be3700 (LWP 32194)]
[New Thread 0x7ffff47e2700 (LWP 32195)]
[New Thread 0x7fffeffff700 (LWP 32196)]
[New Thread 0x7fffefbfe700 (LWP 32197)]
[New Thread 0x7fffef7fd700 (LWP 32198)]
[New Thread 0x7fffef3fc700 (LWP 32199)]
A long shot but I think the problem lies in here
nthread = 1
!$ nthread = omp_get_max_threads()
that will give you the maximum number of threads available on the system
use this instead omp_get_num_threads()
regards,
Alin
> >> QS_COLLOCATE_DENSITY::L_qs_collocate_density_mp_calculate_rho_elec__91
> >> 7__par_region0_2_110 ()
> >> at
> >> /home/andy/build/cp2k/cp2k/makefiles/../src/qs_collocate_density.F:11
> >> 63 #3 0x0000000002634763 in L_kmp_invoke_pass_parms ()
> >> #4 0x00007fffffff53fc in ?? ()
> >> #5 0x00007fffffff53c4 in ?? ()
> >> #6 0x00007fffffff5380 in ?? ()
> >> ...
> >> ==========
> >>
> >> The CPU is a Core i7. The arch file and the input that triggers the
> >> segfault (almost immediately after start) are here:
> >>
> >> http://marge.uochb.cas.cz/~marsalek/tmp/W3-H3O-min-global.tar.gz
> >> http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
> >>
> >> The versions used for the build are:
> >> cp2k trunk checked out today
> >> OpenMPI 1.5
> >> Intel Compiler 11.1.073 and corresponding MKL
> >>
> >> I understand that this might be difficult or impossible to reproduce,
> >> but would be grateful for any suggestions as for how to try to resolve
> >> this.
> >>
> >> Thanks,
> >> Ondrej
I had a look at it...
I compiled on my machine the cp2k 2_1 branch with intel compiler 11.1.073 and
with gnu compilers...
with intel it crashes
with gnu compilers works...
why... that is a puzzle.
to make life simpler i removed the mkl and intel fftw compiling against plain
lapack/blas/fftw3
and added debug and extra checks
here is the error I get
forrtl: severe (408): fort: (2): Subscript #3 of the array R has value 64
which is greater than the upper bound of 63
Image PC Routine Line Source
cp2k.ssmp 00000000073BEFAD Unknown Unknown Unknown
cp2k.ssmp 00000000073BDAB5 Unknown Unknown Unknown
cp2k.ssmp 0000000007359200 Unknown Unknown Unknown
cp2k.ssmp 000000000730845A Unknown Unknown Unknown
cp2k.ssmp 0000000007308852 Unknown Unknown Unknown
cp2k.ssmp 0000000001C96A54 realspace_grid_ty 1993
realspace_grid_types.F
libiomp5.so 00007F277C231793 Unknown Unknown Unknown
and here is my arch file
alin@baphomet:~/lavello/cp2k/makefiles> cat ../arch/Linux-baphomet.ssmp
# by default some intel compilers put temporaries on the stack
# this might lead to segmentation faults is the stack limit is set to low
# stack limits can be increased by sysadmins or e.g with ulimit -s 256000
# furthermore new ifort (10.0?) compilers support the option
# -heap-arrays 64
# add this to the compilation flags is the other options do not work
# The following settings worked for:
# - AMD64 Opteron
# - SUSE Linux Enterprise Server 10.0 (x86_64)
# - Intel(R) Fortran Compiler for Intel(R) EM64T-based applications, Version
10.0
# - AMD acml library version 3.6.0
# - MPICH2-1.0.5p4
# - FFTW 3.1.2
#
CC = icc
CPP =
FC = ifort -FR -openmp -O0 -g -heap-arrays -check all -debug all -
traceback
LD = ifort -FR -openmp -O0 -g -heap-arrays -check all -debug all -
traceback
AR = ar -r
DFLAGS = -D__INTEL -D__FFTSG -D__FFTW3
CPPFLAGS = -C -traditional $(DFLAGS) -I$(INTEL_INC)
FCFLAGS = $(DFLAGS) -I$(INTEL_INC)
LDFLAGS = $(FCFLAGS)
LIBS = -llapack -lblas -lfftw3
OBJECTS_ARCHITECTURE = machine_intel.o
Alin
On Tuesday 11 January 2011 14:49:37 Ondrej Marsalek wrote:
> >> QS_COLLOCATE_DENSITY::L_qs_collocate_density_mp_calculate_rho_elec__91
> >> 7__par_region0_2_110 ()
> >> at
> >> /home/andy/build/cp2k/cp2k/makefiles/../src/qs_collocate_density.F:11
> >> 63 #3 0x0000000002634763 in L_kmp_invoke_pass_parms ()
> >> #4 0x00007fffffff53fc in ?? ()
> >> #5 0x00007fffffff53c4 in ?? ()
> >> #6 0x00007fffffff5380 in ?? ()
> >> ...
> >> ==========
> >>
> >> The CPU is a Core i7. The arch file and the input that triggers the
> >> segfault (almost immediately after start) are here:
> >>
> >> http://marge.uochb.cas.cz/~marsalek/tmp/W3-H3O-min-global.tar.gz
> >> http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
> >>
> >> The versions used for the build are:
> >> cp2k trunk checked out today
> >> OpenMPI 1.5
> >> Intel Compiler 11.1.073 and corresponding MKL
> >>
> >> I understand that this might be difficult or impossible to reproduce,
> >> but would be grateful for any suggestions as for how to try to resolve
> >> this.
> >>
> >> Thanks,
> >> Ondrej
Had a look at the cp2k branch too...
with the intel compiler... the same as previous I still get a seg fault but
the error is different on one thread...
----------------------------------- OT
---------------------------------------
Step Update method Time Convergence Total energy
Change
------------------------------------------------------------------------------
forrtl: warning (402): fort: (1): In call to DBCSR_MULT_M_E_E, an array
temporary was created for argument #4
forrtl: warning (402): fort: (1): In call to DBCSR_MULT_M_E_E, an array
temporary was created for argument #4
forrtl: severe (408): fort: (2): Subscript #1 of the array RIGHT_COL_MAP has
value 2 which is greater than the upper bound of 1
Image PC Routine Line Source
cp2k.ssmp 0000000006F671BD Unknown Unknown Unknown
cp2k.ssmp 0000000006F65CC5 Unknown Unknown Unknown
cp2k.ssmp 0000000006F01410 Unknown Unknown Unknown
cp2k.ssmp 0000000006EB066A Unknown Unknown Unknown
cp2k.ssmp 0000000006EB0A62 Unknown Unknown Unknown
cp2k.ssmp 0000000006B1D64F dbcsr_internal_op 1624
dbcsr_internal_operations.F
cp2k.ssmp 0000000006AF631C dbcsr_internal_op 1155
dbcsr_internal_operations.F
libiomp5.so 00007FDFC2E67793 Unknown Unknown Unknown
this time even the gnu version crashes....
strange it runs with 1 thread
seems to run...
with more than 1 crashes in different regions...
regards,
Alin
On Tuesday 11 January 2011 14:49:37 Ondrej Marsalek wrote:
> >> QS_COLLOCATE_DENSITY::L_qs_collocate_density_mp_calculate_rho_elec__91
> >> 7__par_region0_2_110 ()
> >> at
> >> /home/andy/build/cp2k/cp2k/makefiles/../src/qs_collocate_density.F:11
> >> 63 #3 0x0000000002634763 in L_kmp_invoke_pass_parms ()
> >> #4 0x00007fffffff53fc in ?? ()
> >> #5 0x00007fffffff53c4 in ?? ()
> >> #6 0x00007fffffff5380 in ?? ()
> >> ...
> >> ==========
> >>
> >> The CPU is a Core i7. The arch file and the input that triggers the
> >> segfault (almost immediately after start) are here:
> >>
> >> http://marge.uochb.cas.cz/~marsalek/tmp/W3-H3O-min-global.tar.gz
> >> http://marge.uochb.cas.cz/~marsalek/tmp/Linux-x86-64-intel.psmp
> >>
> >> The versions used for the build are:
> >> cp2k trunk checked out today
> >> OpenMPI 1.5
> >> Intel Compiler 11.1.073 and corresponding MKL
> >>
> >> I understand that this might be difficult or impossible to reproduce,
> >> but would be grateful for any suggestions as for how to try to resolve
> >> this.
> >>
> >> Thanks,
> >> Ondrej
On Tue, 2011-01-11 at 14:49 +0100, Ondrej Marsalek wrote:
> Additional confusion: the problem occurs within an 'IF (nthread > 1)',
> even if I set OMP_NUM_THREADS=1. When I print nthread in that place, I
> always get '8'. Is cp2k supposed to honor the value of the env
> variable? If not, what is the proper way to set the number of threads?
The OMP_NUM_THREADS environment variable is not interpreted by CP2K but
by the threading library that implements OpenMP.
This variable is obtained by a call to OMP_GET_MAX_THREADS. Its return
value of 8 in your case seems to conflict with the specified behavior,
which is to return the number of threads to be used in (following) OMP
PARALLEL sections. These continue to use only one thread.
Cheers,
Urban.
On Tue, 2011-01-11 at 19:45 +0000, Alin Marin Elena wrote:
> Hi Ondrej,
>
> Had a look at the cp2k branch too...
>
> with the intel compiler... the same as previous I still get a seg fault but
> the error is different on one thread...
> [...]
> this time even the gnu version crashes....
> strange it runs with 1 thread
> seems to run...
> with more than 1 crashes in different regions...
>
> regards,
> Alin
I just committed a bugfix for this to the CVS trunk. The bug affected
only the trunk and not the 2.1 branch.
Best regards,
Urban
On Wed, Jan 12, 2011 at 09:15, Urban Borštnik <urban.b...@gmail.com> wrote:
> Hello,
>
> On Tue, 2011-01-11 at 14:49 +0100, Ondrej Marsalek wrote:
>> Additional confusion: the problem occurs within an 'IF (nthread > 1)',
>> even if I set OMP_NUM_THREADS=1. When I print nthread in that place, I
>> always get '8'. Is cp2k supposed to honor the value of the env
>> variable? If not, what is the proper way to set the number of threads?
>
> The OMP_NUM_THREADS environment variable is not interpreted by CP2K but
> by the threading library that implements OpenMP.
>
> This variable is obtained by a call to OMP_GET_MAX_THREADS. Its return
> value of 8 in your case seems to conflict with the specified behavior,
> which is to return the number of threads to be used in (following) OMP
> PARALLEL sections. These continue to use only one thread.
Thanks for the info, it is what I expected but not what I was getting.
Turns out it was a silly typo on my part, sorry about that. My I get
expected behavior with respect to the number of threads, but still get
the original bug. I will prepare a simpler test case and post it
separately.
Ondrej