status of hybrid OpenMP+MPI version

581 views
Skip to first unread message

Axel

unread,
May 8, 2011, 7:20:32 PM5/8/11
to cp...@googlegroups.com
hi everybody,

i would very much appreciate it if somebody could comment 
on the status of the hybrid OpenMP+MPI version of cp2k.

i am compiling cp2k on RHEL 6.0 using either gfortran 4.4.4
as shipped with RHEL or intel fortran 11.1.072.

the 2.1 branch only compiles with gfortran. intel fortran abends
in the function make_threads in the file src/lib/dbscr_methods.F
claiming that FORALL is not compatible with !$OMP SINGLE
gfortran does not complain, but the resulting compile leads to
bogus results when used with more than one thread.

on the development branch, both gfortran and intel finish the 
compile successfully. however, only the serial version compiled
with gfortran seems to be working correctly (perhaps by accident?)
which the parallel variant either hangs or dies in some mpi call.

before i start digging deeper into this, e.g. to see which individual
regtests fail and whether code or compiler have to be blamed, 
can somebody here clue me in under which circumstances
(arch, code version, compiler/OS versions) the OpenMP variant
and - more importantly - the hybrid MPI+OpenMP compile 
is supposed to work.

thanks in advance,
     axel.


Urban Borštnik

unread,
May 9, 2011, 10:59:49 AM5/9/11
to cp...@googlegroups.com
Hi Axel,

I successfully compile and run the development branch of CP2K in hybrid
MPI+OpenMP with gfortran (4.5.[02]) and MPICH2 (1.3.1). About 100 of
the regtests give varying results and most of these seem to be within
numerical noise.

I use the following defines and compiler options

DFLAGS = -D__GFORTRAN -D__FFTSG -D__parallel -D__SCALAPACK -D__BLACS -D__USE_CP2K_TRACE -D__HAS_smm_dnn -D__HAS_NO_OMP_3 -D__LIBINT
FCFLAGS = -g -fopenmp -O3 -ffast-math -funroll-loops -ftree-vectorize -march=native -ffree-form -fcray-pointer $(DFLAGS) -I$(GFORTRAN_INC)

Also, avoid some threaded libraries such as GotoBLAS.

Cheers,
Urban

> --
> You received this message because you are subscribed to the Google
> Groups "cp2k" group.
> To post to this group, send email to cp...@googlegroups.com.
> To unsubscribe from this group, send email to cp2k
> +unsub...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/cp2k?hl=en.


Axel

unread,
May 9, 2011, 12:34:05 PM5/9/11
to cp...@googlegroups.com
hi urban,

hmmm... interesting.
was this the very latest cvs code?
it almost looked to me as if there were some 
mpi calls within a multi-threaded region that 
were messing things up...

also, can you comment on what these two defines are for?
 -D__USE_CP2K_TRACE -D__HAS_smm_dnn 

this one is a bit of a surprise, since already gfortran 4.3.x claims to be OpenMP 3 compliant.
-D__HAS_NO_OMP_3

as far as threaded libraries go. what about MKL?
last thing a remember is that it would not go multi-threaded
if it is called from within an OpenMP multi-threaded region,
but that may be wishful thinking...

in any case, i'll give it a try.

thanks,
     axel.

Urban Borštnik

unread,
May 9, 2011, 4:50:22 PM5/9/11
to cp...@googlegroups.com
Hi Axel,

On Mon, 2011-05-09 at 09:34 -0700, Axel wrote:
> [...]


> hmmm... interesting.
> was this the very latest cvs code?

Yes, this was today's CVS.

> it almost looked to me as if there were some
> mpi calls within a multi-threaded region that
> were messing things up...

I believe we don't (or shouldn't) use such calls. Also, you can
explicitly make CP2K call mpi_init_thread() instead of mpi_init()
(change .TRUE. to .FALSE. on line 479 of src/message_passing.F)--the
current default is a workaround for de facto MPI threading behavior but
is technically wrong. Also, I just committed a patch to declare the use
of MPI funneled mode instead of general mode.

> also, can you comment on what these two defines are for?
> -D__USE_CP2K_TRACE -D__HAS_smm_dnn

__USE_CP2K_TRACE uses CP2K's stack trace infrastructure on failed
assertions in the DBCSR sparse matrix (sub)library. To use it you must
also add the timings_mp.o module to the LIBS variable (used to get
around this and some other circular dependencies). While using it or
not should not affect behavior, there are OpenMP statements in stack
tracing and timing.

__HAS_smm_dnn is for using the Small Matrix Multiply library (see
tools/build_libsmm). It is not threaded so skipping it should have no
effect on the program's behavior.

> this one is a bit of a surprise, since already gfortran 4.3.x claims
> to be OpenMP 3 compliant.
> -D__HAS_NO_OMP_3

I used gfortran 4.3 until recently (where psmp also worked last time I
checked) and didn't update the defines...

> as far as threaded libraries go. what about MKL?
> last thing a remember is that it would not go multi-threaded
> if it is called from within an OpenMP multi-threaded region,
> but that may be wishful thinking...

I haven't tried MKL recently so I can't say. However, to rule out any
library issues, I would try standard netlib BLAS and Lapack...

Cheers,
Urban.

>
> in any case, i'll give it a try.
>
>
> thanks,
> axel.
>

Axel

unread,
May 9, 2011, 6:54:59 PM5/9/11
to cp...@googlegroups.com


On Monday, May 9, 2011 4:50:22 PM UTC-4, Urban wrote:
Hi Axel,

On Mon, 2011-05-09 at 09:34 -0700, Axel wrote:
> [...]
> hmmm... interesting.
> was this the very latest cvs code?

Yes, this was today's CVS.


ok. i updated and tried again, but no luck so far.

> it almost looked to me as if there were some
> mpi calls within a multi-threaded region that
> were messing things up...

I believe we don't (or shouldn't) use such calls.  Also, you can
explicitly make CP2K call mpi_init_thread() instead of mpi_init()
(change .TRUE. to .FALSE. on line 479 of src/message_passing.F)--the
current default is a workaround  for de facto MPI threading behavior but
is technically wrong.  Also, I just committed a patch to declare the use
of MPI funneled mode instead of general mode.

i tried that, but now cp2k quits right away with a confusing error:

 CP2K|  MPI error 0 in Inadequate level of thread support is provided. : MPI_SUCCESS: no errors
 CP2K| Abnormal program termination, stopped by process number 0
 CP2K|  MPI error 0 in Inadequate level of thread support is provided. : MPI_SUCCESS: no errors
 CP2K| Abnormal program termination, stopped by process number 1
 CP2K|  MPI error 0 in Inadequate level of thread support is provided. : MPI_SUCCESS: no errors
 CP2K| Abnormal program termination, stopped by process number 2
 CP2K|  MPI error 0 in Inadequate level of thread support is provided. : MPI_SUCCESS: no errors
 CP2K| Abnormal program termination, stopped by process number 3
 CP2K|  MPI error 0 in Inadequate level of thread support is provided. : MPI_SUCCESS: no errors
 CP2K| Abnormal program termination, stopped by process number 4
 CP2K|  MPI error 0 in Inadequate level of thread support is provided. : MPI_SUCCESS: no errors
 CP2K| Abnormal program termination, stopped by process number 5
 CP2K|  MPI error 0 in Inadequate level of thread support is provided. : MPI_SUCCESS: no errors
 CP2K| Abnormal program termination, stopped by process number 6
 CP2K|  MPI error 0 in Inadequate level of thread support is provided. : MPI_SUCCESS: no errors
 CP2K| Abnormal program termination, stopped by process number 7

> also, can you comment on what these two defines are for?
>  -D__USE_CP2K_TRACE -D__HAS_smm_dnn

__USE_CP2K_TRACE uses CP2K's stack trace infrastructure on failed
assertions in the DBCSR sparse matrix (sub)library.  To use it you must
also add the timings_mp.o module to the LIBS variable (used to get
around this and some other circular dependencies).  While using it or
not should not affect behavior, there are OpenMP statements in stack
tracing and timing.

i have so far tried without.

__HAS_smm_dnn is for using the Small Matrix Multiply library (see
tools/build_libsmm).  It is not threaded so skipping it should have no
effect on the program's behavior.

ok. thanks for the info. this is a fairly new feature, right?
 

> this one is a bit of a surprise, since already gfortran 4.3.x claims
> to be OpenMP 3 compliant.
> -D__HAS_NO_OMP_3

I used gfortran 4.3 until recently (where psmp also worked last time I
checked) and didn't update the defines...

> as far as threaded libraries go. what about MKL?
> last thing a remember is that it would not go multi-threaded
> if it is called from within an OpenMP multi-threaded region,
> but that may be wishful thinking...

I haven't tried MKL recently so I can't say.  However, to rule out any
library issues, I would try standard netlib BLAS and Lapack...

well, in my experience those can get as easily miscompiled
as using precompiled packages like MKL.

for good measure, i am linking with the serial version
now, but that didn't change the behavior.

it _does_ work fine in the sopt/popt binaries.

just make sure we're on the same page. the input(s) that
are giving me the problems are the water benchmark inputs.
in tests/QS/benchmark/

the cp2k.?smp binaries work fine for OMP_NUM_THREADS=1,
but as soon as i enable more than one thread with cp2k.psmp
and multiple MPI tasks, i get NaNs or random energies.
cp2k.ssmp appears to be working for both intel 11.1 and gfortran 4.4.4

thanks,
    axel.

Axel

unread,
May 9, 2011, 10:27:21 PM5/9/11
to cp...@googlegroups.com

I believe we don't (or shouldn't) use such calls.  Also, you can
explicitly make CP2K call mpi_init_thread() instead of mpi_init()
(change .TRUE. to .FALSE. on line 479 of src/message_passing.F)--the
current default is a workaround  for de facto MPI threading behavior but
is technically wrong.  Also, I just committed a patch to declare the use
of MPI funneled mode instead of general mode.

i tried that, but now cp2k quits right away with a confusing error:

 CP2K|  MPI error 0 in Inadequate level of thread support is provided. : MPI_SUCCESS: no errors
 CP2K| Abnormal program termination, stopped by process number 0

ok. i figured that one out by myself. i needed to compile a new MPI 
version that has MPI-threads enabled (which wasn't by default).

> also, can you comment on what these two defines are for?
>  -D__USE_CP2K_TRACE -D__HAS_smm_dnn

__USE_CP2K_TRACE uses CP2K's stack trace infrastructure on failed
assertions in the DBCSR sparse matrix (sub)library.  To use it you must
also add the timings_mp.o module to the LIBS variable (used to get
around this and some other circular dependencies).  While using it or
not should not affect behavior, there are OpenMP statements in stack
tracing and timing.

i have so far tried without.

i've tried with -D__USE_CP2K_TRACE and without. no difference.

> this one is a bit of a surprise, since already gfortran 4.3.x claims
> to be OpenMP 3 compliant.
> -D__HAS_NO_OMP_3

I used gfortran 4.3 until recently (where psmp also worked last time I
checked) and didn't update the defines...

and it appears that -D__HAS_NO_OMP_3 seems to help with cp2k.ssmp

for good measure, i am linking with the serial version
now, but that didn't change the behavior.

yep. serial or threaded MKL doesn't make a difference.

i am now down to two cases:
cp2k.ssmp works fine with the development code for as long
as -D__HAS_NO_OMP_3 is set. the same is true for cp2k.psmp
if as long as i use only 1 MPI task. cp2k.psmp is also working
fine if i use only 1 OpenMP thread and multiple MPI tasks, but
using multiple MPI tasks _and_ OpenMP threads gives me random
numbers. for example the H2O-32.inp benchmark with two MPI and 6 OpenMP gives:

  ----------------------------------- OT ---------------------------------------

  Step     Update method      Time    Convergence         Total energy    Change
  ------------------------------------------------------------------------------
     1 OT DIIS     0.15E+00    1.0  1942.49460508 -33766612.6904597133 -3.38E+07
     2 OT DIIS     0.15E+00    1.0   605.50197323 ******************** -2.64E+09
     3 OT DIIS     0.15E+00    1.1   160.86073279   4380968.2349330271  2.68E+09
     4 OT DIIS     0.15E+00    1.1   164.34481270    865533.1852639467 -3.52E+06
     5 OT DIIS     0.15E+00    1.1   428.43636015 -30478863.0777326487 -3.13E+07
     6 OT DIIS     0.15E+00    1.1  1094.51147521   3880172.3745214189  3.44E+07
     7 OT DIIS     0.15E+00    1.1    33.84945332 ******************** -1.37E+08
     8 OT DIIS     0.15E+00    1.1   164.71992666  -2837998.3761878917  1.30E+08
     9 OT SD       0.15E+00    1.1  2844.75780601 ******************** -1.30E+08
    10 OT SD       0.15E+00    1.1  3605.58561902   2964610.0015976029  1.36E+08
    11 OT DIIS     0.15E+00    1.1   209.53192021    796475.0800868375 -2.17E+06
    12 OT DIIS     0.15E+00    1.1   380.38070104 -26949187.3137775287 -2.77E+07

with 12 MPI and 1 OpenMP and the same binary i get the proper behavior.
  ----------------------------------- OT ---------------------------------------

  Step     Update method      Time    Convergence         Total energy    Change
  ------------------------------------------------------------------------------
     1 OT DIIS     0.15E+00    0.6     0.01909729      -529.3015728815 -5.29E+02
     2 OT DIIS     0.15E+00    0.8     0.01238424      -536.3043281413 -7.00E+00
     3 OT DIIS     0.15E+00    0.8     0.00874280      -540.9240856869 -4.62E+00
     4 OT DIIS     0.15E+00    0.8     0.00605219      -544.3264898596 -3.40E+00
     5 OT DIIS     0.15E+00    0.8     0.00458609      -546.1883072280 -1.86E+00
     6 OT DIIS     0.15E+00    0.8     0.00351827      -547.5582206572 -1.37E+00
     7 OT DIIS     0.15E+00    0.8     0.00273439      -548.5013690323 -9.43E-01
     8 OT DIIS     0.15E+00    0.8     0.00208753      -549.1681124465 -6.67E-01
     9 OT DIIS     0.15E+00    0.8     0.00176820      -549.4777508042 -3.10E-01
   10 OT DIIS     0.15E+00    0.8     0.00140542      -549.7607744914 -2.83E-01
   11 OT DIIS     0.15E+00    0.8     0.00119366      -549.9316698866 -1.71E-01
   12 OT DIIS     0.15E+00    0.8     0.00102681      -550.0679169205 -1.36E-01


could it be that there is some code path where some initializations that 
are required for "clean" OpenMP behavior are not executed, if the MPI
rank /= 0?

this seems to be quite a generic problem and it puzzles me
why it may work on one installation and not another.

the only other significant difference that i see to your setup
is that you are using MPICH2 and i use OpenMPI (1.4.3), but
if that would make a difference, google will probably request 
that this group will become adult only after my next post.

thanks for your help,
     axel.

just make sure we're on the same page. the input(s) that
are giving me the problems are the water benchmark inputs.
in tests/QS/benchmark/

the cp2k.?smp binaries work fine for OMP_NUM_THREADS=1,
but as soon as i enable more than one thread with cp2k.psmp
and multiple MPI tasks, i get NaNs or random energies.
cp2k.ssmp appears to be working for both intel 11.1 and gfortran 4.4.4

thanks,
    axel.

Cheers,
Urban.

>
> in any case, i'll give it a try.
>
>
> thanks,
>      axel.
>
> --
> You received this message because you are subscribed to the Google
> Groups "cp2k" group.
> To post to this group, send email to cp...@googlegroups.com.
> To unsubscribe from this group, send email to cp2k

> +unsu...@googlegroups.com.

Urban Borštnik

unread,
May 11, 2011, 4:21:19 PM5/11/11
to cp...@googlegroups.com
Hi Axel,

> [...]


> the only other significant difference that i see to your setup
> is that you are using MPICH2 and i use OpenMPI (1.4.3), but
> if that would make a difference, google will probably request
> that this group will become adult only after my next post.

Actually, I also get crashes with OpenMPI--but not with MPICH2. I
recompiled OpenMPI 1.4.3 with MPI user thread support, and I recompiled
BLACS and ScaLAPACK with it. As soon as I use more than 1 thread (only
tried popt) it crashes in a BLACS call.

I'll also give the OpenMPI 1.5 development branch a try.

Regards,
Urban

> +unsub...@googlegroups.com.

shoutian sun

unread,
May 11, 2011, 11:05:03 PM5/11/11
to cp...@googlegroups.com
Hi,
1. For openmp (1.4.2):
./configure --prefix=.../openmpi142 --enable-mpi-threads --enable-shared --with-threads=posix --enable-mpi-f90 CC=cc cpp=cpp CXX=c++ FC=ifort

2. for cp2k.popt:
INTEL_INC=/opt/intel/Compiler/11.1/072/mkl/include
FFTW3_INC=.../fftw322/include

CC       = cc
CPP      =
FC       = mpif90 -FR
#FC       = mpif90
LD = mpif90 -i_dynamic -openmp
AR       = ar -r
#DFLAGS   = -D__INTEL -D__FFTSG -D__parallel -D__BLACS -D__SCALAPACK -D__FFTW3
DFLAGS = -D__INTEL -D__FFTSG -D__parallel -D__BLACS -D__SCALAPACK -D__FFTW3 -D__LIBINT
CPPFLAGS = -C -traditional $(DFLAGS) -I$(INTEL_INC)
FCFLAGS = $(DFLAGS) -I$(INTEL_INC) -I$(FFTW3_INC) -O2 -xW -heap-arrays 64 -funroll-loops -fpp -free
FCFLAGS2 = $(DFLAGS) -I$(INTEL_INC) -I$(FFTW3_INC) -O1 -xW -heap-arrays 64 -funroll-loops -fpp -free
LDFLAGS = $(FCFLAGS) -I$(INTEL_INC) -L/opt/intel/mkl/10.1.0.015/lib/em64t
#
LIBS = -L/opt/intel/mkl/10.1.0.015/lib/em64t -lmkl_scalapack -lmkl_em64t -lmkl_blacs_openmpi_lp64 -lguide -lpthread -lstdc++\
.../fftw322/lib/libfftw3.a \
.../libint114/lib/libderiv.a \
.../libint114/lib/libint.a

OBJECTS_ARCHITECTURE = machine_intel.o


graphcon.o: graphcon.F
        $(FC) -c $(FCFLAGS2)
----------------------------------------------------------------
"..." stands for your own direction.


Best regards,

Axel

unread,
May 12, 2011, 9:50:53 AM5/12/11
to cp...@googlegroups.com
hi urban,

Actually, I also get crashes with OpenMPI--but not with MPICH2.  I
recompiled OpenMPI 1.4.3 with MPI user thread support, and I recompiled
BLACS and ScaLAPACK with it.  As soon as I use more than 1 thread (only

i think that one went away after defining   -D__HAS_NO_OMP_3

tried popt) it crashes in a BLACS call.

I'll also give the OpenMPI 1.5 development branch a try.


hmm... i have this nagging feeling that there may be a race condition somewhere
and that it just doesn't show with MPICH.

with experimenting some more, i see that the "degree of badness" in the total
energies, for example, is rather low with using only 2 or 3 threads but gets large
with 6 or more.

on the other hand, i am _very_ impressed how well the threading parallelization
scales (with the .ssmp binary), even to a very large thread count (i have a few
48 core maschines to test on).

i hope there is an answer somewhere as to why hybrid parallelization does crash
with OpenMPI...

cheers,
    axel.

Regards,
Urban

>                 > +uns...@googlegroups.com.

Axel

unread,
May 12, 2011, 10:00:05 AM5/12/11
to cp...@googlegroups.com


On Wednesday, May 11, 2011 11:05:03 PM UTC-4, Ross, Sun wrote:
Hi,
1. For openmp (1.4.2):
./configure --prefix=.../openmpi142 --enable-mpi-threads --enable-shared --with-threads=posix --enable-mpi-f90 CC=cc cpp=cpp CXX=c++ FC=ifort

there no need to hook ifort permanently into the OpenMPI installation.
the more elegant way is to do a symlink from opal_wrapper to, e.g., mpiifort
and then add a file mpiifort-wrapper-data.txt to $OMPI_HOME/share/openmpi:
project=Open MPI
project_short=OMPI
version=1.4.3
language=Fortran 77
compiler_env=F77
compiler_flags_env=FFLAGS
compiler=ifort
extra_includes=
preprocessor_flags=
compiler_flags=
linker_flags= -static-intel -threads
libs=-lmpi_f77 -lmpi -lopen-rte -lopen-pal   -ldl   -Wl,--export-dynamic -lnsl -lutil -lm -ldl 
required_file=
includedir=${includedir}
libdir=${libdir}

 
2. for cp2k.popt:

we're after getting cp2k.psmp  working. not cp2k.popt
 
INTEL_INC=/opt/intel/Compiler/11.1/072/mkl/include
FFTW3_INC=.../fftw322/include

CC       = cc
CPP      =
FC       = mpif90 -FR
#FC       = mpif90
LD = mpif90 -i_dynamic -openmp
AR       = ar -r
#DFLAGS   = -D__INTEL -D__FFTSG -D__parallel -D__BLACS -D__SCALAPACK -D__FFTW3
DFLAGS = -D__INTEL -D__FFTSG -D__parallel -D__BLACS -D__SCALAPACK -D__FFTW3 -D__LIBINT
CPPFLAGS = -C -traditional $(DFLAGS) -I$(INTEL_INC)
FCFLAGS = $(DFLAGS) -I$(INTEL_INC) -I$(FFTW3_INC) -O2 -xW -heap-arrays 64 -funroll-loops -fpp -free
FCFLAGS2 = $(DFLAGS) -I$(INTEL_INC) -I$(FFTW3_INC) -O1 -xW -heap-arrays 64 -funroll-loops -fpp -free
LDFLAGS = $(FCFLAGS) -I$(INTEL_INC) -L/opt/intel/mkl/10.1.0.015/lib/em64t
#
LIBS = -L/opt/intel/mkl/10.1.0.015/lib/em64t -lmkl_scalapack -lmkl_em64t -lmkl_blacs_openmpi_lp64 -lguide -lpthread -lstdc++\

this may cause some issues unless you define OMP_NUM_THREADS=1 by default. an IMO better
solution is to link all intel libraries statically (so you don't have to mess with LD_LIBRARY_PATH
after the compile) and use the sequential interface. e.g.

LDFLAGS  = $(FCFLAGS) -static-intel
LIBS     = -L/opt/intel/Compiler/11.1/072/mkl/lib/em64t -Wl,--start-group,-Bstatic \
-lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64 \
-lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group,-Bdynamic \
-lfftw3

this also works nicely for gfortran:
LDFLAGS  = $(FCFLAGS) 
LIBS     = -L/opt/intel/Compiler/11.1/072/mkl/lib/em64t -Wl,--start-group,-Bstatic \
-lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64 \
-lmkl_gf_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group,-Bdynamic \
-lfftw3

the resulting executables work very well on our clusters. with and without
thread support in OpenMPI (n.b.: one of the really nice things about OpenMPI
is that i can swap the two MPI compiles without having to relink cp2k).

now if only the cp2k.psmp binary would work, too. i would be a very happy
camper and my colleagues would have no more excuse to not run cp2k jobs fast.

cheers,
    axel.

Jörg Saßmannshausen

unread,
May 12, 2011, 11:30:09 AM5/12/11
to cp...@googlegroups.com
Dear all,

There is another way of doing it as well:
I have added these lines here:
compiler_args=-intel


project=Open MPI
project_short=OMPI
version=1.4.3
language=Fortran 77
compiler_env=F77
compiler_flags_env=FFLAGS
compiler=ifort
extra_includes=
preprocessor_flags=

compiler_flags=-pthread
linker_flags=


libs=-lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl

-lutil -lm -ldl -lgfortran


required_file=
includedir=${includedir}
libdir=${libdir}

to
$OMPI_HOME/share/openmpi/mpif77-wrapper-data.txt
(and similar for mpif90-wrapper-data.txt). All I need to do now is:
mpif77 -intel
and it calles the ifort compiler in my PATH and also links against the
libgfortran (which is handy if you are using libraries build with gfortran
instead of ifort). It is probably similar to what Alex suggested.
You get some warning but it is working (at least for me).


My two pennies from a sunny London

Jörg

--
*************************************************************
Jörg Saßmannshausen
University College London
Department of Chemistry
Gordon Street
London
WC1H 0AJ

email: j.sassma...@ucl.ac.uk
web: http://sassy.formativ.net

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

Axel

unread,
May 12, 2011, 12:02:16 PM5/12/11
to cp...@googlegroups.com
joerg,
mixing fortran codes compiled with different fortran compilers is
a bad idea in my opinion; especially, if they use the same name
mangling scheme (at least for fortran 77 style calls). you may be
in for very some surprises.

instead of ifort). It is probably similar to what Alex suggested.
You get some warning but it is working (at least for me).

the warning is because intel fortran does not support the -pthread flag.

cheers,
     axel.

Axel

unread,
May 17, 2011, 12:44:02 PM5/17/11
to cp...@googlegroups.com

perhaps the following helps to track down where things go wrong.

i just updated cp2k to the latest cvs and ran with 2 MPI tasks the H2O-32.inp 
benchmark file. output is identical between OMP_NUM_THREADS=1 and
OMP_NUM_THREADS=2 until i reach this output.


with one thread:
  ----------------------------------- OT ---------------------------------------

  Step     Update method      Time    Convergence         Total energy    Change
  ------------------------------------------------------------------------------

  Electronic density on regular grids:       -255.9999972509        0.0000027491
  Core density on regular grids:              255.9999999997       -0.0000000003
  Total charge density on r-space grids:        0.0000027487
  Total charge density g-space grids:           0.0000027487


  Core Hamiltonian energy:                                        424.9326981392
  Hartree energy:                                                 577.5530255740
  Exchange-correlation energy:                                   -129.1348019384
  Coulomb (electron-electron) energy:                             261.5717237448
        Maximum deviation from MO S-orthonormality                    0.4229E-12
        Minimum/Maximum MO magnitude              0.3562E+00          0.3521E+01
     1 OT DIIS     0.15E+00    1.4     0.01909729      -529.3015728815 -5.29E+02


with two threads:

  ----------------------------------- OT ---------------------------------------

  Step     Update method      Time    Convergence         Total energy    Change
  ------------------------------------------------------------------------------

  Electronic density on regular grids:       -140.0786039192      115.9213960808
  Core density on regular grids:              255.9999999997       -0.0000000003
  Total charge density on r-space grids:      115.9213960805
  Total charge density g-space grids:         -82.9977475763


  Core Hamiltonian energy:                                        424.9326981392
  Hartree energy:                                                1368.0021602823
  Exchange-correlation energy:                                   -391.6132000948
  Coulomb (electron-electron) energy:                             197.3468594016
        Maximum deviation from MO S-orthonormality                    0.4229E-12
        Minimum/Maximum MO magnitude              0.3562E+00          0.3521E+01
     1 OT DIIS     0.15E+00    1.0     0.10264462        -1.3308363296 -1.33E+00


the funny thing is that 140.0786039192+115.9213960808 = 256.0000000000
so it looks as if some information is accumulated in the wrong way.

can anybody lend me a hand in locating where this could happen?

as a reminder. this only goes wrong if using more than one thread and
using more than one MPI task.

thanks,
    axel.





Matt W

unread,
May 18, 2011, 4:23:26 AM5/18/11
to cp...@googlegroups.com
Hi Axel,

setting DISTRIBUTION_TYPE REPLICATED

in

CP2K_INPUT / FORCE_EVAL / DFT / MGRID / RS_GRID

should take the code through some different routines in summing the density. Seeing if it gives similar results might help localize the issue.

Matt
Reply all
Reply to author
Forward
0 new messages