Running CDFT Tutorial Calculation on Cluster

Brian Day

unread,

Jul 20, 2018, 11:24:22 AM7/20/18

to cp2k

Hi all,

I am trying to run the following CDFT tutorial (https://www.cp2k.org/howto:cdft) for water dimers on my university's cluster and am getting an error I do not understand. For clarity, the issue is not with submitting the file, but with CP2K itself. I have also successfully completed the tutorial locally, which is another reason I am confused by the error.

My process for running this on the cluster was as follows:

-Duplicate the energy.bash script and modify it to generate a series of input files (one for each cp2k run)

-Modify the slurm file to submit each input file individually (i.e. Rather than running them with a single bash script as when running locally, I wanted to run each input with its own submission to the cluster)

I have attached the relevant files.

I am able to run the first calculation (energy_run_standard.inp) successfully, but when I try to run the second one (energy_run_cte_noadj.inp) I get the following error:

Possible matches for unknown keyword

MAX_LS

keyword MAX_LS in section %__ROOT__%FORCE_EVAL%DFT%SCF%OUTER_SCF%CDFT_OPT score: 44

keyword MAX_LS in section %__ROOT__%FORCE_EVAL%DFT%QS%CDFT%OUTER_SCF%CDFT_OPT score: 44

keyword MAX_LS in section %__ROOT__%FORCE_EVAL%DFT%XAS%SCF%OUTER_SCF%CDFT_OPT score: 44

keyword MAX_SCF in section %__ROOT__%FORCE_EVAL%DFT%SCF%OUTER_SCF score: 13

keyword MAX_SCF in section %__ROOT__%FORCE_EVAL%DFT%QS%CDFT%OUTER_SCF score: 13

*******************************************************************************

* ___ *

* / \ *

* [ABORT] *

* \___/ found an unknown keyword MAX_LS in section OUTER_SCF *

* | *

* O/| *

* /| | *

* / \ input/input_parsing.F:246 *

*******************************************************************************

===== Routine Calling Stack =====

7 section_vals_parse

6 section_vals_parse

5 section_vals_parse

4 section_vals_parse

3 section_vals_parse

2 section_vals_parse

1 read_input

application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

I checked my input files, and MAX_LS is never called in either of those sections.

Any help in understanding this would be greatly appreciated!

Regards,

Brian

cdft-buildfiles.bash

cp2k-v2.slurm

dft-common-params.inc

energy_run_cte_nosizeadj.inp

energy_run_cte_nosizeadj.out

energy_run_standard.inp

energy_run_standard.out

energy_run.inp

subsys.inc

Nico Holmberg

unread,

Jul 21, 2018, 4:39:50 AM7/21/18

to cp2k

Hi,

My bad. I made some changes to the CDFT input structure between CP2K version 5.1 and 6.1 and I forgot to update the tutorial. You probably have different versions locally and on your cluster. All CDFT related stuff from the OUTER_SCF section was moved to a CDFT_OPT subsection. If move the relevant keywords in the file becke_qs.inc, the calculation should run without issue.

I'll update the tutorial on Monday and upload new files for CP2K 6.1.

BR,

Nico

Brian Day

unread,

Jul 23, 2018, 10:25:18 AM7/23/18

to cp...@googlegroups.com

Thank you!

-Brian

Nico Holmberg

unread,

Jul 24, 2018, 1:54:52 AM7/24/18

to cp2k

Hi,

Just as a heads up. I've updated the tutorial and uploaded new versions of the example input files. Changes between CP2K versions 5.1 and 6.1 are indicated.

Let me know if you encounter any further issues with CDFT calculations.

BR,

Nico

maanantai 23. heinäkuuta 2018 17.25.18 UTC+3 Brian Day kirjoitti:

Thank you!

-Brian

Brian Day

unread,

Jul 24, 2018, 2:33:30 PM7/24/18

to cp...@googlegroups.com

Hi Nico,

Two follow up questions:

1. When I went to run the fragment based CDFT calculations, I got the following error

Reading the cube file:

water-dimer-frag-a-pbe-energy-ELECTRON_DENSITY-1_0.cube

Reading the cube file:

water-dimer-frag-b-pbe-energy-ELECTRON_DENSITY-1_0.cube

*******************************************************************************

* ___ *

* / \ *

* [ABORT] *

* \___/ The number of electrons in the reference and interacting *

* | configurations does not match. Check your fragment cube files. *

* O/| *

* /| | *

* / \ qs_cdft_methods.F:958 *

*******************************************************************************

Visually inspecting the files, there didn't seem to be any issue with either (I've attached each of them). But it may be related to the second issue I ran into.

2. When I tried to run each of the fragment for the charge transfer energy, the calculation portion ran fine, but I got a SIGSEGV (forrtl: severe (174): SIGSEGV, segmentation fault occurred) error when it tried to write the electron density cube files. It would simply output a blank file, and then break. I was able to fix this by passing in the blank file of the appropriate name (giving me the files above), but it seemed like an odd error. I'm not sure if this has something to do with the way CP2K 6.1 is complied on our cluster, or if it is an artifact of the way the code was written. I have a ticket regarding this issue open with the computing center here, and can update if they let me know anything useful.

I also can recreate the error and send the full output file if that would be of any benefit to you.

Thanks and Regards,

Brian

water-dimer-frag-b-pbe-energy-ELECTRON_DENSITY-1_0.cube

water-dimer-frag-a-pbe-energy-ELECTRON_DENSITY-1_0.cube

Brian Day

unread,

Jul 25, 2018, 3:59:27 PM7/25/18

to cp2k

Hi Nico,

I was able to finish the tutorial by generating those electron density cube files in a separate step with an older version of CP2K, and then using them to complete the tutorial. For whatever reason, v6.1 seems to have issues with generating these cube files. If this is an error you are familiar with and have any compiling advice I can pass on to our computing center, that would be great, as they seem unsure of how to handle it. Regardless, Thanks again for your help, and appreciate you updating the tutorial for the new version.

Best,

Brian

Nico Holmberg

unread,

Jul 27, 2018, 6:30:10 AM7/27/18

to cp2k

Hi Brian,

The cube files you attached seem to be quite a bit smaller than I get when I run the tutorial (12 vs 25 Mb), which is likely the cause of the error message you received in question 1. For some reason, likely related to the error you described in question 2, the cube file is only partly written to disk. I don't understand what you mean by "It would simply output a blank file, and then break. I was able to fix this by passing in the blank file of the appropriate name (giving me the files above), but it seemed like an odd error."

In any case, an MPI parallelized cube file writer was introduced in CP2K 6.1. It has been tested on different platforms and compilers (including Intel ifort), though not with this particular input file, but it is possible that some bug still remains. In order to debug this behavior, it would be helpful if you could post more details about how the CP2K 6.1 binary was compiled and how you ran the calculation. In particular, the following would be helpful

Compiling environment including version numbers
Number of MPI processes used in the calculation. Does the error related to printing out the cube file disappear if you use just 1 MPI process?'
Detailed output of the crashing simulation with debugging symbols turned on in the binary (your computing center might not provide one by default)

I have different Intel compiler versions available to me on a cluster so I can try to reproduce the error you're seeing if you can figure out the compiler version.

BR,

Nico

Brian Day

unread,

Jul 30, 2018, 4:00:41 PM7/30/18

to cp2k

Hi Nico,

I use the following intel compilers: module load intel/2017.3.196 intel-mpi/2017.3.196 cp2k/6.1

Summarized below are the conditions I used and the results I got:

nodes = 1, tasks = 14, executable = cp2k.popt -i *.inp -o *.out ---> Cube files generated without issue

nodes = 1, tasks = 14, executable = mpirun -np $SLURM_NTASKS cp2k.popt -i *.inp -o *.out --->

forrtl: severe (174): SIGSEGV, segmentation fault occurred

Image PC Routine Line Source

cp2k.popt 000000000D730E14 Unknown Unknown Unknown

libpthread-2.17.s 00002ABEB44735E0 Unknown Unknown Unknown

libc-2.17.so 00002ABEB61E74DC cfree Unknown Unknown

cp2k.popt 000000000D767FA8 Unknown Unknown Unknown

cp2k.popt 000000000109F105 qs_dispersion_typ 135 qs_dispersion_types.F

cp2k.popt 0000000000B88BFF qs_environment_ty 1476 qs_environment_types.F

cp2k.popt 0000000000B13020 force_env_types_m 232 force_env_types.F

cp2k.popt 0000000000EE66E0 f77_interface_mp_ 335 f77_interface.F

cp2k.popt 000000000043BEF8 cp2k_runs_mp_run_ 405 cp2k_runs.F

cp2k.popt 0000000000432814 MAIN__ 281 cp2k.F

cp2k.popt 000000000043151E Unknown Unknown Unknown

libc-2.17.so 00002ABEB6188C05 __libc_start_main Unknown Unknown

cp2k.popt 0000000000431429 Unknown Unknown Unknown

Note that I had to run these simulations on a different cluster here, as the one I was using previously requires the submission file to declare a minimum of 2 nodes. I can talk to our computing center and see if they can test this themselves.

To (hopefully) clarify my earlier message, when I ran with 2 nodes, the last line in the cp2k output file would be:

The sum of alpha and beta density is written in cube file format to the file:

/scratch/slurm-1244788/water-dimer-frag-b-pbe-energy-ELECTRON_DENSITY-1_0.cube

and the electron density file would appear in the submission directory, but it would be empty. If I re-ran the simulation and passed this blank file to the cluster which I was running on, it would run successfully, but when trying to open the file in another program such as Avogadro, or using it in a subsequent simulation, it would not work. Maybe this is because it is only partially writing as you pointed out.

I will try and update this with the compiling information and a detailed debugging output. (Sorry if any of the above does not make sense, I am still fairly new to computational work).

Thanks again.

-Brian

Nico Holmberg

unread,

Jul 31, 2018, 2:35:45 AM7/31/18

to cp2k

Hi Brian,

Thanks for the information. Just to confirm, if you run

ifort --version

does the command return ifort (IFORT) 17.0.4 ? I need to use a different machine than normally to compile with that version of the Intel compiler, so I need a while to familiarize myself with the proper build process on that machine. Hopefully I'll find the time later this week.

By the way, the error message you posted is quite cryptic and does not point to anything related to writing the cube file. Any chance you could post the full output log file for the crashing simulation? Are you able to reproduce the crash if you decrease the number of MPI tasks to, say 2 or 4, from 14?

BR,

Nico

Brian Day

unread,

Aug 9, 2018, 12:49:45 PM8/9/18

to cp2k

Hi Nico,

Sorry for the long delayed reply, I had forgotten to check this thread for some time!

ifort --version returns: ifort (IFORT) 17.0.04 20170411.

Additionally, I get the same error message when I reduce the number of mpi tasks to 4 (2 per node, 2 nodes).

Best,

Brian

Brian Day

unread,

Aug 9, 2018, 12:51:08 PM8/9/18

to cp2k

Actually, the error message is slightly different, see below:

forrtl: severe (174): SIGSEGV, segmentation fault occurred

Image PC Routine Line Source

cp2k.popt 000000000D730E14 Unknown Unknown Unknown

libpthread-2.17.s 00002ABEBC47F5E0 Unknown Unknown Unknown

libmpi.so.12 00002ABEBD7DA1BA PMPI_File_write_a Unknown Unknown

libmpifort.so.12. 00002ABEBCF2F1AE pmpi_file_write_a Unknown Unknown

cp2k.popt 0000000002F17BCC message_passing_m 3315 message_passing.F

cp2k.popt 0000000002A33AFC realspace_grid_cu 698 realspace_grid_cube.F

cp2k.popt 0000000002A31F4D realspace_grid_cu 211 realspace_grid_cube.F

cp2k.popt 0000000000A06F9D cp_realspace_grid 64 cp_realspace_grid_cube.F

cp2k.popt 0000000000A9F32B qs_scf_post_gpw_m 2651 qs_scf_post_gpw.F

cp2k.popt 0000000000A883B1 qs_scf_post_gpw_m 2001 qs_scf_post_gpw.F

cp2k.popt 0000000000EB6610 qs_scf_post_scf_m 70 qs_scf_post_scf.F

cp2k.popt 00000000017A267F qs_scf_mp_scf_ 285 qs_scf.F

cp2k.popt 0000000000BA7709 qs_energy_mp_qs_e 86 qs_energy.F

cp2k.popt 0000000000C52681 qs_force_mp_qs_ca 115 qs_force.F

cp2k.popt 000000000096F4AA force_env_methods 242 force_env_methods.F

cp2k.popt 000000000043BCAC cp2k_runs_mp_run_ 323 cp2k_runs.F

cp2k.popt 0000000000432814 MAIN__ 281 cp2k.F

cp2k.popt 000000000043151E Unknown Unknown Unknown

libc-2.17.so 00002ABEBE194C05 __libc_start_main Unknown Unknown

cp2k.popt 0000000000431429 Unknown Unknown Unknown

Thanks again for all your help so far!

-Brian

Nico Holmberg

unread,

Sep 11, 2018, 3:43:57 PM9/11/18

to cp2k

Hi Brian,

Sorry for the long delay in replying, I had a couple of tight deadlines that required my full attention.

I compiled CP2K using version of 17.0.4 20170411 of the Intel Fortran compiler, Intel MPI and MKL. You can find my arch file below. I ran the tutorial files with 1, 2, and 24 MPI processes and did not encounter any issues.

Looking at the stack trace you included in your last post, it seems that the calculation is crashing somewhere inside the MPI I/O routine that CP2K is calling. This looks like a library issue to me. Are you able to provide any more information about how your binary has been compiled?

By the way, if you have access to the latest development version of CP2K (dated yesterday), you can disable MPI I/O to force CP2K to use the serial versions of the cube writer/reader. This will bypass your issue without fixing the underlying issue. See discussion in this post for more information.

# Bare bones arch file for building CP2K with the Intel compilation suite
# Tested with ifort (IFORT) + Intel MPI + MKL version 17.0.4 20170411 
# Build tools
CC       = icc
CPP      =
FC       = mpiifort
LD       = mpiifort
AR       = ar -r

# Flags and libraries
CPPFLAGS =

DFLAGS   = -D__BLACS -D__INTEL -D__MKL -D__FFTW3 -D__parallel -D__SCALAPACK  \
           -D__HAS_NO_SHARED_GLIBC

CFLAGS   = $(DFLAGS)

FCFLAGS  = $(DFLAGS) -O2 -g -traceback -fp-model precise -fp-model source -free  \
           -I$(MKLROOT)/include -I$(MKLROOT)/include/fftw

LDFLAGS  = $(FCFLAGS)

LDFLAGS_C = $(FCFLAGS) -nofor_main

LIBS     = -Wl,--start-group \
           $(MKLROOT)/lib/intel64/libmkl_scalapack_lp64.a \
           $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \
           $(MKLROOT)/lib/intel64/libmkl_sequential.a \
           $(MKLROOT)/lib/intel64/libmkl_core.a \
           $(MKLROOT)/lib/intel64/libmkl_blacs_intelmpi_lp64.a \
           -Wl,--end-group \
           -lpthread -lm -ldl

# Required due to memory leak that occurs if high optimisations are used
mp2_optimize_ri_basis.o: mp2_optimize_ri_basis.F
			 $(FC) -c $(subst O2,O0,$(FCFLAGS)) $<

Reply all

Reply to author

Forward