Job setup: how to run multiple single point calculations efficiently?

53 views
Skip to first unread message

Sam Niblett

unread,
Apr 6, 2020, 12:54:36 PM4/6/20
to cp2k
Dear all,

I need to perform a reasonably large number of energy+force single point calculations for distinct configurations of a single molecular system (~1000-10000 in the first instance but lots more later on). I'm using DFT with a rev-PBE functional, running a pre-compiled CP2K module on a large supercomputer. I'm on a fairly tight budget of core hours, so minimising the runtime is my main concern.

Is there a keyword that will instruct CP2K to perform the same operation on a series of different starting configurations (preferably read from a single xyz file, for example)? Something along the lines of LAMMPS' rerun command.



I have looked through the CP2K_INPUT documentation and the best option I could find is to use FARMING to perform a separate ENERGY_FORCE job on each starting configuration. This works, but it is undesirable for three reasons:

1) It requires creating a separate input folder for each configuration (not a big problem, but it's annoying and inelegant given that all the input except the system coordinates is identical for every job)

2) This method appears to reallocate and reinitialise the functionals and system details for every set of input data, which is unnecessary overhead given that those details are the same each time.

3) My starting configurations are similar enough that the converged electron density of one should be a good starting point for the next. By analogy with AIMD calculations I have run on the same system, I estimate that using this information could decrease the cost of the calculation by up to 80%. But FARMING doesn't know that, so each calculation starts completely from scratch.

The result is that FARMING only gives a small speedup compared to running separate CP2K jobs for each input point. Does anyone know of a better (i.e. more efficient) way to set up these calculations?



The only other thing I could think of would be using BAND with 0 optimisation steps and K_SPRING 0, treating each configuration of my input as a separate replica. I don't know if that would give me the information I want but if it did then it would fix at least points 1 and 2 of my list above. If anyone has tried something like that before, please let me know how you got on.

I'm hoping that there's a straightforward way to perform this task and I just haven't found it in the documentation. Please point me to the relevant page if so.

Thanks, and best wishes,

Sam

Ali Sufali

unread,
Apr 6, 2020, 1:28:25 PM4/6/20
to cp...@googlegroups.com
You can use farming. It is a part of a cp2k and allows you to reduce the mpi header time (time the software needs to initialize) as well as running multiple single calculations. For instructions on how to use farming there is a video on youtube on ughent channel.
If you search cp2k tutorial on youtube you will find the videos. They are on a playlist. The 7th video explains the farming and I think it's a good starting point.

--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/8e7824cc-0f9c-40e6-878f-a9d3cdb1ddda%40googlegroups.com.

hut...@chem.uzh.ch

unread,
Apr 6, 2020, 1:33:18 PM4/6/20
to cp...@googlegroups.com
Hi

you are looking for

CP2K_INPUT / MOTION / MD
ENSEMBLE REFTRAJ

to rerun a pre-calculated set of molecular coordinates.

regards

Juerg Hutter
--------------------------------------------------------------
Juerg Hutter Phone : ++41 44 635 4491
Institut für Chemie C FAX : ++41 44 635 6838
Universität Zürich E-mail: hut...@chem.uzh.ch
Winterthurerstrasse 190
CH-8057 Zürich, Switzerland
---------------------------------------------------------------

-----cp...@googlegroups.com wrote: -----
To: "cp2k" <cp...@googlegroups.com>
From: "Sam Niblett"
Sent by: cp...@googlegroups.com
Date: 04/06/2020 06:54PM
Subject: [CP2K:13065] Job setup: how to run multiple single point calculations efficiently?

Sam Niblett

unread,
Apr 6, 2020, 5:29:30 PM4/6/20
to cp2k
Perfect, that is indeed exactly what I was after!

Thanks, and best wishes,

Sam

 To unsubscribe from this group and stop receiving emails from it, send an email to cp...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages