CP2K input for Tamkin

325 views
Skip to first unread message

Tobias Kraemer

unread,
Nov 3, 2015, 4:19:50 PM11/3/15
to TAMkin
Dear all,

I have started using the TAMkin software ((latest stable release 1.0.5) recently, in order to calculate thermochemistry data (in particular Gibbs Free Energies) from CP2K output. To begin with I have looked at some easy examples such as H2O molecule in order to familiarise myself with the protocols. I have used output from both Gaussian03 and CP2K (2.5.1). For the G03 output of a water molecule in gas phase I find pretty good agreement between TAMkin and Gaussian thermochemistry, and I attribute the small differences (~0.001 kJ/mol or J/mol*K depending on the quantity) to the use of perhaps slightly different conversion factors (ideally I would expect that there aren't any differences at all). I have attached my input file for the run, but I think this is fairly straightforward. 

Now, with the CP2K output I have more trouble, and this is just meant to be the easy case. I struggle to make this work, and some of the output puzzles me. I am listing these problems/questions in the following. I have optimised the structure, performed a vibrational analysis and did a energy+force calculation as described in the manual (inputs and outputs are attached).

(1) with nma = NMA(molecule, ConstrainExt()) tamkin complains about the rmsd of the gradient, so I reduced it to 0.01, which then works. How much error is introduced by relaxing the threshold? I would assume that the CP2K-optimized structure is perfectly fine to use, since I have used already somewhat tighter than default values (MAX_FORCE 1*10-4 Eh*bohr-1). 

(2) the TAMkin output from this run reports somewhat different wave numbers for the 3 normal modes than in the cp2k file.
    CP2K: 1604.378487   3657.348123          3760.361618 cm-1
   TAMkin  56.4    90.2  1604.3  3657.3  3760.3 cm-1 plus one imaginary frequency of i10.1 cm-1

I don't know where these extra numbers come from, but I certainly don't think this is right. 

(3) When I include ExtRot() in pf = PartFun(nma, [ExtTrans(), ExtRot()]) I get this error:
ValueError: rotational: There is no external rotation in periodic systems.

I have done the calculation with the molecule-in-a-box approach, hence the system shouldn't be periodical. 

(4) ultimately I am after some bigger systems (molecular crystals of an organometallic species, ~600 atoms in unit cell), for which I want to calculate a reaction profile and report Free Gibbs Energies. The reaction involves some species present in the crystal (a rhodium complex) and hydrogen gas. Initially free, hydrogen then binds to the Rh centre and everything from there is standard organometallic chemistry. So I would need to know how to treat this system in the appropriate manner with CP2K/Tamkin. Basically I have located most of the minima and transition states along the pathway, but what is needed at this stage are Free Energies for all species which I hope to obtain from TAMkin. 

If anyone could give me some advice how I can generate the correct CP2K output / TAMkin input for a solid state system (periodic) and Hydrogen gas I would be very grateful. Perhaps someone can comment on some of the special settings needed for CP2K, e.g. I run the CP2K frequencies with FULLY PERIODIC FALSE for non-periodic systems.  

I am happy to provide more information if needed, I think at this stage I would need some better idea how to get started, since even such a simple system as water causes some problems. In the meantime I will continue to try for myself, but I want to avoid introducing some bad errors.

Thanks for your help

Tobias

tamkin_g03_water.txt
h2o_sp.inp
h2o_opt.inp
h2o_freq.out
h2o_freq.inp
h2o_opt.out
h2o_sp.out

Toon Verstraelen

unread,
Nov 7, 2015, 11:46:45 AM11/7/15
to tam...@googlegroups.com
Hi Tobias,

Thanks for your interest in TAMkin. I'm going to reply below between
your questions and comments to keep things overseeable.

On 11/03/2015 04:19 PM, Tobias Kraemer wrote:
> Dear all,
>
> I have started using the TAMkin software ((latest stable release
> 1.0.5) recently, in order to calculate thermochemistry data (in
> particular Gibbs Free Energies) from CP2K output. To begin with I have
> looked at some easy examples such as H2O molecule in order to
> familiarise myself with the protocols. I have used output from both
> Gaussian03 and CP2K (2.5.1). For the G03 output of a water molecule in
> gas phase I find pretty good agreement between TAMkin and Gaussian
> thermochemistry, and I attribute the small differences (~0.001 kJ/mol
> or J/mol*K depending on the quantity) to the use of perhaps slightly
> different conversion factors (ideally I would expect that there aren't
> any differences at all). I have attached my input file for the run,
> but I think this is fairly straightforward.

Well, this is exact reproduction of output from other codes is not as
easy as it seems. Physical constants and masses of elements are updated
every now and then to reflect results from recent more accurate
measurements. Different programs take these constants from different
sources with variable quality and age. Some codes even refuse to keep
their numbers up-to-date with latest publications just to maintain
backward compatibility. To solve this issue to some extent in TAMkin, we
try to read in atomic masses from the output files where possible. Even
that is not perfect as most codes don't bother to print them out in
sufficient precision.

> Now, with the CP2K output I have more trouble, and this is just meant
> to be the easy case. I struggle to make this work, and some of the
> output puzzles me. I am listing these problems/questions in the
> following. I have optimised the structure, performed a vibrational
> analysis and did a energy+force calculation as described in the manual
> (inputs and outputs are attached).

In general, vibrational calculations from periodic calculations (except
for the Crystal program) are always a little problematic. Nearly all
periodic codes suffer from the eggbox effect, which also results in
noise in the Hessian that is hard to mitigate. Also its impact on the
final results depend somewhat on how the vibrational analysis is done
exactly.

>
> (1) with nma = NMA(molecule, ConstrainExt()) tamkin complains about
> the rmsd of the gradient, so I reduced it to 0.01, which then works.
> How much error is introduced by relaxing the threshold? I would assume
> that the CP2K-optimized structure is perfectly fine to use, since I
> have used already somewhat tighter than default values (MAX_FORCE
> 1*10-4 Eh*bohr-1).

The threshold is just introduced for a sanity check, it does not really
affect the end result. This sanity check is introduced because the
removal of external degrees of freedom from a not-very-equilibrium
Hessian is not reliable. It is just a reminder that you have to do a
better job on the geometry optimization to get decent results. So, try
to further refine the geometry.

>
> (2) the TAMkin output from this run reports somewhat different wave
> numbers for the 3 normal modes than in the cp2k file.
> CP2K: 1604.378487 3657.348123 3760.361618 cm-1
> TAMkin 56.4 90.2 1604.3 3657.3 3760.3 cm-1 plus one
> imaginary frequency of i10.1 cm-1
>
> I don't know where these extra numbers come from, but I certainly
> don't think this is right.

CP2K removed also the rotational degrees of freedom from the Hessian,
while the TAMkin result still includes them. Note that these three extra
numbers should be zero. They aren't due to a mixture of probably the
following three reasons:

1) eggbox effect.
2) incomplete geometry optimization.
3) interactions between periodic images or box boundaries. (Your box is
big enough, so this is probably a minor effect.)

>
> (3) When I include ExtRot() in pf = PartFun(nma, [ExtTrans(),
> ExtRot()]) I get this error:
> ValueError: rotational: There is no external rotation in periodic systems.
>
> I have done the calculation with the molecule-in-a-box approach, hence
> the system shouldn't be periodical.
The current TAMkin version assumes that the systems read from CP2K
output are always periodic. You can manually override this, as you can
see in the code:

https://github.com/molmod/tamkin/blob/master/tamkin/io/cp2k.py#L49

It is possible to deduce from the CP2K output whether the system is
periodic. We should improve TAMkin to actually use that information.
Anyway, if you set is_periodic=True, you can remove the rotational
degrees of freedom from the Hessian in TAMkin. Still, you should also
get reasonable results without it (i.e. with 6 almost zero frequencies
for the external degrees of freedom). If not, the geometry and Hessian
are not of sufficient quality.

> (4) ultimately I am after some bigger systems (molecular crystals of
> an organometallic species, ~600 atoms in unit cell), for which I want
> to calculate a reaction profile and report Free Gibbs Energies. The
> reaction involves some species present in the crystal (a rhodium
> complex) and hydrogen gas. Initially free, hydrogen then binds to the
> Rh centre and everything from there is standard organometallic
> chemistry. So I would need to know how to treat this system in the
> appropriate manner with CP2K/Tamkin. Basically I have located most of
> the minima and transition states along the pathway, but what is needed
> at this stage are Free Energies for all species which I hope to obtain
> from TAMkin.

First, you have to double check if the harmonic approximation is
sensible for this system. Then make sure you have very good geometries
and also make sure your SCF convergence is good enough for both the
geometry and the Hessian calculations. A mediocre SCF convergence can
severely affect the results. The Hessian in CP2K is computed with finite
differences, which magnifies the errors due to a poorly converged SCF.
First, forces are more sensitive to SCF convergence than the energy.
Second, for the computation of the Hessian, the difference between two
very similar forces is computed and divided by a small number. You
really need to know and control these errors before doing production
calculations. Similarly, you should also be careful with the eggbox problem.

>
> If anyone could give me some advice how I can generate the correct
> CP2K output / TAMkin input for a solid state system (periodic) and
> Hydrogen gas I would be very grateful. Perhaps someone can comment on
> some of the special settings needed for CP2K, e.g. I run the CP2K
> frequencies with FULLY PERIODIC FALSE for non-periodic systems.

For isolated systems, it should be set to True in principle. However, it
does not affect the Hessian that is computed in CP2K and read by TAMkin.
So this setting does actually not affect results obtained with TAMkin
afterwards.

My main advice is to control numerical errors in CP2K. Make sure you
understand all the flags set in the input file. If some aspects are
unclear, please ask, or consider visiting our group in Ghent. I don't
have template input files for CP2K that will guarantee reliable results.
(I'm not even sure if such general templates can be made. The ideal
settings may be system dependent.)

>
> I am happy to provide more information if needed, I think at this
> stage I would need some better idea how to get started, since even
> such a simple system as water causes some problems. In the meantime I
> will continue to try for myself, but I want to avoid introducing some
> bad errors.
>

OK. Try to use my advise above for the water test case and see how much
that helps. If you have remaining problems, either for water or for your
system of interest, feel free to get back in touch. It would be nice if
you could share your tuned input for the water molecule.

Best Regards,

Toon

Tobias Kraemer

unread,
Nov 12, 2015, 9:26:22 AM11/12/15
to TAMkin
Dear Toon,


thank you very much for your detailed reply, very insightful. Here are a few direct comments in response, I will need to look into your suggestions over the next litle while.

(i) From what you've said about the vibrational analysis, I understand that TAMkin uses the Hessian (which it extracts from the CP2K output) to generate its own set of wavenumbers (matrix diagonalisation). 
    This would then account for the relatively small differences seen for some of the wavenumbers reported in the output. For example for H2O (now calculated fully periodically within a 20x20x20 Ang supercell inCP2K), the vibrational
    modes are i47.7461, 41.6999, 104.3933, 1603.5224, 3658.3990, 3761.1037 cm-1. Of these the first three should correspond to rotations, since the FULLY_PERIODIC TRUE should "avoid to clean the rotations from the Hessian"   
    For comparison TAMkin reports i44.9, 42.9, 107.7, 1603.6, 3658.4, 3761.1 cm-1. So the latter three which represent the three vibrational modes of water, are the same, whilst the smaller ones are slightly different, but this is fine.      

(ii) As far as the geometry convergence is concerned, I should then try to set tighter convergence criteria for optimisation of water.  

(iii) with regards to you comment on testing the goodness of geometry optimisations in general within CP2K, I have done some substantial testing on the larger systems we are interested in in order to derive a reasonable set of cutoffs and parameters to generate (hopefully) meaningful results.I am using EPS_SCF 10E-7 for geometry optimisations in combination with MAX_FORCE 10E-4 (somewhat tighter than default), and for vibrational analysis I am using even tighter SCF critiera (10E-8). This yields reasonable minima and TS, and I don't 
experience a lot of noise in these calculations any more. I had great difficulties with this when I started, finding a huge number (>100) of imaginary modes for the optimised structures. In terms of the stress tensors, these also look sufficiently small.
I have also used TAMkin for one of these structures for testing, and here it seems the gradients are well converged, since no error occurs with the default settings in ConstrainExt(). So this looks promising at least.   



Thanks for now, I'll be in touch.

Tobias 

Toon Verstraelen

unread,
Nov 16, 2015, 4:35:06 AM11/16/15
to tam...@googlegroups.com
Hi Tobias,

That looks good. You're test on the water molecule is quite reassuring. Let me
know if there is anything else I can do.

Best Regards,

Toon
> --
> You received this message because you are subscribed to the Google Groups "TAMkin"
> group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> tamkin+un...@googlegroups.com <mailto:tamkin+un...@googlegroups.com>.
> To post to this group, send email to tam...@googlegroups.com
> <mailto:tam...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tamkin/3bf1e52d-d222-495c-bf8d-26f48be8c6e5%40googlegroups.com
> <https://groups.google.com/d/msgid/tamkin/3bf1e52d-d222-495c-bf8d-26f48be8c6e5%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.


--
Prof. Dr. ir. Toon Verstraelen
Center for Molecular Modeling
Ghent University
Technologiepark 903,
B9052 Zwijnaarde
Belgium
Tel: +32 9 264 65 56
GSM: +32 471 66 68 83
E-mail: Toon.Ver...@UGent.be
http://molmod.UGent.be/
http://molmod.UGent.be/software/

Tobias Kraemer

unread,
Dec 1, 2015, 7:26:01 AM12/1/15
to TAMkin
Hello,


this is me again with another question, after playing around with TAMkin for a little while now. I wonder if it is possible to set a threshold for vibrational modes to be used in the partition function. My system contains quite a large number (~75) normal modes with wavenumbers <100 cm-1 (an arbitrary cutoff). Of course it is well known that the entropic contribution due to these soft modes can introduce large errors into (Gibbs) Free Energies. A crude approximation is to not include these at all. Another question regards the
PVHA approach. Is it possible to perform a PVHA using a fully calculated Hessian from a CP2K vibrational analysis? I have opted for a full vibrational analysis since this seems to be the only way to produce an output file for visualisation of the desired modes.
It is however possible, in principle, to freeze selected atoms during the job. Nonetheless, I was thinking now that I have the full Hessian anyways, could I define a molecular segment within the structure, for which to perform nma and obtain thermochemical data
for this part (I guess this is similar to PVHA). 

I am not sure if the numbers I am producing with TAMkin are reliable. I have seen for selected (intramolecular) reactions (in fact rearrangements) that the energy increases by ~10 kcal mol-1 when going from electronic energies (DeltaE) to Free Energies (DeltaG). In the particular case this appears to be unreasonable, and I am thinking about possible causes and solutions to this problem (or at least some other options to test this). I will need to think a bit more about how to best describe the system in terms of its thermochemistry, but so far I would believe that it is a solid state system with no translational/rotational contributions to the pf.   

Thanks 

Tobias




On Tuesday, November 3, 2015 at 9:19:50 PM UTC, Tobias Kraemer wrote:

Toon Verstraelen

unread,
Dec 1, 2015, 11:08:00 AM12/1/15
to tam...@googlegroups.com
On 12/01/2015 01:26 PM, Tobias Kraemer wrote:
> Hello,
>
>
> this is me again with another question, after playing around with
> TAMkin for a little while now. I wonder if it is possible to set a
> threshold for vibrational modes to be used in the partition function.
> My system contains quite a large number (~75) normal modes with
> wavenumbers <100 cm-1 (an arbitrary cutoff). Of course it is well
> known that the entropic contribution due to these soft modes can
> introduce large errors into (Gibbs) Free Energies. A crude
> approximation is to not include these at all. Another question regards the
> PVHA approach. Is it possible to perform a PVHA using a fully
> calculated Hessian from a CP2K vibrational analysis? I have opted for
> a full vibrational analysis since this seems to be the only way to
> produce an output file for visualisation of the desired modes.
> It is however possible, in principle, to freeze selected atoms during
> the job. Nonetheless, I was thinking now that I have the full Hessian
> anyways, could I define a molecular segment within the structure, for
> which to perform nma and obtain thermochemical data
> for this part (I guess this is similar to PVHA).

Hi Tobias,

You are right about the low frequencies. These low-frequency modes are
usually very anharmonic and hence their treatment as harmonic
oscillators is indeed debatable. Also the exact values of the low
frequencies can be very sensitive to the local minimum one happens to
end up with. I'm personally not a proponent of just leaving out all
frequencies below a cutoff as it can lead to the following artifact:
when partition functions are computed for two states, they may not have
the same number of modes below the cutoff. That would result in a
different number of degrees of freedom, which would result in a
meaningless free-energy difference. There are more advanced schemes of
"curing" the low frequencies but a physically sound model is clearly
preferable.

PHVA, as you suggest, is the simplest option that eliminates many
non-local low modes in the environment of a chemical reaction. The
assumption made, when using PHVA, is that all the eliminated degrees of
freedom are transferable between two states of interest. Depending on
the case, that may or may not be a reasonable approximation. For atoms
far away from the active site, that is usually True. As long as you
account for the complete interaction energy between the active site and
its environment, which you do in a periodic calculation, you should be fine.

PHVA will certainly work in TAMkin, even if the complete Hessian is
computed with CP2K.

> I am not sure if the numbers I am producing with TAMkin are reliable.
> I have seen for selected (intramolecular) reactions (in fact
> rearrangements) that the energy increases by ~10 kcal mol-1 when going
> from electronic energies (DeltaE) to Free Energies (DeltaG). In the
> particular case this appears to be unreasonable, and I am thinking
> about possible causes and solutions to this problem (or at least some
> other options to test this). I will need to think a bit more about how
> to best describe the system in terms of its thermochemistry, but so
> far I would believe that it is a solid state system with no
> translational/rotational contributions to the pf.

It may actually be a reasonable difference. It all depends on the size
of the molecule and the structure its normal modes. For example, if both
states differ a lot in flexibility, there can be a serious entropic
penalty. If you have a suspicious case, please post it on the mailing
list, so we can try to get a better understanding.

Best Regards,

Toon

Tobias Kraemer

unread,
Dec 2, 2015, 7:18:55 AM12/2/15
to TAMkin
Hi Toon,


thanks for your reply. Just trying to figure out how to set up a PHVA routine in TAMkin. Here is the script that I currently use for running these calculations:
Since the list of fixed atoms is quite large, I wanted to provide an external file with the information. What format would I need to provide here, simply 
[0, 1, .....N]. Can one provide ranges as well? Is it possible to provide a list of non-fixed atoms, which might be easier in this case?

from tamkin import *

molecule = load_molecule_cp2k("force.out", "freq.out")
#nma = NMA(molecule, ConstrainExt(gradient_threshold=0.01))
fixed = load_indices("/home/kraemer/testlab/tamkin/fixedatoms.txt")
nma = NMA(molecule, PHVA(fixed))
pf = PartFun(nma)
#pf = PartFun(nma, [ExtTrans()])
# Write some general information about the molecule
# and the partition function to a file.
pf.write_to_file("partfun.txt")

# Write an extensive overview of the thermodynamic properties to a file:
ta = ThermoAnalysis(pf, [298.15])
ta.write_to_file("thermo.csv")


Thanks

Tobias


On Tuesday, November 3, 2015 at 9:19:50 PM UTC, Tobias Kraemer wrote:

Toon Verstraelen

unread,
Dec 2, 2015, 10:41:54 AM12/2/15
to tam...@googlegroups.com
Hi,

The external file is a good idea. You can also use it to load the free
atoms and deduce the fixed atoms with some Python:

free = load_indices("/home/kraemer/testlab/tamkin/fixedatoms.txt")
fixed = [iatom for iatom in xrange(mol.size) if iatom not in free]

(I didn't test it so be careful.) Ranges are not supported yet but its a
good idea, not too difficult to add. (Hang on...)

Best Regards,

Toon
> --
> You received this message because you are subscribed to the Google
> Groups "TAMkin" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to tamkin+un...@googlegroups.com
> <mailto:tamkin+un...@googlegroups.com>.
> To post to this group, send email to tam...@googlegroups.com
> <mailto:tam...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tamkin/1eda88b1-a220-4816-aa1a-450148a3838c%40googlegroups.com
> <https://groups.google.com/d/msgid/tamkin/1eda88b1-a220-4816-aa1a-450148a3838c%40googlegroups.com?utm_medium=email&utm_source=footer>.

Toon Verstraelen

unread,
Dec 2, 2015, 11:03:56 AM12/2/15
to tam...@googlegroups.com
Hi Tobias,

Can you check out the latest development version and give the new
load_indices a try? It can be found here:

https://github.com/molmod/tamkin

If you're not familiar with git, just click on the "Download ZIP"
button. The new functionality for ranges is documented here:

https://github.com/molmod/tamkin/blob/master/tamkin/io/internal.py#L204

Cheers,

Toon

On 12/02/2015 01:18 PM, Tobias Kraemer wrote:
Reply all
Reply to author
Forward
0 new messages