Hydration free energy of amino acid analogs: methane

Andrey Frolov

unread,

Dec 8, 2013, 1:06:17 PM12/8/13

to ermod...@googlegroups.com

Dear ERmod users,

I am trying to reproduce the results of Karino, Fedorov, Matubayasi, Chem Phys Lett 2010, 496, 351. I used the same simulation setup as in this Ref., but longer simulations (5 ns for solution and solvent systems, 10 ns for solute in vacuum) and . "slvfe" program gives me the following hydration free energy of Methane in TIP3P water: 2.84+/-0.01 kcal/mole. I add the long-range correction for the LJ term: -0.386 kcal/mole (as given in the Supporting information of Shirts, Pande, JCP 2005, 122, 134508).

The resulting methane solvation free energy is 2.45 +/- 0.01 kcal/mole, which does not match the value (1.9 +/- 0.1) given in the paper of Karino et al.

Could you please give help me to find out what could be the source of the discrepancy? Do I miss something?

I also performed the BAR free energy calculates for methane in TIP3P following Shirts, Pande, JCP 2005 and got the same result as they did. Therefore, the MD simulation setup itself seems to be OK.

I used:

- ermod-0.3.alpha8

- gromacs-4.6.1

but some time ago I did identical simulation with earlier versions of ermod and gromacs - the results are almost the same.

parameters_er file for solution system:

&ene_param

boxshp = 1,

cltype = 2,

cmbrule = 1,

elecut = 7.5,

engdiv = 10,

estype = 2,

inptemp = 298,

ljformat = 2,

ljswitch = 1,

lwljcut = 7,

ms1max = 32,

ms2max = 32,

ms3max = 32,

screen = 0.461189,

skpcnf = 1,

slttype = 1,

splodr = 6,

upljcut = 7.5,

/

&hist

eclbin=5.0e-2, ecfbin=2.0e-3, ec0bin=2.0e-4, finfac=10.0e0,

ecdmin=-20.000000, ecfmns=-0.20e0, ecdcen=0.0e0, eccore=20.0e0,

ecdmax=1.0e11, pecore=200

/

parameters_er file for solvent system:

&ene_param

boxshp = 1,

cltype = 2,

cmbrule = 1,

elecut = 7.5,

engdiv = 5,

estype = 2,

inptemp = 298,

ljformat = 2,

ljswitch = 1,

lwljcut = 7,

maxins = 1000,

ms1max = 32,

ms2max = 32,

ms3max = 32,

screen = 0.461189,

skpcnf = 100,

slttype = 3,

splodr = 6,

upljcut = 7.5,

/

&hist

eclbin=5.0e-2, ecfbin=2.0e-3, ec0bin=2.0e-4, finfac=10.0e0,

ecdmin=-20.000000, ecfmns=-0.20e0, ecdcen=0.0e0, eccore=20.0e0,

ecdmax=1.0e11, pecore=200

/

MDinfo file for soln folder (Mind! I used the input files as given in the supporting info for Shirts, Pande 2005, therefore I had only 896 water molecules):

25000 2

1 896

5 3

MDinfo file for refs folder:

25000 1

896

3

Thank you very much.

With kindest regards,

Andrey

nobuyuki

unread,

Dec 8, 2013, 9:59:46 PM12/8/13

to ermod...@googlegroups.com

Dear Andrey;

Thank you for the note.

First, you used OPLS-AA force field, right?

And how were the results for other molecules such as propane, butane and ethanol?

I forgot the details, but I think that we found that methane is a bit special in its isolated MD.

When the LINCS algorithm is used for a too symmetric molecule, there arises too much distortion in the molecular structure and the results can be strange.

Can you check how the methane molecule look like in your MD?

Best,

Nobuyuki

Andrey Frolov

unread,

Dec 9, 2013, 1:44:54 PM12/9/13

to ermod...@googlegroups.com

Dear Nobuyuki,

Thank you very much for the reply.

>> First, you used OPLS-AA force field, right?

Yes, OPLS-AA FF. I use the Gromacs topology files as given in Supporting information to Shirts, Pande, JCP 2005.

>> And how were the results for other molecules such as propane, butane and ethanol?

Here I compare two sets of simulations:

- short simulations: 100 ps for solution (10 fs coordinates sampling time), 10 ps for solvent (100 fs sampling), 10 ns for solute (100 fs sampling) number of insertions for each solvent configuration: 1000 - overall, the same setup as in Karino et al.

- long simulations: 5 ns for solution (200 fs sampling), 5 ns for solvent (20 ps sampling [in fact it is 200 fs but i use skpcnf = 100 ermod option]), 10 ns for solute (100 fs sampling) number of insertions for each solvent configuration: 1000

The slvfe output (including the Long-Range LJ correction), error is 2*sigma here:

Ala (methane)

2.4583 +/- 0.0101 (long simulations)

2.3768 +/- 0.0625 (short simulations)

1.9 +/- 0.02 (Karino et al.)

As (acetamide)

-8.7989 +/- 0.0327 (long simulations)

-8.7276 +/- 0.2463 (short simulations)

-9.2 +/- 0.02 (Karino et al.)

Ile (n-butane)

3.7025 +/- 0.0176 (long simulations)

4.0883 +/- 0.0953 (short simulations)

3.0 +/- 0.02 (Karino et al.)

It seems that the result is sensitive to the simulation length.

>> I forgot the details, but I think that we found that methane is a bit special in its isolated MD.

When the LINCS algorithm is used for a too symmetric molecule, there arises too much distortion in the molecular structure and the results can be strange. Can you check how the methane molecule look like in your MD?

I checked the simulations of methane in vacuum and it looks OK to me: no much of distortion from the initial structure. I think that the strange behavior in you case can be explained by the lack of energy exchange between atoms in the molecule (in the solution there is always a heat exchange between the solute and the neighboring solvent molecules, but in vacuum it is not the case). I think, that the Nose-Hoover thermostat (which was used in Karino et al.) in this case might lead to instability. Now there is a warning in Gromacs that for such simulation it is better to use a stochastic integrator ("sd" in gromacs notation, Langevin thermostat). The same story with the BAR simulations, when I used Nose-Hoover thermostat for nearly-decoupled state simulations I occasionally got LINCS crashes.

I use sd integrator + thermostat with tau-t = 2 ps for all simulations.

I also use 4 digit precision for my XTC files, maybe this could be source of differences? I will try to check it.

Moreover, I found the output of the "slvfe" to be quite strange. For long simulations I have identical cumulative solvation energy and solvation free energy average values for all blocks, BUT cumulative errors are NOT zero (which I would expect in this case)? Could it be an indication that something is going wrong?

Here is the example output of "slvfe":

bash-3.2$ cat ../ala/ermod/slvfe.log

Number of the 1-th solvent = 896

Self-energy of the solute = -0.0001 kcal/mol

cumulative average & 95% error for solvation energy

1 -2.8941

2 -2.8941 0.0177

3 -2.8941 0.0180

4 -2.8941 0.0129

5 -2.8941 0.0171

6 -2.8941 0.0180

7 -2.8941 0.0154

8 -2.8941 0.0135

9 -2.8941 0.0120

10 -2.8941 0.0121

group solvation free energy error difference

1 2.82589 0.01104 -0.00087

2 2.81819 0.01048 -0.00857

3 2.82676 0.01009 0.00000

4 2.84026 0.01006 0.01350

5 2.81486 0.01027 -0.01190

6 2.87281 0.00991 0.04605

7 2.80653 0.01112 -0.02023

8 2.79893 0.01034 -0.02783

9 2.92415 0.00887 0.09739

10 2.82960 0.01027 0.00284

12 2.97310 0.00809 0.14634

14 2.77163 0.01131 -0.05513

16 2.93561 0.00993 0.10885

18 3.06915 0.01093 0.24239

20 3.01108 0.00882 0.18432

group Estimated free energy (kcal/mol)

1 2.8440 2.8038 2.8048 2.8313 2.8601

2.8161 2.8210 2.8156 2.8296 2.8326

2 2.8348 2.7961 2.8006 2.8206 2.8521

2.8100 2.8121 2.8090 2.8235 2.8231

3 2.8443 2.8063 2.8077 2.8277 2.8583

2.8183 2.8217 2.8195 2.8333 2.8306

4 2.8579 2.8200 2.8248 2.8410 2.8733

2.8299 2.8374 2.8312 2.8422 2.8448

5 2.8283 2.7931 2.7967 2.8143 2.8497

2.8066 2.8134 2.8076 2.8176 2.8213

6 2.8880 2.8551 2.8559 2.8700 2.9066

2.8628 2.8717 2.8658 2.8707 2.8815

7 2.8227 2.7848 2.7862 2.8044 2.8436

2.7979 2.8056 2.7968 2.8078 2.8155

8 2.8167 2.7776 2.7837 2.8010 2.8311

2.7860 2.7956 2.7885 2.8026 2.8065

9 2.9356 2.9095 2.9071 2.9173 2.9533

2.9157 2.9295 2.9210 2.9185 2.9340

10 2.8476 2.8072 2.8158 2.8327 2.8603

2.8131 2.8274 2.8220 2.8321 2.8379

12 2.9857 2.9573 2.9608 2.9729 2.9978

2.9603 2.9763 2.9688 2.9686 2.9825

14 2.7916 2.7490 2.7550 2.7751 2.8057

2.7566 2.7665 2.7593 2.7762 2.7813

16 2.9474 2.9217 2.9193 2.9269 2.9692

2.9248 2.9407 2.9284 2.9287 2.9490

18 3.0753 3.0585 3.0538 3.0487 3.0993

3.0622 3.0860 3.0695 3.0511 3.0870

20 3.0290 2.9913 3.0096 3.0123 3.0297

2.9895 3.0111 3.0045 3.0094 3.0244

cumulative average & 95% error for solvation free energy

1 2.8443

2 2.8443 0.0380

3 2.8443 0.0249

4 2.8443 0.0181

5 2.8443 0.0203

6 2.8443 0.0170

7 2.8443 0.0144

8 2.8443 0.0126

9 2.8443 0.0112

10 2.8443 0.0101

nobuyuki

unread,

Dec 9, 2013, 9:08:57 PM12/9/13

to ermod...@googlegroups.com

Dear Andrey

It seems that the result is sensitive to the simulation length.

This is also different from what we saw before.

When the soln MD is 100 ps and the refs MD is a few tens of ps,

the results for amino-acid analogs should be all right.

This was true not only for OPLS-AA but also for AMBER.

I also use 4 digit precision for my XTC files, maybe this could be source of differences?

I have never thought of it.

As far as the MD trajectory is well handled by vmdplugins and successfully transported to ermod's internal routines, there should not be any problem.

But please check what will happen if you change the precision of the xtc file.

Moreover, I found the output of the "slvfe" to be quite strange. For long simulations I have identical cumulative solvation energy and solvation free energy average values for all blocks, BUT cumulative errors are NOT zero (which I would expect in this case)? Could it be an indication that something is going wrong?

This is indeed a strange behavior.

For example, the cumulative average of the solvation free energy is simply constructed from the block average at group = 3 (in default).

In other words, the ten values of

> 3 2.8443 2.8063 2.8077 2.8277 2.8583

> 2.8183 2.8217 2.8195 2.8333 2.8306

are made into cumulative average, so that the cumulative average should look like;

1 2.8443

2 the value of (2.8443 + 2.8063)/2

3 the value of (2.8443 + 2.8063 + 2.8077)/3

...

Similarly, the ten values of the solvation energies in aveuv.tt are converted in the cumulative average of the solvation energy shown as an output of slvfe.

BTW, the solvation energy (not the free energy) in Karino et al is:

-2.81 (ala), -27.29 (Asn), -7.69 (Ile), probably WITHOUT the LJ long-range correction.

Do they agree with what you got?

According to your last observation, something is funny at the moment

Best,

Nobuyuki

Andrey Frolov

unread,

Dec 10, 2013, 2:17:26 AM12/10/13

to ermod...@googlegroups.com

Dear Nobuyuki,

The solvation energy (not the free energy) without LJ LRC for my simulations match the data of Karino et al.:

Ala
-2.8941 +/- 0.0121 (long simulations)
-2.8984 +/- 0.0370 (short simulations)
-2.81              (Karino et al.)

Asn
-27.0860 +/- 0.0739 (long simulations)
-26.9098 +/- 0.4030 (short simulations)
-27.29              (Karino et al.)

Ile
-7.8235 +/- 0.0217 (long simulations)
-7.6405 +/- 0.0896 (short simulations)
-7.69              (Karino et al.)

Please, note that the strange behavior of cumulative averages I have only for long simulations, they and the ermod analysis are done on a different computer than the short simulations. I seems that slvfe in this case just takes the first value in the group 3, and puts it instead of the average. However, the error is still calculated normally. I will try to dig into the code to find out what could be wrong.

Since the solvation energy matches for all three cases, I think that the source of error might be in calculating the solvent correlation matrix and energy distribution at zero coupling. Or alternatively in the slvfe program.

Could you please be so kind to have a look on an example of my simulations? I will send to you a separate e-mail with all the simulation data and ermod analysis data for ile_side_chain system.

Andrey Frolov

unread,

Dec 10, 2013, 5:23:03 AM12/10/13

to ermod...@googlegroups.com

Dear Nobuyuki,

It seems that the problem with identical cumulative average values for all blocks is maybe due to (for some reason) badly compiled "slvfe". I repeated computations with the same data but "slvfe" compile on a different machine, and the cumulative averages are OK.

Here is the data for ile_side_chain (here without Long-Range Correction for LJ):

Output of "slvfe" on remote computer cluster:

cumulative average & 95% error for solvation free energy

1     4.9585
2     4.9585     0.0394
3     4.9585     0.0397
4     4.9585     0.0330
5     4.9585     0.0277
6     4.9585     0.0226
7     4.9585     0.0192
8     4.9585     0.0222
9     4.9585     0.0197
10     4.9585     0.0176

Output of "slvfe" on my local machine:

cumulative average & 95% error for solvation free energy

1     4.9585
2     4.9782     0.0394
3     4.9619     0.0397
4     4.9532     0.0330
5     4.9585     0.0277
6     4.9591     0.0226
7     4.9580     0.0192
8     4.9654     0.0222
9     4.9663     0.0197
10     4.9659     0.0176

In both cases I use ermod-0.3.alpha8. I will try to recompile it and then will let you know if it fixes the problem.

With kind regards,
Andrey

Andrey Frolov

unread,

Dec 10, 2013, 8:34:52 AM12/10/13

to ermod...@googlegroups.com

Dear Nobuyuki,

The precision of the XTC files affects a little the result, but not drastically. Still the difference between the solvation free energies is larger than the sum of uncertainties of both. However, the solution energy match up to second digit after point. It seems that the solvation energy is insensitive much less sensitive to input parameters.

Overall, I think that different XTC precision is not the main reason of differences with data of Karino et al..

slvfe data for ile_side_chain (no LRC for LJ):

XTC precision 4 digits:
    -7.6413 +/- 0.0897 (solvation free energy)
     5.1451 +/- 0.0960 (solvation energy)

XTC precision 3 digits:
    -7.6405 +/- 0.0896 (solvation free energy)
     4.9023   +/- 0.0916 (solvation energy)

With kind regards,
Andrey

Andrey Frolov

unread,

Dec 10, 2013, 9:16:19 AM12/10/13

to ermod...@googlegroups.com

Dear Nobuyuki,

Here the comparison of the results "short simulations” (see the definition in the previous e-mail) with DispCorr = EnerPres and DispCorr = No, as you suggested to do:

Ile (n-butane) solvation free energy (with LRC for LJ):
3.8891 +/- 0.0953 (DispCorr = EnerPres)
3.4947 +/- 0.1313 (DispCorr = No)

3.0 +/- 0.02 (Karino et al.)

Ile (n-butane) solvation energy (without LRC for LJ):
-7.6413 +/- 0.0897 (DispCorr = EnerPres)
-7.7832 +/- 0.1313 (DispCorr = No)
-7.69 (Karino et al.)

Change of the “DispCorr” to “No” significantly affects the solvation free energy, making the discrepancy between my simulations and data of Karino et al. However, the difference is still present.

I think that it would be reasonable if I try to repeat exactly your simulation setup on my computer and then change the MDP options one by one to see what causes the differences in the resulting free energies. Therefore, I would like to kindly ask you to share the input files used for Karino et al. publication for one system (mdp, gro, top files), if it is possible?

PS: please, mind that the values for the free energies which I reported before for ala, asn and ile side chains for the "short" simulations were calculated using 4 digit XTC precision for solute trajectory, and (by a mistake) only 3 digit precision for solution and solvent trajectories. This explains the slight difference between the present and previous data: here I use 4 digit precision for all simulations.

Thank you very much.
With kind regards,
Andrey

Andrey Frolov

unread,

Dec 10, 2013, 10:03:44 AM12/10/13

to ermod...@googlegroups.com

Dear Nobuyuki,

Just for the information
I have recompiled ermod on the supercomputer with the following option and the problem with identical cumulative averages for all blocks disappeared:
"
module load library/openmpi/1.6.4-intel
### Copile ermod
export CC=icc
export CXX=icc
export CCFLAGS=" -O -fPIC "
export CXXFLAGS=" -O -fPIC "
export F77=mpif90
export FC=mpif90
export MPIFC=mpif90
./configure --prefix=`pwd`/EXE --with-mkl
"

The previous version was:
"
module load compiler/intel/12.1
module load library/intelmpi/4.1.0.024
export CC=icc
export CXX=icc
export CCFLAGS=" -O -fPIC "
export CXXFLAGS=" -O -fPIC "
./configure --prefix=`pwd`/EXE --with-mkl
"

Kind regards,
Andrey

Shun Sakuraba

unread,

Dec 11, 2013, 4:28:27 AM12/11/13

to ermod...@googlegroups.com

Hi,

Thank you for reporting the problem. Could you send us "config.log" file, in the case of the *previous* version?
That file is automatically created after you execute ./configure (no need to run "make").

Thanks in advance for your cooperation!

--
Shun SAKURABA, Ph.D.
Postdoc @ Molecular Modeling & Simulation Group, Japan Atomic Energy Agency

Andrey Frolov

unread,

Dec 11, 2013, 5:09:51 AM12/11/13

to ermod...@googlegroups.com

Dear Shun,

Please find attached the config.log for the problematic case.

Kind regards,
Andrey

config.log

Shun Sakuraba

unread,

Dec 11, 2013, 5:21:15 AM12/11/13

to ermod...@googlegroups.com

Thank you very much.
I will track this issue (fixed cumulative average) down with Ticket 30. ( https://sourceforge.net/p/ermod/tickets/30/ )

Andrey Frolov

unread,

Jan 17, 2014, 8:28:43 AM1/17/14

to ermod...@googlegroups.com

Dear Ermod-users,

Here, please, see the analysis of the effect of the combination of two sets of cut-off's and DispCorr=No / EnerPres on density. It means that for the conventional (long) cutoff scheme the mean-field like correction to energy and pressure becomes not that important as in the case of short cutoffs.

Gromacs notations (nm units)

1) Short Cutoff setup
rlist                    = 0.9
vdw-type                 = Switch
rvdw-switch              = 0.7
rvdw                     = 0.75

Here, the system is ile_side_chain analog in water, 100 ps simulation time (1 solute / ~ 892 water molecules)

The densities in kg/m^3 for the two cases:
983.094   (DispCorr = EnerPres)
969.357   (DispCorr = No)

2) Long Cutoff setup:
rlist                    = 1.4
vdw-type                 = Switch
rvdw-switch              = 1.0
rvdw                     = 1.2

Here, the system is pure water, 1 ns simulation time (892 water molecules)

The densities in kg/m^3 for the two cases:
985.235 (DispCorr = EnerPres)
982.175 (DispCorr = No)

Andrey

Reply all

Reply to author

Forward