METADYNAMICS USING PCA-COMPONENTS AS CV--format for reference and eigenvectors data files

richa.s...@gmail.com

unread,

Oct 24, 2013, 2:50:15 AM10/24/13

to plumed...@googlegroups.com

Hello Users,

I am trying to run a metadynamics simulation using the components obtained by PCA as my CV.

I issue the following commands:
grompp_d -f grompp.mdp -c md_adp_nowater.gro -p topol.top
mdrun_d -plumed plumed

This is my input file:
PRINT W_STRIDE 1000
HILLS W_STRIDE 1000 HEIGHT 0.4
PCA FRAME ref.dat EIGENVEC evec1.dat SIGMA 0.1
ENDMETA

Problem is with he format of ref.dat file that I'm using. I get the following error:
PLUMED ERROR : PCA CV: error reading FRAME file [ref.dat]: expecting 4 columns: atomid x y z
Aborting

This is my ref.dat file:
1 0.470 2.301 0.827
5 0.439 2.440 0.864
11 0.573 2.499 0.907
13 0.582 2.632 0.904
15 0.695 2.723 0.900
21 0.638 0.110 0.931

Is there any specific format to follow when writing a reference file or the eigenvectors coordinate file?
Do we need to give any specific no. of spaces between the 4 columns?

--Regards--
Richa

bipin singh

unread,

Oct 24, 2013, 3:00:55 AM10/24/13

to plumed...@googlegroups.com

The format of ref.dat seems correct. The error is might be due to mismatch between the number of entries in your evec1.dat and ref.dat files, check whether they contain same atom number and identity.

--
You received this message because you are subscribed to the Google Groups "PLUMED users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users...@googlegroups.com.
To post to this group, send email to plumed...@googlegroups.com.
Visit this group at http://groups.google.com/group/plumed-users.
For more options, visit https://groups.google.com/groups/opt_out.

--
-----------------------
Thanks and Regards,
Bipin Singh

richa.s...@gmail.com

unread,

Oct 24, 2013, 6:06:56 AM10/24/13

to plumed...@googlegroups.com

Thanks Bipin for replying.

my evec1.dat file is as following:
1 -0.096 0.184 -0.062
5 -0.127 0.323 -0.025
11 0.007 0.382 0.018
13 0.016 0.515 0.015
15 0.129 0.606 0.011
21 0.072 -2.007 0.042

It has the same no. of atoms and coordinates! I am not able to figure out the problem.

Ludovico Sutto

unread,

Oct 24, 2013, 6:30:00 AM10/24/13

to plumed...@googlegroups.com

Hi,

try to remove any empty lines you might have at the end of the evec1.dat file.

Cheers,

Ludovico

richa.s...@gmail.com

unread,

Oct 25, 2013, 2:52:21 AM10/25/13

to plumed...@googlegroups.com

Thanks Ludovico.

The starting stucture that I m using has protein in water which is equilibrated. I tried metadynamics over this system using the following input files:

Plumed input file:
HILLS HEIGHT 0.4 W_STRIDE 1000
WELLTEMPERED SIMTEMP 300 BIASFACTOR 15

PCA FRAME ref.dat EIGENVEC evec1.dat SIGMA 0.1
ENDMETA

Now the issue is:
Plumed error : unknown option with HILLS keword
Aborting run
Plumed error : unknown option with HILLS keword
Aborting run

Then I tried removing water from the starting structure and topology, using the first eigenvector as CV, simulaion started without errors.

What could be the reason for such an error when I'm trying to run a metadynamics simulation of protein in water?

The mdp file that I'm using is:
;    Input file
;
define              =
; integrator
integrator          = md
nsteps              = 25000000
dt                  = 0.002
;
; removing CM translation and rotation
comm_mode           = Angular
nstcomm             = 1000
;
; output control
nstlog                   = 1000
nstenergy                = 1000
nstxout                  = 0
nstvout                  = 0
nstfout                  = 0
; group definition
nstxtcout                = 1000
xtc-precision            = 1000
xtc-grps                 = protein
;
; neighbour searching
nstlist             = 0
ns_type             = simple
pbc                 = no
rlist               = 0.0
;
; electrostatic
rcoulomb            = 0.0
coulombtype         = Cut-off
epsilon_r           = 80.0
;
; vdw
vdw-type            = Cut-off
rvdw                = 0.0
;
; constraints
constraints              = all-bonds
constraint-algorithm     = lincs
lincs_iter               = 4
;
; temperature
Tcoupl              = v-rescale
tc_grps             = system
tau_t               = 0.1
ref_t               = 300
;
; pression
Pcoupl              = no
;
; initial velocities
gen_vel             = yes
gen_temp            = 300
gen_seed            = 454

Ludovico Sutto

unread,

Oct 25, 2013, 6:28:54 AM10/25/13

to plumed...@googlegroups.com

Hi,

your inputs seem fine and I don't see any reason why it doesn't work with water.

The thing that I would try to debug is to remove the SIGMA 0.1 (or add NOHILLS CV 1) to disable the hills deposition and see if it runs. I would also try with another simple CV (e.g DISTANCE) to check if the problem is the CV or something else in your input.

Finally, remember to use gmx double precision and the ALIGN_ATOMS keyword (check the manual for the syntax) with the set of atoms you use in your PCA CV as specified in the manual.

Cheers,

Ludovico

richa.s...@gmail.com

unread,

Oct 28, 2013, 8:45:33 AM10/28/13

to plumed...@googlegroups.com

Thanks! Ludovico.

I tried using NOHILLS option. Also the distance as CV, but still I get the same errors. I use gmx double precision.

Actually the problem seems to be with the internal code of the PLUMED 1.3.

I tried to run a metadynamics simulation on a system containing aniline di peptide in water. The system contains 2043 atoms of water and 23 atoms of aniline di peptide. When I try to run this simulation using previously mentioned inputs, I get the following error:

Back Off! I just backed up ener.edr to ./#ener.edr.1#
starting mdrun 'Protein in water'
25 steps, 0.1 ps.
*** glibc detected *** /home/user1/gromacs-4.6.1-dp-plumed/bin/mdrun_d: malloc(): memory corruption: 0x00007f5a140a8a00 ***
!!!!! PLUMED ERROR: Unknown option for HILLS keyword
!!!!! ABORTING RUN
!!!!! PLUMED ERROR: Unknown option for HILLS keyword
!!!!! ABORTING RUN
*** glibc detected *** /home/user1/gromacs-4.6.1-dp-plumed/bin/mdrun_d: malloc(): memory corruption: 0x00007f5a1c0c6710 ***
*** glibc detected *** /home/user1/gromacs-4.6.1-dp-plumed/bin/mdrun_d: malloc(): memory corruption: 0x00007f5a0c0a2a90 ***
./build_and_run.sh: line 14: 23395 Segmentation fault (core dumped) /home/user1/gromacs-4.6.1-dp-plumed/bin/mdrun_d -s md_nohills_pcacv.tpr -plumed wt_plumed

Althogh I get different errors each time I run the same simulation. But seeing the malloc error it seems to me that only a specific no of arrays have been defined in the internal code of the PLUMED 1.3. This could be one reason the code is not able to read the total no. of atoms in my system and hence giving a malloc error.

Keeping that into mind I tried another metadynamics run with only 2 water molecules and the simulation started without errors. Then I also tried with 120 molecules of water and again the simulation started with no errors. This indicates that the array already defined in the code is for a limited no of atoms.

What could be a solution to that?

Thanks and regards
Richa

Ludovico Sutto

unread,

Oct 28, 2013, 10:01:07 AM10/28/13

to plumed...@googlegroups.com

Hi Richa,

looking at the plumed code, the error that you get "Unknown option for HILLS keyword" is raised when the line containing HILLS has some unknown keyword (unsurprisingly), so double check that in your plumed input file the syntax is correct.

Regarding the number of atoms, there are hard-coded variables specifying the size of arrays, namely:

common_files/metadyn.h:#define MAXATOMS_PATH 900

common_files/metadyn.h:#define MAXATOMS_RMSD 900

common_files/metadyn.h:#define MAXATOM_GROUP 30

But from your input you are well below these numbers so it shouldn't be a problem.

Cheers,

Ludovico

richa.s...@gmail.com

unread,

Oct 28, 2013, 10:18:47 AM10/28/13

to plumed...@googlegroups.com

But the same input file works for 120 water molecules! and when trying for 681(2043 atoms) water molecules it gives errors!!

and from lines of code (that I can interpret) it seems that the maximum no. of atoms is 900 and my system contains 2066 atoms.

Carlo Camilloni

unread,

Oct 28, 2013, 10:20:55 AM10/28/13

to plumed...@googlegroups.com

so try to change the number of maxatoms and recompile the code.

Message has been deleted

Gaurav Goel

unread,

Oct 28, 2013, 12:25:03 PM10/28/13

to plumed...@googlegroups.com

Typically all explicit solvent biomolecular systems are bigger than 900 atoms. Is there a particular reason for hard coding a small array size besides efficiency concerns?

Thanks,
G

Carlo Camilloni

unread,

Oct 28, 2013, 12:30:23 PM10/28/13

to plumed...@googlegroups.com

you are right, but usually one doesn’t not take into account the solvent for example. From a practical point of view in plumed1 the structure for PATH cvs and PCA (the only ones for which that hardcoded value is used) use a lot of memory so it is not possible to use to many atoms. Furthermore also the performances will decrease dramatically. So it is better to look more into the details of one own system in oder to reduce the needed atoms.

C

Message has been deleted

Carlo Camilloni

unread,

Oct 29, 2013, 9:26:43 AM10/29/13

to plumed...@googlegroups.com

Dear Gaurav,

if you have only 40 lines in both the reference and the eigenvector files then in principle MAXATOMS_RMSD is not the problem, as I wrote before this keyword in not related to the maximum number of atoms of the system but to the maximum number of atoms acceptable for a RMSD/PATH/PCA like collective variable. So if the code works after increasing the number of MAXATOMS this should mean, at least in principle, that in your reference file or in your eigenvector file or somewhere else you are trying to use a CV which includes all the atoms of the systems and not only the 40 C-alphas.

Could you please send a tpr, the plumed.dat and the ref and eigenvect files? I would like to reproduce the error because if it is a bug it will be good to fix it.

Best,

Carlo

On 29 Oct 2013, at 14:00, Gaurav Goel <gaurav...@gmail.com> wrote:

Dear Carlo, Thank you for your time.
I looked into the PLUMED code to get better understanding, but some questions remain.
Our system has a protein molecule (insulin) solvated in water. We are performing PCA on 40 c-alphas of insulin. So there are corresponding 40 entries in the reference structure file and the eigenvector file. However, we also have approximately 2000 water molecules as part of our md system and the .tpr file is based on the full configuration file (1 protein + 2000 water).

We do not receive error on increasing MAXATOMS_PATH, MAXATOMS_RMSD, etc. in correspondence to total number of atoms in the protein+water system. However, our CV is based only on 40 c-alpha atoms. Is their an easy way to separately hardcode two "MAXATOMS" values-- one for CV atoms and one for total atoms in the system? Or, you will not recommend using this code for a large explicit solvent system even when taking care that total number of atoms in the CV list is small.

Thanks,
G

Ludovico

unread,

Oct 31, 2013, 9:50:09 AM10/31/13

to plumed...@googlegroups.com, richa.s...@gmail.com

Hi Richa,

leaving off the MAXATOMS variable which, as Carlo correctly suggested, is probably not the problem, you could try to start your simulation from a different structure than the one you align your system to, if that was the case!

Indeed, sometimes it can happen that aligning the structure on itself leads to numerical instabilities which bring the simulation to crash.

Hope it helps.

Cheers,

Ludovico

richa.s...@gmail.com

unread,

Nov 6, 2013, 6:18:41 AM11/6/13

to plumed...@googlegroups.com

Hi Carlo,

Sorry for the delay... Our system contains aniline di peptide in water and the files that you asked for are attached with this mail.

description of files==>
evec1.dat --> eigenvector file
npt_wat.gro --> gro file of aniline di peptide in water (681 water molecules)
ref.dat --> reference structure file (structure after production run)
topol.top --> topology file
wt_plumed.dat --> plumed input file
md_hills_pcacv.tpr --> the tpr file generated after grompp
md.mdp --> the mdp file for generating a .tpr file

commands given:
grompp_d -f md.mdp -c npt_wat.gro -p topol.top -o md_hills_pcacv.tpr
mdrun_d -s md_hills_pcacv.tpr -plumed wt_plumed.dat

evec1.dat

npt_wat.gro

ref.dat

topol.top

wt_plumed.dat

md_hills_pcacv.tpr

md.mdp

richa.s...@gmail.com

unread,

Nov 6, 2013, 6:23:58 AM11/6/13

to plumed...@googlegroups.com, richa.s...@gmail.com

Hi Ludovico,

The starting structure for my simulation with 'plumed' is the structure that I got after npt equilibration when I ran simple MD production run in GROMACS.
The structure that I used to generate the eigenvectors was the one that I got after the completion of simple MD production run. Also I used this structure to generate the ref.dat file.

regards
Richa

Ryoji Takahashi

unread,

Oct 27, 2015, 5:03:42 AM10/27/15

to PLUMED users, richa.s...@gmail.com

Hi,

I am learning how to use path collective variables in gromacs. These attached files should be useful for me. However, I am not able to download them.