Path CV woes

391 views
Skip to first unread message

João Henriques

unread,
Nov 10, 2017, 8:05:37 AM11/10/17
to plumed...@googlegroups.com
Dear users and developers,

After a few months of trial and error (and no success), I finally gave up using "conventional" CVs with my system and decided to move on to path CVs. I read the original path CV paper [Branduardi, Gervasio and Parrinello (2007) J. Chem. Phys. 126, 054103] along with several papers of its application to cyclin dependent kinases (CDKs) and small drug candidates by Gervasio & Co. (because my system also contains a CDK, despite being different other aspects that I do not wish to dwell upon), and everything seemed to be more or less clear to me.

In short, I ran a short plain MetaD simulation with a contact map of interest and used the first unbinding event to define states A and B, along with 3 intermediates, for a total N = 5. I used the driver to calculate the optimal RMSD of the unbinding process and chose equally distant frames within RMSD space. I dumped the frames containing only the atoms of interest and, from the Belfast tutorial on adaptive variables, I figured out how to estimate my lambda value.

As for sigma, there's where I started to get in trouble. I failed to understand where I can get a good estimate of both sigmas, one for S and another for Z. Thus, I decided to give them both the same value, which was taken as the standard deviation of the RMSD (of this subset of atoms of interest) from a long unbiased MD simulation of this system. I should have probably used adaptive sigmas...

Anyway, I started the simulation with the following plumed code:

"""
p: PATH REFERENCE=frames.pdb LAMBDA=15 TYPE=OPTIMAL

# every 1 ps
m: METAD ARG=p.spath,p.zpath HEIGHT=0.5 SIGMA=0.015,0.015 PACE=500 FILE=HILLS

# every 10 ps
PRINT ARG=p.spath,p.zpath,m.bias STRIDE=5000 FILE=COLVAR
"""

The simulation ran for about 10 ns before it crashed with the classic:

"""
X particles communicated to PME rank Y are more than 2/3 times the cut-off
"""

However, this is not what worried me the most. What really worried me was that after a quick plot of S vs. Z, I noticed that Z takes on negative values. From Branduardi's paper mentioned above, Z(R), as defined in Eq. 9, should not take negative values assuming lambda is positive. Right? If so, does anyone have any idea of what is going on? Any input would be really appreciated.

Best regards,
J

João Henriques

unread,
Nov 10, 2017, 8:12:19 AM11/10/17
to plumed...@googlegroups.com
Quick note. The sign of Z(R) is actually independent of lambda and should always be positive. If not, I blame Wolfram Alpha and not my absolutely atrocious mathematics background :)

J

Giovanni Bussi

unread,
Nov 10, 2017, 10:58:42 AM11/10/17
to plumed...@googlegroups.com
Hi,

actually Z can be negative.

If you have more than one of the images that is close enough to the simulated structure, then the argument of the logarithm becomes >1 and Z becomes negative

Giovanni


--
You received this message because you are subscribed to the Google Groups "PLUMED users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users+unsubscribe@googlegroups.com.
To post to this group, send email to plumed...@googlegroups.com.
Visit this group at https://groups.google.com/group/plumed-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/plumed-users/CAHv45qMkqNuX4FVUWhtFS7r9TbPVJr25KbcNFCbZw3EzHq9iXw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

Davide Branduardi

unread,
Nov 10, 2017, 11:09:46 AM11/10/17
to plumed...@googlegroups.com
Hi 

assuming the same value for sigma is generally not right. You're using a pretty small at least for S and I worry that's connected with the instability you have.
I would recommend you using a fraction of the image spacing. 
If you have two images on average distant 0.5 Ang then I'd use sigma for Z=0.2*0.2 . For S I'd use a similar argument. If you move of 1 unit in S this will be like moving 0.5 RMSD units.
therefore 0.2 units is about 0.4 S units. Quite different from your initial value. 
Also, for deposition rate I do start with 0.1 kcal/mol/ps . This is very system dependent but this puts you in a regime where you can't screw your dynamics.
Another thing you may want to see, by plotting s and z, is that they are continuous and you do not observe  jumps in discrete values of S. I thin you might be already in that case since you are measuring negative values of Z but you may have surprises if your frames are not well equally spaced. 
Cheers

Davide
 

******************************************************
Davide Branduardi
davide.b...@gmail.com
Web:
https://sites.google.com/site/davidebranduardi
******************************************************

--
You received this message because you are subscribed to the Google Groups "PLUMED users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plumed-users+unsubscribe@googlegroups.com.
To post to this group, send email to plumed...@googlegroups.com.
Visit this group at https://groups.google.com/group/plumed-users.

João Henriques

unread,
Nov 10, 2017, 12:05:33 PM11/10/17
to plumed...@googlegroups.com
@Giovanni: You're absolutely on point, I knew I was botching this up somehow. I see now why it can assume negative values, thank you.

@Davide: Thank you very much for all those expert tips. Very, very helpful. I will retry the simulation with more sensible sigmas and hopefully it will no longer crash. By the way, I do seem to be experiencing those jumps you mention (see attachment), and this is something I also had already noticed but did not comment on my previous email.

Once more, thank you very much Giovanni and Davide for your input, it was really appreciated!

Have a nice weekend,
J



quick_plot.png

Davide Branduardi

unread,
Nov 10, 2017, 1:34:00 PM11/10/17
to plumed...@googlegroups.com
Yeah 

few comments looking to your plot
the lambda might be too small: that's somethign I can say from the curvy corners and from the fact that you never reach values close to s=1 and S=5. 
In general, when I have a well parametrized path, i get something that is, say, 1.001 as minimum value (or even less). And that I believe is not because your barrier is too large, that looks really like an effect of your lambda.
Try not to go to very high Z values. With experience we learnt that values beyond z=4 (i.e. RMSD 2Ang from the path) are far to degenerate to provide a meaningful free energy landscape.
Good luck!

D

******************************************************
Davide Branduardi
davide.b...@gmail.com
Web:
https://sites.google.com/site/davidebranduardi
******************************************************

João Henriques

unread,
Nov 10, 2017, 4:39:10 PM11/10/17
to plumed...@googlegroups.com
Thank you Davide, these are great tips and advice you're sharing. It's not only very valuable to me, it will also surely help other rookies venturing into this path (terrible pun, but I couldn't help it).

Regarding the high Z values, I see what you mean about degeneracy. I will definitely keep it in mind and will eventually use upper walls at a later stage. For the time being, I'm mostly interested in understanding all the intricacies of the method and getting a simulation to run without crashing. Hopefully, these first trial runs will be able to produce relatively meaningful data, that can help me find a better path which I can then refine for use in proper production runs.

Best regards,
J

João Henriques

unread,
Nov 13, 2017, 8:38:07 AM11/13/17
to plumed...@googlegroups.com
Dear Davide,

Just two additional questions on the topic.

[1] As stated before, my simulation contains 5 nodes, which are 0.2 nm apart in RMSD space (using the "optimal" algorithm). Using the example you provided in your email as inspiration, I deduced that my sigmas should be 0.4 for S and 0.08 nm for Z. Is there any special reason for selecting such factor, i.e., 0.4, other than personal experience from trial and error?

[2] As for lambda, you later mention that based on the curvy shape of the plot near the start and end nodes, my lambda is too short. However, using the rule of thumb in the Belfast tutorial on adaptive variables, lambda should actually be even smaller (2.3/0.2 ≈ 11.5 nm^-1) than the one I used (15 nm^-1). Do you have any comments on this particular subject? What is the rationale behind formula used as rule of thumb? I have tried using lambda = 50 and it does seem to work better...

Thank you for your attention,
Best regards,
J


Davide Branduardi

unread,
Nov 14, 2017, 2:50:40 AM11/14/17
to plumed...@googlegroups.com
Hi Joao

On Mon, Nov 13, 2017 at 1:38 PM, João Henriques <joao.m.a....@gmail.com> wrote:
Dear Davide,

Just two additional questions on the topic.

[1] As stated before, my simulation contains 5 nodes, which are 0.2 nm apart in RMSD space (using the "optimal" algorithm). Using the example you provided in your email as inspiration, I deduced that my sigmas should be 0.4 for S and 0.08 nm for Z. Is there any special reason for selecting such factor, i.e., 0.4, other than personal experience from trial and error?
Talking about the ratio between the two one simple consideration is that you are using RMSD as metrics in both S and Z so the underlying assumption is that they diffuse with the same velocity in that space. translating this into CV terms I came with that number. Other than that, the precise value comes from the fact that in structured protein RMSD stays below 2Ang. This is to say that around 2 Ang your RMSD is almost blind so it makes sense to choose something smaller than that.
Using something very small (i.e. 0.001 for Z) would probably hurt at high value of Z where geometric diffusion is high. In that space you will add "needle-like" hills which will interfere with your integrator stability. So I find that values of 10^-1 are generally well behaved and sorry I was talking about Ang, so all those number should be converted into nm in your case (sorry the code I am working with recently uses Ang units).  

[2] As for lambda, you later mention that based on the curvy shape of the plot near the start and end nodes, my lambda is too short. However, using the rule of thumb in the Belfast tutorial on adaptive variables, lambda should actually be even smaller (2.3/0.2 ≈ 11.5 nm^-1) than the one I used (15 nm^-1). Do you have any comments on this particular subject? What is the rationale behind formula used as rule of thumb? I have tried using lambda = 50 and it does seem to work better...
 
wait. You used 2.3/RMSD. The formula should be 2.3/(MSD)= 2.3/(0.2*0.2)=57.5 .

Cheers

D

João Henriques

unread,
Nov 14, 2017, 4:26:30 AM11/14/17
to plumed...@googlegroups.com
Hi Davide,

I somehow put it in my mind that |Xi-Xi+1| = RMSD because I'm using RMSD instead of MSD. In the tutorial, it is clearly stated that |Xi-Xi+1| = d(X,Xi)^2 where d(X,Xi) is the RMSD, but I still went ahead my stubbornness in the name of some nonsensical unit "coherence" in Eq. 9. It's complete nonsense, there's no point in even trying to explain my faulty rationale. Alas, I see my mistake now, and it perfectly explains the better behavior of my simulation using lambda = 50.

Once more, thank you very much Davide, your input is greatly appreciated.

Best regards,
J

Reply all
Reply to author
Forward
0 new messages