Compute the MSD of water using MDAnalysis-2.00-dev0

530 views
Skip to first unread message

Alex CHOU

unread,
Mar 18, 2021, 5:01:47 AM3/18/21
to MDnalysis discussion
Dear all developers 

I am trying to compute the msd of water in a cubic box and I found the results of MDAnalysis are totally different from the msd generated by gmx msd. Can anyone help me figure out if something is wrong with my code. I have attached the code and results, please have a look. 
import MDAnalysis as mda
import MDAnalysis.analysis.msd as msd
import matplotlib.pyplot as plt
import numpy as np

u = mda.Universe('npt0.tpr', 'npt0_mol.xtc')
MSD = msd.EinsteinMSD(u, select='all', msd_type='xyz', fft=True)
MSD.run()

nframes = MSD.n_frames
timestep = 10 # this needs to be the actual time between frames
lagtimes = np.arange(nframes)*timestep # make the lag-time axis

#Plot
plt.xlabel('Time (ps)')
plt.ylabel('MSD')
plt.title('MSD')
plt.plot(lagtimes,MSD.timeseries)
plt.show()

Regards,
Jun

Hugo Macdermott-Opeskin

unread,
Mar 18, 2021, 5:07:32 AM3/18/21
to mdnalysis-...@googlegroups.com
Hi Alex. 
GROMACS and MDA do use slightly different algorithms. However the results should still be very comparable. 

Would you be able to plot the two MSDs on the same axis so we can compare directly ?
Cheers 
Hugo 

--
You received this message because you are subscribed to the Google Groups "MDnalysis discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mdnalysis-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mdnalysis-discussion/1a66f5c0-8af2-463b-8bad-c897da5a5f24n%40googlegroups.com.
--
Hugo MacDermott-Opeskin
PhD Candidate, RSC ANU

Alex CHOU

unread,
Mar 18, 2021, 5:12:04 AM3/18/21
to MDnalysis discussion
Thanks Hugo,  the attachments of last email are missing, I have attached here, please have a look.
gmx_msd.PNG
mdanalysis.PNG

Hugo Macdermott-Opeskin

unread,
Mar 18, 2021, 5:17:10 AM3/18/21
to mdnalysis-...@googlegroups.com
Would you be able to share your trajectory and TPR  also please ? 

Alex CHOU

unread,
Mar 18, 2021, 5:24:10 AM3/18/21
to mdnalysis-...@googlegroups.com
Sure, please check the attachment.

You received this message because you are subscribed to a topic in the Google Groups "MDnalysis discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mdnalysis-discussion/llGYzS_c8pM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mdnalysis-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mdnalysis-discussion/CAOeAkd%3Da8M-V3%3DQvkk7JifnShLe2sT1OU1zMA7j0%2BkyL%3D9ZG%2Bw%40mail.gmail.com.
npt0.tpr
npt0_mol.xtc

Hugo Macdermott-Opeskin

unread,
Mar 18, 2021, 6:44:48 AM3/18/21
to MDnalysis discussion
Hey Alex, 
Could I also please have the trajectory before you apply any pbc corrections to it, i.e pre `gmx trjconv`
I think  your problem is related to the way the coordinates have been wrapped but I will investigate.
Cheers

Alex CHOU

unread,
Mar 18, 2021, 7:49:05 AM3/18/21
to mdnalysis-...@googlegroups.com
I have attached it, please have a look. Thanks.

Regards,

npt0.xtc

Hugo Macdermott-Opeskin

unread,
Mar 18, 2021, 7:54:09 PM3/18/21
to mdnalysis-...@googlegroups.com
Hi Alex,

You are not doing anything wrong at all! 
The behaviour you see is to do with the coordinate wrapping, I'll explain how.

GROMACS keeps all of its coordinates inside the primary box, this means that for a cubic box the maximum distance you can ever be apart in the primary box is  3**(1/2)*boxlength.
This is why the MSD is plateauing very quickly, as all the coordinates are now on average  3**(1/2)*boxlength away from where they started, which occurs quickly in a small box with fast diffusion (eg a small box of water).
We should have caught this earlier so not at all your fault.

What gmx msd does is unwrap the coordinates back to where they should be, in the adjacent boxes. We currently do not have an on the fly transform to do this, but are looking to develop one. 
Thanks for catching this. I will add the requisite warnings to the docs. Stick to gmx msd for now.

Let me know if you have any questions

Cheers

Hugo



Alex CHOU

unread,
Mar 18, 2021, 8:40:24 PM3/18/21
to mdnalysis-...@googlegroups.com
Hi Hugo, 

Thanks for your information. When I try to unwrap the trajectory using 'gmx trjconv -pbc nojump', then use the same code and get comparable results with the gmx msd. Thanks for your help.

Regards,


Hugo Macdermott-Opeskin

unread,
Mar 18, 2021, 9:10:48 PM3/18/21
to mdnalysis-...@googlegroups.com
Okay awesome. 
That’s good news that no jump can restore the coordinates. We lack the capability to do that ourselves. 
Cheers


Alex CHOU

unread,
Mar 21, 2021, 10:09:32 PM3/21/21
to MDnalysis discussion
Dear Hugo, 

I have another question related the MSD computation. For the current stable version of MDAnalysis, there is an example code used for computing MSD of water around a protein. However, to my knowledge, we cannot compute this since the number of water molecules changes with time. Is there anything wrong with that code, or I just misunderstand. Look for your answers, thanks.

import MDAnalysis
from MDAnalysis.analysis.waterdynamics import MeanSquareDisplacement as MSD

u = MDAnalysis.Universe(pdb, trajectory)
select = "byres name OH2 and cyzone 11.0 4.0 -8.0 protein"
MSD_analysis = MSD(universe, select, 0, 1000, 20)
MSD_analysis.run()
#now we print data ready to graph. The graph
#represents MSD vs t
time = 0
for msd in MSD_analysis.timeseries:
      print("{time} {msd}".format(time=time, msd=msd))
      time += 1

#Plot
plt.xlabel('time')
plt.ylabel('MSD')
plt.title('MSD')
plt.plot(range(0,time),MSD_analysis.timeseries)
plt.show()

Jun

Hugo Macdermott-Opeskin

unread,
Mar 21, 2021, 10:28:33 PM3/21/21
to mdnalysis-...@googlegroups.com
Hi Alex.
Another very relevant perceptive question.
You can technically compute the MSD for a mobile selection, by averaging only the lagtimes for particles that are inside the selection at any given time, using a list of lists structure. 
You are possibly confused by two separate implementations of MSD, one in waterdynamics and one in EinsteinMSD, something that we should probably clean up.

However
I personally do not use this kind of analysis as the MSD is not well defined at long lag times (I'm not sure how the waterdynamics MSD code handles this). 
Additionally, you need careful analysis to understand how quickly your particles enter and leave the selection zone so as to understand what segments of the MSD have adequate sampling and can be used for diffusion coefficient calculation. 

Hope this helps.

Alex CHOU

unread,
Mar 21, 2021, 10:48:30 PM3/21/21
to mdnalysis-...@googlegroups.com
Hi Hugo,

Thanks for your quick response. You are right. I just want to compute the msd of interracial water in a water slab, using the same method described in Pu Liu's paper (108, 6595-6602). Specifically, the msd of water in the interfacial zone will be computed using EinsteinMSD and also consider the survival probability of water in that zone. I have noticed that the example includes the method and code for both parameters, can we do it using MDAnalysis?

I also wrote a code based on the example to analyze the distribution of hydrogen bonds of water slab along the Z axis, and I got the average hbond number per water is 2.5, which is much lower than published paper (3.5). Can you please also have a look, thanks.

Best wishes,
Jun

HBonds.py

Hugo Macdermott-Opeskin

unread,
Mar 21, 2021, 11:07:45 PM3/21/21
to mdnalysis-...@googlegroups.com
Hi Alex,

You cannot use EinsteinMSD with an updating atom group, so your proposed code does not seem viable.
You would have to use the water dynamics code or your own version.
Could you also provide an actual link to the paper you are talking about? I'm sure it can be done with some combination of MDA and your own code. 

The references contained in this page provide a balanced treatment of the issue of diffusion at interfaces, including some of the controversy and issues that are highlighted in this paper.

I'm not sure where the discrepancy comes wrt hydrogen bonds, however temperature seems like it may be important.

In general it is outside the scope of the mailing list to provide specific directions for your analysis or code review for your code.
You will have to investigate the correctness of your code yourself. Good luck! Hopefully this has provided you with some guidance.


Alex CHOU

unread,
Mar 22, 2021, 9:44:13 AM3/22/21
to MDnalysis discussion
Dear Hugo, 

Thanks for your kind reply. I have already obtain a good result of the msd of water at the interface. but when I reading the code for the survival probability,  I cannot understand the meaning of tau and tau_max: 
1. we have already  input the start and stop, why we still need a tau_max? is this value only used for optimization the efficiency?

2. We can get the probability for each tau, does the value represent the likelihood of find particles in the zone for each tau. 

3. What is the unit for the output tau, picosecond right?

Looking forward to your reply, thanks.

Paul Smith

unread,
Mar 24, 2021, 12:37:31 PM3/24/21
to MDnalysis discussion
Hi Alex,

The problem with your code is that you consider a slab of water 5 Angstrom thick, calculate the hbonds in this slab only then move onto the next one. This means that you are ignoring the hbonds formed by water molecules in neighbouring slabs. What you need to do is find all hydrogen bonds in the system, then count the number in each 5 Angstrom slab. You can check out an example of how to do this in this notebook: https://p-j-smith.github.io/UserGuide/examples/analysis/hydrogen_bonds/hbonds.html.

Hope this helps!

Cheers,
Paul

Reply all
Reply to author
Forward
0 new messages