On 21 Mar, 2012, at 15:26, jmborr wrote:
> I want to calculate the distances between atoms in two atomGroup
> objects, the number of atoms need not be the same.
> I could not find a dedicated function. Is my only chance to resort to
> cast the coordinates to ndarray and use numpy?
Probably the easiest is to use MDAnalysis.analysis.distances.distance_array on the coordinates of two AtomGroups, eg
g1 = universe.selectAtoms("(resname ARG or resname LYS) and not backbone")
g2 = universe.selectAtoms("(resname ASP or resname GLU) and not backbone")
dist = MDAnalysis.analysis.distances.distance_array(g1.coordinates(), g2.coordinates())
This will give you the "matrix" of distances.
You can also look at MDAnalysis.analysis.contacs for some contact analysis classes.
Oliver
--
Oliver Beckstein * orbe...@gmx.net
skype: orbeckst * orbe...@gmail.com
> Thank you Oliver, this will help a lot :)
That's good :-)
>
> At the risk of being a spammer, I have another question.
This is totally on-topic so I am certainly not going to ban you from the mailing list for asking sensible questions!
> Would the statement universe.selectAtoms("protein around 3.5 ans resname WAT")
> provide a different set of atoms as I iterate in a trajectory?
> Something like this:
>
> for ts in universe.trajectory:
> atoms=universe.selectAtoms("protein around 3.5 AND resname WAT")
Yes, typically atoms will change in size between timesteps. When I do this I typically add a line such as
print "Solvation shell with %d waters" % atoms.numberOfResidues()
Also note that you can get the water molecules with
atoms.residues
Note that the distance selection can be made a little bit faster by not using the selection parser but explicitly constructing the KD-Tree, as described under http://packages.python.org/MDAnalysis/documentation_pages/KDTree/NeighborSearch.html
Cheers,
Oliver
--
> Sweet! I wanted to find pairs of solvent-protein atoms in contact. The
> KDE-tree will help me reduce the number of distance computation:
That should work but depending on system sizes you can probably make it a little bit faster (check with timeit, %time/%timeit in IPython or the python profilers), see comments below.
>
> water = universe.selectAtoms('name OH2')
> protein = universe.selectAtoms('protein and not H*')
>
> for frame in ts.trajectory:
> ns_w = NeighborSearch.AtomNeighborSearch(water)
> ns_p = NeighborSearch.AtomNeighborSearch(protein)
> closeWAT = ns_w.search_list(protein,4.6)
> closePROT=ns_p.search_list(water,4.6)
Better:
closePROT=ns_p.search_list(closeWAT, 4.6)
(only uses the waters that we know are close)
>
> distance_array( frame[ closeWAT.indices() ],frame[ closePROT.indices()] )
Use coordinates() directly:
distance_array( closeWAT.coordinates(), closePROT.coordinates() )
However, I quickly tested this with the GRO/XTC pair in the test files and found that it is faster (~x 2 [1]) not to build a KD-tree for the protein but directly calculate the distance array:
for frame in ts.trajectory:
ns_w = NeighborSearch.AtomNeighborSearch(water)
closeWAT = ns_w.search_list(protein,4.6)
d = distance_array( closeWAT.coordinates(), protein.coordinates() )
You can then get a list of distances (e.g. for a histogram) as d[d < 4.6] or the total number as np.sum(d < 4.6). Getting individual pairs can be done with np.where(d<4.6).
Oliver
[1] For reference (in IPython):
import MDAnalysis;
from MDAnalysis.tests.datafiles import *
import MDAnalysis.KDTree.NeighborSearch
u = MDAnalysis.Universe(GRO,XTC)
protein = u.selectAtoms("protein and not name H*")
water = u.selectAtoms("resname SOL and name OW")
# the following would be done for each time step
ns_w = MDAnalysis.KDTree.NeighborSearch.AtomNeighborSearch(water)
shell = ns_w.search_list(protein, 4.6)
# %timeit is IPython's built-in access to the timeit benchmarking module
# (ran on a Macbook Pro, 2.66 GHz Core Duo))
%timeit -n 10 -r 3 d = distance_array(protein.coordinates(), shell.coordinates())
#10 loops, best of 3: 19 ms per loop
%timeit -n 10 -r 3 ns_p = MDAnalysis.KDTree.NeighborSearch.AtomNeighborSearch(protein); close_p = ns_p.search_list(shell, 4.6)
#10 loops, best of 3: 44.4 ms per loop