# How to measure intermolecular atom distances

34 views

### Yaoting Zhang

Mar 8, 2023, 4:46:24 PM3/8/23
to freud-users
Hi,

I am wondering if Freud could be used to find the min distance between molecules using Freud.
Let's say I have one thousand XX molecules, and each XX has three atoms, and I want to find each XX's closest neighbours by measuring the intermolecular atom distance. I have to sort by intermolecular atomic distance important because my actual system is very flat and all have different sizes so I cannot use centre-of-mass to estimate neighbours.

I tried to make Freud.locality into a 3d array in the form of
points=[[[3,4,5], [5,5,4],[5,5,5]],[[1,2,3],[4,5,5],[1,1,1]],...[[2,2,2],[5,6,7],[6,7,9]]]
aq = freud.locality.AABBQuery(box, points)
but it seems it only takes the 2d array.

Thank you

### Tommy Waltmann

Mar 9, 2023, 12:51:20 PM3/9/23
to freud-users
Hi,

Based on the explanation given, I don't fully understand why using a single center-of-mass point to represent the XX molecules is not possible. It would be an efficient way to represent such small molecules, and the distance between molecules and neighbors could be easily queried using freud's query API. A flag could even be set so the query results come back sorted by distance.

However, assuming this is not possible, I think the best way to achieve your goal is to do a neighbor query for each pair of molecules. This method will be significantly slower than just using a single center-of-mass distance to represent each molecule, but you'll get back the pairwise atom distances between each pair of molecules, sorted by distance. Then that could be used to quantify distance between XX molecules.

One last potential option is to do a single neighbor query with all the atoms as points, and then do some post-processing on the resultant neighborlist with prior knowledge of which point indices are in which XX molecule.

Give those options some thought and let me know which one sounds the best for your situation.

-Tommy

### Yaoting Zhang

Mar 9, 2023, 1:49:29 PM3/9/23
to freud-users
Hi Tommy,

Based on the explanation given, I don't fully understand why using a single center-of-mass point to represent the XX molecules is not possible. It would be an efficient way to represent such small molecules, and the distance between molecules and neighbors could be easily queried using freud's query API. A flag could even be set so the query results come back sorted by distance.
In my actual system, I have multiple flat different disk-like molecules with different numbers of atoms. The size of the disk range is large (10-100 Angstroms), and because of the flat structure, COM is not very applicable in my case.

I think the best way to achieve your goal is to do a neighbor query for each pair of molecules. This method will be significantly slower than just using a single center-of-mass distance to represent each molecule, but you'll get back the pairwise atom distances between each pair of molecules, sorted by distance. Then that could be used to quantify distance between XX molecules.
I agree this is the way to go. I am unsure how Freud can do that because it seems all the examples use 2D arrays (points/atoms and coordinates), but for molecules, I will need a 3D array (molecule, atoms/points, and coordinates). When I tried a 3D array for freud.locality.AABBQuery I got the dimension error.
Some suggestions will be very helpful.

Yaoting

--
You received this message because you are subscribed to a topic in the Google Groups "freud-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/freud-users/gBgqxdbF7Eg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to freud-users...@googlegroups.com.

Mar 9, 2023, 2:00:50 PM3/9/23
to freud-users
Another approach would be to compute the nearest 3 neighbors between every atom and every other atom. It is probable that 2 of the nearest neighbors are atoms in the same molecule, but the third must be in another molecule. If you can determine which points belong to each molecule, you can create a neighborlist and filter it. See this example: https://gist.github.com/bdice/010a97668fbe8dd3e6551f93224a0198

I see your actual data has a varying number of atoms in each molecule -- you can modify the query_args accordingly. freud's neighbor finding is quite efficient so I would focus on generating a neighbor query that is sure to contain your desired neighbors (at least one neighbor in another molecule), and then filter it after the fact.

### Tommy Waltmann

Mar 9, 2023, 3:25:31 PM3/9/23
to freud-users
Yaoting,

To achieve your goal, you'll need to use explicit python loops over molecules instead of trying to give AABBQuery a 3-dimensional array of data. Something vaguely like the algorithm below:

for each pair of molecules M1, M2 in your system:
1) nlist = freud.AABBQuery(box, points_M1).query(points_M2).toNeighborList()
2) using the nlist, compute the distance metric between M1 and M2
3) Do something useful with the computed distance

### Yaoting Zhang

Mar 21, 2023, 11:23:15 AM3/21/23
to freud-users
Thank you so much for your help Tommy

This is what I end up using
Let me know if I can optimize it somehow

let's say I have n molecules in the form of mol1, mol2 in the system
system=[mol1 mol2, mol3... moln], where each mol is the XYZ coordinates of all the atoms in molecule.

test_real=[]
for count1, value1 in enumerate (system):
for count2, value2 in enumerate(system[1:], start=1):
if count1!=count2 and count1<count2: #ensures that we don't loop molecule itself and no repeats as 1-2 and 2-1 are the same
complete=[]
# this nlist finds the shortest distance between atoms of each molecule
nlist = freud.AABBQuery(box, value1).query(value2, {'mode': 'nearest','num_neighbors':1}).toNeighborList()
print("molecule", count1, count2)
#filter through this neighbour nlist to obtain the shortest distance between molecules
for (i,j), distance in zip(nlist, nlist.distances):
temp=([count1,count2, distance])
complete.append(temp)
complete2=np.array(complete)
column=2 #selecting the distance column
# save only the shortest distance between molecules
shortest=complete2[complete2[:, column].argmin()]
print(shortest)
test_real=np.append(test_real, shortest, axis=0)
this should print out something like this
molecule 0 1 [0. 1. 2.55192995] molecule 0 2 [0. 2. 3.35122228] molecule 1 2 [1. 2. 1.08505833]
....

The test_real needs to be separated because test_real is  in this form
array([0. , 1. , 2.55192995, 0. , 2. ,
3.35122228, 1. , 2. , 1.08505833])

test_real
new=np.array_split(test_real, (test_real.shape[0] + 2) // 3)
new2=np.array(new)
new2

array([[0. , 1. , 2.55192995], [0. , 2. , 3.35122228], [1. , 2. , 1.08505833]])

The new2 array can now be looped to create precursors for the intermolecular neighbour list

query_indices=[]
point_indices=[]
distances=[]
cutoff=10 #this is an optional filter otherwise, your intermolecular list might be too large
for i in new2:
if i[2]<cutoff:
query_indices.append(i[0])
point_indices.append(i[1])
distances.append(i[2])

The intermolecular neighbour list is created by using the function below

n=len(query_indices) # n is the same for points and query
nlist_mol=freud.locality.NeighborList.from_arrays(n, n, query_indices, point_indices, distances)

The nlist_mol is the intermolecular neighbour list.

It was a fun deep dive into Freud.