Hello everyone!
I have some simulations of a molecule in water: 1 molecule, 10 molecules, 20 molecules and 30 molecules of sorbitol in water. I wanted to know how many molecules of water are in the first solvation shell of each one of these molecules.
My first thought was to compute the RDF of the molecule in water. So, I tried (example for the system containing 1 molecule):
# define parameters
r_min = 0.0 # minimum distance (in Angstroms)
r_max = 10.0 # maximum distance (in Angstroms)
nbins = 100 # number of bins for RDF
sorbitol_selection = u_sorbitol.select_atoms(f"resname B00")
water_selection = u_sorbitol.select_atoms("resname SOL")
# calculate RDF
rdf_sorbitol = InterRDF(sorbitol_selection,
water_selection,
nbins=nbins,
range=(r_min, r_max),
periodic=True # account for periodic boundary conditions
)
rdf_sorbitol.run()
Then I used a function to calculate the coordination number and got a value of 0.25, which I find strange. I also found the RDF plot strange, in comparison with what I can find in the bibliography (I've attached the plots).
The function I used to calculate the coordination number is:
# coordination number calculation
def calc_coord_number(r_polyol, g_r_polyol, r_min, r_max, rho):
# interpolate g(r) function
g_interpolated = np.interp
# define the function to integrate
def integrand(r):
return g_interpolated(r, r_polyol, g_r_polyol) * r**2
# perform numerical integration
result, _ = integrate.quad(integrand, r_min, r_max)
# calculate CN(r)
CN_r = 4 * np.pi * rho * result
return CN_r
I then said to myself that maybe the value is small because RDF is usually computed regarding particles and not whole molecules. My first question would be if it's possible with MDAnalysis to compute the coordination number with RDF for whole molecules.
I then tried a different approach, specifying a solvation shell size:
import MDAnalysis.lib.distances as distances
cutoff = 3.2
water = u_sorbitol_1.select_atoms("resname SOL")
sorbitol = u_sorbitol_1.select_atoms("resname B00")
N_atoms = water.n_atoms
data = []
for ts in u_sorbitol_1.trajectory:
r = distances.self_distance_array(water.positions, box=ts.dimensions)
in_shell = np.logical_and(r > 0.5, r < cutoff)
Ncoord = np.sum(in_shell) / N_atoms
data.append(Ncoord)
data = np.array(data)
time = np.arange(len(data))
plt.plot(time, data)
plt.xlabel("Time (Frame)")
plt.ylabel("Coordination Number")
plt.show()
I actually saw this code in this forum (thanks, btw!). I think I don't understand it completely yet. If it's calculating the coordination number of water molecules around sorbitol, why doesn't it take the sorbitol atoms under consideration? I had previously tried to get the water molecules within a cutoff throughout the simulation, but used a different approach (sphzone):
solvation_shell_sorbitol_1 = u_sorbitol_1.select_atoms("resname SOL and sphzone 3.2 (resname B00)", updating=True)
Then I iterated over the trajectory to get for how long each molecule of water stays in the solvation shell. My question would be: what's the difference between the distances.self_distance_array method and the sphzone method? I mean, to get the coordination number, can I use sphzone too?
By the way, I'm still having problem defining the size of the solvation shell. If I take under consideration the RDF calculation for the whole molecule in water (figure attached), it would be around 2. But, again, I'm not sure the calculation of the RDF for a whole molecule is correct with MDAnalysis. I used another method too: I calculated the RDF for each atom of sorbitol, got the first local minima for each calculation and then averaged. I got 3.2 approximately, that's why I'm using it. 2 seems a little small to me. I would appreciate if you have any comments regarding it :)
Sorry if the post got too long! Thank you very much, any help is appreciated.
Best,
Maria