Coordination number of a molecule

69 views
Skip to first unread message

Maria Thereza

unread,
Nov 24, 2023, 8:53:34 AM11/24/23
to MDnalysis discussion
Hello everyone!

I have some simulations of a molecule in water: 1 molecule, 10 molecules, 20 molecules and 30 molecules of sorbitol in water. I wanted to know how many molecules of water are in the first solvation shell of each one of these molecules.

My first thought was to compute the RDF of the molecule in water. So, I tried (example for the system containing 1 molecule):

# define parameters
r_min = 0.0  # minimum distance (in Angstroms)
r_max = 10.0  # maximum distance (in Angstroms)
nbins = 100  # number of bins for RDF

sorbitol_selection = u_sorbitol.select_atoms(f"resname B00")
water_selection = u_sorbitol.select_atoms("resname SOL")

# calculate RDF
rdf_sorbitol = InterRDF(sorbitol_selection,
                            water_selection,
                            nbins=nbins,
                            range=(r_min, r_max),
                            periodic=True  # account for periodic boundary conditions
                            )
rdf_sorbitol.run()

Then I used a function to calculate the coordination number and got a value of 0.25, which I find strange. I also found the RDF plot strange, in comparison with what I can find in the bibliography (I've attached the plots).
The function I used to calculate the coordination number is:

# coordination number calculation
def calc_coord_number(r_polyol, g_r_polyol, r_min, r_max, rho):
    # interpolate g(r) function
    g_interpolated = np.interp

    # define the function to integrate
    def integrand(r):
        return g_interpolated(r, r_polyol, g_r_polyol) * r**2

    # perform numerical integration
    result, _ = integrate.quad(integrand, r_min, r_max)

    # calculate CN(r)
    CN_r = 4 * np.pi * rho * result

    return CN_r

I then said to myself that maybe the value is small because RDF is usually computed regarding particles and not whole molecules. My first question would be if it's possible with MDAnalysis to compute the coordination number with RDF for whole molecules.

I then tried a different approach, specifying a solvation shell size:

import MDAnalysis.lib.distances as distances

cutoff = 3.2
water = u_sorbitol_1.select_atoms("resname SOL")
sorbitol = u_sorbitol_1.select_atoms("resname B00")
N_atoms = water.n_atoms
data = []

for ts in u_sorbitol_1.trajectory:
    r = distances.self_distance_array(water.positions, box=ts.dimensions)
    in_shell = np.logical_and(r > 0.5, r < cutoff)
    Ncoord = np.sum(in_shell) / N_atoms
    data.append(Ncoord)

data = np.array(data)
time = np.arange(len(data))
plt.plot(time, data)
plt.xlabel("Time (Frame)")
plt.ylabel("Coordination Number")
plt.show()

I actually saw this code in this forum (thanks, btw!). I think I don't understand it completely yet. If it's calculating the coordination number of water molecules around sorbitol, why doesn't it take the sorbitol atoms under consideration? I had previously tried to get the water molecules within a cutoff throughout the simulation, but used a different approach (sphzone):

solvation_shell_sorbitol_1 = u_sorbitol_1.select_atoms("resname SOL and sphzone 3.2 (resname B00)", updating=True)

Then I iterated over the trajectory to get for how long each molecule of water stays in the solvation shell. My question would be: what's the difference between the distances.self_distance_array method and the sphzone method? I mean, to get the coordination number, can I use sphzone too?

By the way, I'm still having problem defining the size of the solvation shell. If I take under consideration the RDF calculation for the whole molecule in water (figure attached), it would be around 2. But, again, I'm not sure the calculation of the RDF for a whole molecule is correct with MDAnalysis. I used another method too: I calculated the RDF for each atom of sorbitol, got the first local minima for each calculation and then averaged. I got 3.2 approximately, that's why I'm using it. 2 seems a little small to me. I would appreciate if you have any comments regarding it :)

Sorry if the post got too long! Thank you very much, any help is appreciated.
Best,
Maria

rdf.png

Hugo Macdermott-Opeskin

unread,
Nov 28, 2023, 3:00:17 AM11/28/23
to MDnalysis discussion
Hi Maria, 

I am not sure about the specific issues to do with your implementation of CNs, I would point you to the amazing solvation-analysis MDAKit that implements a lot of what you are looking for (https://github.com/MDAnalysis/solvation-analysis). See some of the great tutorials within. Perhaps you could give that a try for your purposes and see if you are still getting unexpected results. If you are we can work to get to the bottom of it. 

Cheers

Hugo

Maria Thereza

unread,
Nov 28, 2023, 4:24:43 AM11/28/23
to MDnalysis discussion
Hi Hugo,

Thanks a lot, I'll look into it!

Best,
Maria

Maria Thereza

unread,
Nov 30, 2023, 8:59:43 AM11/30/23
to MDnalysis discussion
Hello Hugo,

First of all thank you very much for your advice on using Solvation Analysis! I've been playing with it ever since.

I think I came across a problem that I could not solve though. As I mentionned, I'm performing calculations with a multi-atom solute. So I tried:

solute = Solute.from_atoms(sorbitol, {'sorbitol': sorbitol, 'water': water}, solute_name='sorbitol')
solute.run()

But in my case I had: 
AssertionError: Solute could not identify a solvation radius for water sorbitol. Please manually enter missing radii by editing the radii dict and rerun the analysis.

But I actually wanted to know the solvation radius. For now I calculated the RDF for the sorbitol molecule in the water solvent (with MDAnalysis) and took the first local minima as solvation radius, but isn't there a way to do it automatically with Solvation Analysis?

Also, I couldn't plot the RDF for this system with Solvation Analysis. I tried:

solute.plot_solvation_radius('sorbitol', 'water')
plt.show()

But got:

--------------------------------------------------------------------------- KeyError Traceback (most recent call last) Cell In[14], line 1 ----> 1 solute.plot_solvation_radius('sorbitol', 'water') 2 plt.show() File ~/anaconda3/lib/python3.11/site-packages/solvation_analysis/solute.py:672, in Solute.plot_solvation_radius(self, solute_name, solvent_name) 670 solute_name = self.solute_name 671 assert self.has_run, "Solute.run() must be called first." --> 672 bins, data = self.rdf_data[solute_name][solvent_name] 673 radius = self.atom_solutes[solute_name].radii[solvent_name] 674 fig, ax = self._plot_solvation_radius(bins, data, radius) KeyError: 'sorbitol'

I'd appreciate your thoughts on the matter. Thank you in advance! :-)
Best,
Maria

Reply all
Reply to author
Forward
0 new messages