Dear all,
A few months ago I asked for help in detecting surface residues of a protein in a MD simulation (I can not access the theme anymore).
I am looking for a tool that will give me a number of certain residues (Argenine, Glutamate...) on the protein surface.
I got very helpful piece of code (which I slightly adapted):
##################################################################
import MDAnalysis as mda
import numpy as np
import matplotlib.pyplot as plt
plt.matplotlib.style.use("ggplot")
import pandas as pd
from MDAnalysis.lib import distances
import os, sys
TPR = "md.tpr"
XTC = "traj_160-163.xtc"
u = mda.Universe(TPR, XTC)
acidics = u.select_atoms("resname ASP GLU and not name H*")
water = u.select_atoms("resname SOL and name OW")
dmax = 3
def get_exposed_residues(atoms, water, dmax=3.5):
"""Find all residues for which atoms are within dmax of water."""
dij = distances.distance_array(atoms.positions, water.positions,
box=atoms.universe.trajectory.ts.dimensions)
exposed_atoms = np.any(dij <= dmax, axis=1)
return atoms[exposed_atoms].residues
results = np.zeros((u.trajectory.n_frames, 2)) # (time, N_exposed)
for i, ts in enumerate(u.trajectory):
exposed_residues = get_exposed_residues(acidics, water, dmax=dmax)
results[i, :] = (ts.time, exposed_residues.n_residues)
np.savetxt('exposed_ASP_GLU_D_3.xvg', results, delimiter=' ')
#################################################################
However, problem with this is, that during the simulations (sometimes even in a crystal structure), there are a few water molecules "inside" the protein, leading to a great number of false positives.
Is there another way to do this - some criteria for distance of residue to other residues? Or simply to count distance to many (10-50) water molecules?
As my python knowledge is limited, I would be very grateful for any help.
Thank you in advance,
Milos