detecting surface residues

60 views
Skip to first unread message

ivanovic...@gmail.com

unread,
Dec 3, 2018, 10:46:32 AM12/3/18
to MDnalysis discussion
Dear all,

A few months ago I asked for help in detecting surface residues of a protein in a MD simulation (I can not access the theme anymore).
I am looking for a tool that will give me a number of certain residues (Argenine, Glutamate...) on the protein surface.

I got very helpful piece of code (which I slightly adapted):

##################################################################
import MDAnalysis as mda
import numpy as np
import matplotlib.pyplot as plt
plt.matplotlib.style.use("ggplot")
import pandas as pd
from MDAnalysis.lib import distances
import os, sys

TPR = "md.tpr"
XTC = "traj_160-163.xtc"
u = mda.Universe(TPR, XTC)

acidics = u.select_atoms("resname ASP GLU and not name H*")
water = u.select_atoms("resname SOL and name OW")

dmax = 3


def get_exposed_residues(atoms, water, dmax=3.5):
    """Find all residues for which atoms are within dmax of water."""

    dij = distances.distance_array(atoms.positions, water.positions,
                                   box=atoms.universe.trajectory.ts.dimensions)
    exposed_atoms = np.any(dij <= dmax, axis=1)
    return atoms[exposed_atoms].residues

results = np.zeros((u.trajectory.n_frames, 2))   # (time, N_exposed)
for i, ts in enumerate(u.trajectory):
    exposed_residues = get_exposed_residues(acidics, water, dmax=dmax)
    results[i, :] = (ts.time, exposed_residues.n_residues)

np.savetxt('exposed_ASP_GLU_D_3.xvg', results, delimiter=' ')
#################################################################


However, problem with this is, that during the simulations (sometimes even in a crystal structure), there are a few water molecules "inside" the protein, leading to a great number of false positives. 
Is there another way to do this - some criteria for distance of residue to other residues? Or simply to count distance to many (10-50) water molecules?
As my python knowledge is limited, I would be very grateful for any help.

Thank you in advance,
Milos

orbe...@gmail.com

unread,
Dec 10, 2018, 3:53:30 PM12/10/18
to MDnalysis discussion
Hi Milos,

I don't think that there is a simple solution. If your protein is roughly spherical then you could exclude water molecules that are within a certain radius from the center.

A better approach would be to compute the convex hull of the protein and then check if a water is inside the hull. A long time ago I tried to implement this in our "hop" package (see function hop.qhull.ConvexHull.point_inside()) but I doubt that this is code is working out of the box – you're welcome to try but you're on your own making it work. The code is not supported in any form. You are also welcome to try and use the code for your own analysis algorithm (it's GPL). Perhaps you can make use of scipy.spatial.ConvexHull. That's where I would start.

The bottom line is that you will have to come up with your own algorithm for how to solve this problem and then you need to implement it. We're happy to help with questions regarding implementation and MDAnalysis, but ultimately, you'll have to do the work.

Oliver

ivanovic...@gmail.com

unread,
Dec 11, 2018, 6:25:58 AM12/11/18
to MDnalysis discussion
Hi Oliver,

Thank you so much for the answer. I naively believed that there is an easy way to detect surface residues, at least from a simulation snapshot (pdb file). SInce this is not the case, I would be happy to try your suggestions.

Cheers,
milos

Oliver Beckstein

unread,
Dec 11, 2018, 1:40:53 PM12/11/18
to mdnalysis-...@googlegroups.com
On Dec 11, 2018, at 4:25 AM, ivanovic...@gmail.com wrote:

Thank you so much for the answer. I naively believed that there is an easy way to detect surface residues, at least from a simulation snapshot (pdb file).

Maybe there is, but I am not aware of a canonical way to define them. What does the literature say?

SInce this is not the case, I would be happy to try your suggestions.

You could also use the ConvexHull to find all the residues that are close to the surface. 

Or consider enveloping the protein in a sphere and then call those residues surface residues that can be connected to the surface of the sphere by a radial ray, without intersecting any other residues. 

Or generate a molecular surface (eg using MSMS) and then use that information to define surface residues.

My point is that you first need a definition of what you really mean by “surface residue” and then you can think about how to classify them. I’d definitely look into the literature and see what other people have been using, especially if you want to do research that is building upon prior work.

Oliver



ivanovic...@gmail.com

unread,
Dec 13, 2018, 12:29:26 PM12/13/18
to MDnalysis discussion
Hi Oliver,

Thank you again.
Surface would mean solvent exposed - let's say in contact with water atoms in 80% of the frames. There is no consensus in a literature (and also the phenomena that I want to study is not studied with the MD so far), but I find this assumption very reasonable. The problem is, as I wrote before, to avoid counting a few water atoms that are inside the proteine. I would start digging through suggestions, and also through the pymol documentation - as pymol offer an option do colour the surface of the protein, maybe something useful is there..

Cheers!

Oliver Beckstein

unread,
Dec 13, 2018, 12:47:25 PM12/13/18
to mdnalysis-...@googlegroups.com
You could also look through the electrostatics literature (eg Betrand Garcia-Moreno E’s work). They typically discuss “buried” vs “surface exposed” residues.

The PROPKA software for heuristic pKa prediction classifies residues as buried and surface. I’d read their papers and look at their code https://github.com/jensengroup/propka-3.1 (citations on the page). It will almost certainly be easier than reverse-engineering pymol’s surface generation.

(Instead of full-blown molecular surface I would use ConvexHull – seems much simpler.)

On Dec 13, 2018, at 10:29 AM, ivanovic...@gmail.com wrote:

Surface would mean solvent exposed - let's say in contact with water atoms in 80% of the frames. There is no consensus in a literature (and also the phenomena that I want to study is not studied with the MD so far), but I find this assumption very reasonable. The problem is, as I wrote before, to avoid counting a few water atoms that are inside the proteine. I would start digging through suggestions, and also through the pymol documentation - as pymol offer an option do colour the surface of the protein, maybe something useful is 

Reply all
Reply to author
Forward
0 new messages