H-bond interaction questions

19 views
Skip to first unread message

Tim Dudgeon

unread,
Oct 9, 2020, 10:42:12 AM10/9/20
to Open Drug Discovery Toolkit Community
Hi,
I have a question/observation about how the H-bond interactions are calculated.

Where a H-bond donor atom has two (or more?) protonation states are the interactions that are generated for the actual state of the molecule, or are they based on the potential for a H-bond. e.g. in the case of N in an aromatic ring being in neutral, non-protonated, form then it is not a H-bond donor. But if it is protonated [nH+] then it can be a donor. It seems that a H-bond is predicted if the molecule is in either form. I can see that this might indeed be useful in some cases but in others you might only want predictions for that exact molecular form to be generated.

Thanks
Tim

Maciek Wójcikowski

unread,
Oct 11, 2020, 5:07:57 AM10/11/20
to Tim Dudgeon, Open Drug Discovery Toolkit Community
Hi Tim,

Computing H-bonds is a two step process. First it is decided which atoms are acceptors and donors, and then they are cross-checked for interaction. I found a possible cause: a donor SMARTS pattern that also captures possible tautomers https://github.com/oddt/oddt/blob/master/oddt/toolkits/ob.py#L516-L517
I also double checked at RDKit BaseFeat.def which was a source of this SMARTS and it's the same there https://github.com/rdkit/rdkit/blob/master/Data/BaseFeatures.fdef#L12

There is unfortunately no way of re defining smarts patterns by yourself. To quickly check if that addresses your tautomers case properly you can comment out those lines. 
To address this issue in a more proper way you can overwrite "isdonor" property by setting it to False in `mol.atom_dict` for [n;H0] atoms.
----
Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


--
You received this message because you are subscribed to the Google Groups "Open Drug Discovery Toolkit Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oddt+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/oddt/3dfcd490-73db-4ac3-a82d-37e0d84f7507n%40googlegroups.com.

Maciek Wójcikowski

unread,
Oct 11, 2020, 7:04:43 AM10/11/20
to Tim Dudgeon, Open Drug Discovery Toolkit Community
Here is a code snippet that modifies the atom_dict and should fit your needs:
import oddt
mol = oddt.toolkit.readstring('smi', 'n1c[nH]cc1')
print(f'Before:\t{mol.atom_dict["isdonor"]}')
# modify atom_dict write protected array
mol.atom_dict.setflags(write=True)
for (idx,) in oddt.toolkit.Smarts('[n;H0]').findall(mol):
    # OB has 1-based index
    mol.atom_dict['isdonor'][idx-1] = False
mol.atom_dict.setflags(write=False)
print(f'After:\t{mol.atom_dict["isdonor"]}')
----
Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl

Tim Dudgeon

unread,
Oct 12, 2020, 6:27:49 AM10/12/20
to Maciek Wójcikowski, Open Drug Discovery Toolkit Community
Hi Maciek

Thanks for that information and suggestion.
Yes, the case I found did correspond to that tautomer.
If I understand correctly your fix finds all aromatic nitrogens with zero hydrogens and sets the 'isdonor' property to false.
That seems to work for me (other than I'm using RDKit so uses 0 based indexing).
Will you be applying the change to the smarts patterns to the ODDT codebase?
Tim

Tim Dudgeon

unread,
Oct 12, 2020, 6:57:25 AM10/12/20
to Maciek Wójcikowski, Open Drug Discovery Toolkit Community
Hi Maciek

I think a similar problem is seen with salt bridge interactions. The same aromatic nitrogen with no hydrogens in being predicted as participating in a salt bridge.
For the form where that nitrogen is protonated [nH+] that would be valid, but for the non-protonated state it is not.

Tim

Maciek Wójcikowski

unread,
Oct 12, 2020, 3:53:37 PM10/12/20
to Tim Dudgeon, Open Drug Discovery Toolkit Community
Hi Tim,

Thanks for additional feedback. Indeed the SMARTS patterns are not as robust as one might hope. Therefore I'd have to review them more closely, It has been over 3 years since they have been introduced.

Regarding the tautomeric, aromatic nitrogen I'm not 100% convinced that it's a change we'd like to make. I will make sure that such exceptions are better documented. Would you find such an approach acceptable? Or Would you prefer to have additional toggle for [*H0] atoms? If you have any suggestions they are welcome!

I find the results from salt bridges to be a bit faulty, thus this change will have to be made there.

----
Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl

Tim Dudgeon

unread,
Oct 13, 2020, 3:19:21 AM10/13/20
to Maciek Wójcikowski, Open Drug Discovery Toolkit Community
Hi Maciek,
I can see two different cases:
1. interactions determined based on the exact molecular representation e.g. aromatic nitrogen without hydrogen is not a donor.
2. interactions determined based on potential tautomers and protomers e.g. aromatic nitrogen without hydrogen is a donor if a tautomeric form or charge state can result in a hydrogen being present.

I can see both being valid cases, so possibly there could be an additional switch for this.
Presumably the first case is relatively straight forward, the second more difficult?

Tim

Tim Dudgeon

unread,
Oct 21, 2020, 6:08:09 AM10/21/20
to Maciek Wójcikowski, Open Drug Discovery Toolkit Community
On further thinking the nc[n;H1] pattern is probably primarily to handle HIS residues in the protein, as it allows H-bond interactions to be predicted even if the HIS is in the wrong tautomeric state (HID vs HIE).
Also, the quirk in Salt bridge interaction that I mention is probably similar in nature ( allowing HIS/HID/HIE to be treated as HIP).
But the knock on consequence is that this also impacts ligands, which is not always what is expected.
Maybe we need to be able to configure protein and ligands differently as ligands may be more precisely specified than the protein (though not always).
Tim

Maciek Wójcikowski

unread,
Oct 22, 2020, 3:20:27 PM10/22/20
to Tim Dudgeon, Open Drug Discovery Toolkit Community
We already have the Molecule.protein property so that proteins get residues parsed, so what you proposed would be feasible. Although I would avoid having two separate patterns by default. So either we allow for implicit tautomeric structures or not. My opinion on this matter is the more cheminformatics approach of having some implicit information (and Hs) in the interactions by default. 

In contrast, exact protonation states would require Hs coordinates, which could be taken further to include their angles in the calculations - lets call it comp chemist approach.

----
Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl

Reply all
Reply to author
Forward
0 new messages