exception on canonicalSmiles method

125 views
Skip to first unread message

Karen Karapetyan

unread,
Mar 8, 2013, 6:52:21 AM3/8/13
to indig...@googlegroups.com
Hi,

The attached seems to be "innocent" aromatic molecule and many others that I have from DrugBank collection are throwing exception:
element: can not calculate implicit hydrogens on aromatic N, charge 0, degree 2, 0 radical electrons

Indigo i = new Indigo();
IndigoObject input_obj = i.loadMolecule(mol_input);
string input_smiles = input_obj.canonicalSmiles();

   at com.ggasoftware.indigo.Indigo._handleError(SByte* message, Indigo self)
   at IndigoLib_indigo_dll.indigoCanonicalSmiles(Int32 )
   at com.ggasoftware.indigo.IndigoObject.canonicalSmiles()

Thanks,
Ken
selectedRecords (1).sdf

Mikhail Rybalkin

unread,
Mar 21, 2013, 11:00:06 AM3/21/13
to indig...@googlegroups.com
Hi Ken,

Sorry for this period of silence from our team.

Your "innocent" structure doesn't store hydrogens and there are two different dearomatizations:  ClC1=NC(Br)=NC2N=CNC=21 and ClC1=NC(Br)=NC2NC=NC=21. These structure are tautomers and canonical SMILES have to write hydrogens to distinguish these structures. You can call input_obj.dearomatize() to get one dearomatization, but it will not return both of them. If you want I can add a method that counts number of different dearomatizations in terms of hydrogens positions. Internally in C++ code we can enumerate all the aromatic forms, but I'm not sure if it is reasonable to develop complex API for such enumeration.

Best regards,
Mikhail

Karen Karapetyan

unread,
Mar 21, 2013, 2:03:04 PM3/21/13
to indig...@googlegroups.com
Hi, Mikhail,

One dearomatization is enough. I always forget that to get truly canonical indigo smiles I need to consistently do either prior aromatizaion or dearomatization.

Thanks,
Ken

Mikhail Rybalkin

unread,
Mar 21, 2013, 4:44:30 PM3/21/13
to indig...@googlegroups.com
I think in one of the next versions (like in Indigo 1.2) we will be doing aromatization automatically in the .canonicalSmiles() methods. At this moment I do not want to break compatibility.

Karen Karapetyan

unread,
Apr 2, 2013, 3:25:10 PM4/2/13
to indig...@googlegroups.com
>If you want I can add a method that counts number of different dearomatizations in terms of hydrogens positions. 
>Internally in C++ code we can enumerate all the aromatic forms, but I'm not sure if it is reasonable to develop complex API for such enumeration.

Actually having count of dearomatizations could be very handy for chemical validation as >1 dearomatization introduced ambiguity into molfile.
It appears, that ,for example, DrugBank data set - pretty well known dataset - have bunch of records with molfiles with ambiguous aromatized fragments. 

Mikhail Rybalkin

unread,
Apr 4, 2013, 8:14:32 AM4/4/13
to indig...@googlegroups.com
Number of dearomatization can be too large because from have N dearomatizations in one molecule fragment and M dearomatizations in another, then you get N*M dearomatization for the whole molecule. And this number can be very large.

I'm thinking about another simpler solution: additional option to check if dearomatization is ambiguous and in such case Indigo will throw an exception in the IndigoObject.dearomatize method. What do you think?

Karen Karapetyan

unread,
Apr 4, 2013, 9:13:24 AM4/4/13
to indig...@googlegroups.com
we need to think of way for users not to call dearomatize twice if they want to learn if dearomatization is ambiguous but at the same time want to get back any single dearomatized Indigo object. 

Throwing exception would make users to have try/catch around any dearomatization just in case if it is ambiguous. May be instead of throwing exception do something like this:

bool isAmbiguous ;
indigo.dearomatize(isAmbiguous);

prototype:
IndigoObject indigo.dearomatize(out bool isAmbiguous);

And we can keep the old prototype as well
IndigoObject indigo.dearomatize();

Mikhail Rybalkin

unread,
Apr 19, 2013, 4:19:11 PM4/19/13
to indig...@googlegroups.com
Hi Karen,

I added an option "unique-dearomatization" in the Indigo 1.1.10. If it is set to true Indigo will throw an exception. With your approach with additional argument  isAmbiguous it is not clear whether we want to find any dearomatization and return information that it is ambiguous, or that we do not want to dearomatize if it is ambiguous. So for simplicity I decided to an option.

What it not good with the current approach is that we can distinguish different exceptions in the code only by text message. Maybe we will add exceptions hierarchy that is common approach to distingiush exceptions: something like AmbiguousDearomatizationException, InvalidStereocentersException, etc.

Best regards,
Mikhail
Reply all
Reply to author
Forward
0 new messages