Ignore invalid valences

165 views
Skip to first unread message

Casey Wood

unread,
Jan 20, 2012, 9:45:47 AM1/20/12
to indigo-general
Is it possible to instruct Indigo to ignore invalid valences?

For example, iterating through the atoms of the molecule derived from
the smiles string "CN(C)(C)C" generates the following error:

"com.ggasoftware.indigo.IndigoException: element: bad valence on N
having 4 drawn bonds, charge 0, and 0 radical electrons"

I appreciate that this is an informative message, and that the
nitrogen should carry a charge, but there are instances where I need
to consider a molecule "as is" regardless of its structural validity.
That is, I'm not looking to have this error handled differently (i.e.
via a try/catch); I simply would like the option of not having these
sorts of errors generated at all. I don't see anything in the API
options that would indicate that this is possible, but perhaps there
is some undocumented functionality?

Andrew Dalke

unread,
May 5, 2012, 7:27:36 PM5/5/12
to indigo-...@googlegroups.com
I just came across the same issue. In researching it, I found Casey
Wood's previous post about it, which appears to be unresolved.

I am using Indigo to process records from ChEMBL. I found already
that Indigo doesn't accept the bad stereochemistry in the ChEMBL 13
data set, so I used RDKit to convert the SDF files into SMILES, then
use Indigo to process those SMILES.

I found two ways where Indigo could not parse RDKit's generated SMILES.
One is that RDKit supports aromatic Te as a SMILES extension, and
RDKit perceives [te] in 18 structures out of 1,000,000+.

The other way is in the two structures CHEMBL1616388 and CHEMBL1357894.

The SMILES are I1c2ccccc2c3ccccc13 and Cl.I1c2ccccc2c3ccccc13 respectively.

Indigo raises an exception in mol.aromatize() saying

>>> mol = indigo.loadMolecule("I1c2ccccc2c3ccccc13")
>>> mol.aromatize()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "indigo core", line 3, in indigoAromatize_wrapper
File "/Users/dalke/ftps/indigo-python-1.1-rc-universal/indigo.py", line 1071, in _checkResult
raise IndigoException(Indigo._lib.indigoGetLastError())
indigo.IndigoException: 'element: bad valence on I having 2 drawn bonds, charge 0, and 0 radical electrons'

I don't know where those structures come from, and I don't know how
iodine is supposed to have a valence of 2. I can only say that the
ChEMBL data set contains a couple of SMILES strings which Indigo cannot
handle.

If you all think this is a structure error, and that the chemistry is truly
impossible, then I'll notify ChEMBL about it. I don't know enough chemistry
to be able to be certain.

In any case, large data sets contain strange chemistries, and it would
be nice if there was some way to remove the safeties, as it were, and
allow bad chemistry to go through.

Cheers,


Andrew
da...@dalkescientific.com


Mikhail Rybalkin

unread,
May 14, 2012, 3:12:47 AM5/14/12
to indigo-...@googlegroups.com
Hello Andrew and Casey,

Casey, by some reason we missed that question, sorry for long delay in reply!

We will try to find some solution for iterating atoms, and I will post message about it here.

Andrew, thank you for all you comments. Here is the answers:

I found already that Indigo doesn't accept the bad stereochemistry in the ChEMBL 13 data set, so I used RDKit to convert the SDF files into SMILES, then use Indigo to process those SMILES. 
You can set "ignore-stereochemistry-errors" option to "true", and Indigo will skip such invalid stereocenters.

I found two ways where Indigo could not parse RDKit's generated SMILES.
One is that RDKit supports aromatic Te as a SMILES extension, and
RDKit perceives [te] in 18 structures out of 1,000,000+.
Yes, we didn't support [te] in SMILES, as SMILES specification do not allow it. But I found that it is not a problem to load such structure, and this functionality will be available in the next release. 

I don't know where those structures come from, and I don't know how
iodine is supposed to have a valence of 2. I can only say that the
ChEMBL data set contains a couple of SMILES strings which Indigo cannot
handle.
Iodine should have positive charge to have to single bonds. This is a correct structure: http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=3101
But maybe in your set all the charges are discarded.

Best regards,
Mikhail

Mikhail Rybalkin

unread,
May 18, 2012, 2:30:36 AM5/18/12
to indigo-...@googlegroups.com
Hello Andrew,

We have just release Indigo 1.1-rc3 where [te] is accepted within SMILES: http://ggasoftware.com/download/indigo_next

Issue with iteration of atoms with invalid valences is still open.

Best regards,
Mikhail

On Saturday, May 5, 2012 4:27:36 PM UTC-7, Andrew Dalke wrote:

Mikhail Rybalkin

unread,
Dec 24, 2012, 5:10:15 PM12/24/12
to indigo-...@googlegroups.com
Hello Andrew,

We have finally fixed this issue aromatizing this structures. You can try the latest release of Indigo.

Best regards,
Mikhail

lixi...@gmail.com

unread,
Jun 3, 2014, 7:47:07 AM6/3/14
to indigo-...@googlegroups.com
Has anything happened to "Ignore invalid valences" ?
Reply all
Reply to author
Forward
0 new messages