> I have uploaded the bug fixes and it will be available in the next
> release
Thanks for the quick fix. I also forgot to add that when the algorithm
is set to ignore_charges and ignore_valencies that a significant speed-
up is noticed. This somewhat surprised me as I expected these options
would mean that more potential mappings would need to be evaluated due
to greater flexibility in matching.
On Sep 28, 10:48 am, Savelyev Alexander <
asavel...@ggasoftware.com>
wrote:
> No it is not a bug. The algorithm point is to map more atoms in a
> reaction. Also, there is some privilege for reactants molecules. I can
> suppose that your task is to determine unused reactants in a reaction.
> Unfortunately, current algorithm can not do such a approach at the
> moment.
> But I can propose a pure algorithm, that can be done above the
> "automap" method. You can remove a reactant in a cycle and look for
> maximum mapping atoms in the reaction products part for each
> iteration. If maximum is reached by removing one (or several)
> reactants thus these reactants are not used in a reaction. The
> algorithm was created in a minute but it can be improved.
Unfortunately not all reactants necessarily form a part of the product
e.g. a reactant may be responsible for a bond order change or the
removal of a group
I am quite happy with the performance of the algorithm as it is now. I
do not believe there are any significant classes of organic reaction
besides perhaps the formation of acid chlorides using thionyl chloride
that are not covered and the occurrences of false mappings is still
sufficiently low.
> If a fragment consists more than 3 atoms it can be multiplied. It is
> not a good solution to multiple small fragments since it can corrupt a
> mapping. But we can consider the feature request and, say, create an
> option there user can determine the number of atoms [or set the atom
> list allowed for multiplying].
I think in general that this is a reasonable heuristic. As an
exception though I would argue that if the EXACT molecule is present
on both sides of the reaction that mapping is appropriate. This would
be appropriate for salts such as [ClH]. Performing graph identity on
molecules with 3 atoms or less should be computationally cheap (e.g.
canonical smiles) so its something that could be considered.
Just to check one more thing. Is:
http://wwmm.ch.cam.ac.uk/~dl387/indigoMappingProblems/odduseofthesamereactanttwice.png
Intended behaviour? The actual reaction has an extra reactant.
The cyclic reactant appears to have been multiplied but different
parts of the multiplied reactant are used. I'm sure there are cases
where a reactant performs multiple roles in a reaction but I'm still a
bit dubious about this due to its potential for false positives
(although I suppose with stricter matching criteria such false
mappings would be less common).