In the current implementation, the SearchSim function calls the
internal _SearchSim function, which, in turn, returns a (virtual)
table with one column called 'id', in which the id's of the structures
satisfying the similarity search criteria are returned. A typical
usage of this function is shown below:
select * from dbo.Structure where ID in (select * from
bingo.SearchSim('Structure', 'C/C=C(/
Cl)C(C)C1CCN(C[C@H]2ON=C3c4cc5OCCSc5cc4SC[C@H]23)CC1C', 'Tanimoto',
0.95, null));
The problem with this approach is that it provides no way to report
the similarity scores of the found structures, or sort the query
results by the similarity score. It would make sense to provide an
additional table-valued function called, for example,
SearchSimReportScore, that will return a table with both the id, and
the score (float), so a statement similar to the one shown below could
be executed:
select
dbo.Structure.ID, dbo.Structure.Smiles, ss.Score from
dbo.Structure join bingo.SearchSim('Structure', 'C/C=C(/
Cl)C(C)C1CCN(C[C@H]2ON=C3c4cc5OCCSc5cc4SC[C@H]23)CC1C', 'Tanimoto',
0.95, null) ss on
dbo.Structure.ID =
ss.id order by ss.Score desc;
The above query would return a table with id's, structures and
similarity scores (with respect to the specified query molecule)
sorted in the descending order, so the molecules most similar to the
target one appear on top of the result set.
The same applies to other search functions that compute the search
property on-the-fly, such as bingo.SearchMolecularWeight.