bingo.SearchSim table-valued function in SQL Server

24 views
Skip to first unread message

Dmitrii

unread,
Jul 27, 2010, 4:50:43 PM7/27/10
to indigo-bugs
In the current implementation, the SearchSim function calls the
internal _SearchSim function, which, in turn, returns a (virtual)
table with one column called 'id', in which the id's of the structures
satisfying the similarity search criteria are returned. A typical
usage of this function is shown below:

select * from dbo.Structure where ID in (select * from
bingo.SearchSim('Structure', 'C/C=C(/
Cl)C(C)C1CCN(C[C@H]2ON=C3c4cc5OCCSc5cc4SC[C@H]23)CC1C', 'Tanimoto',
0.95, null));

The problem with this approach is that it provides no way to report
the similarity scores of the found structures, or sort the query
results by the similarity score. It would make sense to provide an
additional table-valued function called, for example,
SearchSimReportScore, that will return a table with both the id, and
the score (float), so a statement similar to the one shown below could
be executed:

select dbo.Structure.ID, dbo.Structure.Smiles, ss.Score from
dbo.Structure join bingo.SearchSim('Structure', 'C/C=C(/
Cl)C(C)C1CCN(C[C@H]2ON=C3c4cc5OCCSc5cc4SC[C@H]23)CC1C', 'Tanimoto',
0.95, null) ss on dbo.Structure.ID = ss.id order by ss.Score desc;

The above query would return a table with id's, structures and
similarity scores (with respect to the specified query molecule)
sorted in the descending order, so the molecules most similar to the
target one appear on top of the result set.

The same applies to other search functions that compute the search
property on-the-fly, such as bingo.SearchMolecularWeight.

Dmitry Pavlov

unread,
Jul 28, 2010, 12:11:32 PM7/28/10
to indig...@googlegroups.com
Hello Dmitrii,

We agree that it is good to have the similarity/molecular weight
value returned by the TVF which searches the table by the
corresponding criterion. Such feature is likely to be added
to Bingo next month.

This week, as a temporary solution, we will add Bingo.Sim()
function for directly computing the similarity of two given
molecules, so one could write queries like:

SELECT Bingo.Sim(smiles, $query, 'Tanimoto'), t.id from dbo.Structure,
bingo.SearchSim('Structure', $query, 'Tanimoto') t WHERE
t.id = dbo.Structure.id

Too bad this function is not already there. It will be useful
anyway, and it is present in Bingo for Oracle.

With best regards,

Dmitry

ksasi...@gmail.com

unread,
Feb 21, 2018, 5:03:19 AM2/21/18
to indigo-bugs
Hello Dmitry Pavlov,

Was this update also made for bingo version 1.8.0beta.r1-gef02bfd win64 for postgres? If so, could you please let me know what function returns the value of Tanimoto so that I can sort search results according to the value?

Thank you.

Regards,
Sasirekha

ksasi...@gmail.com

unread,
Feb 21, 2018, 5:20:05 AM2/21/18
to indigo-bugs
UPDATE: I found it on pgAdmin.

It is: getsimilarity($molecule, $queryString, $option) where option is one of 'Tanimoto', 'Tversky', 'Euclid-sub'.

Thanks anyway!

Regards,
Sasirekha
Reply all
Reply to author
Forward
0 new messages