Possibility of bias with PLECs in DUD-E

11 views
Skip to first unread message

mate...@gmail.com

unread,
Jun 20, 2019, 5:46:09 PM6/20/19
to Open Drug Discovery Toolkit Community
Hello,

I am concerned about the possibility of bias caused by molecular properties or 2D topology when using PLEC fingerprints with the DUD-E set.

This concern was raised after reading the following paper:

Performance of ML models were largely unchanged after removing all protein information from their structure based descriptor, revealing an implicit bias.

Has this possibility been explored / addressed with regards to the PLEC fingerprint?

If not, is there a way to replicate this sort of experiment, i.e, remove the protein from the PLEC to see if there is a change in ML performance?

Thank you in advance,
Mateo Vacacela

Maciek Wójcikowski

unread,
Jun 29, 2019, 6:59:25 AM6/29/19
to mate...@gmail.com, Open Drug Discovery Toolkit Community
Hi,

This is a very interesting problem. PLEC does not include a ligand-only bits. If you remove the proteins entirely you will get all zeros in FP. We have not done any particular research in this area, but I've seen other comparing similarity of the complexes to find such correlation, but it showed no problems.

If you would like to verify it yourself I can recommend introducing random rotations of ligands in the protein binding sites to check if it in fact still can train the predictions on them with high accuracy (and proper cross-validation).

Let me know what you think, or if you have any other questions.
----
Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


--
You received this message because you are subscribed to the Google Groups "Open Drug Discovery Toolkit Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oddt+uns...@googlegroups.com.
To post to this group, send email to od...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/oddt/69fc4a81-983d-45e9-a1c8-4857102c0e52%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mate...@gmail.com

unread,
Jul 4, 2019, 1:35:56 PM7/4/19
to Open Drug Discovery Toolkit Community
Thank you. I will go ahead with rotating the ligands to non-ideal poses and seeing how this affects the ML model's prediction ability.


On Saturday, June 29, 2019 at 3:59:25 AM UTC-7, Maciek Wójcikowski wrote:
Hi,

This is a very interesting problem. PLEC does not include a ligand-only bits. If you remove the proteins entirely you will get all zeros in FP. We have not done any particular research in this area, but I've seen other comparing similarity of the complexes to find such correlation, but it showed no problems.

If you would like to verify it yourself I can recommend introducing random rotations of ligands in the protein binding sites to check if it in fact still can train the predictions on them with high accuracy (and proper cross-validation).

Let me know what you think, or if you have any other questions.
----
Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


czw., 20 cze 2019 o 23:46 <mate...@gmail.com> napisał(a):
Hello,

I am concerned about the possibility of bias caused by molecular properties or 2D topology when using PLEC fingerprints with the DUD-E set.

This concern was raised after reading the following paper:

Performance of ML models were largely unchanged after removing all protein information from their structure based descriptor, revealing an implicit bias.

Has this possibility been explored / addressed with regards to the PLEC fingerprint?

If not, is there a way to replicate this sort of experiment, i.e, remove the protein from the PLEC to see if there is a change in ML performance?

Thank you in advance,
Mateo Vacacela

--
You received this message because you are subscribed to the Google Groups "Open Drug Discovery Toolkit Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to od...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages