Run an oddt job in anaconda

14 views
Skip to first unread message

Shirin Jamshidi

unread,
Jan 24, 2020, 5:44:01 AM1/24/20
to od...@googlegroups.com
Dear oddt developers/users;

I am very new in this field and try to use oddt to run a job for my target protein docked with a series of ligands to analyse the results by machine learning. I have installed anaconda and oddt module, however I am not sure where exactly I need to introduce the target protein, ligands and docked poses and where I need to use the library and v2007 database that we need to upload it and also where we should use the default database and information in the oddt toolkit that it seems is v2016. Actually I cant recognise them by reading the chapter book and also the published papers regarding this interesting area of computer science. I appreciate that very much if you could correct me by editing my below command please;

PS
The only command that shows an error is the last one.

Commands in anaconda/python:
import oddt
import pandas as pd
data = pd.read_csv(oddt.__path__[0] + "/scoring/functions/RFScore/rfscore_descs_v1.csv")
training_data = data[data['2016_refined'] & ~data['2016_core']]
features = training_data.iloc[:, -36:].values
activity = training_data['act'].values
target = next(oddt.toolkit.readfile('pdb', 'E:/oddt_machineLearning/PDBbind/PDBID/carLDM40.pdb'))
target.target = True
ligands = list(oddt.toolkit.readfile('mol2', 'E:/oddt_machineLearning/PDBbind/PDBID/allLigs_210.mol2'))
activities = pd.read_csv('E:/oddt_machineLearning/PDBbind/PDBID/carLDM40_scores.csv')
from oddt.datasets import pdbbind
dataset = pdbbind('E:/oddt_machineLearning/PDBbind/v2007/',version=2007,default_set='refined')
activity = dataset.activities
from oddt.scoring.functions import rfscore
desc_gen = rfscore(version=1).descriptor_generator
features = desc_gen.build(ligands, target)
from oddt.scoring.models.regressors import randomforest
model = randomforest(n_estimators=500)
model.fit(features, activities)
testing_data = data[data['2016_core']]
testing_features = testing_data.iloc[:, -36:].values
testing_activity = testing_data['act'].values
model.score(testing_features, testing_activity)
from oddt.scoring import scorer
scoring_function = scorer(model, desc_gen, score_title='my_custom_score')
target = next(oddt.toolkit.readfile('pdb', 'E:/oddt_machineLearning/PDBbind/PDBID/carLDM40.pdb'))
docked_poses = list(oddt.toolkit.readfile('sdf', 'E:/oddt_machineLearning/PDBbind/PDBID/carLDM40_dockedligs.sdf'))
scoring_function.set_protein(target)
scores = scoring_function.predict(docked_poses)
scoring_function.save('my_sf.pkl')
oddt_cli –score_file = my_sf.pkl ('sdf', 'E:/oddt_machineLearning/PDBbind/project/carLDM40_dockedligs.sdf') –protein ('pdb', 'E:/oddt_machineLearning/PDBbind/project/carLDM40.pdb') -o ('csv', 'E:/oddt_machineLearning/PDBbind/project/scores.csv')


Best,
Shirin

Maciek Wójcikowski

unread,
Feb 7, 2020, 4:22:51 PM2/7/20
to Shirin Jamshidi, Open Drug Discovery Toolkit Community
Hi Shirin,

What you want to do is to score the ligands not just train the scoring function, as probably one protein model is not enough to train anything meaningful. 
I can recommend checking out ODDT notebooks on how to use oddt to screen a library of ligands https://github.com/oddt/jcheminf

If you want to retrain a RF-Score-VS-style scoring function https://github.com/oddt/rfscorevs or the chapter you mentioned.

Your code looks fairly ok, the last command is probably intended to be used in CLI/BASH. You can also use the pkl in ODDT pipeline (checkout the first link notebooks on how to do that).

Best,
Maciek


----
Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


--
You received this message because you are subscribed to the Google Groups "Open Drug Discovery Toolkit Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oddt+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/oddt/CAOmdAgtUN5aUMF%3Ds-U6r-1S0JW%3DJ1o21dijytjmgFOkkzdh2Wg%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages