runing with selene_cli

78 views
Skip to first unread message

Zofie Lin

unread,
Jun 3, 2020, 2:06:34 PM6/3/20
to Selene (sequence-based deep learning package)
Hi
I installed selene_sdk with conda, and have set up a conda environment using selene-cpu.yml.
However, I sitll couldn't running the script with selene_cli.py, always with an error "ModuleNotFoundError: No module named 'selene_sdk.sequences._sequence'" and I don't know how to deal with it.
Then I do the training with the codes below (similar training with case1):
from selene_sdk.utils import load_path
from selene_sdk.utils import parse_configs_and_run

configs = load_path("/path/selene/manuscript/case1/test/train_and_eval.yml")
parse_configs_and_run(configs, lr=0.01)
It runs well without any error messages. 
However, the  training_outputs missing roc_curves.svg, precision_recall_curves.svg, test_predictions.npz, test_targets.npz, test_data.bed and test_performance.txt. Totally missing half of the result! 
Only has these outputs:
best_model.pth.tar
selene_sdk.train_model.log
selene_sdk.train_model.validation.txt
validate_data.bed
checkpoint.pth.tar
selene_sdk.train_model.train.txt
train_data.bed
Could you please help me check what the problem with it?
Many thanks.



Jian Zhou

unread,
Jun 3, 2020, 2:48:43 PM6/3/20
to Selene (sequence-based deep learning package)
Hi 

Running selene_cli.py uses your local copy of Selene (that is in the same directory as selene_cli.py) while importing selene_sdk uses the installed version. So it looks like your local version has not compiled the Cython component.

So to solve this, try running

python setup.py build_ext --inplace


in your same directory as selene_cli.py.

To get the other output files you described, you need to run Selene that a configuration file that includes `EvaluateModel`. Selene has an example for doing that in case 2 `selene/manuscript/case2/2_model_comparison/evaluate.yml` and you can also refer to the CLI docs (https://selene.flatironinstitute.org/overview/cli.html).

Hope this helps!
Jian

Kathy Chen

unread,
Jun 3, 2020, 4:05:59 PM6/3/20
to Selene (sequence-based deep learning package)
Thanks Jian for responding to this!

I looked into it a little more, and I think the version of Selene that matches the publication's case studies used to evaluate automatically with just the `train_model` config initialization. Looks like we completely removed that part of the code in newer versions... I'll make a Github issue about this, sorry about that!! 

Zofie Lin

unread,
Jun 4, 2020, 4:58:00 AM6/4/20
to Selene (sequence-based deep learning package)
Thanks to Jian and Kathy,

Another issue is, in the module of 'sampler: !obj:selene_sdk.samplers.IntervalsSampler', I have used 
save_datasets: [train, test, validate]
but it seems didn't save the datasets of 'test'.

In case 2 `selene/manuscript/case2/2_model_comparison/evaluate.yml` which includes `EvaluateModel`, it should use the "trained_model_path", so I couldn't evaluate the model at the same time with training but need to set up a new configuration file after training. It would be much more inconvenient. It would be better it could be combined with training. 

Thanks again, very helpful!

Kathy Chen

unread,
Jun 4, 2020, 9:32:25 AM6/4/20
to Selene (sequence-based deep learning package)
Yes, due to the bug it never makes it to evaluation, so the test set is not saved. Remember these samplers only load data on-the-fly. I agree that it's nice to train and evaluate all at once, so we'll plan address this bug soon. For now, you will probably need to use a new config file if you don't want to wait for that to be pushed out. 

You can save the test set BEFORE training if you use the parameter `load_test_set` http://selene.flatironinstitute.org/overview/cli.html#general-configurations in your config file, it'll take time up-front to sample the test set so the overall job will take longer. 

Kathy Chen

unread,
Jul 1, 2020, 4:27:42 PM7/1/20
to Selene (sequence-based deep learning package)
Hi Zofie,

This is a rather late update on the train and evaluate issue, but upon closer look (& running that case1 YML config myself) it seems like the latest version of Selene does still have the train and automatic evaluation functionality - at least, I'm able to get all evaluation outputs after training with the master branch of Selene. I wonder if there was some other issue or if this is a bug from an earlier version of Selene you were using? Anyway - totally happy to work with you to resolve this issue if you're still interested (or if you encounter it again at any point), just let me know.

Thanks!
Kathy

Zofie Lin

unread,
Jul 1, 2020, 11:29:38 PM7/1/20
to Selene (sequence-based deep learning package)
Hi Kathy,

Actually, I don't know what version my Selene, I installed it with conda in 2, June.  I guess it may be the lastest version in that time.
When I am doing train and evaluate, I just follow the YML with "selene/manuscript/case1/1_train_and_evaluate/train_and_eval.yml", the result didn't have the evaluation outputs. Then follow Jian's suggestion, I running another YML with "evaluate_model: !obj:selene_sdk.EvaluateModel {}" and got the evaluation outputs.
So, I don't know what the problem is, could you share your testing YML and let me have a try?

Thanks!
Zofie

Kathy Chen

unread,
Jul 2, 2020, 9:08:58 AM7/2/20
to Selene (sequence-based deep learning package)
Hi Zofie,


I used the same YAML file (I think the tabix file `.bed.gz` name was wrong so I updated that to `GATA1_proery_bm.bed.gz` - will need to fix in master at some point). How long did you run it for? & did you confirm that your job completed (as opposed to expiring or failing early) and it didn't give the eval outputs? You can try running a test case with the YAML file edited so that the `max_steps` set to something super low (I also reduced `n_validation_samples`, `n_test_samples` to make it run faster) and see if it evaluates at the end. 

Also, if I recall correctly you had a previous question about whether you were running the local copy or the conda installed copy - was your github repository (local copy) also downloaded in June? 

Thanks!
Kathy
Reply all
Reply to author
Forward
0 new messages