Sei, Selene-SDK, RandomPositionsSampler

43 views
Skip to first unread message

goutha...@gmail.com

unread,
May 16, 2022, 2:42:15 AM5/16/22
to Selene (sequence-based deep learning package)
Hi All,

I am trying to understand the Samplers. I am training sei model on enhancers and promoters from particular tissue. So I have two "profiles" (n_genomic_features: 2). Currently, I left the default option of RandomPositionsSampler but the average_precision is 0.004. 

Ideally I should be using the genomic regions from my dataset (enhancers and promoter sequences) for training and evaluation. I am confused about the use cases for RandomSampler. If user wants to train a model on a specific set of sequences, the training and evaluation set should be taken from the provided peaks itself right ? Sorry if I completely missed something. 

In my case, how can I provide appropriate sequences for training and evaluation (i.e from my set of enhancer peaks) ? Should I be using IntervalsSampler with same set of peaks I am using as targets ? Is it possible to mention different files for different features ? (e.g. Enhancers and promoters separately ?)


Thanks,
Goutham A
Research Collaborator, Imperial College London

chen.ka...@gmail.com

unread,
Aug 19, 2022, 3:25:24 PM8/19/22
to Selene (sequence-based deep learning package)
Hi Goutham,

So sorry for the late response - next time don't hesitate to follow-up as I often miss these emails!

If you are wanting to train on a specific set of regions/sequences, then Intervals Sampler would be the right place to look. If you are trying to train different sampling regions for different features, I'd probably recommend training two separate models (otherwise would need to write some kind of custom sampler for it).

Let me know if you still have questions on this (sorry again this is so late!!)
Kathy
Reply all
Reply to author
Forward
0 new messages