Varying the number of enriched domains detected

179 views
Skip to first unread message

Toby Hocking

unread,
Jul 31, 2014, 10:07:40 AM7/31/14
to rseg-s...@googlegroups.com
I got rseg-diff to work on one of my data sets! Now I am interested in computing 100 different RSEG models, each with a different number of enriched domains. For example my data set had 11651 enriched domains, but I would also like to compute models with say 5000 and 20000 enriched domains. Can you please tell me what input parameter can be used to control the number of enriched domains? What is the range of acceptable values? When I increase the parameter does it increase or decrease the number of detected domains?

Or should I just take the 11651 enriched domains that it detects by default, and sort that file by the 6th column? (sum of posterior scores of all bins in the domain)

Thanks in advance for your help.

Song, Qiang

unread,
Aug 1, 2014, 1:10:04 AM8/1/14
to RSEG Users
Hi Toby,

You are right. I would also suggest sorting enriched domains by the 6th column and choose different cutoff to get those significant ones.

Best,
Song Qiang

 


--
You received this message because you are subscribed to the Google Groups "RSEG Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rseg-support...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Toby Hocking

unread,
Aug 18, 2014, 11:14:30 AM8/18/14
to rseg-s...@googlegroups.com
RSeg is not giving me enough significant regions. Even if I take all the regions in the output file, there are still many peaks which are not detected (low true positive rate).

Do you have any advice about how to increase the number of peaks that RSeg will detect?

Song, Qiang

unread,
Aug 18, 2014, 12:10:59 PM8/18/14
to RSEG Users
Hi Toby,

You may try to tune the -posterior-cutoff parameter. This parameter indicates the posterior cutoff that a bin will be considered as foreground. The default value is 0.95. Try to use a less stringent cutoff, such as 0.9.

I also you used "peaks" in your description. By default, rseg is optimized to identify broad domains. If you look for "peaks",
please try to use a small bin size, such as 50bp or 100bp.

Hopefully this is helpful.

Best,
Song Qiang



Reply all
Reply to author
Forward
0 new messages