Option -domain-size

95 views
Skip to first unread message

Dario Beraldi

unread,
Apr 10, 2012, 5:57:36 PM4/10/12
to rseg-s...@googlegroups.com
Hello,

From the help of of rseg/rseg-diff I found that there is an option called -domain-size which is not mentioned in the rseg manual. The help description says that -domain-size is the "Expected size of domain (Default 20000)". Would it be possible to provide a bit more explanation about this option?

The reason why I'm interested in -domain-size is that the expected size of my enriched blocks could be much less than 20000bp so I'm trying to reduce -domain-size to something between 5000bp and 500bp hoping to get more accurate and more realistic enriched regions. Is this a sensible approcah to follow? Any guidance would be much appreciated...

All the best

Dario

Song, Qiang

unread,
Apr 12, 2012, 12:43:46 AM4/12/12
to rseg-s...@googlegroups.com
Hi Dario,

The option -domain_size is only used to initialize the transition probability of HMM.
Suppose that the size of domain is D, and the estimated bin size is B, then the self-transition
probability of foreground state is initialized to (1 - B/D). 
Thereafter the option -domain-size is no longer used. 

So by setting the appropriate value of expected domain size, we may get better initial parameters 
of the HMM, which may speed up the training procedure. However their effect should be overwhelmed 
by the real data as we iterate more times.  

Hope this is helpful.

Regards,
Song Qiang

dario

unread,
Apr 12, 2012, 3:35:51 AM4/12/12
to RSEG Users
Once again, thanks for explanation.

Dario

On Apr 12, 5:43 am, "Song, Qiang" <qiang.s...@usc.edu> wrote:
> Hi Dario,
>
> The option -domain_size is only used to initialize the transition
> probability of HMM.
> Suppose that the size of domain is D, and the estimated bin size is B, then
> the self-transition
> probability of foreground state is initialized to (1 - B/D).
> Thereafter the option -domain-size is no longer used.
>
> So by setting the appropriate value of expected domain size, we may get
> better initial parameters
> of the HMM, which may speed up the training procedure. However their effect
> should be overwhelmed
> by the real data as we iterate more times.
>
> Hope this is helpful.
>
> Regards,
> Song Qiang
>
Reply all
Reply to author
Forward
0 new messages