Option -domain-size

Dario Beraldi

unread,

Apr 10, 2012, 5:57:36 PM4/10/12

to rseg-s...@googlegroups.com

Hello,

From the help of of rseg/rseg-diff I found that there is an option called -domain-size which is not mentioned in the rseg manual. The help description says that -domain-size is the "Expected size of domain (Default 20000)". Would it be possible to provide a bit more explanation about this option?

The reason why I'm interested in -domain-size is that the expected size of my enriched blocks could be much less than 20000bp so I'm trying to reduce -domain-size to something between 5000bp and 500bp hoping to get more accurate and more realistic enriched regions. Is this a sensible approcah to follow? Any guidance would be much appreciated...

All the best

Dario

Song, Qiang

unread,

Apr 12, 2012, 12:43:46 AM4/12/12

to rseg-s...@googlegroups.com

Hi Dario,

The option -domain_size is only used to initialize the transition probability of HMM.

Suppose that the size of domain is D, and the estimated bin size is B, then the self-transition

probability of foreground state is initialized to (1 - B/D).

Thereafter the option -domain-size is no longer used.

So by setting the appropriate value of expected domain size, we may get better initial parameters

of the HMM, which may speed up the training procedure. However their effect should be overwhelmed

by the real data as we iterate more times.

Hope this is helpful.

Regards,

Song Qiang

dario

unread,

Apr 12, 2012, 3:35:51 AM4/12/12

to RSEG Users

Once again, thanks for explanation.

Dario

On Apr 12, 5:43 am, "Song, Qiang" <qiang.s...@usc.edu> wrote:
> Hi Dario,
>
> The option -domain_size is only used to initialize the transition
> probability of HMM.
> Suppose that the size of domain is D, and the estimated bin size is B, then
> the self-transition
> probability of foreground state is initialized to (1 - B/D).
> Thereafter the option -domain-size is no longer used.
>
> So by setting the appropriate value of expected domain size, we may get
> better initial parameters
> of the HMM, which may speed up the training procedure. However their effect
> should be overwhelmed
> by the real data as we iterate more times.
>
> Hope this is helpful.
>
> Regards,
> Song Qiang
>

Reply all

Reply to author

Forward