questions on liftover parameters when lifting regions between species

99 views
Skip to first unread message

Tzachi Hagai

unread,
Nov 29, 2016, 6:51:42 PM11/29/16
to gen...@soe.ucsc.edu
Hello,

I have data of regulatory regions from human (ChIP seq peak that vary in length from ~150 to ~30,000bp) and I would like to look at the corresponding regions in primates and rodents.

I was wondering what the parameters minMatch, minSizeQ, and minSizeT exactly mean and how changing these parameters in the context of moving between genomes of different species affect the results.

With respect to minSizeQ and minSizeT: Do I understand correctly that I need to set them to the length of the shortest regulatory region in my human data? (~150bp)?
What are the default values of these parameters? (I guess they cannot really be zero since that will match them to anything?)

What would be an "appropriate" value of minMatch when looking at syntenic regions of closely related mammals with good-quality genomes (such as human and mouse)? I saw an example from ENCODE (https://github.com/ENCODE-DCC/kentUtils/tree/master/src/hg/liftOver) where they set the threshold to 0.01, but isn't this too low?


Thank you very much
Tzachi

Luvina Guruvadoo

unread,
Dec 12, 2016, 6:01:55 PM12/12/16
to Tzachi Hagai, gen...@soe.ucsc.edu
Hello Tzachi,

Thank you for your question. One of our engineers recommends using the following settings:

minSizeQ=150
minSizeT=150
minMatch=0.1

Specifying the minSizeQ,T causes it to skip very small chains, which means that fewer low-quality short alignments are output, and it runs a little faster.

If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Regards,
Luvina

--
Luvina Guruvadoo
UCSC Genome Browser

http://genome.ucsc.edu




--


Reply all
Reply to author
Forward
0 new messages