question

6 views
Skip to first unread message

huang...@sjtu.edu.cn

unread,
Aug 14, 2015, 11:38:14 AM8/14/15
to gen...@soe.ucsc.edu
Hello:
This is Wenze Huang. Thanks for replying. We still have some questions about the net format for a net track. I have a look at your help documentation and track description page. Do you mean that the fill is the highest-scoring chains in the pairwise alignment? I do not know what does the gap mean. The longer chains have the high score, so how can I value the similarity about the chains between two species?
Thanks!
Wenze Huang

Matthew Speir

unread,
Aug 17, 2015, 6:46:06 PM8/17/15
to huang...@sjtu.edu.cn, gen...@soe.ucsc.edu
Hi Wenze,

Thank you for your questions about the "Net" file format. Answers to
your questions are inline below:
> Do you mean that the fill is the highest-scoring chains in the pairwise alignment? I do not know what does the gap mean.
In a net file, "fill" lines specify an aligning region. "Gap" lines
follow a "fill" line. All of the "gap" lines that follow a "fill" line
indicate alignment gaps in that preceding "fill" line.

In addition to the the "net" format help page and the track description
page mentioned by my colleague Luvina, you may find the following
GenomeWiki page useful:
http://genomewiki.ucsc.edu/index.php/Chains_Nets. The sections "Basic
definitions" and "Nets in a nutshell" might be of particular interest to
you.
> The longer chains have the high score, so how can I value the similarity about the chains between two species?
Can give a little more detail as to what you are looking for here? Are
you looking for something like percent identity?

Here is some background on how chains are scored from one of our engineers:
> The chains use a scoring scheme that is tolerant of long gaps because
> that is what enables the discovery of long-range similarities between
> diverged species. Long gaps might be explained by genomic
> rearrangements or mobile element insertions, for example. To get a
> high score, many bases must be aligned. Therefore, a long chain that
> joins many aligning blocks covering many bases will score much higher
> than a long chain that joins only two aligning blocks with not so many
> bases, and higher than a shorter chain that doesn't cover as many
> aligning bases.
>
> The possible explanation of genomic rearrangements is supported when a
> long gap in a chain is filled in by an alignment from some other
> region of the other genome; that is the purpose of the nets.
I hope this is helpful. If you have any further questions, please reply
to gen...@soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group
Reply all
Reply to author
Forward
0 new messages