Queries regarding axtChain and netToAxt

15 views

Skip to first unread message

Rohit Kolora

unread,

Nov 12, 2015, 12:21:45 PM11/12/15

to gen...@soe.ucsc.edu

Dear UCSC developers,

I have a few questions regarding the UCSC tools for chain building and
netting from axt alignments.

1) blocks after duplicate removal
"12 blocks after duplicate removal" - During axtChain option, there is a
process of removing the duplicates. What kind of duplicates do you refer
to? Is it query wise duplicate i.e. hits that are covered by the same
query at multiple places (or) target-related i.e. region of target that is
covered by multiple queries?
How will a region be extended if they are covered by both strands, one
strand by a gap and other strand due to an exact hit, are they both
mentioned as different chains?

2) noSplit option in netToAxt
"Don't split chain when there is an insertion of another chain" - Does
this mean that a particular target sequence when being chained with one
query is split at a point since the target-region has a hit with another
query?
If so, is there an option to net this region but still have information
regarding these overlaps.

--
Regards,
Rohit

Hiram Clawson

unread,

Nov 12, 2015, 5:12:50 PM11/12/15

to Rohit Kolora, gen...@soe.ucsc.edu

Good Afternoon Rohit:

The duplicate removal is the collapsing of blocks that begin on
the same start position for query and target. Here is the code:

for (b = *pBoxList; b != NULL; b = next)
{
next = b->next;
if (last != NULL && b->qStart == last->qStart && b->tStart == last->tStart)
{
/* Fold this block into previous one. */
if (last->qEnd < b->qEnd) last->qEnd = b->qEnd;
if (last->tEnd < b->tEnd) last->tEnd = b->tEnd;
freeMem(b);
}
else
{
slAddHead(&newList, b);
last = b;
}
}

These are the gapless chain links, they have no strand information.

The split output uses the gaps with inserts in the chains to write out separate
blocks in the axt output. The noSplit writes out axt blocks with insert gaps included.
All your overlap information remains within your original chains. You can view
your chains as a 'bed' type of custom track:
chainToPsl yourFile.chain target.sizes query.sizes target.2bit query.2bit stdout \
pslToBed stdin yourChains.bed

Viewing these unadorned chains in the browser will illustrate how they may
overlap and what the gaps look like. When in doubt about the results of these
commands, view them in the genome browser to see the complete picture.

See also: http://genomewiki.ucsc.edu/index.php/Chains_Nets
and http://www.pnas.org/content/100/20/11484.full

--Hiram

Reply all

Reply to author

Forward

0 new messages