Good Afternoon Rohit:
The duplicate removal is the collapsing of blocks that begin on
the same start position for query and target. Here is the code:
for (b = *pBoxList; b != NULL; b = next)
{
next = b->next;
if (last != NULL && b->qStart == last->qStart && b->tStart == last->tStart)
{
/* Fold this block into previous one. */
if (last->qEnd < b->qEnd) last->qEnd = b->qEnd;
if (last->tEnd < b->tEnd) last->tEnd = b->tEnd;
freeMem(b);
}
else
{
slAddHead(&newList, b);
last = b;
}
}
These are the gapless chain links, they have no strand information.
The split output uses the gaps with inserts in the chains to write out separate
blocks in the axt output. The noSplit writes out axt blocks with insert gaps included.
All your overlap information remains within your original chains. You can view
your chains as a 'bed' type of custom track:
chainToPsl yourFile.chain target.sizes query.sizes target.2bit query.2bit stdout \
pslToBed stdin yourChains.bed
Viewing these unadorned chains in the browser will illustrate how they may
overlap and what the gaps look like. When in doubt about the results of these
commands, view them in the genome browser to see the complete picture.
See also:
http://genomewiki.ucsc.edu/index.php/Chains_Nets
and
http://www.pnas.org/content/100/20/11484.full
--Hiram