Hello Jan,
While the final decision about whether this region represents a
biological duplication event or an assembly artifact is up to you, I can
point you to some additional evidence contained in our browser tracks
that you may find helpful.
Variation and Repeats: Segmental Dups
Duplications of >1000 Bases of Non-RepeatMasked Sequence
With the browser region centered at position chr5:68,865,473-68,887,387,
set this track to full. The duplication between the two regions that you
mention is annotated here. If the position is changed to
chr5:70,403,678-70,425,578, the reverse duplication is also annotated.
Position: chr5:68865474-68944233 <-> Other Position:
chr5:70346809-70425579 (-)
Genes and Gene Prediction Tracks: Yale Pseudo
This track shows identified pseudogenes as recorded in the Yale
Pseudogene Database.
With the browser region centered at position chr5:70,403,678-70,425,578,
this track contains no data. This tells me this group has not yet
identified the region as a pseudogene.
Genes and Gene Prediction Tracks: RefSeq Genes
The RefSeq Genes track shows known protein-coding genes taken from the
NCBI mRNA reference sequences collection (RefSeq).
With the browser region centered at position chr5:70,403,678-70,425,578,
this track contains no data. This means that the other position
(chr5:68,865,473-68,887,387) is the best match and that these two
regions differ in the coding region.
Please see each track's description page for methods used to generate
the data sets. From this limited analysis, it appears that the region is
a known inexact duplication but that the cause is undetermined.
Exploring other annotation tracks may also be useful, these are just an
example of where I would start.
Hope this helps. Please let us know if we can be of additional assistance,
Jennifer Jackson
UCSC Genome Bioinformatics Group
>_______________________________________________
>Genome maillist - Genome at
soe.ucsc.edu
>
http://www.soe.ucsc.edu/mailman/listinfo/genome
>
>