Hi,
Is there information anywhere on how the segmental duplication (segdup) annotations were prepared for GRCh38/hg38? Judging by the presence of short (<1kb) intervals, it appears to have been a liftOver of the GRCh37/hg19 co-ordinates rather than a re-analysis; is that correct?
Kind regards,
Luke
--
Dear Luvina,
Thank you for your reply.
I had read the documentation page you linked, but the information it contains appears to be incorrect. For example, it states that the segmental duplications should be at least 1kb in size, but there are 23 intervals that are less than 1kb with a shortest length of 381bp in one instance. Do you know why this is?
Kind regards,
Luke
Hello, Luke.
While there are several items in this track whose genomic size is less than 1 kb, the items to which they align are always greater than 1 kb in size. You can confirm this by clicking the details page for any of these items and looking at the “Aligned Bases” field. For example, the item whose length is 381 bp is named chr2:87419158 and is located at chr2:87,412,595-87,412,975. The item to which it aligns is 2,286 bases in size. These items were left in the data set at the request of the data provider.
Please contact us again at gen...@soe.ucsc.edu if you have any further questions. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
---
Steve Heitner
UCSC Genome Bioinformatics Group
--