Hi, I have a question about UCSC liftover.

9 views
Skip to first unread message

주혜연

unread,
Jul 16, 2021, 12:40:46 PMJul 16
to gen...@soe.ucsc.edu

Hi, I'm using UCSC liftover to convert hg19 to hg38, and I have a question because the result came out that I don't understand.


  • Feb. 2009 (GRCh37/hg19) → Dec. 2013 (GRCh38/hg38)
    • chr1:120904787 → chr1:143905854
  • Dec. 2013 (GRCh38/hg38) → Feb. 2009 (GRCh37/hg19)
    • chr1:143905854 → chr1:149400430

(I didn't check "Allow multiple output regions".)


I think the value of chr1:120904787 and chr1:149400430 should be the same value, not different.

Also, I checked the browser, and I think the mapping does not appear to be correct. (chr1:120904787(hg19) → chr1:143905854(hg38))



The result was the same even if I downloaded the chain file (hg19ToHg38.over.chain.gz) and used the PyLiftover.


If you know the reason, please reply.


Thanks.







--
Hyeyeon Ju
Bioinformatics and Cancer Genomics Lab
(22012) 인천광역시 연수구 아카데미로 119 (송도동) 29호관 309호
Tel: 032-832-4652
Phone: 010-2024-9234

Mailtrack Sender notified by
Mailtrack 21. 07. 16. 오후 03:36:13

Matthew Speir

unread,
Jul 23, 2021, 11:47:43 AMJul 23
to 주혜연, UCSC Genome Browser Discussion List
Hello, Hyeyeon Ju.

Thank you for your question about UCSC LiftOver.

You have found one of the cases where our LiftOver chains get the answer wrong about how a region maps between two different assemblies. This usually happens in cases where the contig used to build the assembly has either been partially or even fully replaced in the new assembly. That appears to be the case here, as you can see by the red line in the "Contigs Dropped or Changed from GRCh37(hg19) to GRCh38(hg38)" track in this session: http://genome.ucsc.edu/s/mspeir/hg19_chr1_120904787. In addition to that, you can see that this position also falls within a segmental duplication that has a greater than 99% similarity to another region in the genome, which I'm guessing, in this case, is the region our LiftOver tool matches this region to in the hg38 assembly.

A little bit more detail from one of our engineers about how we create our LiftOver chain files:

Our liftOver chains are taken from the nets, and the nets are single-coverage on the target genome (currently the "from" genome, although there are cases in which it might make sense to go the other way).
 
When a region of hg19 maps well to two different regions of hg38, only one of those regions is kept in the net & liftOver chains, and conversely, when a region of hg38 maps well to two different regions of hg19, only one can be kept. So we expect there to be some regions that don't map symmetrically because the single-coverage restriction means they can't.
 
I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Training videos & resources: http://genome.ucsc.edu/training/index.html

Want to share the Browser with colleagues? Host a workshop: http://bit.ly/ucscTraining
---

Matthew Speir

UCSC Cell Browser, Quality Assurance and Data Wrangler

Human Cell Atlas, User Experience Researcher

UCSC Genome Browser, User Support

UC Santa Cruz Genomics Institute

Revealing life’s code.



--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAFJ6-R0VNSdHO6wMbkSWOKwk3CP8dvLd7tXtvBZRMUVnCmYZMA%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages