UCSC liftOver hg19tohg38

1,459 views
Skip to first unread message

Zhang, Zhenyu [BSD] - HGD

unread,
Mar 2, 2015, 12:20:18 PM3/2/15
to gen...@soe.ucsc.edu
Hi,

This is Zhenyu, a bioinformatician at the University of Chicago. I am planing to do liftOver of some old NGS calling data using UCSC liftOver chain files. The problem is that my old data is with GRCh37 and the new ref is GRCh38. I couldn’t find the corresponding chain file in UCSC except this one:
ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/liftOver/hg19ToHg38.over.chain.gz

My understanding is that GRCh38 is exactly the same as hg38. However, there are some difference between hg19 and GRCh37. I have the following concerns:
1. UCSC bases start from “0”, while others start from “1”
2. The difference between version of mitochondrial chromosome sequences.

Do you know if these concerns will affect my liftOver result if I use hg19ToHg38 chain file? Or is there a tool that I can use to generate my own chain file given two different reference assembly? Thanks a lot for helping.

Best,
Zhenyu Zhang

________________________________
This email is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If the reader of this email message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is prohibited. If you have received this email in error, please notify the sender and destroy/delete all copies of the transmittal.

Thank you.

Steve Heitner

unread,
Mar 3, 2015, 12:25:42 PM3/3/15
to Zhang, Zhenyu [BSD] - HGD, gen...@soe.ucsc.edu
Hello, Zhenyu.

For your first question concerning 0-based versus 1-based start coordinates (see also https://genome.ucsc.edu/FAQ/FAQtracks.html#tracks1), it should not matter where you got your coordinates from. The liftOver utility just converts hg19 coordinates into their hg38 equivalent. If you enter 0-based coordinates into the liftOver utility, you will get 0-based results. If you enter 1-based coordinates into the liftOver utility, you will get 1-based results.

Regarding your second question about the differences between mitochondrial sequences, it depends on where you obtained your mitochondrial sequence. Our hg19 assembly used an older version of the mitochondrial sequence (see also http://genome.ucsc.edu/cgi-bin/hgGateway?db=hg38, Assembly Details/GRCh38 Highlights/Mitochondrial genome). If you use the UCSC hg19 mitochondrial sequence, then the hg19 to hg38 liftOver will give you the proper hg38 coordinates. If you use the official GRCh37 mitochondrial sequence, then there is no need to lift - you already have the hg38 coordinates.

Please contact us again at gen...@soe.ucsc.edu if you have any further questions. Questions sent to that address will be archived in a publicly-accessible forum for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.

---
Steve Heitner
UCSC Genome Bioinformatics Group
--


Reply all
Reply to author
Forward
0 new messages