I am trying to convert a bed file from hg19 to Hg38. I have 3 fields in addition to the normal bed fields. My file looks like this:
chr11 189744 189744 40 6 9
chr11 189760 189760 45 9 11
chr11 189879 189879 33 2 4
chr11 189880 189880 18 3 13
chr11 189937 189937 42 3 4
chr11 189941 189941 44 4 5
chr11 192064 192064 22 4 14
chr11 192097 192097 17 4 19
chr11 192140 192140 13 3 20
chr11 192142 192142 4 1 21
I need to run liftOver in the command line because the file is too large to upload on the genome browser site. I believe the correct command is:
/shared/bin/UCSC_Tools/liftOver EOC12.bed hg19ToHg38.over.chain.gz test.txt unmapped.test.txt -bedPlus=3 -tab
However, the result is all unmapped sequences. I know there is nothing wrong with the coordinates because I have pasted some into the web browser and they convert fine. I have also tried converting this into “position” based coordinates and this works with the liftOver command line. But, I need to be able to keep those last 3 columns with my data so I think I have to make this work in the bed format. I’ve tried converting tabs to spaces, removing “chr” and nothing seems to work.
Annie Shaw Research Associate Basic Science
e sha...@mwri.magee.edu o 412.641.5794 c 412.400.1164
a 204 Craft Avenue, B430N Pittsburgh, PA 15213
|
|
Magee-Womens
|
Confidential UPMC, Magee-Womens Research Institute & Foundation Information. Any unauthorized or improper disclosure, copying, distribution or use of the contents of this email or any attachments is prohibited. The information contained in this email message is intended only for the personal and confidential use of the recipient(s) named above. If you have received this communication in error, please notify the sender immediately by email and delete the original message.
Hello Annie,
Thank you for using the Genome Browser and for your question about your LiftOver not working.
The reason it did not work is that you are using 1-based coordinates (e.g. chr1 300 300) and most of our tools require the zero-based coordinate format (e.g. chr1 299 300). You can see this in your dataset because the start and end positions are the same.
BED coordinates like your example file also must be 0-based (the first position of chr1 is 0). If you subtract one from each start coordinate you shared with us, each position lifts over with the default settings. Or, if you convert your positions from BED format to 1-based position-format (as in chrN:start-end) the file also lifts over.
You should be able to automatically subtract one from each start position with the following command:
For more information about 0-based vs. position format, please see the following blog post (especially the liftOver example):
http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/
I hope this was helpful. If you have any more questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are publicly archived. If your question includes sensitive data, please send it instead to genom...@soe.ucsc.edu.
All the best,
Daniel Schmelter
UCSC Genome Browser
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/BN7PR08MB611378D4603E10421B402D4DF80F0%40BN7PR08MB6113.namprd08.prod.outlook.com.