liftOver command line not working

67 views
Skip to first unread message

Shaw, Patricia

unread,
Jan 23, 2020, 1:35:34 PM1/23/20
to gen...@soe.ucsc.edu

I am trying to convert a bed file from hg19 to Hg38. I have 3 fields in addition to the normal bed fields. My file looks like this:

 

chr11   189744  189744  40      6       9

chr11   189760  189760  45      9       11

chr11   189879  189879  33      2       4

chr11   189880  189880  18      3       13

chr11   189937  189937  42      3       4

chr11   189941  189941  44      4       5

chr11   192064  192064  22      4       14

chr11   192097  192097  17      4       19

chr11   192140  192140  13      3       20

chr11   192142  192142  4       1       21

 

I need to run liftOver in the command line because the file is too large to upload on the genome browser site. I believe the correct command is:

 

/shared/bin/UCSC_Tools/liftOver EOC12.bed hg19ToHg38.over.chain.gz test.txt unmapped.test.txt -bedPlus=3 -tab

 

However, the result is all unmapped sequences. I know there is nothing wrong with the coordinates because I have pasted some into the web browser and they convert fine. I have also tried converting this into “position” based coordinates and this works with the liftOver command line. But, I need to be able to keep those last 3 columns with my data so I think I have to make this work in the bed format. I’ve tried converting tabs to spaces, removing “chr” and nothing seems to work.

 

 

Annie Shaw  Research Associate Basic Science

 

e sha...@mwri.magee.edu  o 412.641.5794  c 412.400.1164

a 204 Craft Avenue, B430N Pittsburgh, PA 15213

 

cid:image003.png@01D4BDFA.369F9A10

Magee-Womens
Research Institute & Foundation

mageewomens.org

 

 

Confidential UPMC, Magee-Womens Research Institute & Foundation Information.  Any unauthorized or improper disclosure, copying, distribution or use of the contents of this email or any attachments is prohibited. The information contained in this email message is intended only for the personal and confidential use of the recipient(s) named above. If you have received this communication in error, please notify the sender immediately by email and delete the original message.

Daniel Schmelter

unread,
Jan 23, 2020, 6:31:16 PM1/23/20
to Shaw, Patricia, gen...@soe.ucsc.edu

Hello Annie,

Thank you for using the Genome Browser and for your question about your LiftOver not working.

The reason it did not work is that you are using 1-based coordinates (e.g. chr1 300 300) and most of our tools require the zero-based coordinate format (e.g. chr1 299 300). You can see this in your dataset because the start and end positions are the same.

BED coordinates like your example file also must be 0-based (the first position of chr1 is 0). If you subtract one from each start coordinate you shared with us, each position lifts over with the default settings. Or, if you convert your positions from BED format to 1-based position-format (as in chrN:start-end) the file also lifts over.

You should be able to automatically subtract one from each start position with the following command:

awk '{print $1, ($2=$2 - 1), $3, $4, $5, $6}' oldFile > newFile 

For more information about 0-based vs. position format, please see the following blog post (especially the liftOver example):

http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/

I hope this was helpful. If you have any more questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are publicly archived. If your question includes sensitive data, please send it instead to genom...@soe.ucsc.edu.

All the best,

Daniel Schmelter
UCSC Genome Browser


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/BN7PR08MB611378D4603E10421B402D4DF80F0%40BN7PR08MB6113.namprd08.prod.outlook.com.
Reply all
Reply to author
Forward
0 new messages