Liftover for vcf.gz file

5 views
Skip to first unread message

Khadija Sana

unread,
Dec 9, 2022, 12:34:17 PM12/9/22
to genome...@soe.ucsc.edu
Dear Team,
I am trying to use liftover for a cleaned vcf.gz file using the chain hg18ToHg19.over.chain.gz using the command-

liftOver xyz.vcf.gz.clean.vcf  hg18ToHg19.over.chain.gz xyz.vcf.gz.clean.hg19lift.chrbed xyz.vcf.gz.clean.hg19liftt.unmapped.curbed

but I am getting the error:
Reading liftover chains
Mapping coordinates
invalid unsigned integer: "."

Here is what my vfc file looks like:
##fileformat=VCFv4.2
##fileDate=20221205
##source=PLINKv1.90
##contig=<ID=1,length=247177331>
##contig=<ID=10,length=135284979>
##contig=<ID=11,length=134445627>
##contig=<ID=12,length=132288870>
##contig=<ID=13,length=114108296>
##contig=<ID=14,length=106358709>
##contig=<ID=15,length=100216155>
##contig=<ID=16,length=88677424>
##contig=<ID=17,length=78653170>
##contig=<ID=18,length=76116153>
##contig=<ID=19,length=63779292>
##contig=<ID=2,length=242692821>
##contig=<ID=20,length=62382908>
##contig=<ID=21,length=46919232>
##contig=<ID=22,length=49524957>
##contig=<ID=3,length=199322660>
##contig=<ID=4,length=191173723>
##contig=<ID=5,length=180642522>
##contig=<ID=6,length=170761396>
##contig=<ID=7,length=158812248>
##contig=<ID=8,length=146264219>
##contig=<ID=9,length=140147761>
##contig=<ID=M,length=16394>
##contig=<ID=X,length=154582607>
##contig=<ID=XY,length=154871187>
##contig=<ID=Y,length=27108407>
##INFO=<ID=PR,Number=0,Type=Flag,Description="Provisional reference allele, may not be based on real reference genome">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  MES.1106        MES.1108        MES.1109        MES.1110        MES.1111        MES.1113        MES.1114        MES.1115        MES.1116        X1AL    X2AL    X4AL    X5AL    X7AL    X9AL    X11AL   X13AL   X15AL   X16AL   X17AL   X18AL   X19AL   X20AL   X21AL   X22AL   X23AL   X24AL   X25AL   X26AL   X27AL   X28AL   X29AL   X30AL   X31AL   X32AL   X33AL   X34AL   X35AL   X36AL   X37AL   X38AL   X39AL   X40AL   X41AL   X42AL   X43AL   X44AL   X45AL   X46AL   X47AL   X48AL   X49AL   X50AL
   X51AL   X52AL   X53AL   X54AL   X55AL   X56AL   X57AL   X58AL   X59AL   X60AL   X61AL   X62AL   X63AL   X64AL   X65AL   X66AL   X67AL   X68AL   X69AL   X70AL   X71AL   X72AL   X73AL   X74AL   X75AL   X76AL   X77AL
1       742429  .       A       G       .       .       PR      GT      0/0     0/0     0/0     0/1     0/0     0/0     0/1     0/0     0/0     0/0     0/1     0/0     0/1     0/0     0/1
     0/0     0/0     0/0     0/1     0/1     0/1     0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/1     0/0     0/0     0/0     0/1     0/0     0/0     0/0     0/0     0/0     0/0     0/1     0/0     0/0     0/1     0/0     0/1     0/0     0/0     0/1     0/0     0/1     0/0     0/0     0/0     0/0     0/0     0/1     0/0     0/0     0/1     0/1     0/0     1/1
     0/1     0/1     0/0     0/0     0/0     0/1     0/0     0/0     1/1     0/0     0/1     0/1     0/0     0/0     0/0     0/0     0/1     0/0

Please help me out

Best regards,
Khadija

Luis Nassar

unread,
Dec 14, 2022, 9:28:03 PM12/14/22
to Khadija Sana, genome...@soe.ucsc.edu

Hello, Khadija.

Thank you for your interest in the Genome Browser.

By default, the liftOver utility requires the files to be in BED format (http://www.genome.ucsc.edu/FAQ/FAQformat.html#format1). It also supports GFF, but not VCF.

Additionally, the chromosome names are expected to be in 'UCSC standard', meaning 'chr1' instead of '1'.

We do offer some command line utilities to facilitate these conversions, e.g. for linux:

vcfToBed: https://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/vcfToBed
chromToUcsc (converts the 1 to chr1s): https://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/chromToUcsc

However, another option would be crossmap which can perform better for SNPs and supports VCF format. You can learn more about the program
from their website, http://crossmap.sourceforge.net/. Unfortunately, we do not maintain crossmap so I cannot provide guidance on how to use the tool if you have any questions. They do have an active community in BioStars, though (https://www.biostars.org/).

Also, it appears your vcf may have been created from a plink file. You may also find the following biostars question helpful: https://www.biostars.org/p/252938/

I hope this is helpful. Please include gen...@soe.ucsc.edu in any replies to ensure visibility by the team. All messages sent to that address are archived on our public forum. If your question includes sensitive information, you may send it instead to genom...@soe.ucsc.edu.

Lou Nassar
UCSC Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Mirror-Specific Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome-mirro...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome-mirror/CACmOf9vdXc%2BDiJamsc--EU6ZOM4j6%3DAM8e6oUv1AxMr5_rAUCw%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages