Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Converting VCF to Hapmap [Query]

33 views
Skip to first unread message

Pranavathiyani G

unread,
Feb 4, 2025, 8:43:53 AMFeb 4
to TASSEL - Trait Analysis by Association, Evolution and Linkage
Hello TASSEL community, I am new to using the software, I have a doubt regarding the conversion of VCF to HapMap format. The converted HapMap format has same allele as reference for the sample, although it is different in my VCF. How to rectify this and what am I doing wrong?

Please see below my sample data (input and output) for reference.

Thank you.

example.vcf (sample)
##fileformat=VCFv4.2
##FILTER=<ID=MIN_DP,Description="Minimum DP of 2">
##FILTER=<ID=MIN_FRS,Description="Minimum FRS of 0.9">
##FILTER=<ID=MIN_GCP,Description="Minimum GT_CONF_PERCENTILE of 0.25">
##FILTER=<ID=mask-compass,Description="Overlaps a user-input mask">
##FORMAT=<ID=COV,Number=R,Type=Integer,Description="Number of reads on ref and alt alleles">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="total read depth from gramtools">
##FORMAT=<ID=DPF,Number=1,Type=Float,Description="Depth Fraction, defined as DP divided by mean depth">
##FORMAT=<ID=FRS,Number=1,Type=Float,Description="Fraction of reads that support the genotype call">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GT_CONF,Number=1,Type=Float,Description="Genotype confidence. Difference in log likelihood of most likely and next most likely genotype">
##FORMAT=<ID=GT_CONF_PERCENTILE,Number=1,Type=Float,Description="Percentile of GT_CONF">
##contig=<ID=NC_000962.3,length=4411532>
##fileDate=2020-07-15
##minosMeanReadDepth=245.926
##source=minos, version 0.10.0
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1
NC_000962.3 11 . A C . MIN_GCP . GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE 0/0:15:0.061:15,0:1.0:393.82:0.0
NC_000962.3 22 . G C . MIN_GCP . GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE 0/0:29:0.1179:29,0:1.0:514.54:0.0
NC_000962.3 29 . C G . MIN_GCP . GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE 0/0:42:0.1708:42,0:1.0:620.38:0.0
NC_000962.3 31 . A G . MIN_GCP . GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE 0/0:46:0.187:46,0:1.0:652.08:0.0
NC_000962.3 64 . G C . MIN_GCP . GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE 0/0:101:0.4107:101,0:1.0:1061.26:0.0
NC_000962.3 67 . G A . MIN_GCP . GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE 0/0:107:0.4351:107,0:1.0:1103.68:0.0
NC_000962.3 69 . C T . MIN_GCP . GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE 0/0:110:0.4473:110,0:1.0:1124.76:0.01
NC_000962.3 71 . C T . MIN_GCP . GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE 0/0:114:0.4636:114,0:1.0:1152.75:0.01
NC_000962.3 76 . G C . MIN_GCP . GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE 0/0:125:0.5083:125,0:1.0:1229.01:0.03



Command I used for running TASSEL: perl run_pipeline.pl -vcf example.vcf -export examp
le_HMP.txt -exportType HapMap

rs# alleles chrom pos strand assembly# center protLSID assayLSID panelLSID QCcode sample1
SNC_000962.3_11 A NC_000962.3 11 + NA NA NA NA NA NA A
SNC_000962.3_22 G NC_000962.3 22 + NA NA NA NA NA NA G
SNC_000962.3_29 C NC_000962.3 29 + NA NA NA NA NA NA C
SNC_000962.3_31 A NC_000962.3 31 + NA NA NA NA NA NA A
SNC_000962.3_64 G NC_000962.3 64 + NA NA NA NA NA NA G
SNC_000962.3_67 G NC_000962.3 67 + NA NA NA NA NA NA G
SNC_000962.3_69 C NC_000962.3 69 + NA NA NA NA NA NA C
SNC_000962.3_71 C NC_000962.3 71 + NA NA NA NA NA NA C
SNC_000962.3_76 G NC_000962.3 76 + NA NA NA NA NA NA G

Reply all
Reply to author
Forward
0 new messages