Hello TASSEL community, I am new to using the software, I have a doubt regarding the conversion of VCF to HapMap format. The converted HapMap format has same allele as reference for the sample, although it is different in my VCF. How to rectify this and what am I doing wrong?
Please see below my sample data (input and output) for reference.
Thank you.
example.vcf (sample)
##fileformat=VCFv4.2
##FILTER=<ID=MIN_DP,Description="Minimum DP of 2">
##FILTER=<ID=MIN_FRS,Description="Minimum FRS of 0.9">
##FILTER=<ID=MIN_GCP,Description="Minimum GT_CONF_PERCENTILE of 0.25">
##FILTER=<ID=mask-compass,Description="Overlaps a user-input mask">
##FORMAT=<ID=COV,Number=R,Type=Integer,Description="Number of reads on ref and alt alleles">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="total read depth from gramtools">
##FORMAT=<ID=DPF,Number=1,Type=Float,Description="Depth Fraction, defined as DP divided by mean depth">
##FORMAT=<ID=FRS,Number=1,Type=Float,Description="Fraction of reads that support the genotype call">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GT_CONF,Number=1,Type=Float,Description="Genotype confidence. Difference in log likelihood of most likely and next most likely genotype">
##FORMAT=<ID=GT_CONF_PERCENTILE,Number=1,Type=Float,Description="Percentile of GT_CONF">
##contig=<ID=NC_000962.3,length=4411532>
##fileDate=2020-07-15
##minosMeanReadDepth=245.926
##source=minos, version 0.10.0
#CHROM
POS
ID
REF
ALT
QUAL
FILTER
INFO
FORMAT
sample1
NC_000962.3
11
.
A
C
.
MIN_GCP
.
GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE
0/0:15:0.061:15,0:1.0:393.82:0.0
NC_000962.3
22
.
G
C
.
MIN_GCP
.
GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE
0/0:29:0.1179:29,0:1.0:514.54:0.0
NC_000962.3
29
.
C
G
.
MIN_GCP
.
GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE
0/0:42:0.1708:42,0:1.0:620.38:0.0
NC_000962.3
31
.
A
G
.
MIN_GCP
.
GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE
0/0:46:0.187:46,0:1.0:652.08:0.0
NC_000962.3
64
.
G
C
.
MIN_GCP
.
GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE
0/0:101:0.4107:101,0:1.0:1061.26:0.0
NC_000962.3
67
.
G
A
.
MIN_GCP
.
GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE
0/0:107:0.4351:107,0:1.0:1103.68:0.0
NC_000962.3
69
.
C
T
.
MIN_GCP
.
GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE
0/0:110:0.4473:110,0:1.0:1124.76:0.01
NC_000962.3
71
.
C
T
.
MIN_GCP
.
GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE
0/0:114:0.4636:114,0:1.0:1152.75:0.01
NC_000962.3
76
.
G
C
.
MIN_GCP
.
GT:DP:DPF:COV:FRS:GT_CONF:GT_CONF_PERCENTILE
0/0:125:0.5083:125,0:1.0:1229.01:0.03
Command I used for running TASSEL: perl
run_pipeline.pl -vcf example.vcf -export examp
le_HMP.txt -exportType HapMap
rs#
alleles
chrom
pos
strand
assembly#
center
protLSID
assayLSID
panelLSID
QCcode
sample1
SNC_000962.3_11
A
NC_000962.3
11
+
NA
NA
NA
NA
NA
NA
A
SNC_000962.3_22
G
NC_000962.3
22
+
NA
NA
NA
NA
NA
NA
G
SNC_000962.3_29
C
NC_000962.3
29
+
NA
NA
NA
NA
NA
NA
C
SNC_000962.3_31
A
NC_000962.3
31
+
NA
NA
NA
NA
NA
NA
A
SNC_000962.3_64
G
NC_000962.3
64
+
NA
NA
NA
NA
NA
NA
G
SNC_000962.3_67
G
NC_000962.3
67
+
NA
NA
NA
NA
NA
NA
G
SNC_000962.3_69
C
NC_000962.3
69
+
NA
NA
NA
NA
NA
NA
C
SNC_000962.3_71
C
NC_000962.3
71
+
NA
NA
NA
NA
NA
NA
C
SNC_000962.3_76
G
NC_000962.3
76
+
NA
NA
NA
NA
NA
NA
G