Hi rtg-tools team,
I have run vcfeval usng the same exact vcf as truth and called, I have 141,820 PASS in the vcf, but vcfeval reports 141,815 TP with 5 privates, but 100% precision and sensitivity. Looking at those 5 privates, it seems that they are exactly the same variants but they might be in the same haplotype as another variant in the initial vcf.
e.g.
bcftools view $vcf.vcf.gz -H -r chr1:218721137 -f PASS
chr1 218721133 . ATATATACGTATATATATATATATATATACG A . PASS DP=133;MQ=130.07;FractionInformativeReads=0.902;SoftClipRatio=0.63 GT:SQ:AD:AF:F1R2:F2R1:DP:SB:MB 0/0:0:25,0:0:15,0:10,0:25:.:. 0/1:21.37:82,12:0.1277:39,4:43,8:94:35,47,7,5:40,42,7,5
chr1 218721137 . ATACGTATATATATATATATATATACG A . PASS DP=127;MQ=127.28;FractionInformativeReads=0.897;SoftClipRatio=0.85 GT:SQ:AD:AF:F1R2:F2R1:DP:SB:MB 0/0:0:25,0:0:15,0:10,0:25:.:. 0/1:20.9:76,12:0.1364:34,5:42,7:88:37,39,5,7:41,35,4,8
bcftools view $vcf.vcf.gz -H -r chr1:43744897
chr1 43744895 . GCC G . PASS DP=173;MQ=208.17;FractionInformativeReads=0.994;SoftClipRatio=0.8 GT:SQ:AD:AF:F1R2:F2R1:DP:SB:MB 0/0:0:44,0:0:21,0:23,0:44:.:. 0/1:19.78:121,5:0.0397:65,3:56,2:126:50,71,2,3:72,49,2,3
chr1 43744897 . C A . PASS DP=170;MQ=207.43;FractionInformativeReads=0.988;SoftClipRatio=0.4 GT:SQ:AD:AF:F1R2:F2R1:DP:SB:MB 0/0:0:43,0:0:20,0:23,0:43:.:. 0/1:19.76:119,5:0.0403:65,1:54,4:124:49,70,1,4:69,50,3,2
Is this behaviour because rtg-tools is haplotype based?
Is this the correct behaviour?
I would expect that if there are private variants (FP, FN) precision and recall should not have been 100% . Right?
Best,
Alexandra