Hello!
I am using Strelka as variant caller for INDELs. I would like to obtain the number of read supporting the REF and the ALT for tumor and normal sample pairs.
I have found in the documentation:
tier1RefCounts = First comma-delimited value from FORMAT/TAR
tier1AltCounts = First comma-delimited value from FORMAT/TIR
Somatic allele freqeuncy is $tier1AltCounts / ($tier1AltCounts + $tier1RefCounts)
But I always have for the normal sample a TIR=0 and for tumor sample a TAR=0. I reported here the first lines of my vcf:
chr4 119176585 . CTG C . PASS IC=0;IHP=2;NT=ref;QSI=41;QSI_NT=40;RC=1;RU=TG;SGT=ref->hom;SOMATIC;TQSI=1;TQSI_NT=1 DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50 3:3:3,3:0,0:0,0:4.68:0.00:0.00 29:29:0,0:27,27:2,2:29.52:0.35:0.00
chr12 30887814 . ACACACACT A . PASS IC=0;IHP=2;NT=ref;QSI=41;QSI_NT=41;RC=1;RU=CACACACT;SGT=ref->hom;SOMATIC;TQSI=1;TQSI_NT=1 DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50 6:6:7,8:0,0:1,0:10.03:0.21:0.00 17:17:0,0:16,16:2,2:22.46:0.19:0.00
chr14 22694510 . CTGTGTCCGTG C . PASS IC=0;IHP=2;NT=ref;QSI=40;QSI_NT=40;RC=1;RU=TGTGTCCGTG;SGT=ref->hom;SOMATIC;TQSI=1;TQSI_NT=1 DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50 4:4:4,4:0,0:0,0:3.87:0.00:0.00 19:19:0,0:19,20:0,0:19.6:0.00:0.00
chr14 71513930 . TG T . PASS IC=0;IHP=2;NT=ref;QSI=40;QSI_NT=39;RC=1;RU=G;SGT=ref->hom;SOMATIC;TQSI=1;TQSI_NT=1 DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50 5:5:5,5:0,0:0,0:5.94:0.00:0.00 18:18:0,0:18,18:0,0:20.5:0.00:0.00
chr15 41246237 . A ATC . PASS IC=1;IHP=2;NT=ref;QSI=40;QSI_NT=40;RC=0;RU=TC;SGT=ref->het;SOMATIC;TQSI=1;TQSI_NT=1 DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50 5:5:5,5:0,0:0,0:6.65:0.00:0.00 26:26:0,0:24,24:2,2:25.5:0.00:0.00
chr16 24988884 . C CCA . PASS IC=2;IHP=6;NT=ref;QSI=35;QSI_NT=35;RC=1;RU=CA;SGT=ref->hom;SOMATIC;TQSI=1;TQSI_NT=1 DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50 4:4:4,4:0,0:0,0:5.11:0.00:0.00 20:20:0,0:18,19:2,1:19.55:0.00:0.00
chr16 74566092 . ATAT A . PASS IC=0;IHP=12;NT=ref;OVERLAP;QSI=32;QSI_NT=32;RC=1;RU=TAT;SGT=ref->hom;SOMATIC;TQSI=1;TQSI_NT=1 DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50 0:0:2,3:0,0:2,1:3.75:0.37:0.00 23:23:0,0:15,19:8,4:17.63:0.93:0.00
chr17 1933225 . T TG . PASS IC=4;IHP=4;NT=ref;QSI=40;QSI_NT=40;RC=3;RU=G;SGT=ref->hom;SOMATIC;TQSI=1;TQSI_NT=1 DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50 8:8:7,8:0,0:1,0:8.34:0.26:0.00 16:16:0,0:16,16:0,0:17.01:0.00:0.00
chr17 41566783 . CTCT C . PASS IC=1;IHP=6;NT=ref;QSI=41;QSI_NT=41;RC=2;RU=TCT;SGT=ref->hom;SOMATIC;TQSI=1;TQSI_NT=1 DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50 9:9:9,9:0,0:0,0:7.33:0.00:0.00 14:14:0,0:14,14:0,0:14.43:0.00:0.00
chr19 3192358 . TGGGCTG T . PASS IC=3;IHP=4;NT=ref;QSI=44;QSI_NT=44;RC=4;RU=GGGCTG;SGT=ref->hom;SOMATIC;TQSI=1;TQSI_NT=1 DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50 6:6:6,6:0,0:0,0:7.5:0.00:0.00 20:20:0,0:18,20:2,0:22.22:0.41:0.00
chr19 49519785 . CT C . PASS IC=0;IHP=4;NT=ref;Q
Should I use TOR as the ALT for normal and as REF for the tumor? In some cases, for example the second indel in the vcf, the total DP is higher than TAR for normal samples (6 DP vs 7), why?
How can I obtain the count for REF and ALT of both tumor and normal sample pairs? If I get these counts I can calculate the allele frequencies.
Thank you in advance for your help!
Best