This question was posted 2 years ago by another userbut there was no response:
https://groups.google.com/forum/#!topic/tassel/tMlOV440Zm8
I am using the GATK pipeline to call SNPs, and I end up with a VCFv4.2 file. Before I can load it into Tassel, it needs to sort, so I use the Sort Genotype File plugin, and it completely alters the data in the sample columns, along with sorting it correctly. For example it turns this:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT TC002 TC002-2
KB222877.1 1258 . G A 3254.20 . AC=4;AF=1.00;AN=4;DP=85;ExcessHet=3.0103;FS=0.000;MLEAC=4;MLEAF=1.00;MQ=42.00;QD=30.89;SOR=0.716 GT:AD:DP:GQ:PL 1/1:0,50:50:99:1912,149,0 1/1:0,35:35:99:1369,105,0
KB222877.1 1528 . A G 3415.20 . AC=4;AF=1.00;AN=4;DP=90;ExcessHet=3.0103;FS=0.000;MLEAC=4;MLEAF=1.00;MQ=41.98;QD=28.85;SOR=0.738 GT:AD:DP:GQ:PL 1/1:0,43:43:99:1661,129,0 1/1:0,47:47:99:1781,140,0
KB222877.1 2973 . A C 3037.20 . AC=4;AF=1.00;AN=4;DP=79;ExcessHet=3.0103;FS=0.000;MLEAC=4;MLEAF=1.00;MQ=42.00;QD=26.67;SOR=0.941 GT:AD:DP:GQ:PL 1/1:0,40:40:99:1557,120,0 1/1:0,39:39:99:1507,117,0
into this:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT TC002 TC002-2
KB222877.1 1258 SKB222877.1_1258 G A . PASS AC=4;AF=1.00;AN=4;DP=85;ExcessHet=3.0103;FS=0.000;MLEAC=4;MLEAF=1.00;MQ=42.00;QD=30.89;SOR=0.716;DP=114 GT:AD:DP:GQ:PL 1/1:53,0:53:99:0,159,255 1/1:45,0:45:99:0,135,255
KB222877.1 1528 SKB222877.1_1528 A G . PASS AC=4;AF=1.00;AN=4;DP=90;ExcessHet=3.0103;FS=0.000;MLEAC=4;MLEAF=1.00;MQ=41.98;QD=28.85;SOR=0.738;DP=100 GT:AD:DP:GQ:PL 1/1:53,0:53:99:0,159,255 1/1:47,0:47:99:0,141,255
KB222877.1 2973 SKB222877.1_2973 A C . PASS AC=4;AF=1.00;AN=4;DP=79;ExcessHet=3.0103;FS=0.000;MLEAC=4;MLEAF=1.00;MQ=42.00;QD=26.67;SOR=0.941;DP=74 GT:AD:DP:GQ:PL 1/1:15,0:15:99:0,45,255 1/1:19,0:19:99:0,57,255
The input is VCFv4.2 and Tassel outputs 4.0. I don't know if this is part of the problem or not.
Thanks for any help,
Alex