Maj/Min vs Ref/Alt - Can I make sure my VCF comes out of production with Ref/Alt instead of Maj/Min coded?

34 views
Skip to first unread message

Zachary Winn

unread,
Feb 16, 2024, 10:15:36 AMFeb 16
to TASSEL - Trait Analysis by Association, Evolution and Linkage
Hi TASSEL Team!

I know that this question has been asked before (https://groups.google.com/g/tassel/c/olBzvxcEIZk/m/KVCMIDfeAgAJ?pli=1), but I am currently working with multiple VCFs, and I need a TASSEL generated GBS VCF that is coded as Ref/Alt rather than Maj/Min. I read through that conversation, but I did not find an answer to my posed question.

Essentially, the header of my VCF generated by TASSEL say the following:

##Tassel=<ID=GenotypeTable,Version=5,Description="Reference allele is not known. The major allele was used as reference allele">

But I need to make sure that my produced VCF is in Ref/Alt instead of Maj/Min. Is there any way for TASSEL to produce a VCF which is informed by reference instead of major minor?

Thanks,
Zachary Winn, PhD
Postdoctoral Research Scholar 
USDA-ARS Eastern Regional Small Grains Genotyping Lab

Zachary Winn

unread,
Feb 21, 2024, 12:51:40 PMFeb 21
to TASSEL - Trait Analysis by Association, Evolution and Linkage
Hi again,

I have done a little digging in the documentation for the ProductionSNPCallerPluginV2 for flags in the standalone pipeline. It appears there is no flag to provide a reference genome when calling... Does this mean that the VCF will always be put out as Major/Minor alleles instead of Reference/Alternative when referencing a .db file produced by TASSEL?

Thanks,
Zach

Lynn Carol Johnson

unread,
Feb 21, 2024, 1:30:30 PMFeb 21
to tas...@googlegroups.com

Yes, that is correct.  You will get Major/Minor vs Ref/Alt when referencing the .db created by the GBS pipeline.

We are not currently doing updates to the GBSv2 pipeline.  However, the code is open source and we welcome others to supply updates if they choose.

 

--
You received this message because you are subscribed to the Google Groups "TASSEL - Trait Analysis by Association, Evolution and Linkage" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tassel+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tassel/4caf202d-0a08-486b-94f8-39cbef402903n%40googlegroups.com.

Zachary Winn

unread,
Feb 21, 2024, 1:49:03 PMFeb 21
to TASSEL - Trait Analysis by Association, Evolution and Linkage
Good to know. Thank you, Lynn. I will look for alternative solutions to my issue. You may now close the issue!

Have a good day,
Zach

Zachary Winn

unread,
Mar 11, 2024, 12:12:31 PMMar 11
to TASSEL - Trait Analysis by Association, Evolution and Linkage
Just as a minor update to this issue for people reading this in the future...

I wrote a little program in Bash to handle this issue. Essentially, it references the positions given in a VCF to a reference genome sequence (in this case I was using IWGSC RefSeqv2.1) and it uses those positions to check each SNP listed in a VCF. If the "reference allele" given for the position in the VCF does not match the reference genome provided, the program will attempt to correct the VCFs ref/alt to match the reference genome sequence. You can pull the program from my github here.

Thanks,
Zach

Lynn Carol Johnson

unread,
Mar 11, 2024, 12:46:07 PMMar 11
to tas...@googlegroups.com

Zachary – thank you for sharing this!

 

Reply all
Reply to author
Forward
0 new messages