VCF to structure exportation

48 views
Skip to first unread message

Alex Sanchez

unread,
Jul 11, 2022, 4:55:15 PM7/11/22
to Stacks
Hello everyone,

I am currently working in SNP data pre-processing and I design for exporting vcf files in a structure format. I am using stacks v2.59 to validate the program, but the translation from the base (ACGT) to numbers (1,2,3,4) is not clear.

I concluded that stacks translate bases to numbers using the next dictionary
A:1, T:2, G:3, C:4

However, I find this situation
stacks_vcf_stru_diagram.png
It seems like stacks translate using the same dictionary at the beginning, but after some samples, both T and C are translated as 2.

I would like to have some support from your team.

Greetings,
Alex Sanchez

Catchen, Julian

unread,
Jul 11, 2022, 5:43:48 PM7/11/22
to stacks...@googlegroups.com

Hi Alex,

 

Structure files are encoded with the following dictionary (from the source code):

nuc_map['A'] = "1";

nuc_map['C'] = "2";

nuc_map['G'] = "3";

nuc_map['T'] = "4";

 

julian

 

From: stacks...@googlegroups.com <stacks...@googlegroups.com> on behalf of Alex Sanchez <sany...@gmail.com>
Date: Monday, July 11, 2022 at 3:55 PM
To: Stacks <stacks...@googlegroups.com>
Subject: [stacks] VCF to structure exportation

Hello everyone,

 

I am currently working in SNP data pre-processing and I design for exporting vcf files in a structure format. I am using stacks v2.59 to validate the program, but the translation from the base (ACGT) to numbers (1,2,3,4) is not clear.

 

I concluded that stacks translate bases to numbers using the next dictionary

A:1, T:2, G:3, C:4

 

However, I find this situation

Alex Sanchez

unread,
Jul 12, 2022, 9:18:09 AM7/12/22
to Stacks
Then, why in this case, in the first SNP, the alleles T/C are translated as 2/4. Using the dictionary that you mention will be 4/2
Reply all
Reply to author
Forward
0 new messages