Re: [structure-group] Best way to enter SNP data in HapMap format

822 views
Skip to first unread message

Vikram Chhatre

unread,
Jan 17, 2013, 12:39:52 AM1/17/13
to structure-software
Hi Ivan,

Structure works with allele frequencies, therefore, it is advisable to convert SNP data to numerical format as follows:

A=1, T=2, G=3, C=4

This alphanumeric conversion is arbitrary.  The genotypes would look like this:

Locus#1:  AT TT AA AT AT AT
Structure: 1 2 2 2 1 1 1 2 1 2 1 2

Locus#2: CG CG CG GG CC GG
Structure: 4 3 4 3 4 3 3 3 4 4 3 3 

If you have hapmap formatted data, you can convert it to double-alphabet style genotypes first and then convert to numerical format.

HTH
V



On Wed, Jan 16, 2013 at 7:57 PM, (Iván Darío Barrero Farfán) <idba...@gmail.com> wrote:
My SNP data is in hapmap format

I was wondering what would be the best way to enter this data in structure?

AA=1
CC=2
GG=3
TT=4
AG=5 (R in hapmap code)
CT= 6 (Y in hapmap code)
CG= 7 (S in hapmap code)
AT= 8  (W in hapmap code)
GT=9 (k in hapmap code)
AC= 10 (M in hamap code)

This is the hapmap coding:
Genotype AA CC GG TT AG CT CG AT GT AC
Code      A C   G T   R Y  S  W  K M

Does the approach above sound reasonable?

Thanks

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/oY9qa3hPMfwJ.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-softw...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/structure-software?hl=en.


IDBarrero

unread,
Jan 17, 2013, 1:12:56 PM1/17/13
to structure...@googlegroups.com
Vikram

Is there a more efficient way than editing by hand (which in my case is going to take forever) the data to enter into structure

I have taxa in row

my colums are the SNP markers, but they are AA, GG, CC, TT or heterozygous.

since Structure requires a format line this:

Taxa1  1 2 1 2
Taxa1  1 2 1 2

is there a better way to do this

thanks

Vikram Chhatre

unread,
Jan 17, 2013, 1:24:54 PM1/17/13
to structure...@googlegroups.com
Its very easy. You can do global replacements on genotypes lines. Do not use Excel. I always recommend using vim text editor. 

---------------------------------------
Vikram Chhatre
Graduate Program in Genetics
Texas A&M University

This message was sent from a cellular device. It may contain typos and other errors.


--
You received this message because you are subscribed to the Google Groups "structure-software" group.
Reply all
Reply to author
Forward
0 new messages