Alleles in ped file

44 views
Skip to first unread message

Daisuke Kasuga

unread,
Jan 26, 2015, 2:45:11 AM1/26/15
to plink2...@googlegroups.com
Hi,

I have a question about alleles in ped file.

I want to split a large ped file by chromosome in order to do imputation separately.
So I used a shell command;

plink \
    --noweb \
    --file MyGwasData \
    --chr ${chr} \
    --recode \
    --out MyGwasData_chr-${chr}

* "MyGwasData" meams the prefix of ped & map file.

After that, I compare MyGwasData.ped with MyGwasData_chr-${chr}.ped,
I noticed that allele 1 and alllele 2 were changed in some SNPs.

For example, I'll show a ped file of 4 indivisials and 4 SNPs below.

MyGwasData.ped
0 ID1 0 0 0 -9 C C C C A A T T
0 ID2 0 0 0 -9 C T C C A A T T
0 ID3 0 0 0 -9 C C C T A A T A
0 ID4 0 0 0 -9 C C C C A G T T

*because I used an annotation of Affimetrix SNP array, allele representations are always in alphabetical order.
("A C", "A G", "A T", "C G", "C T" and "G T" do exist, but "C A", "G A", "T A", "G C", "T C" and "T G" do not exist.)

MyGwasData_chr-${chr}.ped
0 ID1 0 0 0 -9 C C C C A A T T
0 ID2 0 0 0 -9 C T C C A A T T
0 ID3 0 0 0 -9 C C T C A A T A
0 ID4 0 0 0 -9 C C C C G A T T

In this example,
the SNP2 of ID3 has been changed from "C T" to "T C" and
the SNP3 of ID4 has been changed from "A G" to "G A".

On the other hand,
the SNP2 of ID2 remained "C T" and
the SNP4 of ID3 remained "T A".

Do you know how this conversion happened.

First, I thought hetero alleles in reversed alphabetical order were automatically changed to those in alphabetical order,
but it seems that some hetero alleles were not changed.

And also, I thought that this conversion is based on some strand ("+" and "-") information,
but a file set of ped and map does not have any strand information.

Does this conversion have influence on downstream analysis?
(In other words, do "A B" and "B A" of an individual have the same meaning in ped file?)

Thanks,

Christopher Chang

unread,
Jan 26, 2015, 2:58:56 AM1/26/15
to plink2...@googlegroups.com
This does not affect downstream calculations; allele order in a .ped heterozygous genotype does not matter, and PLINK does not keep track of it (that's why they were swapped sometimes).

Daisuke Kasuga

unread,
Jan 26, 2015, 3:31:54 AM1/26/15
to plink2...@googlegroups.com
Dear Christopher

Thank you for quick replying.
OK, I do not care about orders of alleles of a SNP.

And I' m sorry to ask this question to OXST...@JISCMAIL.AC.UK.

Best regards,
Daisuke




2015年1月26日月曜日 16時58分56秒 UTC+9 Christopher Chang:
Reply all
Reply to author
Forward
0 new messages