plink2 export tped 12 not working as expected

96 views
Skip to first unread message

Esther Mezey

unread,
Mar 7, 2023, 7:25:12 PM3/7/23
to plink2-users
Hi,

When I take a VCF file, convert it to plink, and then export it as a tped, the modifier '12' does not work as expected. I thought this modifier causes ALT1 alleles to be coded as '1' and REF alleles as '2'. However, it appears to code REF alleles as '1' and ALT1 alleles as '2'.

Sample commands and data follow:

Commands
plink2 --vcf ${vcf}  --make-pgen --out ${pfile_prefix}
plink2 --pfile ${pfile_prefix} --export tped 12 --output-missing-genotype 0 --out ${tped_prefix}


VCF data
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  S1      S2      S3      S4
1       1691    .       C       T       .       PASS    AC=1;AN=8       GT      0|0     0|0     0|0     1|0
1       1800    .       G       A       .       PASS    AC=7;AN=8       GT      1|1     1|1     1|1     1|0


TPED data (note that a VCF entry of 0 is 1 in the tped)

1       .       0       1691    1       1       1       1       1       1       2       1
1       .       0       1800    2       2       2       2       2       2       2       1


Christopher Chang

unread,
Mar 7, 2023, 8:01:34 PM3/7/23
to plink2-users
Oops, yes, this was flipped in plink2 (to open up the possibility of encoding ALT2 as '3', ALT3 as '4', etc.), but I forgot to update the documentation.  The website has been updated, and the command-line help will be updated in the next build.

Esther Mezey

unread,
Apr 14, 2023, 9:40:21 AM4/14/23
to plink2-users
Hi Chris,

Thanks for the quick response on this. I have a quick related question: 

I've been exporting genotype data using "plink2  --export tped 12 --output-missing-genotype 0" and running a mixed model with EMMAX, which (I believe) generates its own additive coding for the tped data where 2=homozygous ALT, 0=homozygous REF, and 1=heterozygote.

I'm now running GWAS analysis using other tools, and would like to export genotype data using the same additive coding (2=homozygous ALT, 0=homozygous REF, and 1=heterozygote). Is this possible with plink2? 

When I run "plink2 --export A", I believe the resulting additive coding is of the reverse form: 0=homozygous ALT, 2=homozygous REF, and 1=heterozygote (which is the opposite of what I want).

The reason I ask is that I'm working with very large files (10s of millions of snps), and I'm hoping not to have to write an adhoc program to reverse the additive coding in order to generate something that is consistent with my previous output (generated with  "plink2  --export tped 12 --output-missing-genotype 0" and EMMAX tped data encoding).

Thanks very much for your input,

Esther
Message has been deleted

Christopher Chang

unread,
Apr 14, 2023, 12:57:59 PM4/14/23
to plink2-users
A simple way to do this is to add --ref-allele (or --alt1-allele) to your "--export tped" command, pointing it at your .bim/.pvar file and making it flip your alleles.
Reply all
Reply to author
Forward
0 new messages