GWAS using PCA as covariates and permutation

598 views
Skip to first unread message

Melanie Montes

unread,
Feb 23, 2017, 10:51:09 AM2/23/17
to plink2-users
Hi,
I'm trying to run a GWAS for the first time, so I'm afraid I have some rather simplistic questions about the different outputs. I am working with a population that has very high genotypic diversity, but is still clonal (significant LD), so I thought I should run a PCA and include the first 2 PCs as covariates, to account for any effect of population structure. My phenotypic variable is quantitative. 

This is what my command looks like:

plink --file full.nr --pheno pheno.nr.txt --aec --assoc perm --covar nr.eigenvec --covar-name PC1, PC2 --linear perm --out filtered2 --allow-no-sex  --pfilter 0.01


A couple things I was wondering about. When I just look at the qassoc files, the permutation file has more significant SNPs that the non-permutated one. Isn't that a bit strange? Shouldn't the permutation lead to more strict significance values? 

kxb901@galaxy:~/space/RAD/red_ref_10$ wc -l filtered2.qassoc           

421 filtered2.qassoc

kxb901@galaxy:~/space/RAD/red_ref_10$ wc -l filtered2.qassoc.perm

566 filtered2.qassoc.perm


If I understand correctly, the .linear files are the same association test, but including the covariates? I.e. these are the ones I should actually look at for the "most correct" results? They also have a much higher number of SNPs than the plain association test, which I again find strange... isn't the point of adding these pop structure covariates to again eliminate false positives?


Sorry for the naive questions, this is a very new field for me! 

Thanks for any insights you can give,

Melanie 

Christopher Chang

unread,
Feb 23, 2017, 12:18:58 PM2/23/17
to plink2-users
1. When you want to take covariates into account, you shouldn't use --assoc at all.  (This flag will probably be removed from plink 2.0.)
2. "--linear perm" does *not* make a multiple-testing correction which would reduce the number of significant results; it merely removes a possible source of bias in the uncorrected p-values.  It sounds like you want something like "--linear mperm=10000" instead.

Melanie Montes

unread,
Feb 23, 2017, 12:32:29 PM2/23/17
to plink2-users
Ah OK, makes a lot more sense :) 
Can you use the --adjust flag together with --linear if you wanted an adjustment for multiple-testing? 
Thanks! 

Christopher Chang

unread,
Feb 23, 2017, 12:44:02 PM2/23/17
to plink2-users
The --adjust correction is overly conservative since it does not account for linkage disequilibrium.  It is much quicker to compute, though.
Reply all
Reply to author
Forward
0 new messages