Logistic regression with covariates in plink2 vs. plink1

613 views
Skip to first unread message

Mats Nagel

unread,
Nov 22, 2016, 10:58:09 AM11/22/16
to plink2-users
Hi everybody,

I conducted a GWAS (logistic regression) on two binary phenotypes, using the same genotype data. I use both dichotomous as well as continuous covariates in the model. I found that plink2 returned NA's for some SNPs for one phenotype, but valid test results for the other phenotype (and vice versa). The discrepancy leads to considerable differences in the number of valid test results per phenotype.

After trying out different things, I ran the same analyses in plink1. Oddly, plink1 did NOT produce NA's where plink 2 did! Attached you'll find the results for two phenotypes (A/B) and three SNPs conducted in both versions of plink (i.e. excerpts from the .assoc.logistic files).

Can anyone tell me what's going on here?
I know the implementation of logistic regression (with covariates) probably changed in plink2. However, I would like to know if, and how, this may have caused the issue.

Many thanks in advance,


Mats


P.S. Due to privacy concerns I cannot attach the raw data (and genomic positions in the attached results files are fictitious). However, if it helps I'm happy to run additional analyses to clarify this problem.
attachment_MNagel.pdf

Christopher Chang

unread,
Nov 22, 2016, 4:23:44 PM11/22/16
to plink2-users
Try adding --ci 0.95 to your plink 1.07 run.  My guess is that, for most of the cases where plink 1.9 reports "NA" and plink1 does not, standard errors/confidence intervals will be huge, and at least one of the coefficients tries to diverge to infinity.

(I am currently working on adding Firth logistic regression to plink 2.0 to address this issue.)

Mats Nagel

unread,
Nov 23, 2016, 4:11:19 AM11/23/16
to plink2-users
Dear Christopher,

Thanks for the quick response!
I have repeated the analysis in plink1 including the --ci 0.95 flag (find results attached).
It indeed seems like what you're saying might be the case for SNP 3. However, then I would expect plink2 to give NA's for this SNP for both phenotypes, instead of just for phenotype B.
In addition, SNP 2 also has NA's for phenotype B when using plink2. For this SNP I find no strange standard errors when running plink1 with --ci 0.95.

Again, many thanks for helping me out.

Kind regards,


Mats


Op dinsdag 22 november 2016 22:23:44 UTC+1 schreef Christopher Chang:
attachment2_MNagel.pdf
Reply all
Reply to author
Forward
0 new messages