".glm.firth.hybrid" output file in plink2 does not give correct CASE_ALLELE_CT values

139 views
Skip to first unread message

metzge...@gmx.de

unread,
Mar 17, 2022, 5:07:32 AM3/17/22
to plink2-users

Hi, I have been doing a GWAS with PLINK2 and got an output file with the extension: .glm.logistic.hybrid command. Everything worked fine until I realized that the output values are not really conform with my phenotype values.

For example: For each sample I got phenotypes in binary format (0's and 1's). Here I have 347 samples with 0's (controls) and 194 with 1's (cases).

The output from PLINK2 gives me within the column CTRL_ALLELE_CT (specified by plink2 index as: Control allele observation count): 598

And in the column: CASE_ALLELE_CT (specified by plink2 index as: Case allele observation count): 304

Obviously these values in the ".glm.firth.hybrid"-file are not the correct case and control counts I inputted from the phenotype file. How can this be? What exactly is PLINK2 measuring here?

Christopher Chang

unread,
Mar 17, 2022, 11:33:28 AM3/17/22
to plink2-users
Three possibilities:
1. Missing covariate values.  Any sample with a missing covariate value is excluded from the analysis.
2. Single-sample categories.  If there's a covariate that's zero-valued for all but 1 sample, there is no point including that 1 sample or the covariate in the regression, and plink2 automatically excludes the sample-covariate pair.
3. Missing genotypes.

You can use --glm's "pheno-ids" modifier to see what samples remain after the first two considerations.
Reply all
Reply to author
Forward
0 new messages