--glm genotypic/hethom on chromosome X in females

freeseek

unread,

Sep 15, 2021, 12:57:17 PM9/15/21

to plink2-users

When I run:

plink2 --glm genotypic

On chromosome X data with a dataset with female-only phenotype, I get the error:

Error: No variants remaining for --glm ('dominant', 'recessive', 'genotypic',
and 'hethom' only operate on diploid data).

I understand that --glm genotypic should not be used for males on chromosome X, as there are no heterozygous genotypes for males. However, my dataset only has phenotypes for females, so this error check does not apply to my case. Is there a way to turn this behavior off?

Slightly related. I am trying to test whether heterozygous genotype is a risk factor against homozygous genotype, regardless of which homozygous genotype I am working with.

The genotypic modifier tests for ADD/DOMDEV at once and the hethom modifier tests for HET/HOM at once (and I assume DOMDEV and HET are the same term, but with different names). Is there a way to only test for DOMDEV/HET?

Giulio

Christopher Chang

unread,

Sep 15, 2021, 1:06:08 PM9/15/21

to plink2-users

1. I'll look into special-casing all-female chrX analysis this week. In the meantime, a temporary hack is to convert to numeric chromosome codes (--output-chr 26), then run --glm while pretending you're analyzing a species with more chromosomes (e.g. "--chr-set 30").

2. You can just look at the DOMDEV/HET p-values?

freeseek

unread,

Sep 15, 2021, 3:16:43 PM9/15/21

to plink2-users

Thank you Chris! Very much appreciated.

For the DOMDEV/HET p-values, would I get the same results by running a 2-degree regression vs. a simple regression with just the DOMDEV term?

As an example, here are three different results (using version 5 Sep 2021):

--glm allow-no-covars no-firth

#CHROM POS ID REF ALT A1 FIRTH? TEST OBS_CT OR LOG(OR)_SE Z_STAT P ERRCODE
22 70444587 . G T T Y ADD 22 27.4152 1.61471 2.05058 0.0403074 .

--glm genotypic allow-no-covars no-firth

#CHROM POS       ID REF ALT A1 FIRTH? TEST      OBS_CT OR       LOG(OR)_SE Z_OR_F_STAT P          ERRCODE
22      70444587 .   G    T    T   Y       ADD       22      2.4721   1.00611     0.899575     0.368346   .
22      70444587 .   G    T    T   Y       DOMDEV    22      11.3713 1.81354     1.34053      0.180075   .
22      70444587 .   G    T    T   Y       GENO_2DF 22      NA       NA          2.24169      0.129999   .

--glm hethom allow-no-covars no-firth

#CHROM POS       ID REF ALT A1 FIRTH? TEST      OBS_CT OR       LOG(OR)_SE Z_OR_F_STAT P          ERRCODE
22      70444587 .   G    T    T   Y       HOM       22      6.11388 2.01247     0.899673     0.368294   .
22      70444587 .   G    T    T   Y       HET       22      28.1112 1.65094     2.02076      0.0433044 .
22      70444587 .   G    T    T   Y       GENO_2DF 22      NA       NA          2.24172      0.129996   .

For example, OR and p-value for ADD in the first two models is not the same and also DOMDEV and HET in the last two models don't have the same p-values (though the GENO_2DF has the same p-values across the two models). Should I expect them to be the same?

Christopher Chang

unread,

Sep 15, 2021, 3:31:25 PM9/15/21

to plink2-users

No, you shouldn't expect these p-values to be the same, for the same reason that you don't expect additive-effect p-values to be the same when you add/remove/change covariates.

freeseek

unread,

Sep 15, 2021, 3:37:28 PM9/15/21

to plink2-users

So, going back to the original question related to the DOMDEV/HET term, is it not possible with plink2 to test only for the 0..1..0 variable (DOMDEV/HET) without testing at the same time either 0..1..2 (ADD) or 0..0..1 (HOM)? Something like a --glm "hetonly" modifier (in addition to the dominant and recessive modifiers)?

Christopher Chang

unread,

Sep 15, 2021, 3:44:55 PM9/15/21

to plink2-users

Hmm, "--logistic genotypic --parameters 2" works in plink 1.9, but it's currently blocked by plink 2.0. I'll try to fix this.

Christopher Chang

unread,

Sep 15, 2021, 3:50:37 PM9/15/21

to plink2-users

(I guess it is blocked because the ordinary genotypic/hethom joint test has to be disabled in this special case. plink 1.x hardcodes this special case, but a 'hetonly' modifier is cleaner so I will go that route instead.)

freeseek

unread,

Sep 15, 2021, 11:49:49 PM9/15/21

to plink2-users

Indeed, I could verify that:

1) plink1.9 --logistic hethom --parameters 2

2) plink1.9 --logistic genotypic --parameters 2

Produce the same results (with the exception of the "DOMDEV" label being replaced by "HET" ... I find "HET" slightly more intuitive).

Thank you so much for implementing this in plink2.0. I am not sure what I would have done otherwise. To give you context, we are trying to extend analyses from Loh et al. 2020 to see if some common variants increase the chance of CN-LOHs when the genotype is heterozygous. As an example, for TCL1A variant rs2887399 heterozygous genotypes seem to increase the likelihood of 14q CN-LOH events and a "hetonly" test might be best suited to discover this effect.

Christopher Chang

unread,

Sep 18, 2021, 4:10:55 PM9/18/21

to plink2-users

Build with 'hetonly' mode and all-female-chrX special-casing is posted; let me know if you run into any problems with it.

freeseek

unread,

Sep 20, 2021, 3:59:42 PM9/20/21

to plink2-users

I have noticed that, if phenotypes are missing for males, plink2 still yields the error:

Error: No variants remaining for --glm ('dominant', 'recessive', 'genotypic',
and 'hethom' only operate on diploid data).

But if I use the ---keep-females modifier I instead get the message:

--glm: Including chrX, despite presence of a diploid-only modifier
('dominant', 'recessive', 'hetonly', 'genotypic', 'hethom'), since all samples
are female.

And the analysis completes correctly. It seems good with me. You could almost suggest the use of the --keep-females (or --remove-males) modifier to get the analysis running.

However, there is still one need that might go unfulfilled. I might need this analysis to be run also for males on the PAR1/PAR2 regions, as this seems relevant for male individuals with evidence of mosaic loss-of-Y. Is it possible to use the hetonly modifier for this case?

Christopher Chang

unread,

Sep 20, 2021, 4:01:29 PM9/20/21

to plink2-users

plink2 supports PAR1/PAR2 chromosome codes, and these are treated as diploid.

freeseek

unread,

Sep 20, 2021, 4:05:52 PM9/20/21

to plink2-users

I just tried the following command:

plink2 --keep-males --glm allow-no-covars hetonly --chr X --from-kb 0 --to-kb 1000 ...

And I get the error:

Error: No variants remaining for --glm ('dominant', 'recessive', 'genotypic',
and 'hethom' only operate on diploid data).

I do not get the error without the hetonly modifier. This is the beginning of the .pvar file:

##contig=<ID=chrX,length=156040895>
##FPLOIDY=2

#CHROM POS ID REF ALT

X    13189    .    G    A
X    13205    .    T    C
X    13221    .    T    C
X    13223    .    G    C
X    13230    .    G    T
X    13236    .    G    C
X    13242    .    C    T

...

Showing that there should be PAR1/PAR2 variants to test.

Christopher Chang

unread,

Sep 20, 2021, 4:09:02 PM9/20/21

to plink2-users

You need to use --split-par to tell plink2 where the PARs are located.

freeseek

unread,

Sep 20, 2021, 4:11:22 PM9/20/21

to plink2-users

I see, it works by using options:

--chr PAR1,PAR2 --split-par hg38

Got it now. Sorry for asking so many questions!

freeseek

unread,

Sep 20, 2021, 6:07:35 PM9/20/21

to plink2-users

I might have uncovered a bug here.

Generate a toy dataset:

echo "1 1000000000 0 0 1 1 0 0 A A
1 1000000001 0 0 1 2 C C A G
1 1000000002 0 0 1 1 0 0 A A
1 1000000003 0 0 1 2 C C A G" > toy.ped

Then:

echo -e "X\trs0\t0\t1000\nX\trs10\t0\t1001" > toy.map
plink1.9 --file toy --make-bed --out toy
plink2 --bfile toy --glm allow-no-covars hetonly --split-par hg38
...

--glm logistic-Firth hybrid regression on phenotype 'PHENO1': done.
Results written to plink2.PHENO1.glm.logistic.hybrid .

While, as expected:

echo -e "X\trs0\t0\t3001000\nX\trs10\t0\t3001001" > toy.map
plink1.9 --file toy --make-bed --out toy
plink2 --bfile toy --glm allow-no-covars hetonly --split-par hg38

...

Error: No variants remaining for --glm ('dominant', 'recessive', 'genotypic',
and 'hethom' only operate on diploid data).

(though notice the error message is missing the "hetonly" case)

But the following:

echo -e "X\trs0\t0\t1000\nX\trs10\t0\t3001001" > toy.map
plink1.9 --file toy --make-bed --out toy
plink2 --bfile toy --glm allow-no-covars hetonly --split-par hg38

...

--glm logistic-Firth hybrid regression on phenotype 'PHENO1': 0%Floating point exception (core dumped)

Causes a segmentation fault.

Christopher Chang

unread,

Sep 20, 2021, 9:48:58 PM9/20/21

to plink2-users

Oops. Thanks for reporting this; bugfix is posted.

Reply all

Reply to author

Forward