X and Y chromosome handling in --glm and --export

155 views
Skip to first unread message

Ika

unread,
Dec 9, 2021, 12:23:49 PM12/9/21
to plink2-users
Hey,
I'm performing some genome-wide association studies in an all-male study population and would like to include the X and Y chromosomes.

My data is from the UK Biobank and I've been processing it in PLINK1.9.

A few questions arose:
1) One of the tests I'm performing is a --glm in PLINK. Does this handle X and Y chromosomes correctly, or do I need to take extra measures?

2) I perform some other association tests outside PLINK. I wanted to get a better idea of how the hemizygosity is encoded in my data, so I performed an --export A in PLINK1.9.

I was a bit confused by the output, though. In the PAR, things look as expected: 0/1/2 encoding with 2 being observed much less frequently than 1 across variants and individuals.
However, when I look at the non-PAR, I see variants which have 0/2, variants which have 0/1/2 with 2 being much more frequent than 1 and variants which have 0/1/2 with 1 being more frequent than 2. I expected to see only 0/2 in the non-PAR. Can sense be made of these, or is --export not supported in hemizygous cases?

If so, can I still assume that the PLINK binary format correctly represents the hemizygosity?

Christopher Chang

unread,
Dec 9, 2021, 8:37:56 PM12/9/21
to plink2-users
1. The main question here is whether you want male genotypes to be treated as 0/1 or 0/2 in the regression; see the --xchr-model flag.  Normally, I'd recommend 0/2, which corresponds to the PLINK 2.0 default, but NOT PLINK 1.9's.  (PLINK 2.0 has several other major advantages over PLINK 1.9 when it comes to processing UK Biobank data, too.)

2. The raw encoding uses 0/2 for hemizygous genotypes.  However, if you have poorly-imputed variants where there are some dosages close to 0.5, you'll see 1s.

Ika

unread,
Dec 10, 2021, 3:49:43 AM12/10/21
to plink2-users
Thank you for your answers and the recommendation!

Just one thing is, that my data are not imputed (I use exome sequencing data). Is there another explanation for the exports I'm seeing?

Christopher Chang

unread,
Dec 10, 2021, 4:02:48 AM12/10/21
to plink2-users
Sex-unaware genotype calling is another possibility.  The --set-hh-missing flag can be used to convert these calls to "missing".
Reply all
Reply to author
Forward
0 new messages