plink --score-list on sex chromosomes

72 views
Skip to first unread message

Caterina

unread,
Jul 3, 2025, 8:10:45 AM7/3/25
to plink2-users
I'm trying to compute the sum of rare variants in CHR Y. For several genes. However, I'm getting several entries that yield 0.5. Why am I getting this instead of 1? I am using no-mean-imputation flag.
Here is an example of my scorefiles:

variantID ALT UTY
24:13249860:G:C C 1
24:13249882:C:T T 1
24:13251046:G:C C 1
24:13251051:T:A A 1
24:13251053:C:A A 1
24:13251076:C:A A 1
24:13251105:C:T T 1
24:13251112:A:G G 1
24:13251123:T:C C 1


And the log file:

PLINK v2.0.0-a.6.9LM 64-bit Intel (29 Jan 2025) cog-genomics.org/plink/2.0/
(C) 2005-2025 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to additive_burden.CHR_24.log.
Options in effect:
--bfile /mnt/project/Bulk/Exome sequences/Population level exome OQFE variants, PLINK format - final release//ukb23158_cY_b0_v1
--no-psam-pheno
--out additive_burden.CHR_24
--score-list /mnt/project/burden_per_gene/data/02.score_files.missense/scorefile_list.CHR_24.txt no-mean-imputation header-read cols=+scoresums,-scoreavgs
Start time: Fri May 9 00:33:52 2025
15614 MiB RAM detected, ~14370 available; reserving 7807 MiB for main
workspace.
Using up to 8 compute threads.
469835 samples (254616 females, 215156 males, 63 ambiguous; 469835 founders)
loaded from /mnt/project/Bulk/Exome sequences/Population level exome OQFE
variants, PLINK format - final release//ukb23158_cY_b0_v1.fam.
11316 variants loaded from /mnt/project/Bulk/Exome sequences/Population level
+ set +x
exome OQFE variants, PLINK format - final release//ukb23158_cY_b0_v1.bim.
Note: No phenotype data present.
--score-list file 1/34: 13 variants processed.
--score-list file 2/34: 303 variants processed.
--score-list file 3/34: 7 variants processed.
--score-list file 4/34: 16 variants processed.
--score-list file 5/34: 1 variant processed.
--score-list file 6/34: 3 variants processed.
--score-list file 7/34: 158 variants processed.
--score-list file 8/34: 576 variants processed.
--score-list file 9/34: 7 variants processed.
--score-list file 10/34: 1 variant processed.
--score-list file 11/34: 48 variants processed.
--score-list file 12/34: 4 variants processed.
--score-list file 13/34: 70 variants processed.
--score-list file 14/34: 201 variants processed.
--score-list file 15/34: 33 variants processed.
--score-list file 16/34: 78 variants processed.
--score-list file 17/34: 41 variants processed.
--score-list file 18/34: 309 variants processed.
--score-list file 19/34: 19 variants processed.
--score-list file 20/34: 31 variants processed.
--score-list file 21/34: 14 variants processed.
--score-list file 22/34: 90 variants processed.
--score-list file 23/34: 73 variants processed.
--score-list file 24/34: 2 variants processed.
--score-list file 25/34: 3 variants processed.
--score-list file 26/34: 31 variants processed.
--score-list file 27/34: 12 variants processed.
--score-list file 28/34: 35 variants processed.
--score-list file 29/34: 67 variants processed.
--score-list file 30/34: 383 variants processed.
--score-list file 31/34: 31 variants processed.
--score-list file 32/34: 11 variants processed.
--score-list file 33/34: 206 variants processed.
--score-list file 34/34: 62 variants processed.
--score-list: Results written to additive_burden.CHR_24.sscore .

Caterina

unread,
Jul 3, 2025, 8:46:59 AM7/3/25
to plink2-users
I'm using UK Biobank exome data in case that's relevant. 

Chris Chang

unread,
Jul 3, 2025, 9:56:04 AM7/3/25
to Caterina, plink2-users
What happens if you preprocess the chrY .bed file with —set-invalid-haploid-missing (

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/plink2-users/fbaf3b27-ab5a-473c-887d-4ac98f997f87n%40googlegroups.com.
Message has been deleted
Message has been deleted

Emma

unread,
Jul 3, 2025, 8:18:23 PM7/3/25
to Chris Chang, plink2-users
The half-integer counts (0.5, 1.5, 3.5, etc) disappear when I pre-process as you said. 

1 1	0	0	1	-9	C	C
2 2	0	0	1	-9	T	C
3 3	0	0	1	-9	T	C
I got a ped file for 3 individuals and a specific variIt seems there are indeed haploid calls. But in that case, my previous command line would consider the "C/C" person as SUM = 1 or 2? How about for chrX, would this be a problem for men there too?Thanks!

Emma

unread,
Jul 4, 2025, 5:32:44 AM7/4/25
to Chris Chang, plink2-users
Sorry for the poorly written sentence. I meant that when looking at a specific variant (24:18550184) which is not in a PAR, I got diploid calls. Which I did not expect, and I'm not sure why this happens not only for this variant but several ones.
Also, for the haploid call (C/C), how does PLINK2 know to compute a SUM=1 instead of a SUM=2 when using --score?
Does this happen in ChrX too?

Christopher Chang

unread,
Jul 6, 2025, 11:40:15 PM7/6/25
to plink2-users
From the --score documentation:
"By default, G contains basic allelic dosages (0..2 on diploid chromosomes, 0..1 on haploid, male chrX encoding controlled by --xchr-model)."

PLINK 2 dosages are on a 0..2 scale on regular diploid chromosomes, and 0..1 on regular haploid chromosomes. However, chrX doesn't fit neatly in either of those categories.  --xchr-model lets you control its encoding in several contexts (--glm, --condition[-list], --score[-list], --variant-score).

The following three modes are supported:

  • 0. Skip chrX. (This no longer causes other haploid chromosomes to be skipped.)
  • 1. Male dosages are on a 0..1 scale on chrX, while females are 0..2. This was the PLINK 1.x default.
  • 2. Males and females are both on a 0..2 scale on chrX.  This is the PLINK 2 default.

Message has been deleted

Caterina

unread,
Jul 8, 2025, 9:51:57 AM7/8/25
to plink2-users
How about on chrY? Does PLINK2 --score assume 0..1 encoding? In that case for a genotype like this:
C/C (which is actually just one haploid call, but for some reason in the .ped files it shows as C C), I should get a SUM=1 when using PLINK2 --score?

I'm trying to understand why PLINK2 was giving me half-integer counts for people with diploid calls like T/C, which I guess only makes sense if PLINK2 understands that this: C/C in chrY is just one call (which is different for autosomes, in which SUM=2 for a diploid call like that). 

I guess it's better to explain myself with an example. A .ped file like this:


1 1 0 0 1 -9 C C 
2 2 0 0 1 -9 T C 
3 3 0 0 1 -9 T C

Counting variant C with weight =1. Would give me SUM=2 for person1 in autosomes, but SUM=1 for person1 in ChrY. How does PLINK2 --score know how to differentiate between the 2 types (an autosomal call and a chrY call).

Chris Chang

unread,
Jul 9, 2025, 10:44:57 PM7/9/25
to Caterina, plink2-users
PLINK recognizes that chrX, chrY, and chrM are special, and handles them differently from the autosomes when appropriate.

Reply all
Reply to author
Forward
0 new messages