symmetric square matrix for LD statictics including Dprime

28 views
Skip to first unread message

Kruthika Iyer

unread,
Jun 27, 2024, 6:41:12 PMJun 27
to plink2-users

Hi!

I am currently working on generating an LD statistic report in a symmetric square matrix format. 

While the option '--r2 square' efficiently calculates statistics for all the variants present in the bfile and outputs a square matrix, there isn't a corresponding option for calculating Dprime statistics.

To work around this limitation, I opted for using a list format with the following parameters:

--bfile  
--ld-snp-list snp_list.txt 
--ld-window 9999999 
--ld-window-r2 0 
--out  
--r2 dprime yes-really

The snp_list.txt file contains SNP IDs extracted from the .bim file specified with the --bfile flag. 

Despite explicitly specifying the SNP list and setting a low threshold of 0, I've observed that some variants are omitted from the output, i.e., their pairwise LD calculation was never performed against any other SNP.

I'm currently investigating the reasons behind this behavior and seeking clarity on why certain variants are not included in the output despite being specified in the SNP list.

Thanks

Chris Chang

unread,
Jun 27, 2024, 8:43:51 PMJun 27
to Kruthika Iyer, plink2-users
plink 1.x —r2 normally uses the intersection of the —ld-window and —ld-window-kb settings.  You correctly disabled the “—ld-window 10” default setting, but you also need to disable the “—ld-window-kb 1000” default.

(Also, as a practical matter, you probably want to use plink 2.0 —r2-phased for this, so you can remove output columns you don’t need from the gigantic output file.)

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/plink2-users/b9d29981-1f56-4d60-988a-ae127bacaf41n%40googlegroups.com.

Chris Chang

unread,
Jun 27, 2024, 8:54:03 PMJun 27
to Kruthika Iyer, plink2-users
Forgot to mention, if you literally want all pairs including the interchromosomal ones, the ‘inter-chr’ modifier is necessary (and removes the need for —ld-window / —ld-window-kb).

Kruthika Iyer

unread,
Jun 28, 2024, 12:02:26 PMJun 28
to plink2-users
Hi Chris!

I tried both ways of running this - 

--bfile 

  --ld-snp-list list.txt

  --ld-window 9999999

  --ld-window-kb 1000000

  --ld-window-r2 0

  --out 

  --r2


and 

--bfile

--inter-chr

--ld-snp-list list.txt

--ld-window-r2 0

--out

--r2

The result is the same as before: it misses out on SNPs.
In this particular example, my .bim file has 146 unique SNPs. 
When I run --r2 square, I get a symmetric square matrix of 146x146 r2 associations.
However, when I use either set of those aforementioned parameters, I get a list of 19881 pairs, and when I distill the SNPs from the SNP_A and SNP_B columns of the .ld file, I only get 141 unique SNPs.
I know which 5 SNPs are not included in the pairwise calculation, but I do not understand why. 


Best,
Kruthika

Kruthika Iyer

unread,
Jun 28, 2024, 12:33:38 PMJun 28
to plink2-users
Also, is there an easier way of getting the Dprime statistic in a symmetric square format than the approach I was taking?


Best,
Kruthika

Chris Chang

unread,
Jun 28, 2024, 12:57:23 PMJun 28
to Kruthika Iyer, plink2-users
Do you know what their allele frequencies are?

Reply all
Reply to author
Forward
0 new messages