Restricting computation of LD matrix to a set of snips

93 views
Skip to first unread message

Jarkko Toivonen

unread,
Jul 30, 2021, 10:13:49 AM7/30/21
to plink2-users
What is the correct way to compute R2 for all pairs in a list of snips?

I first tried with command plink --bfile input --r2 --ld-snps chr6_26022804_C_T,chr6_26024958_C_T
but apparently this will compute LD for each snip in the list with the snips in its neighbourhood.

Then I tried using the --extract switch in the following way:
plink --bfile input --r2 inter-chr --ld-window-r2 0.0 --extract snips.txt
This will create the following output:
 CHR_A         BP_A               SNP_A  CHR_B         BP_B               SNP_B           R2
     6     26022804   chr6_26022804_C_T      6     26024958   chr6_26024958_C_T     0.237953

But command plink --bfile input  --ld chr6_26022804_C_T chr6_26024958_C_T
will output
R-sq = 0.22459

Why the difference?

Christopher Chang

unread,
Jul 30, 2021, 11:03:40 AM7/30/21
to plink2-users
plink 1.x has two classes of r^2 computations, allele-based and haplotype-based.  The haplotype-based calculation is slower without being much more accurate (partly because phasing software didn't really exist back when plink 1.07 was written, so the plink 1.x file format can't directly store haplotype information and instead plink has to guess), so it's used by --ld but usually not --r2.  If you add --r2's "dprime" modifier, one side effect is that the r^2 computation also becomes haplotype-based.

Jarkko Toivonen

unread,
Jul 30, 2021, 12:39:41 PM7/30/21
to plink2-users
Thanks a lot for the explanation!

jasdeep kaur

unread,
Aug 18, 2021, 12:14:19 PM8/18/21
to plink2-users
Hi chris,

I have phased and imputed genotype data, would you suggest using plink 1.7 for haplotype analysis ?

Thanks
jasdeep

Reply all
Reply to author
Forward
0 new messages