LD pruning parameters

432 views
Skip to first unread message

Diana J.

unread,
Feb 26, 2020, 12:14:00 PM2/26/20
to plink2-users
Hello all,

When pruning for SNPs in LD, how do you choose the window size, # of SNPs to shift, and VIF? I've scoured through literature and it all seems subjective/arbitrary.

Two papers that are highly related to my project use --indep 100 25 100 and --indep 50 5 10, and the # of SNPs they are working with are similar. 

Is there a certain method to set the parameters according to your dataset?

Best,
Diana

Christopher Chang

unread,
Feb 26, 2020, 2:27:35 PM2/26/20
to plink2-users
It depends on the expectations of your downstream computations.

For many purposes, it's sufficient to prune obvious short-range LD and do an imperfect job of it; a window size of 50-100 is good enough for that, and is computationally very cheap.

When you have stronger requirements, you can translate that into an appropriate r^2 threshold, and then specify a kilobase value for the --indep-pairwise window size that's large enough to capture most variant pairs expected to correlate that highly, along with a step size of 1.  (E.g. "--indep-pairwise 500kb 1 <r^2 threshold>")

Diana Jeong

unread,
Feb 28, 2020, 4:53:05 PM2/28/20
to plink2-users
Thank you Christopher. If I would like to remove SNPs in high LD and would like to calculate inbreeding coefficients for each individual after LD pruning (working with about ~30K SNPs after basic QC), is there a way to determine the parameters? I am using the --indep function rather than --indep-pairwise.
Reply all
Reply to author
Forward
0 new messages