To my understanding, pruning is the process of removing nearby variants that are
highly correlated in the population (in linkage disequilibrium) and retaining only one with the highest minor allele frequency in such a cluster (window). The r2 matric is indicates how correlated the variants are in the removing process. According to plink documentation,
"Its third parameter is a pairwise r2 threshold: at each step, pairs of variants in the current window with squared correlation greater than the threshold are noted, and variants are greedily pruned from the window until no such pairs remain. "
So I think the process remove variants with r2 greater than a threshold. However, many publications in prominent journals claim they prune variants at r2 smaller than a threshold. For example,
"Linkage Disequilibrium (LD) pruning with r2 < 0.2 was done using PLINK [
82] software to obtain a set of unrelated SNPs to evaluate the phylogenetic relationship and principal component analysis"
https://www.mdpi.com/2223-7747/10/5/998/htm
Have I misunderstood anything?
Wang