Thanks, James. I didn't know about that new option in smartpca. I
haven't been using it lately because it is kind of awkward. Instead, I
just do it by hand using Octave eigs() function. I'll bet R can do that,
too. I use plink2 to decide which subjects are sufficiently distantly
related to form the set for eigenvector computation, then I project their
relatives onto those vectors. It has been a lot easier than using
smartpaca. But now they have new features so maybe I have to return to
it.
Mike
On Wed, 19 Mar 2014, James Lee wrote:
> I think lack of LD pruning would matter if some of the statistically
> significant PCs turned out to reflect local LD rather than global
> structure induced by geography or lab artifacts. You probably would not
> want to partial out PCs that merely reflected LD.
>
> It takes a large sample size and marker number to pick up such
> artifactual PCs. You can diagnose their presence by inspecting the "SNP
> loadings" and seeing if the biggest ones are all adjacent in the SNP
> panel. If no such PCs seem to be present, I don't see any harm in using
> the entire panel of SNPs.
>
> If you do see some relatively prominent LD-based PCs, you can rerun
> smartpca with an option that allows pruning or downweighting of SNPs by
> LD. In fact, some of the smartpca's developers have proposed using this
> option to address the same problem raised by Speed et al.:
>
>
http://dx.plos.org/10.1371/journal.pgen.1003993
>
> On Mar 19, 2014, at 2:16 PM, Mike Miller <
mbmi...@gmail.com> wrote:
>