Hi Christopher,
I'm sorry to bang on about this again, but I believe that the documentation for plink --pca is wrong in its explanation of what is being shown in the plink.eigenvec table.
It says here (https://www.cog-genomics.org/plink/2.0/formats#eigenvec) that in the plink.eigenvec table: "The first columns contain the sample ID, and the rest are principal component *weights*". (Documentation also refers a “sample weight” in the discussion of the allele-wts modifier.)
This is incorrect. The values for each PC reported in the standard plink.eigenvec table represent normalised PC *scores* for that variant (ie. the linear combination of SNPs as variables) and not PC weights.
I performed a test to confirm this – if interested I can share with you the R markdown file. Using dummy data, performing --pca in plink, then performing pca on the same data in R, taking care with row and column orientation. In R I performed PCA twice, on snps-as-variables and samples-as-variables. Plotting the samples scores of snps-as-variables (which leads to snp weights and samples scores) clearly looks more similar to the plot of plink.eigenvec values than the plotting the samples weights of samples-as-variables.
(This is a relief as this is how this data is commonly interpreted.)
May I suggest a simple update of the documentation to state clearly that plink.eigenvec shows the score for each sample, and to correct the error in this discussion. I have found that I'm not the only person whose been confused here.
Best wishes and thanks for your ongoing support to this excellent software.