Projection (Out-of-sample) accuracy

11 views
Skip to first unread message

Louis H

unread,
Oct 19, 2025, 11:32:23 AM10/19/25
to flashpca-users
Hi, 

Does anyone know about what formula is used for flashpca's out-of-sample projection? I get very low accuracy... This is my protocol:
- 1000G project, 2504 individuals
[Perform PCA on all of them]
- Single out 1 individual
[Perform PCA on the 2503 remaining individuals]
--- The two spaces match perfectly ---
Project the singled-out individual.
Compare its position from when it was included to when it was not included.

This gives me two plots
[2504 indiv. Full PCA] + [2503 ref PCA + 1 indiv. projection]
Because the effect of 1 individual over 2504 is negligible the ref. points line up perfectly and I can fuse the plots to compare the positions of the singled-out individual (before and after).

I have joined the plot, in black are regular points, in red the individual  (NA12762) when it was part of the reference, and in cyan when it was projected.

The difference looks significant to me for modern, high quality individuals (1000G on Grch38)....oosflashpcaissue.png 
Reply all
Reply to author
Forward
0 new messages