How row.w is used in PCA?

118 views
Skip to first unread message

Mahmood Naderan

unread,
Apr 22, 2021, 7:11:08 AM4/22/21
to factomin...@googlegroups.com
Hi
I have a question about how row.w is applied to the PCA algorithm used in factominer. I did a test and got confused about how it works.
Consider the following data:

> mydata1
         V1    V2    V3
P1.K1 218.0 30.00 10.00
P2.K1  21.8  2.30  1.50
P2.K2  26.4 28.16 14.96
P2.K3   0.1  0.24  0.28
> mydata2
          V1    V2   V3
P1.K1 109.00 15.00 5.00
P2.K1  10.90  1.15 0.75
P2.K2  13.20 14.08 7.48
P2.K3   0.05  0.12 0.14
> res.pca <- PCA(mydata1)
> res.pca <- PCA(mydata2)


Since the values of mydata2 is half of mydata1, and considering the fact that the weight of individuals are the same, the PCA results are the same. That is fine.
Now, consider the following case that I used mydata1 with a weight vector

> mydata1
       V1 V2 V3
P1.K1 218 30 10
P2.K1 218 23 15
P2.K2  30 32 17
P2.K3   5 12 14
> res.pca <- PCA(mydata1, row.w=c(1,0.1,0.88,0.02))
> res.pca$call$row.w
[1] 0.50 0.05 0.44 0.01

So, I expect that the algorithm uses the weight of 0.5 for P1.K1 row and 0.05 for P2.K1 and so on.
Then I manually used the weights and created mydata4.

> mydata4
          V1    V2   V3
P1.K1 109.00 15.00 5.00
P2.K1  10.90  1.15 0.75
P2.K2  13.20 14.08 7.48
P2.K3   0.05  0.12 0.14

For example, the value of V1 for P2.K3 is 0.05 and that is 5*0.01 from the previous case.
I think it is now safe to use a uniform weight vector for mydata4 because I have already applied the weight.

> res.pca <- PCA(mydata4)

But the graph of mydata4 is different from mydata1+weight.
Can someone help what is the cause of this difference?


Regards,
Mahmood



Francois Husson

unread,
Apr 24, 2021, 2:33:30 AM4/24/21
to factomin...@googlegroups.com
Hi,

The weights in PCA can be understood in the sense that a PCA done with an individual that has a weight which is 2 has the same dimension has the PCA performed with the individual which is duplicated. But it doesn't correspond to multiply all the data by 2.
You should see the videos available here to better understand PCA: https://husson.github.io/MOOC.html#AnaDoGB
And if you want to understand the program, you can see the function and the lines of code in R.

Best
FH
--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "FactoMineR users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse factominer-use...@googlegroups.com.
Cette discussion peut être lue sur le Web à l'adresse https://groups.google.com/d/msgid/factominer-users/CADa2P2XZOjQZgLw3VRE1Pm-cp3MhnDgjesHpW0TQCz0eSc8-Ng%40mail.gmail.com.

--
Francois Husson
Department Statistics & Computer science
L'Institut Agro - AGROCAMPUS OUEST
65 rue de St-Brieuc - 35042 RENNES
Tel: +33 2 23 48 58 86
https://husson.github.io
Reply all
Reply to author
Forward
0 new messages