Weighted PCA

104 views
Skip to first unread message

Mahmood Naderan

unread,
Apr 5, 2021, 4:36:06 AM4/5/21
to factomin...@googlegroups.com
Hi
I didn't find an option in PCA/CA/FAMD functions for dealing with weighted data points.
Assume for a variable (V1), the value of the first individual (X) is 86% and the value of the second individual (Y) is 3% and so on. I would like to assign a weight vector like (0.55, 0.12, ...) to them. Any idea about that?

Regards,
Mahmood



Meinhard H. Schroeder

unread,
Apr 5, 2021, 5:46:51 AM4/5/21
to factomin...@googlegroups.com

You can use the subcommand row.w.  Example below: 23 individuals with 23 different weights.

res<-PCA(Dataset.PCA , scale.unit=TRUE, ncp=5, quali.sup=c(94: 94), graph = FALSE,

row.w=c(0.033433509,0.034560481,0.023290759,0.025169046,0.016904583,0.013899324,0.196093163,0.063110443,0.046957175,0.043576258,

                   0.043200601,0.03756574,0.16904583,0.053719008,0.030052592,0.019158527,0.017655898,0.00864012,0.033433509,0.040570999,

                   0.024793388,0.007513148,0.017655898))

 

row.w   an optional row weights (by default, a vector of 1 for uniform row weights); the weights are given only for the active individuals

--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "FactoMineR users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse factominer-use...@googlegroups.com.
Cette discussion peut être lue sur le Web à l'adresse https://groups.google.com/d/msgid/factominer-users/CADa2P2UK03GHSX2%3DT8k9V5sDVDZUKN_zC0SJv22u6HwiEsJ_9w%40mail.gmail.com.

Mahmood Naderan

unread,
Apr 17, 2021, 5:14:27 PM4/17/21
to factomin...@googlegroups.com
Hi again,
I found out that the weights I assigned are different from those that are actually used in the analysis.
Consider the following commands

> mydata
       V1 V2 V3
P1.K1 218 30 10
P2.K1 218 23 15
P2.K2  30 32 17
P2.K3   5 12 14
> res.pca <- PCA(mydata, row.w=c(1,0.1,0.88,0.02))

With that, I want to say that the weight of P1.K1 is 1 because P1 has just one kernel (K1) while P2 has three kernels and therefore the weights of its kernels are 0.1,0.88,0.02. However, after running the PCA, I see the weights are

> res.pca$call$row.w
[1] 0.50 0.05 0.44 0.01

I don't know how these weights are calculated but it seems that P1.K1 and P2.K1 has equal weights which is incorrect.

Meinhard H. Schroeder

unread,
Apr 18, 2021, 6:36:47 AM4/18/21
to factomin...@googlegroups.com

The weights are correct: 0.50+0.05+0.44+0.01 = 1

Sum of weights should be 1.

 

For Example

49 % Men à weight = 0.49

51 % Women à weight = 0.51  

 

 

It seems that the program calibrates the sum of the weights to 1.

 

res.pca <- PCA(mydata, row.w=c(100,10,88,2))

res.pca$call$row.w

[1] 0.50 0.05 0.44 0.01

 

--

Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "FactoMineR users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse factominer-use...@googlegroups.com.

Mahmood Naderan

unread,
Apr 20, 2021, 6:57:32 AM4/20/21
to factomin...@googlegroups.com
Hello,
I tried to use "row.w" but I am not sure if the way I did that, is correct or not.

> mydata
       V1 V2 V3   WG CTG1 CTG2
P1.K1 218 30 10 1.00    C    L
P2.K1 218 23 15 0.10    C    B
P2.K2  30 32 17 0.88    M    B
P2.K3   5 12 14 0.02    C    B
> res.famd1 <- FAMD(mydata)
> res.famd2 <- FAMD(mydata, row.w=mydata$WG)


The data consists of two programs. P1 has one kernel, so the weight of the kernel is 1. On the other hand, P2 contains three kernels with weights (0.1,0.88,0.02). So the sum of WG column is not necessarily 1.

Thing is that, in both commands, I see WG in the graph of variables. So, I have doubts about whether I  have used the weights correctly or not.
Any thoughts?


Regards,
Mahmood





Mahmood Naderan

unread,
Apr 20, 2021, 6:57:32 AM4/20/21
to factomin...@googlegroups.com
Yes I know the final summation of weights are 1. However as you said, it has recalibrated the weights which will bias the results.

I would say that I am looking for PCA on a group of data. In the previous example, the baseline PCA considers 4 individuals. But I would like to see that as two groups. One group with one individual and one group with three individuals.

In there any chance to do that with Factominer?

Regards, 
Mahmood




Reply all
Reply to author
Forward
0 new messages