Specifying columns for PCA

8 views
Skip to first unread message

Mahmood Naderan

unread,
Feb 22, 2021, 12:23:14 PMFeb 22
to factomin...@googlegroups.com
Hi,
I use the following code and it works properly

> library(FactoMineR)
> mydata <- read.csv('test.csv', header=T,row.names=1)
> head(mydata)
     V1  V2   V3  V4
P1 73.6 0.7 74.6 3.1
P2 75.2 0.7 75.8 2.8
P3  6.5 0.0  7.3 2.5
P4 41.4 0.3 39.2 8.9
P5  5.4 0.1 18.2 1.1
P6 18.8 0.3 30.3 7.3
> pca.res<-PCA(mydata,graph=F,scale.unit=T)
>  plot(pca.res)


I would like to know how can I specify custom columns instead of all V1~V4?
For example, I want to use V1,V3,V4 and then I want to use V1,V2,V4 for further runs.

Regards,
Mahmood



lepape...@neuf.fr

unread,
Feb 22, 2021, 2:00:52 PMFeb 22
to factomin...@googlegroups.com

Indicate the number of each column you want ?

# Without V3

> pca.res<-PCA(mydata [ , c (1,2,4)], graph=F, scale.unit=T)

Gilles

 

De : factomin...@googlegroups.com <factomin...@googlegroups.com> De la part de Mahmood Naderan
Envoyé : lundi 22 février 2021 18:23
À : factomin...@googlegroups.com
Objet : Specifying columns for PCA

--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "FactoMineR users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse factominer-use...@googlegroups.com.
Cette discussion peut être lue sur le Web à l'adresse https://groups.google.com/d/msgid/factominer-users/CADa2P2Ub8i20f2hjoacPwGH%3D_wkLaGQnWEkdGPDjsk3xjSnjLQ%40mail.gmail.com.

Mahmood Naderan

unread,
Feb 22, 2021, 5:13:32 PMFeb 22
to factomin...@googlegroups.com
Thanks. It works for that data. However, when I change the structure, I am not able to use the idea

> library(FactoMineR)
> mydata <- read.csv('test.csv', header=T,row.names=1)
> mydata$CTG <- as.factor(mydata$CTG)
> head(mydata)
     V1  V2   V3  V4 CTG
P1 73.6 0.7 74.6 3.1   1
P2 75.2 0.7 75.8 2.8   1
P3  6.5 0.0  7.3 2.5   2
P4 41.4 0.3 39.2 8.9   2
P5  5.4 0.1 18.2 1.1   2
P6 18.8 0.3 30.3 7.3   3
> pca.res<-PCA(mydata[,c(1,2,4)],graph=T,scale.unit=T,quali.sup=5)
Error in `[.data.frame`(Xtot, , quali.sup, drop = FALSE) :
  undefined columns selected


With c(1,2,4) I expect V1,V2,V4 and the fifth column is supplementary.

Regards,
Mahmood





Mahmood Naderan

unread,
Mar 7, 2021, 4:14:47 AMMar 7
to factomin...@googlegroups.com
Hi again,
Still the previous problems exist. When I use quali.sup, I am not able to specify which columns to be used in pca. May I know how to fix that?


> library(FactoMineR)
> mydata <- read.csv('test.csv', header=T,row.names=1)
> mydata
     V1  V2   V3  V4 CTG
P1 73.6 0.7 74.6 3.1  A1
P2 75.2 0.7 75.8 2.8  B1
P3  6.5 0.0  7.3 2.5  B1
P4 41.4 0.3 39.2 8.9  C1
P5  5.4 0.1 18.2 1.1  A1
P6 18.8 0.3 30.3 7.3  C1
> res.pca <- PCA(mydata[,c(1,2,4)])
> res.pca <- PCA(mydata[,c(1,2,4)],quali.sup=5)

Error in `[.data.frame`(Xtot, , quali.sup, drop = FALSE) :
  undefined columns selected


Regards,
Mahmood




Mahmood Naderan

unread,
Mar 7, 2021, 7:26:17 AMMar 7
to factomin...@googlegroups.com
Unfortunately that doesn't work either.

> mydata
     V1  V2   V3  V4 CTG
P1 73.6 0.7 74.6 3.1  A1
P2 75.2 0.7 75.8 2.8  B1
P3  6.5 0.0  7.3 2.5  B1
P4 41.4 0.3 39.2 8.9  C1
P5  5.4 0.1 18.2 1.1  A1
P6 18.8 0.3 30.3 7.3  C1
> res.pca <- PCA(mydata[,c(1,2,4,5)],quali.sup=5)
Error in PCA(mydata[, c(1, 2, 4, 5)], quali.sup = 5) :
The following variables are not quantitative:  CTG

Regards,
Mahmood





On Sun, Mar 7, 2021 at 1:23 PM J.C. Deroubaix <jc.der...@skynet.be> wrote:
Try res.pca <- PCA(mydata[,c(1,2,4,5)],quali.sup=5)

J.C. Deroubaix

unread,
Mar 7, 2021, 8:24:16 AMMar 7
to factomin...@googlegroups.com
Try
 res.pca <- PCA(mydata[,c(1,2,3,4,5)],quanti.sup=3,quali.sup=5)

or 
res.pca <- PCA(mydata,quanti.sup=3,quali.sup=5)

Mahmood Naderan

unread,
Mar 7, 2021, 9:04:35 AMMar 7
to factomin...@googlegroups.com
The commands you mentioned include the 3rd column in the PCA calculation.
My problem is that I want to exclude the 3rd column while using 1,2,4 as numerics and 5 as qualitative variables.

If there is no way to do that, I have to create another csv file without the 3rd column and do the rest of the thing. However, since that is a test file and maybe easy to do that, in my original dataset I have to create multiple csv file excluding different columns. However, if the PCA() has some facilities to include/exclude some quantitative and qualitative variables, it will be great.

Regards,
Mahmood





--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "FactoMineR users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse factominer-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages