Using MCA with a test dataset

182 views
Skip to first unread message

Mahmood Naderan

unread,
Feb 26, 2021, 6:53:07 AM2/26/21
to factomin...@googlegroups.com
Hi,
With the following data, I would like to test the MCA. As you can see the fifth column is a categorical variable.

> library(FactoMineR)
> mydata <- read.csv('test.csv', header=T,row.names=1)
> head(mydata)
     V1  V2   V3  V4 CTG
P1 73.6 0.7 74.6 3.1   A
P2 75.2 0.7 75.8 2.8   B
P3  6.5 0.0  7.3 2.5   B
P4 41.4 0.3 39.2 8.9   C
P5  5.4 0.1 18.2 1.1   A
P6 18.8 0.3 30.3 7.3   C

However, the following commands are invalid

> res.mca = MCA(mydata, quanti.sup=c(1,2,3,4), quali.sup=5)
Error in which(unlist(lapply(listModa, is.numeric))) :
  argument to 'which' is not logical
> res.mca = MCA(mydata, quali.sup=5)
Error in which(unlist(lapply(listModa, is.numeric))) :
  argument to 'which' is not logical


I see tea example only and don't know how to fix that.
Any suggestions?

Regards,
Mahmood



J.C. Deroubaix

unread,
Feb 26, 2021, 8:01:07 AM2/26/21
to factomin...@googlegroups.com


As you can read here that Multiple correspondance analyse needs CATEGORICAL variables as active variables. In you example, you have 4 quantitative variables. For you data, you have to use PCA not CA nor MCA. 

Usage

MCA(X, ncp = 5, ind.sup = NULL, quanti.sup = NULL, 
    quali.sup = NULL, excl=NULL, graph = TRUE, 
	level.ventil = 0, axes = c(1,2), row.w = NULL, 
	method="Indicator", na.method="NA", tab.disj=NULL)

Arguments

X

a data frame with n rows (individuals) and p columns (categorical variables)


Best regards, 
Jean-Claude Deroubaix


--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "FactoMineR users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse factominer-use...@googlegroups.com.
Cette discussion peut être lue sur le Web à l'adresse https://groups.google.com/d/msgid/factominer-users/CADa2P2XaJEAS_Ff39Py4MR6q2WxJ6SSi3SQNJ4zuQCPewfdUdg%40mail.gmail.com.

Mahmood Naderan

unread,
Feb 26, 2021, 8:10:38 AM2/26/21
to factomin...@googlegroups.com
I am shaping the data to be able to use mca. So, please correct me if I am wrong.
Each row (individual) has some continuous columns (variables) which are V1~V4 and a non-continuous column which is CGT.

In the example, it is stated that
 >>First, we want to see the relationship between variables and the associations between categories. Two categories are close to each other if they are often taken together.<<

So, I expect that "relationship between variables" is a PCA on variables V1~V4 and "associations between categories" is to consider CTG in the analysis. Isn't that right?
Some individuals are in category A which some others in B and C. So, I want to check the relationship between categories.


Regards,
Mahmood





J.C. Deroubaix

unread,
Feb 26, 2021, 10:12:04 AM2/26/21
to factomin...@googlegroups.com

Le 26 févr. 2021 à 14:10, Mahmood Naderan <mahmo...@gmail.com> a écrit :

So, I expect that "relationship between variables" is a PCA on variables V1~V4 and "associations between categories" is to consider CTG in the analysis. Isn't that right?
Some individuals are in category A which some others in B and C. So, I want to check the relationship between categories.



OK

Mahmood Naderan

unread,
Feb 26, 2021, 11:12:50 AM2/26/21
to factomin...@googlegroups.com
Thanks for the confirmation.
Then, do you know why that error is shown?  :)


> head(mydata)
     V1  V2   V3  V4 CTG
P1 73.6 0.7 74.6 3.1   A
P2 75.2 0.7 75.8 2.8   B
P3  6.5 0.0  7.3 2.5   B
P4 41.4 0.3 39.2 8.9   C
P5  5.4 0.1 18.2 1.1   A
P6 18.8 0.3 30.3 7.3   C
> res.mca = MCA(mydata, quanti.sup=c(1,2,3,4), quali.sup=5)
Error in which(unlist(lapply(listModa, is.numeric))) :
  argument to 'which' is not logical
> res.mca = MCA(mydata, quali.sup=5)
Error in which(unlist(lapply(listModa, is.numeric))) :
  argument to 'which' is not logical

How can I specify the continuous variables with quanti.sup and categories with quali.sup?

Regards,
Mahmood





--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "FactoMineR users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse factominer-use...@googlegroups.com.

J.C. Deroubaix

unread,
Feb 26, 2021, 1:28:09 PM2/26/21
to factomin...@googlegroups.com
replace MCA by PCA. 

Mahmood Naderan

unread,
Feb 26, 2021, 2:14:46 PM2/26/21
to factomin...@googlegroups.com
That is getting complicated...
If I use PCA, then I am not able to include the categories. Is that right? 

Let me ask what should I do with the test data in order to use MCA? I am trying to create a data file similar to tea example.

Mahmood Naderan

unread,
Feb 26, 2021, 3:14:53 PM2/26/21
to factomin...@googlegroups.com
I tried to specify V1~V4 as quali.sup like below but still get the same error


> mydata <- read.csv('test.csv', header=T,row.names=1)
> res.mca = MCA(mydata, quanti.sup=1:4, quali.sup=5)

Error in which(unlist(lapply(listModa, is.numeric))) :
  argument to 'which' is not logical
> head(mydata)
     V1  V2   V3  V4 CTG
P1 73.6 0.7 74.6 3.1   A
P2 75.2 0.7 75.8 2.8   B
P3  6.5 0.0  7.3 2.5   B
P4 41.4 0.3 39.2 8.9   C
P5  5.4 0.1 18.2 1.1   A
P6 18.8 0.3 30.3 7.3   C


I really wonder what my data file should look like in order to use MCA.

Regards,
Mahmood




Mahmood Naderan

unread,
Feb 27, 2021, 4:39:51 PM2/27/21
to factomin...@googlegroups.com
Is there any example other than tea for using MCA?

Regards,
Mahmood




Meinhard H. Schroeder

unread,
Feb 28, 2021, 3:01:34 AM2/28/21
to factomin...@googlegroups.com

Following an example:

 

http://www.sthda.com/english/articles/22-principal-component-methods-videos/71-mca-in-r-using-factominer-quick-scripts-and-videos/

 

Please note: multiple correspondence analysis (MCA) is a data analysis technique for nominal categorical data. In your dataset only one categorical  variable.

Womble

unread,
Feb 28, 2021, 5:29:57 AM2/28/21
to FactoMineR users
A good archaeological example was provided by Englestad (1988) in the volume Multivariate Archaeology edited by Madsen.  It was discussed by Baxter (1994) in his volume Exploratory Multivariate Analysis in Archaeology.  I have written a guide on how to perform the analyses in Baxter (1994) in R, and provided the data in an Excel spreadsheet.  See my Academia pages: https://www.academia.edu/43034760/A_companion_to_Exploratory_Multivariate_Analysis_in_Archaeology_using_R and https://www.academia.edu/43034824/Baxter_data

Mahmood Naderan

unread,
Feb 28, 2021, 7:28:54 AM2/28/21
to factomin...@googlegroups.com
Hi
@Meinhard: Where can I see the source poison file to see the structure?
When I check ~/R/x86_64-pc-linux-gnu-library/3.6/FactoMineR/data
I see two non text files: poison.rda and poison.text.rda.




Regards,
Mahmood




Mahmood Naderan

unread,
Feb 28, 2021, 7:54:51 AM2/28/21
to factomin...@googlegroups.com
Hi Wonble,
As I quickly checked the links you mentioned, it seems that for MCA, there should be at least two categorical columns.
In the tea example, there is one numerical variable (age) and the rest are categorical. So, I tried to use a data file with one numerical column and two categorical columns like below, but as you can see I still get an error.

> library(FactoMineR)
> mydata <- read.csv('test1.csv', header=T,row.names=1)
> head(mydata)
     V1 CTG_1 CTG_2
P1 73.6    A1    X2
P2 75.2    B1    X2
P3  6.5    B1    Z2
P4 41.4    C1    Y2
P5  5.4    A1    Y2
P6 18.8    C1    Z2
> res.mca = MCA(mydata, quanti.sup=1, quali.sup=2:3)

Error in which(unlist(lapply(listModa, is.numeric))) :
  argument to 'which' is not logical

May I know what is the difference between this test data and the one in the tea example?
Why does the tea example work but my test data doesn't work?

> data(tea)
> res.mca=MCA(tea,quanti.sup=19,quali.sup=20:36)

Regards,
Mahmood




Kris Lockyear

unread,
Feb 28, 2021, 8:01:05 AM2/28/21
to factomin...@googlegroups.com
You have defined all three variables as supplementary data so there is no data to analyse. 

K.



Sent from my Galaxy
--
Vous recevez ce message, car vous êtes abonné à un sujet dans le groupe Google Groupes "FactoMineR users".
Pour vous désabonner de ce sujet, visitez le site https://groups.google.com/d/topic/factominer-users/8qZdGPpLMZk/unsubscribe.
Pour vous désabonner de ce groupe et de tous ses sujets, envoyez un e-mail à l'adresse factominer-use...@googlegroups.com.
Cette discussion peut être lue sur le Web à l'adresse https://groups.google.com/d/msgid/factominer-users/CADa2P2UC4E2edw1esgWjv3SeXUBWDHK4Sp994tmqm%2BRBxczxhw%40mail.gmail.com.

Mahmood Naderan

unread,
Feb 28, 2021, 8:36:50 AM2/28/21
to factomin...@googlegroups.com
Kris,
Do you mean that in the tea example, columns 1:18 are also used while 19 and 20:36 are explicitly specified as quanti.sup and quali.sup?

> data(tea)
> res.mca=MCA(tea,quanti.sup=19,quali.sup=20:36)

Then, I have another question. Is that mandatory for the data being analyzed (columns 1:18) to be numerical or categorical?
Based on the tea example, I expect that it must be qualitative only.


Regards,
Mahmood




J.C. Deroubaix

unread,
Feb 28, 2021, 11:40:56 AM2/28/21
to factomin...@googlegroups.com

You need 3 or more catégorical columns. And must put your quantitative variables as sup.


--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "FactoMineR users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse factominer-use...@googlegroups.com.

Meinhard

unread,
Feb 28, 2021, 11:42:48 AM2/28/21
to FactoMineR users
library(FactoMineR)
data("poison")
View(poison)
head(poison, n = 55)

J.C. Deroubaix

unread,
Feb 28, 2021, 11:48:36 AM2/28/21
to factomin...@googlegroups.com
I thing that what you need is good course on MCA, CA and PCA. You seems to need com basic information about multimensinnal analysis. Pleas read and study this :

Regards
J-C Deroubaix


Mahmood Naderan

unread,
Feb 28, 2021, 12:36:31 PM2/28/21
to factomin...@googlegroups.com
OK I understand that.
Thank you.


Regards,
Mahmood





Reply all
Reply to author
Forward
0 new messages