K means partitioning prior to HCPC

475 views
Skip to first unread message

Wenyuan Yin

unread,
Mar 27, 2017, 12:58:06 AM3/27/17
to FactoMineR users


Hi all,

I am new to FactoMineR package so forgive me if I am asking some ambiguous questions here. I have a dataset with about 56 variables and variables are mixed types(categorical and continuous). I learned that MFA and FAMD can be used prior to HCPC function for the purpose of dimension reduction. So I ran both of them and use them as the input for HCPC. I got the result but encountered some questions. Since I still had about 38 variables left and each of them has multiple values, when I ran HCPC without any partitioning first, I ended up getting 4 major clusters which have lots of variable values within each and made it a little difficult for me to interpret the definition of each cluster. Then I learned that I can do k means partition before actual HCPC using kk = number. I applied and got the chart like the one below. Basically, it makes hierarchy much more clear but I am wondering is there any way I can get the definition of the sub-cluster 1-25, like the variables contribution to those sub-cluster etc, since the final result I got was still the relationship between individuals/variables and major 4 clusters.

I researched online and it seems cutree function could give different views of branches of the whole tree;meanwhile, keep hierarchy unchanged but I am not sure how to do it.

Let me know if you need more clarification from me.




















































Thank you so much for your help in advance!

Best,

Wenyuan
                                                                                                                                                                                                 

François Husson

unread,
Apr 27, 2017, 3:53:10 AM4/27/17
to FactoMineR users
In fact, the partition should be approximately the same with or without K-means before hierarchical clustering. K-means is used before hierarchical clustering when there is a lot of people and when hierarchical clustering cannot be performed on all the dataset.
So consider your first analysis. And if there is a lot of variables that describe each cluster, it is not really a problem because variables are sorted from the most important to the least.
FH

Wenyuan Yin

unread,
May 2, 2017, 12:27:34 AM5/2/17
to factomin...@googlegroups.com
Thank you so much for your explanation!

Wenyuan

无病毒。www.avast.com

--
Vous recevez ce message, car vous êtes abonné à un sujet dans le groupe Google Groupes "FactoMineR users".
Pour vous désabonner de ce sujet, visitez le site https://groups.google.com/d/topic/factominer-users/eToX0jkaYnw/unsubscribe.
Pour vous désabonner de ce groupe et de tous ses sujets, envoyez un e-mail à l'adresse factominer-users+unsubscribe@googlegroups.com.
Pour obtenir davantage d'options, consultez la page https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages