inertia gain

553 views
Skip to first unread message

Rachael

unread,
Apr 24, 2013, 1:06:35 PM4/24/13
to factomin...@googlegroups.com
Hi,

I am using PCA followed by HCPC to cluster 57 individuals into behavioral groups. 

res.hcpc=HCPC(res.pca, nb.clust=-1,method="ward", metric="euclidean",consol=FALSE)

At the moment I am getting only three clusters using the inertia gain criteria.  However, it looks like there is further inertia to be gained by adding another cluster (see attached).  I guess the program isn't including that additional cluster because there is a decrease in inertia gain followed by an increase?    

I haven't been able to find much information about using inertia gain to delineate clusters.  Would it be valid to include the additional cluster using this criteria? And how well supported by the data would it be?  (Or what do I need to look at to find out?)
 
Thanks for your help,
Rachael
Rplot04.pdf

François Husson

unread,
Apr 24, 2013, 2:02:36 PM4/24/13
to factomin...@googlegroups.com
In the HCPC function, you can use the argument nbclust to define the number of clusters you want. So if you use nbclust=-1, the number of clusters is chosen by an algorithm, but you can put nbclust=4 to define 4 clusters.

Best
FH 

Rachael

unread,
Apr 24, 2013, 2:47:29 PM4/24/13
to factomin...@googlegroups.com

Hi François 
Thanks for your fast reply.  However, I am interested in choosing clusters based on a predefined rational  / data supported manner rather than just picking a number.  

Maybe I need to clarify my questions about inertia gain criteria.  

Is it a good method to use if I want to find data supported clusters?

If I understand things correctly the inertia gain equation picks the place where the change in inertia gain from cluster -1 is greater than cluster +1, 
however in my data this doesn't translate into flat line following this break.  Instead it looks like there is little inertia to be gained from cluster 3-4, but more from 4-5.

If I identify these breaks where there is inertia gained, and then choose to include all these clusters in my final model, how well supported by the data are these clusters?  
Are there any values that I can test?  

Thanks again for your help.
Regards,
Rachael


Julie Josse

unread,
May 1, 2013, 1:20:53 PM5/1/13
to factomin...@googlegroups.com
Hi Rachel,


Le 24/04/2013 20:47, Rachael a écrit :

Hi François 
Thanks for your fast reply.  However, I am interested in choosing clusters based on a predefined rational  / data supported manner rather than just picking a number. 

Maybe I need to clarify my questions about inertia gain criteria. 
Two answers:
1) You can try different numbers, the one suggested by the criterion, and the one you consider interesting, regarding the shape of the tree for instance. You can then analyse and compare the clustering issue from these different cuts. It is an empirical way to proceed but often useful.
2) You can have a look at the litterature on selecting the number of clusters: there are many criteria, model based ones, some more empirical such as the gain in inertia. It is very related to the problem of selecting the number of principal components (where the inertia gain criterion is as looking at the barplot of eigenvalues).
We do not have any test to test for the suggested cut for the moment,

Best regards,
Julie
 

Is it a good method to use if I want to find data supported clusters?

If I understand things correctly the inertia gain equation picks the place where the change in inertia gain from cluster -1 is greater than cluster +1, 
however in my data this doesn't translate into flat line following this break.  Instead it looks like there is little inertia to be gained from cluster 3-4, but more from 4-5.

If I identify these breaks where there is inertia gained, and then choose to include all these clusters in my final model, how well supported by the data are these clusters?  
Are there any values that I can test?  

Thanks again for your help.
Regards,
Rachael


--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes FactoMineR users.
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse factominer-use...@googlegroups.com.
Pour plus d'options, visitez le site https://groups.google.com/groups/opt_out .
 
 

Reply all
Reply to author
Forward
0 new messages