Bug when plotting Cluster Dendrogram?

203 views
Skip to first unread message

Gabri M

unread,
Jun 26, 2017, 10:25:31 AM6/26/17
to FactoMineR users
Hi,

i am running an HCPC on the results of a PCA made on a dataset of 351 observations with 7 quantitative active variables and some supplementary qualitative variables. What i've noticed is that what is being plotted either after running HCPC or using function plot() is not in agreement to what is stored in the data.frame "res.hcpc$data.clust".

As you can see from the following snippet of the dendrogram obtained with command:
plot(res.hcpc, choice = "tree", cex = 0.5)




this is the composition of the 6th cluster.
But, if i call these rows from the data.frame to see in which cluster are placed

res.hcpc$data.clust[c(166,184,157,236,106,202,287,137,183,91,131,196,237,142,175,77,241,180,52,119,335,164,292,245,242,169,33,36,126,87,99,81,270,333,249,250,228,275,248,198,16,140,290),"clust"]

this is the result

[1] 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[19] 4 2 4 2 2 2 2 2 2 2 2 4 2 2 2 2 2 2
[37] 2 2 2 2 2 2 2
Levels: 1 2 3 4 5 6 7 8

neither one of the rows listed belong to cluster 6. Two are in cluster 1, three in cluster 4, and the others in cluster 2.

What's the problem? Is it a bug? I think (and hope) that the correct informations are in the HCPC output object, could you confirm this?

I tried to update FactoMineR package, tried to remove rownames (initially rows had an unique alfanumeric distinctive name) but with no success.

Any help would be greatly appreciated.

François Husson

unread,
Jun 27, 2017, 2:48:34 AM6/27/17
to FactoMineR users
Hi,
No it is not a bug. It is because there is a consolidation after the clustering (the tree is cut, but then the consolidation makes the clusters more homogeneous). And the consolidation improves the clusters but the link with the tree is lost. So if you want that there is an agreement between the tree and the clusters, you have to use the argument consol=FALSE in the HCPC function.
FH

Gabri M

unread,
Jun 28, 2017, 6:44:20 AM6/28/17
to FactoMineR users
Thank you very much for your fast and very useful answer. 

Isn't there ANY other method to plot the tree after consolidation? Or any other idea for how to plot this information?

Thanks again

Olly Bones

unread,
Aug 10, 2017, 8:17:25 AM8/10/17
to FactoMineR users
Hi François

Thanks for this clarification. Could you expand upon what you mean by making the clusters more homogeneous--what is the actual process that is performed?

Best
Olly

François Husson

unread,
Aug 18, 2017, 10:52:37 AM8/18/17
to FactoMineR users
You can see Section 3.3 from this technical report:
FH

Gabri M

unread,
Jan 29, 2018, 4:42:41 AM1/29/18
to FactoMineR users
Hello,

has anyone managed to plot a dendrogram from a consolidated partitioning?

Thanks in advance
Gabriele

François Husson

unread,
Jan 29, 2018, 6:56:11 AM1/29/18
to FactoMineR users
If you plot a dendrogram from a consolidated partition, then some individuals in a same group are not in the same part of the tree. The consolidation makes that the hierarchy is lost.
FH
Message has been deleted

Gabri M

unread,
Jan 29, 2018, 8:03:58 AM1/29/18
to FactoMineR users
I finally managed to obtain what I wanted using dendextend package

Here's a small reproducible example:

library(dendextend)
library(dplyr)

# Subsampling Iris Dataset
subsample <- c(118, 42,107,136,135,5,116,120,
               123,138,15,147,142,87,101,46,
               16,114,88,33,91,25,58,62,85,115,
               110,44,140,126,32,99,43,10,119,
               71,80,17,74,61,97,125,109,60,
               29,19,77,45,54,132)
small_iris <- iris[subsample,]

#PCA followed by HCPC, consilid = TRUE
res.hcpc <- 
  PCA(small_iris, quali.sup = 5, graph = F) %>% 
  HCPC(consol = T, graph = F)

# Get consildated clusters
order <- res.hcpc$call$t$tree %>% as.dendrogram() %>%  order.dendrogram()
clusts <- res.hcpc$call$X[order,"clust"]

# Using dendextend color leaves based on "clust" categorical variable
res.hcpc$call$t$tree %>% 
  as.dendrogram() %>%
  assign_values_to_leaves_edgePar(value = clusts, edgePar = "col") %>% 
  hang.dendrogram(hang_height = 0.01 ) %>% # Only to see colors better
  plot(horiz = T)


In the attached file you can see the result.
As you can see some items have been moved from one cluster to another after consolidation (109,138,77, ecc).

Hope this helps.
Thanks François for your invaluable work.

All the best,
Gabriele
Rplot01.png

François Husson

unread,
Feb 10, 2018, 4:21:13 AM2/10/18
to FactoMineR users
Some points have moved because there is a consolidation of the clusters. So the clusters are more homogeneous, but the hierarchical tree does not coincide with the clusters.
If you want that it coincides, you can use consol=FALSE in the HCPC function.
FH

Le lundi 26 juin 2017 16:25:31 UTC+2, Gabri M a écrit :
Reply all
Reply to author
Forward
0 new messages