Visualize full dendrogram

23 views
Skip to first unread message

Carmen Torres López

unread,
Nov 11, 2015, 1:36:11 PM11/11/15
to s-space-re...@googlegroups.com
Hello,

I'm using HierarchicalAgglomerativeClustering class from S-Space
package and I got good clusters with buildDendogram method at a given
merge step, which I specify, but I'm also trying to visualize the full
dendrogram, it this possible? I'm working with S-Space 2.0.4. Please
if you can give me any clue how can I achieve this I'll appreciate it.

best regards,
Carmen

David Jurgens

unread,
Nov 11, 2015, 2:14:42 PM11/11/15
to s-space-re...@googlegroups.com
Hi Carmen,

  Great to hear you got the 2.0.4 code working.  S-Space doesn't have any visualization capabilities built it (the library is just for algorithms), but I think it should be pretty easy to export the results.  I'm not sure what you have used for visualizing, but I think maybe the easiest solution is to try dumping the output of the clustering into a JSON object and using d3 to visualize.  That library has great support of both radial and dendrogram layouts.  An example of the JSON format they use is here and you can safely ignore the size field; all you need to do is have each child in the dendrogram be a JSON object with a "name" field and have all children be in a JSON array. 

  Hopefully that won't be too difficult to visualize but feel free to respond with questions.  Also, I'd love to see how the results look!

  Thanks,
  David


--
You received this message because you are subscribed to the Google Groups "Semantic Space Research - Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to s-space-research...@googlegroups.com.
To post to this group, send email to s-space-re...@googlegroups.com.
Visit this group at http://groups.google.com/group/s-space-research-dev.
For more options, visit https://groups.google.com/d/optout.

carmentor...@gmail.com

unread,
Nov 12, 2015, 11:10:07 PM11/12/15
to Semantic Space Research - Development
Hi David,

Thank you so much for your message, I try your solution and I got a pretty dendrogram :) I attach an image.
First, I created a method to represent the list of Merge objects returned by HAC algorithm as an object with the full hierarchy (parents and children).
For the output of clustering results to JSON file I used https://github.com/google/gson
I was reading in d3 documentation but seems like it can't visualize the similarity values in the diagram?

Right know I'm studying how to improve the dendrogram pointcut for all the documents of my collection to get better clusters, because I got some better than others. I've read that there is not an agreement by the researchers of an specific technique for this, some of them cut the dendrogram at some similarity level or as in my case select a number of merge steps on dependence of the size of texts. Could you recommend me another way of cutting to dendrogram to get best clusters?

best regards,
Carmen
Doc1.pdf

carmentor...@gmail.com

unread,
Nov 25, 2015, 4:35:01 PM11/25/15
to Semantic Space Research - Development, carmentor...@gmail.com

Hi,

I just tested this method in HierarchicalAgglomerativeClustering class

clusterRows(Matrix m, double clusterSimilarityThreshold, ClusterLinkage linkage, SimType similarityFunction)
                                   
it does not get the full dendrogram, but instead you can obtain cluster partitions when you specify the clusterSimilarityThreshold parameter. I think it´s a good alternative of poincut because when you build the tree the threshold stops the clustering process when comparing to the clusters similarity.
Thanks for develop this method.

regards,
Carmen                               
Reply all
Reply to author
Forward
0 new messages