Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Dendrograms / tree models
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  2 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Andrie de Vries  
View profile   Translate to Translated (View Original)
 More options May 12 2010, 12:49 pm
From: Andrie de Vries <apdevr...@gmail.com>
Date: Wed, 12 May 2010 09:49:55 -0700 (PDT)
Local: Wed, May 12 2010 12:49 pm
Subject: Dendrograms / tree models
Hi all

As a newbie on this mailing list, let me first congratulate Hadley and
the ggplot community with creating a truly wonderful piece of graphic
software.

As part of my work I need to create compelling graphic displays of
dendrograms, such as the output produced by tree() and hclust().  Much
to my surprise, a google search for ggplot dendrogram reveals nothing
useful.

So I decided to write some code to plot regression/classification
trees, i.e. the output of tree() in library(tree).

I would be interested to know whether there are better ways of doing
this.  In particular, I created a fortify method for objects of class
tree.  This creates the data frame for plotting the lines.  But I also
need two additional data frames, one for the text labels, and one for
the value labels.  Is there a sensible way of merging the data frames
in such a way that a subset gets used by geom_segment, and another
subset by geom_text?

Andrie

# Plots tree object in ggplot2

fortify.tree <- function(model, data, ...){
  require(tree)
  # Uses tree:::treeco to extract data frame of plot locations
    xy <- tree:::treeco(model)
    n <- model$frame$n

  # Lines copied from tree:::treepl
    x <- xy$x
    y <- xy$y
    node = as.numeric(row.names(model$frame))
    parent <- match((node%/%2), node)
    sibling <- match(ifelse(node%%2, node - 1L, node + 1L), node)

    linev <- data.frame(x=x, y=y, xend=x, yend=y[parent], n=n)
    lineh <- data.frame(x=x[parent], y=y[parent], xend=x,
yend=y[parent], n=n)

    rbind(linev[-1,], lineh[-1,])

}

label.tree <- function(model, ...){
  require(tree)
  # Uses tree:::treeco to extract data frame of plot locations
  xy <- tree:::treeco(model)
  label <- model$frame$var
  sleft  <- model$frame$splits.cutleft
  sright <- model$frame$splits.right

  # Lines copied from tree:::treepl
  x <- xy$x
  y <- xy$y
  node = as.numeric(row.names(model$frame))
  parent <- match((node%/%2), node)
  sibling <- match(ifelse(node%%2, node - 1L, node + 1L), node)

  data <- data.frame(x=x, y=y, label=label)
  data <- data[data$label != "<leaf>",]
  data

}

label.tree.leaf <- function(model, ...){
  require(tree)
  # Uses tree:::treeco to extract data frame of plot locations
  xy <- tree:::treeco(model)
  label <- model$frame$var
  yval  <- model$frame$yval
  sleft  <- model$frame$splits.cutleft
  sright <- model$frame$splits.right

  # Lines copied from tree:::treepl
  x <- xy$x
  y <- xy$y
  node = as.numeric(row.names(model$frame))
  parent <- match((node%/%2), node)
  sibling <- match(ifelse(node%%2, node - 1L, node + 1L), node)

  data <- data.frame(x, y, label, yval)
  data <- data[data$label == "<leaf>",]
  data$label <- round(data$yval, 2)
  data

}

################
# Example code #
################

library(ggplot2)
library(tree)

data(cpus, package="MASS")
cpus.ltr <- tree(log10(perf) ~ syct+mmin+mmax+cach+chmin+chmax, cpus)

p <- ggplot(data=cpus.ltr)
p <- p +
geom_segment(aes(x=x,y=y,xend=xend,yend=yend,size=n),colour="blue",
alpha=0.5)
p <- p + scale_size("n", to=c(0, 3))
p <- p + geom_text(data=label.tree(cpus.ltr), aes(x=x, y=y,
label=label), vjust=-0.5, size=4)
p <- p + geom_text(data=label.tree.leaf(cpus.ltr), aes(x=x, y=y,
label=label), vjust=0.5, size=3)
theme_null <- theme_update(panel.grid.major = theme_blank(),
                           panel.grid.minor = theme_blank(),
                           axis.text.x = theme_blank(),
                           axis.text.y = theme_blank(),
                           axis.ticks = theme_blank(),
                           axis.title.x = theme_blank(),
                           axis.title.y = theme_blank(),
                           legend.position = "none")

p <- p + theme_set(theme_null)
print(p)

--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442

To post: email ggplot2@googlegroups.com
To unsubscribe: email ggplot2+unsubscribe@googlegroups.com
More options: http://groups.google.com/group/ggplot2


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hadley Wickham  
View profile  
 More options May 16 2010, 1:46 pm
From: Hadley Wickham <had...@rice.edu>
Date: Sun, 16 May 2010 12:46:08 -0500
Local: Sun, May 16 2010 1:46 pm
Subject: Re: Dendrograms / tree models

> As a newbie on this mailing list, let me first congratulate Hadley and
> the ggplot community with creating a truly wonderful piece of graphic
> software.

Thanks!

> As part of my work I need to create compelling graphic displays of
> dendrograms, such as the output produced by tree() and hclust().  Much
> to my surprise, a google search for ggplot dendrogram reveals nothing
> useful.

> So I decided to write some code to plot regression/classification
> trees, i.e. the output of tree() in library(tree).

I had some code for doing this too, but I think yours does a better
job - thanks!  Do you mind if I include it in the next version of
ggplot2?

> I would be interested to know whether there are better ways of doing
> this.  In particular, I created a fortify method for objects of class
> tree.  This creates the data frame for plotting the lines.  But I also
> need two additional data frames, one for the text labels, and one for
> the value labels.  Is there a sensible way of merging the data frames
> in such a way that a subset gets used by geom_segment, and another
> subset by geom_text?

Not that I'm aware of.  I'm in the middle of some major rewriting of
ggplot2, but when I come back to smaller problems this summer, I'll
have a think about it.

Hadley

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442

To post: email ggplot2@googlegroups.com
To unsubscribe: email ggplot2+unsubscribe@googlegroups.com
More options: http://groups.google.com/group/ggplot2


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »