How to remove duplicated legends

1,887 views
Skip to first unread message

Martin

unread,
Nov 22, 2010, 2:51:16 AM11/22/10
to ggplot2
Hi there,

I'm trying to use ggplot2 to draw a heat map. The code is as follows:

df <- data.frame(x=sample(LETTERS[1:10], 100, replace =T), y
=sample(LETTERS[1:10], 100, replace =T), z=runif(100, 1,
10)) # generate data sample
p<- ggplot(data=df, aes(x=x,y=y))+stat_sum(aes(fill=..prop..,
group=x), geom='tile') # draw with default settings
p + scale_fill_gradient(low='white', high='steelblue',
formatter='percent') # change color and format

The problem is that there are 2 legends. The top one is wanted with
correct color and format. How to remove the bottom one?

Thanks.

smu

unread,
Nov 22, 2010, 5:33:18 AM11/22/10
to Martin, ggplot2
Hi Martin,

try this:

ggplot(data=df, aes(x,y)) + stat_sum(aes(fill=..prop..,group=x),
geom='tile', legend=NA) + scale_fill_gradient(low='white',
high='steelblue')


regards,
Stefan

Dennis Murphy

unread,
Nov 22, 2010, 9:45:30 AM11/22/10
to Martin, ggplot2
Hi:

Stefan is right as far as it goes, but it doesn't really get to Martin's labeling problem. Consider the following, using qplot() (the same happens in ggplot() ):

# (1)
qplot(x, y, data = df, stat = 'sum', group = x, fill = ..prop.., geom = 'tile')
# Replicate Stefan's suggestion: [Plot 1 in attachment]
qplot(x, y, data = df, stat = 'sum', group = x, fill = ..prop.., geom = 'tile') +
     scale_fill_gradient(low='white', high='steelblue')

# (2)
# Try to change the title on the legend:  [Plot 2 in attachment]
qplot(x, y, data = df, stat = 'sum', group = x, fill = ..prop.., geom = 'tile') +
     scale_fill_gradient('Percent')
# Add title change to Stefan's suggestion: [Plot 3 in attachment]
qplot(x, y, data = df, stat = 'sum', group = x, fill = ..prop.., geom = 'tile') +
     scale_fill_gradient('Percent', low='white', high='steelblue')

I don't think that should happen. Of course, if the simple title change doesn't work, then neither does the formatter = 'percent' argument, which seemed pretty reasonable to me and which doesn't appear to be the immediate source of the problem. I also tried [Plot 4 in attachment]

qplot(x, y, data = df, stat = 'sum', group = x, fill = ..prop.., geom = 'tile') +

    scale_fill_gradient(low='white', high='steelblue',
            breaks = seq(0.1, 0.5, by = 0.1),
            labels = c('10%', '20%', '30%', '40%', '50%'))

which puts the same title and breaks as the original legend, but it still prints two. It seems to me, at least on the surface, this has something to do with stat_sum() or the interaction of stat_sum() with geom_tile() [more likely].

Martin's code looks like a variation of the last example in the on-line help page of scale_gradient() except that he used stat_sum() to generate the constructed variable ..prop..  instead of using one of the geoms. It's entirely possible I'm not thinking outside the box enough to see the solution, though, and if so, I'd be happy to be edified.

> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252  
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                         
[5] LC_TIME=English_United States.1252   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  grid      methods 
[8] base    

other attached packages:
[1] gridExtra_0.7   sos_1.3-0       brew_1.0-4      lattice_0.19-13
[5] ggplot2_0.8.8   proto_0.3-8     reshape_0.8.3   plyr_1.2.1    

loaded via a namespace (and not attached):
[1] digest_0.4.2


Cheers,
Dennis



--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442

To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

tileplots.pdf

Joshua Wiley

unread,
Nov 22, 2010, 10:23:13 AM11/22/10
to Dennis Murphy, Martin, ggplot2, Hadley Wickham
Hi,

ggplot2 tries to present as concise of legends as possible, however, when information is unique, there must be unique legends.  Two scales are involved here, scale_fill_gradient() and scale_size().  When the title of scale_fill_gradient() is changed, the title must also be changed on scale_size() in order for the legends to be collapsed.  There is a second issue, the formatting of the scale_fill_gradient() legend was changed, so it must also be changed on scale_size() in order for them to be collapsed.  However, scale_size() lacks a formatter argument.  Here is what the code would need to look like:

ggplot(data = dat, aes(x = x, y = y)) +
 stat_sum(aes(fill = ..prop.., group = x), geom = 'tile') +
 scale_fill_gradient('Percent', low='white', high='steelblue', formatter = 'percent') +
 scale_size('Percent', formatter = 'percent')

and here is a patch to scale_size() to make the above code actually work.

Cheers,

Josh

@Hadley, all I did was add a formatter = identity argument to the function and formatter = formatter to the super$new call.


ScaleSizeContinuous <- proto(ScaleContinuous, expr={
  doc <- TRUE
  common <- NULL
  .input <- .output <- "size"
  aliases <- c("scale_area", "scale_size")
 
  new <- function(., name=NULL, limits=NULL, breaks=NULL, labels=NULL, formatter = identity, trans = NULL, to = c(1, 6), legend = TRUE) {
   
    b_and_l <- check_breaks_and_labels(breaks, labels)
   
    .super$new(., name=name, limits=limits, breaks=b_and_l$breaks, labels=b_and_l$labels, formatter = formatter, trans=trans, variable = "size", to = to, legend = legend)
  }
 
  map <- function(., values) {
    rescale(values, .$to, .$input_set())
  }
  output_breaks <- function(.) .$map(.$input_breaks())
 
  objname <- "size_continuous"
  desc <- "Size scale for continuous variable"
  seealso <- list(
    "scale_manual" = "for sizing discrete variables"
  )
  desc_params <- list(
    "to" = "a numeric vector of length 2 that specifies the minimum and maximum size of the plotting symbol after transformation."
  )
 
  icon <- function(.) {
    pos <- c(0.15, 0.3, 0.5, 0.75)
    circleGrob(pos, pos, r=(c(0.1, 0.2, 0.3, 0.4)/2.5), gp=gpar(fill="grey50", col=NA))
  }
 
  examples <- function(.) {
    (p <- qplot(mpg, cyl, data=mtcars, size=cyl))
    p + scale_size("cylinders")
    p + scale_size("number\nof\ncylinders")
   
    p + scale_size(to = c(0, 10))
    p + scale_size(to = c(1, 2))

    # Map area, instead of width/radius
    # Perceptually, this is a little better
    p + scale_area()
    p + scale_area(to = c(1, 25))
   
    # Also works with factors, but not a terribly good
    # idea, unless your factor is ordered, as in this example
    qplot(mpg, cyl, data=mtcars, size=factor(cyl))
   
    # To control the size mapping for discrete variable, use
    # scale_size_manual:
    last_plot() + scale_size_manual(values=c(2,4,6))
   
  }
 
})
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

Feng Zhang

unread,
Nov 22, 2010, 11:46:38 PM11/22/10
to Joshua Wiley, Dennis Murphy, ggplot2, Hadley Wickham
Hi Josh,
 
Your code actually works. Thanks.
 
However I'm still confused how you managed to do this. As far as I can tell, you overwrite scale_size(). But in my original code, I didn't call scale_size(), at least not explicitly.
 
Martin

Joshua Wiley

unread,
Nov 23, 2010, 12:06:17 AM11/23/10
to Feng Zhang, ggplot2
On Mon, Nov 22, 2010 at 8:46 PM, Feng Zhang <fzha...@gmail.com> wrote:
> Hi Josh,
>
> Your code actually works.

lol, is my code normally that bad? ;)

> Thanks.
>
> However I'm still confused how you managed to do this. As far as I can tell,

Redundant information in legends is collapsed. ggplot calls many
scales with standard sets of default names, but when you change the
title of one scale (in this case: scale_fill_gradient("Percent")), the
legends are no longer redundant---they have different titles.
Further, you changed the legend labels by using the formatter argument
which means if you want a single legend, you must change all of the
scales to the same format also.

> you overwrite scale_size(). But in my original code, I didn't call

I do not overwrite scale_size (well, my patch gets added to the global
environment which is higher on the search path and pre-empts the
default, but those details aren't important), I just adjust the call
to scale_size() with custom arguments (i.e., the title to match the
scale_fill_gradient() and the formatter)

> scale_size(), at least not explicitly.

It is true, you did not explicitly, it was part of the
behind-the-scenes of ggplot2. The sensible defaults are often fine,
but once you start customizing, you have to go all the way to each and
every scale (or you get the essentially redundant though technically
different legends you saw). If you really want to learn more (and
have not already), I highly recommend reading the ggplot2 book. It is
well written and it covers some of the structure/theory behind ggplot2
which will help make this sort of thing make sense. If you already
have the book, trundle off to page 110 (section 6.5) paying special
note to the final paragraph of page 111.

Cheers,

Josh

Kohske Takahashi

unread,
Nov 23, 2010, 9:23:21 AM11/23/10
to Joshua Wiley, Feng Zhang, ggplot2
hi. Josh's explanation will help to understand legends.
here is another workaround, without hacking ScaleSizeContinuous.

a)

ggplot(data = df, aes(x = x, y = y))+


stat_sum(aes(fill = ..prop.., group = x), geom = 'tile')+
scale_fill_gradient('Percent', low='white', high='steelblue',
formatter = 'percent')+

scale_size(legend=FALSE)

b)

ggplot(data = df, aes(x = x, y = y))+
stat_sum(aes(size = NULL, fill = ..prop.., group = x), geom = 'tile')+


scale_fill_gradient('Percent', low='white', high='steelblue',
formatter = 'percent')

probably b) is more efficient.

--
Kohske Takahashi <takahash...@gmail.com>

Research Center for Advanced Science and Technology,
The University of  Tokyo, Japan.
http://www.fennel.rcast.u-tokyo.ac.jp/profilee_ktakahashi.html

Dennis Murphy

unread,
Nov 23, 2010, 9:32:53 AM11/23/10
to Kohske Takahashi, ggplot2
Very nice, Kohske. Thanks!

Best regards,
Dennis

Feng Zhang

unread,
Nov 24, 2010, 8:49:07 PM11/24/10
to Kohske Takahashi, Joshua Wiley, ggplot2
Excellent point, Kohske!
 
Thanks for all inputs. Problem is solved and I now know more about ggplot2.
 
Regards,
 
Martin
Reply all
Reply to author
Forward
0 new messages