Setting legend scale & custom tick labels

978 views
Skip to first unread message

Nicholas Riley

unread,
Jan 22, 2009, 2:52:41 PM1/22/09
to ggplot2
ggplot2 is plotting package #5 or so for me (though the first R-based
one I've tried) and so far I'm finding it so much easier than
everything else I can't believe it. Thanks!

I've got a couple of questions though. I'm generating a stacked
histogram as follows:

ggplot(pystone, aes(x=ml*16,fill=factor(level))) + geom_bar()

I am able to customize the x and y axis labels:

ggplot(pystone, aes(x=ml*16,fill=factor(level))) + geom_bar() +
scale_x_log2('bytes referenced') + scale_y_continuous('transactions')

but nothing I can do seems to set the legend label, which is just
factor(level). I tried various scale_colour_* stuff and labs
(colour='...') but neither .

Also, the legend has 1 at the top and 9 at the bottom, whereas the
graph has 1 at the bottom and 9 at the top; is it possible to reverse
the legend so it matches the graph?

The tick labels are 2^10, 2^11, 2^12, etc... I'd rather them show up
as 1K, 2K, etc. since they represent memory use. Is there a way to
provide some kind of custom formatter function for the tick labels?

Thanks.

hadley wickham

unread,
Jan 22, 2009, 4:23:45 PM1/22/09
to Nicholas Riley, ggplot2
On Thu, Jan 22, 2009 at 1:52 PM, Nicholas Riley <nri...@gmail.com> wrote:
>
> ggplot2 is plotting package #5 or so for me (though the first R-based
> one I've tried) and so far I'm finding it so much easier than
> everything else I can't believe it. Thanks!

Great :)

> I've got a couple of questions though. I'm generating a stacked
> histogram as follows:
>
> ggplot(pystone, aes(x=ml*16,fill=factor(level))) + geom_bar()
>
> I am able to customize the x and y axis labels:
>
> ggplot(pystone, aes(x=ml*16,fill=factor(level))) + geom_bar() +
> scale_x_log2('bytes referenced') + scale_y_continuous('transactions')
>
> but nothing I can do seems to set the legend label, which is just
> factor(level). I tried various scale_colour_* stuff and labs
> (colour='...') but neither .

You were pretty close: scale_fill_continuous("My label") should do the trick.

> Also, the legend has 1 at the top and 9 at the bottom, whereas the
> graph has 1 at the bottom and 9 at the top; is it possible to reverse
> the legend so it matches the graph?

scale_fill_continuous("My label", breaks = 9:1)


> The tick labels are 2^10, 2^11, 2^12, etc... I'd rather them show up
> as 1K, 2K, etc. since they represent memory use. Is there a way to
> provide some kind of custom formatter function for the tick labels?

You can do something like this:

qplot(mpg, wt, data=mtcars) + scale_x_continuous(formatter=dollar)

The formatter function should take a numeric vector as input and
return an appropriately formatted character vector.

Regards,

Hadley


--
http://had.co.nz/

Nicholas Riley

unread,
Jan 22, 2009, 9:41:06 PM1/22/09
to ggplot2
On Jan 22, 3:23 pm, hadley wickham <h.wick...@gmail.com> wrote:
> On Thu, Jan 22, 2009 at 1:52 PM, Nicholas Riley <nri...@gmail.com> wrote:
>
> > ggplot(pystone, aes(x=ml*16,fill=factor(level))) + geom_bar() +
> > scale_x_log2('bytes referenced') + scale_y_continuous('transactions')
>
> > The tick labels are 2^10, 2^11, 2^12, etc... I'd rather them show up
> > as 1K, 2K, etc. since they represent memory use.  Is there a way to
> > provide some kind of custom formatter function for the tick labels?
>
> You can do something like this:
>
> qplot(mpg, wt, data=mtcars) + scale_x_continuous(formatter=dollar)
>
> The formatter function should take a numeric vector as input and
> return an appropriately formatted character vector.

This doesn't seem to work for me:

kilobytes = function (x)
{
paste(x / 1024, 'K', sep='')
}
ggplot(pystone, aes(x=ml*16,fill=factor(level))) + geom_bar() +
scale_x_log2(name='bytes referenced', formatter=kilobytes) +
scale_y_continuous('transactions') + scale_fill_discrete('max nesting
depth', breaks=9:1)

I still get 2^10, 2^11, 2^12, etc.

I was able to get the labels to appear as desired with:

scale_x_continuous(name='bytes referenced', breaks=2**(10:15),
formatter=kilobytes)

but the ticks have a linear rather than logarithmic scale, which makes
the data hard to interpret in my case. I tried with scale_x_discrete
but no labels at all appeared. Is there a way to get custom labels
but keep the log scale, somehow, or do I need to rescale my data?

Thanks for the quick help on the other stuff - it works perfectly now.

--Nicholas

hadley wickham

unread,
Jan 22, 2009, 10:18:15 PM1/22/09
to Nicholas Riley, ggplot2
On Thu, Jan 22, 2009 at 8:41 PM, Nicholas Riley <nri...@gmail.com> wrote:
>
> On Jan 22, 3:23 pm, hadley wickham <h.wick...@gmail.com> wrote:
>> On Thu, Jan 22, 2009 at 1:52 PM, Nicholas Riley <nri...@gmail.com> wrote:
>>
>> > ggplot(pystone, aes(x=ml*16,fill=factor(level))) + geom_bar() +
>> > scale_x_log2('bytes referenced') + scale_y_continuous('transactions')
>>
>> > The tick labels are 2^10, 2^11, 2^12, etc... I'd rather them show up
>> > as 1K, 2K, etc. since they represent memory use. Is there a way to
>> > provide some kind of custom formatter function for the tick labels?
>>
>> You can do something like this:
>>
>> qplot(mpg, wt, data=mtcars) + scale_x_continuous(formatter=dollar)
>>
>> The formatter function should take a numeric vector as input and
>> return an appropriately formatted character vector.
>
> This doesn't seem to work for me:
>
> kilobytes = function (x)
> {
> paste(x / 1024, 'K', sep='')
> }
> ggplot(pystone, aes(x=ml*16,fill=factor(level))) + geom_bar() +
> scale_x_log2(name='bytes referenced', formatter=kilobytes) +
> scale_y_continuous('transactions') + scale_fill_discrete('max nesting
> depth', breaks=9:1)
>
> I still get 2^10, 2^11, 2^12, etc.

Oooh, yeah, the log scale would override the formatter.

> I was able to get the labels to appear as desired with:
>
> scale_x_continuous(name='bytes referenced', breaks=2**(10:15),
> formatter=kilobytes)
>
> but the ticks have a linear rather than logarithmic scale, which makes
> the data hard to interpret in my case. I tried with scale_x_discrete
> but no labels at all appeared. Is there a way to get custom labels
> but keep the log scale, somehow, or do I need to rescale my data?

Try this:

breaks <- breaks=2**(10:15)
scale_x_log2(name='bytes referenced', breaks = breaks, labels =
kilobytes(breaks))

> Thanks for the quick help on the other stuff - it works perfectly now.

Great!

Hadley

--
http://had.co.nz/

Nicholas Riley

unread,
Feb 22, 2009, 12:54:49 PM2/22/09
to ggplot2
On Jan 22, 3:23 pm, hadley wickham <h.wick...@gmail.com> wrote:
> > Also, the legend has 1 at the top and 9 at the bottom, whereas the
> > graph has 1 at the bottom and 9 at the top; is it possible to reverse
> > the legend so it matches the graph?
>
> scale_fill_continuous("My label", breaks = 9:1)

I'm now plotting a different stacked histogram, with a similar issue.
The legend is in alphabetical order, which does not match the plotting
order at all. Instead the plot stacks values in the order they appear
in the data set, with the first at the bottom. Similar to your
suggestion above, I used an explicit scale_fill_* and set the breaks
to match the observed plotting order.

cat = factor(thresholds$category)
ggplot(thresholds,
aes(x=factor(threshold), y=count, fill=cat)) +
geom_bar(stat='identity', position='stack') +
facet_grid(. ~ bench) +
xlab('-XX:CompileThreshold') + ylab('methods') +
scale_fill_discrete('method category', breaks=as.vector(rev(unique
(cat)))) +
scale_y_log10()

I can't help but think that I must be doing something wrong, though.
Is there a simpler way to always match the legend order to the plot
stacking order? Without the 'as.vector' I end up with the labels being
right justified and overlapping the color swatches; is this
intentional?

Thanks,

--Nicholas

James Howison

unread,
Feb 22, 2009, 1:03:44 PM2/22/09
to ggplot2

On 22 Feb 2009, at 12:54 PM, Nicholas Riley wrote:

>
> On Jan 22, 3:23 pm, hadley wickham <h.wick...@gmail.com> wrote:
>>> Also, the legend has 1 at the top and 9 at the bottom, whereas the
>>> graph has 1 at the bottom and 9 at the top; is it possible to
>>> reverse
>>> the legend so it matches the graph?
>>
>> scale_fill_continuous("My label", breaks = 9:1)
>
> I'm now plotting a different stacked histogram, with a similar issue.
> The legend is in alphabetical order, which does not match the plotting
> order at all. Instead the plot stacks values in the order they appear
> in the data set, with the first at the bottom. Similar to your
> suggestion above, I used an explicit scale_fill_* and set the breaks
> to match the observed plotting order.
>
> cat = factor(thresholds$category)

btw, why the = sign there, you are doing assignment, no?

But more importantly try

thresholds$category <- factor(thresholds$category,
levels
=
c
("YourBottomLevel","YourNextLevel","AndSoOn","YourTopLevel"),ordered=T))

Note that you can leave the category inside the data frame for easy of
use.

Not sure about the justification etc.

--J

hadley wickham

unread,
Feb 23, 2009, 9:14:57 AM2/23/09
to Nicholas Riley, James Howison, ggplot2

Did James' answer help you? If not, it's very hard to tell exactly
what is going wrong and suggest how to fix it without a reproducible
example. If you can reproduce it with a built in data set that's the
best, otherwise include a small sample of your data, if possible.

Nicholas Riley

unread,
Feb 23, 2009, 2:05:43 PM2/23/09
to ggplot2
On Feb 22, 12:03 pm, James Howison <ja...@freelancepropaganda.com>
wrote:
> btw, why the = sign there, you are doing assignment, no?

I guess <- is more traditional? Is there a difference?

Thanks for the tip about the levels.

To be specific with my data:

> thresholds$category[1:5]
[1] Jython modules Java Jython compiler Python
library Python benchmark
Levels: Java Jython compiler Jython core Jython modules Python
benchmark Python library

The stacking order is that of the data (first to last, bottom to top);
the legend is sorted as with the levels. If I use:

thresholds$category = factor(thresholds$category, levels=rev(unique
(thresholds$category)))

as you suggest, I get the same ordering, though with different colors,
as the code I posted.

I realized in retrospect it was trivial to change the stacking order
by sorting the data, in this case to match the default levels:

thresholds = thresholds[order(thresholds$category, decreasing=TRUE),]

Hopefully my original question is now clearer. Is there a way to keep
the stacking order and legend order the same, or should I continue to
use sorting and reversing as above to match the observed behavior?

Thanks, and sorry for any confusion.

--Nicholas
Reply all
Reply to author
Forward
0 new messages