Transformed y scale breaks stacked bar chart

153 views
Skip to first unread message

Delta Kappa

unread,
Feb 15, 2016, 9:52:41 AM2/15/16
to ggplot2

Hi,

 I just encountered a strange behaviour when using a stacked geom_bar with scale_y_sqrt.
Stacked bars (with more than one stack element) which are the same total height in normal scale have differing total heights in transformed sqrt scale.

To me it looks like a bug. At least its not the behaviour I would expect.

Any ideas?

Best,
Daniel

Example:

# Two bars: two stack elements at x=1 (height 1 and 2) and one stack element x=2 (height 3)
d <- data.frame(x=factor(c(1,1,2)), y=c(1,2,3), fill=factor(c(1,2,3)))

# Just stacking with normal scale gives two bars with equal total height of 3
ggplot(d, aes(x=x, fill=fill)) + geom_bar(aes(y=y), stat="identity", position="stack")

# Tranforming with scale_y_sqrt leads to differing heights
ggplot(d, aes(x=x, fill=fill)) + geom_bar(aes(y=y), stat="identity", position="stack") + scale_y_sqrt()


# With ggvis, the result is as expected:
ggvis(d, x=~x, y=~y, fill=~fill) %>% layer_bars() %>% scale_numeric("y", trans="sqrt")

Dennis Murphy

unread,
Feb 15, 2016, 10:22:05 AM2/15/16
to Delta Kappa, ggplot2
ggplot2 applies transformations at various points in the process of
training a ggplot, in the following order:

scale transformations (effected by scale_ functions);
statistical transformations (effected by stat_ functions);
coordinate transformations (effected by coord_ functions).

geom_bar() calls stat_bin(), so the bar plots occur after statistical
transformation. Thus, when you use scale_y_sqrt(), it transforms the
y's before passing it to stat_bin(), which is why you don't get what
you expected.

OTOH, coordinate scaling takes place after statistical transformation,
so a coordinate change affects both the y-scale and the bars. So the
following code does what you expected:

d <- data.frame(x=factor(c(1,1,2)), y=c(1,2,3), fill=factor(c(1,2,3)))

library(ggplot2)
ggplot(d, aes(x = x, fill = fill)) +
geom_bar(aes(y = y), stat = "identity", position = "stack") +
coord_trans(y = "sqrt")

This ordering is documented in both the first and second editions of
Hadley's ggplot2 book, the latter of which is forthcoming. To answer
your question, the behavior is not a bug - it's a design feature of
which you were apparently unaware.

Notice that the ggvis() call uses trans = inside scale_numeric(); it
borrows from ggplot2::coord_trans().

Dennis
> --
> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> Please provide a reproducible example:
> https://github.com/hadley/devtools/wiki/Reproducibility
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ggplot2" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ggplot2+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Delta Kappa

unread,
Feb 15, 2016, 10:44:17 AM2/15/16
to ggplot2, the.del...@googlemail.com
Hi Dennis,

thanks so much for clarifying.
Reply all
Reply to author
Forward
0 new messages