Stacked ggplot chart does not look accurate.

1,359 views
Skip to first unread message

Raphael

unread,
Oct 6, 2010, 8:15:39 PM10/6/10
to ggplot2
Hello everyone!

These data compose the first bar of the graph. Naturally, the sum of
these four values of C2G equals the y value of the bar (about 14.198):
> cog[1:4,]
Quarter Segment C2G
1 2008 Q1 Professional D 0.7509433
2 2008 Q1 Professional M 7.4151935
3 2008 Q1 Home D -0.4347982
4 2008 Q1 Home M 6.4669215

This is the code I'm using:
stackedchart <- ggplot(cog, aes(x=Quarter, y=C2G,
fill=factor(Segment)))
stackedchart <- stackedchart + layer(
geom="bar",
stat="identity",
position="stack")
stackedchart

I expect that the stacked "2008 Q1" bar have four factors each with a
unique color in this single bar. But the result only has three colors.
The smallest "Home D" value of -0.435 is missing.

Note that even though the bar only has three visible factors, the
stacked bar "begins" slightly below the x-axis at y = -0.435. This
tells me that the negative "Home D" value is included in the
calculations--it's just not visible.

Why does this happen?
Is there any way to make it look right?
Does the resulting warning message "Stacking not well defined when
ymin != 0" have anything to do with it and, if so, how do I make it
well defined?

I'm a new R user. Any help you all provide is much appreciated!

-Raphael

Hadley Wickham

unread,
Oct 6, 2010, 10:16:56 PM10/6/10
to Raphael, ggplot2
Hi Raphael,

How exactly do you expect to stack a bar with a negative height?

Hadley

> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Raphael

unread,
Oct 6, 2010, 11:09:20 PM10/6/10
to ggplot2, Hadley Wickham, Raphael
Well, it's a negative value more than a negative height in my view.

Perhaps I shouldn't but I was expecting something similar to Excel. In
Excel, all four elements would be stacked. The negative "Home D" value
would begin at y=- 0.435 and extend to y = 0. The other three elements
would stack on top of that. In the end, the top of the graph would
have y = sum of all elements - Home D = 14.198 - 0.435. Effectively,
the length of the bar is the sum of the elements (14.198) but the max
y value is 14.198 - 0.435.

It seems to me like you're suggesting that negative values can't be
stacked in ggplot?

In fact... there are negative values in all Segment levels when I look
at the entire dataset (2008 Q1 - 2014 Q4) but "Home D" is the only
level that "disappears". It's almost as if whenever "Home D" is
negative, ggplot uses it's negative value as the starting y-value for
the other three stacked values??? That's what it seems like, at
least.

By the way, I find your Elegant Graphics book very helpful. I received
it from Amazon yesterday.

Thanks,

Raphael

Ista Zahn

unread,
Oct 7, 2010, 9:27:18 AM10/7/10
to Raphael, ggplot2, Hadley Wickham
Hi Raphael,

On Wed, Oct 6, 2010 at 11:09 PM, Raphael <rhv....@gmail.com> wrote:
> Well, it's a negative value more than a negative height in my view.
>
> Perhaps I shouldn't but I was expecting something similar to Excel. In
> Excel, all four elements would be stacked. The negative "Home D" value
> would begin at y=- 0.435 and extend to y = 0. The other three elements
> would stack on top of that. In the end, the top of the graph would
> have y = sum of all elements - Home D = 14.198 - 0.435. Effectively,
> the length of the bar is the sum of the elements (14.198) but the max
> y value is 14.198 - 0.435.
>
> It seems to me like you're suggesting that negative values can't be
> stacked in ggplot?

This can be done quite easily actually. You just need to plot the
positive and negative values in separate layers, like this:

cog$pn <- factor(sign(cog$C2G), labels=c("-", "+"))

stackedchart <- ggplot(cog, aes(x=Quarter, y=C2G,
fill=factor(Segment)))
stackedchart <- stackedchart + layer(
geom="bar",
stat="identity",

position="stack",
data=subset(cog, pn == "+"))


stackedchart <- stackedchart + layer(
geom="bar",
stat="identity",

position="stack",
data=subset(cog, pn == "-"))
stackedchart + geom_hline(yintercept=0)

>
> In fact... there are negative values in all Segment levels when I look
> at the entire dataset (2008 Q1 - 2014 Q4) but "Home D" is the only
> level that "disappears". It's almost as if whenever "Home D" is
> negative, ggplot uses it's negative value as the starting y-value for
> the other three stacked values??? That's what it seems like, at
> least.

It's there, underneath the others. Try

stackedchart <- ggplot(cog, aes(x=Quarter, y=C2G,
fill=factor(Segment)))
stackedchart <- stackedchart + layer(
geom="bar",
stat="identity",

position="stack",
alpha=.5)
stackedchart

and you will see it.

Best,
Ista

--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

Raphael

unread,
Oct 10, 2010, 1:14:48 PM10/10/10
to ggplot2
Thanks for the help! It makes sense!

-Raphael

On Oct 7, 6:27 am, Ista Zahn <iz...@psych.rochester.edu> wrote:
> Hi Raphael,
>
Reply all
Reply to author
Forward
0 new messages