Trouble getting stat summary and geom=ribbon to work

1,124 views
Skip to first unread message

Sock

unread,
Sep 17, 2009, 12:49:40 PM9/17/09
to ggplot2
Hi,

I'm trying to plot a longitudinal data set, using ggplot and adding
some summary info (eg. mean, 1 sd bounds) using geom=ribbon. The
summary info is based on a subset of the original data (eg. less an
outlier). But I'm having trouble getting the ribbons to show up
correctly. It's probably something obvious that I'm missing as a
novice, and any help is much appreciated!

Here's a simple example. I tried several things.
- if I use geom=crossbar instead of geom=ribbon, everything is ok
- if Day is set as rep(c(1,2,3,8,9), each=8), then everything is ok,
which makes me wonder if the problem has to do with the ordering of
Day? Day is supposed to be numeric.
- (per Thierry 's suggestion on R help list) calculating summary stats
externally and feeding geom_ribbon the summary stats. Only Days 1 and
20 came up ok, the other days appeared to be ignored.

Thanks!
Sock

### Example data. Ran using R version 2.9.2, ggplot2 version 0.8.3
###

set.seed(13)

Day <- rep(c(1, 2, 3, 8, 20), each=8)
# The plot is ok if Day <- rep(c(1,2,3,8,9), each=8)

ID <- rep(LETTERS[1:8], 5)
Y <- rnorm(length(Day), 100, 5)
dat <- data.frame(Day=Day, ID=ID, Y=Y)

# outlier
dat$Y[dat$ID=="A" & dat$Day==8] <- 150
dat.less <- dat[!(dat$ID=="A" & dat$Day==8),]

# Longitudinal data plot. Obs for each subject is connected by a line
over time

p <- ggplot(dat, aes(x=Day, y=Y, group=ID)) +
scale_x_continuous(breaks=sort(unique(dat$Day))) +
geom_line(colour=alpha("blue", 5/10))

# Adding mean, 1 sd bounds using crossbar geom is ok. But the same
info using ribbon geom doesn't work.

p + stat_summary(data=dat.less, aes(group=1), geom="crossbar",
fun.data="mean_sdl", mult=1) + stat_summary(data=dat.less, aes
(group=1), geom="ribbon", fun.data="mean_sdl", mult=1, fill=alpha
("blue", 1/10))

# Calculating summary stats externally and feeding it to geom_ribbon

RibbonData <- ddply(dat.less, "Day", function(x){
mean(x$Y) + c(ymin = -1, ymax = 1) * sd(x$Y)
})

p + stat_summary(data=dat.less, aes(group=1), geom="crossbar",
fun.data="mean_sdl", mult=1) + geom_ribbon(data = RibbonData, aes
(group= 1, ymin = ymin, ymax = ymax), fill=alpha("blue", 1/10))


Harlan Harris

unread,
Sep 18, 2009, 8:32:54 AM9/18/09
to ggplot2
This looks like the ordering bug I and several others ran into a while
back. Some numeric data was being sorted alphabetically in some
stat_summary code. It's broken in ggplot2 0.8.3, but fixed in the
development version, fwiw.

One of the threads on the bug:
http://groups.google.com/group/ggplot2/browse_thread/thread/6e52d349d66c3526/39044af12298774c

Development version: http://github.com/hadley/ggplot2

Installation in some versions of Linux (Windows is trickier):
wget http://github.com/hadley/ggplot2/tarball/master | tar -e
R CMD install hadley-ggplot2-[tab]

-Harlan

Sock

unread,
Sep 18, 2009, 11:53:07 AM9/18/09
to ggplot2
Thanks Harlan, it does sound like an ordering issue.

If anyone else has any other thoughts on this, I'm all ears.

If it's really a bug, I may just put off implementing the ribbons for
now and wait for the fixed version to be released, because of issues
around using development code for this project. I'm trying to build
an analysis template that needs to be stable and reproducible for the
long run.

Thanks,
Sock


On Sep 18, 5:32 am, Harlan Harris <harlan.har...@gmail.com> wrote:
> This looks like the ordering bug I and several others ran into a while
> back. Some numeric data was being sorted alphabetically in some
> stat_summary code. It's broken in ggplot2 0.8.3, but fixed in the
> development version, fwiw.
>
> One of the threads on the bug:http://groups.google.com/group/ggplot2/browse_thread/thread/6e52d349d...
>
> Development version:http://github.com/hadley/ggplot2
>
> Installation in some versions of Linux (Windows is trickier):
> wgethttp://github.com/hadley/ggplot2/tarball/master| tar -e
Reply all
Reply to author
Forward
0 new messages