mean+se for stat_summary

260 views
Skip to first unread message

takahashi kohske

unread,
Aug 10, 2010, 4:44:21 AM8/10/10
to ggplot2
hi

I usually use mean +/- se (standard error) for error bar, though
mean+-SE is old-fashioned.
Could you please consider to add a brief function for this in the next
version of ggplot2?

mean_se<-function (x, ...)
{
x<-na.omit(x)
se<-function(x)sqrt(var(x)/length(x))
data.frame(y=mean(x), ymin=mean(x)-se(x), ymax=mean(x)+se(x))
}

you can use like this:

qplot(cyl, mpg, data=mtcars)+stat_summary(fun.data = mean_se, colour = "red")

actually it is easy to define by myself, but... i wrote this function
for many times.

thanks in advance.

Hadley Wickham

unread,
Aug 10, 2010, 9:07:21 AM8/10/10
to takahashi kohske, ggplot2
Good idea - it'll be their in ggplot 0.8.9.

Hadley

> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Elaine

unread,
Aug 10, 2010, 9:40:43 AM8/10/10
to ggplot2
Great to hear. We use this all the time, but it would be nice not to
have to source the function. We most often want to show 2 SEs, so a
multiplicative factor would be a good parameter to have.

takahashi kohske

unread,
Aug 10, 2010, 1:07:42 PM8/10/10
to Elaine, ggplot2
Hi,

Thanks Hadley, I'm happy to hear your agreement.

As for Elaine's proposal, probably there are two approach.

One is full-parametarized:

mean_se<-function (n=1)
{
return(function(x){
x<-na.omit(x)
se<-function(x)sqrt(var(x)/length(x))*n
data.frame(y=mean(x), ymin=mean(x)-se(x), ymax=mean(x)+se(x))
})
}

usage:

qplot(cyl, mpg, data=mtcars)+stat_summary(fun.data = mean_se(), colour = "red")
qplot(cyl, mpg, data=mtcars)+stat_summary(fun.data = mean_se(2), colour = "red")

this is flexible, but inconsistent to the other mean_*, e.g.,
mean_cl_boot, because they are not function apparently.

another approach is that functionized one:

mean_2se<-function(x){
x<-na.omit(x)
se<-function(x)sqrt(var(x)/length(x))*2
data.frame(y=mean(x), ymin=mean(x)-se(x), ymax=mean(x)+se(x))
}

qplot(cyl, mpg, data=mtcars)+stat_summary(fun.data = mean_2se, colour = "red")

this maybe too ad-hoc, but is consistent to the others. In addition,
2se or 3se is enough for this purpose.

I don't know which is better approach. anyway, the implementation is easy.

Thanks.

Hadley Wickham

unread,
Aug 12, 2010, 10:57:41 AM8/12/10
to takahashi kohske, Elaine, ggplot2
Or a little more simply:

mean_se <- function(x, mult = 1) {
x <- na.omit(x)
se <- mult * sqrt(var(x) / length(x))
mean <- mean(x)
data.frame(y = mean, ymin = mean - se, ymax = mean + se)
}

qplot(cyl, mpg, data=mtcars) +
stat_summary(fun.data = mean_se, colour = "red", mult = 2)

Hadley

takahashi kohske

unread,
Aug 12, 2010, 7:47:26 PM8/12/10
to Hadley Wickham, Elaine, ggplot2
yes... I completely missed that we can pass the extra arguments to
fun.*, excellent.
I'm looking forward to the new version.
Reply all
Reply to author
Forward
0 new messages