geom_boxplot by mean and std deviations instead of medians

2,203 views
Skip to first unread message

Giovanni Marco Dall'Olio

unread,
Feb 26, 2010, 10:08:25 AM2/26/10
to ggp...@googlegroups.com
Hi,
the current documentation for geom_boxplot is not very specific: I don't understand what the boxplot produced by geom_boxplot are showing, whether it is the median and the quantiles or any other measurement, and if it possible to change it.

For example, if you do:
> qplot(data=diamonds, x=color, y=price, geom='boxplot')

it seems that ggplot2 plots the median and quantile values of prices, but the documentation is not specific on this.

Moreover, I would like to know if I can plot boxplots with means and values within 95% confidence interval from the mean, instead of quantiles. I have tried with geom_boxplot(stat='mean') and other, and also looked at the documentation for stat_boxplot, but I couldn't find more informations.

Many thanks, ggplot2 is one of my favorite R's libraries.

--
Giovanni Dall'Olio, phd student
Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain)

My blog on bioinformatics: http://bioinfoblog.it

hadley wickham

unread,
Feb 26, 2010, 11:20:45 AM2/26/10
to dallo...@gmail.com, ggp...@googlegroups.com
Hi Giovanni,

It's a boxplot - http://en.wikipedia.org/wiki/Box_plot

If you want to do something else, it's not a boxplot, and you might
try looking at stat_summary.

Hadley

> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> To post to this group, send email to ggp...@googlegroups.com
> To unsubscribe from this group, send email to
> ggplot2+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/ggplot2

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

takahashi kohske

unread,
Feb 26, 2010, 11:06:32 AM2/26/10
to dallo...@gmail.com, ggp...@googlegroups.com
Hi, I'm a newbie in ggplot2, so there may be more elegant solution.

As for the first question, they are documented in a web site:
http://had.co.nz/ggplot2/stat_boxplot.html

As for the second question, you can combine stat_summary and geom_boxplot:

fun<-function(y){
try_require("Hmisc")
ci<-smean.cl.boot(y)
data.frame(lower=ci["Lower"],upper=ci["Upper"],ymin=min(y),ymax=max(y),middle=mean(y))
}

qplot(data=diamonds[1:100,],x=color,
y=price)+stat_summary(fun.data=fun, geom='boxplot')

Please play with function fun() ,e.g., change the return value, but I
would like to note that the names of column of return value should be
consistent with the names of aes in geom_boxplot (see ?geom_boxplot).

HTH

-KO

2010/2/27 Giovanni Marco Dall'Olio <dallo...@gmail.com>:

Giovanni Marco Dall'Olio

unread,
Mar 1, 2010, 5:36:54 AM3/1/10
to takahashi kohske, ggp...@googlegroups.com
On Fri, Feb 26, 2010 at 5:06 PM, takahashi kohske <takahash...@gmail.com> wrote:
Hi, I'm a newbie in ggplot2, so there may be more elegant solution.

As for the first question, they are documented in a web site:
http://had.co.nz/ggplot2/stat_boxplot.html

As for the second question, you can combine stat_summary and geom_boxplot:

fun<-function(y){
       try_require("Hmisc")
       ci<-smean.cl.boot(y)
       data.frame(lower=ci["Lower"],upper=ci["Upper"],ymin=min(y),ymax=max(y),middle=mean(y))
       }

qplot(data=diamonds[1:100,],x=color,
y=price)+stat_summary(fun.data=fun, geom='boxplot')


Many thanks, it seems to work fine.
I know that boxplots are usually done with median and quantiles, but even the definition on wikipedia admits that it is not uncommon to plot them with means and standard deviations, so I tought there it could already be an option.

 
Reply all
Reply to author
Forward
0 new messages