On 2012-Apr-10, at 01:56 , Mikhail Titov wrote:
>
> I am a lattice user but from what I understand there is an attitude against group-wise boxplots. So I decided to switch to ggplot2 for that. I saw an example that adds means to boxplot. However I can't figure out how to properly use it when there are groups, i.e. when I use "fill" aesthetics (do I use proper terms?). So far I have the following as a quick example:
>
> library(ggplot2)
> df <- data.frame(
> x = factor(month.abb[rep(1:12, each=30)], month.abb),
> y = runif(12*30, max=rep(1:12, each=30))*rep(3:1,each=12*10),
> z = rep(factor(c("a","b","c")), 12*10)
> )
> ggplot(df, aes(x=x,y=y, fill=z)) +
> geom_boxplot() +
> stat_summary(fun.y=mean, geom="point", aes(x=x,group=z), shape=5, size=2)
> # stat_summary(fun.y=mean, geom="point", aes(x=x,fill=z), shape=5, size=2)
>
> which apparently doesn't work properly. How can I make means appear properly along the x axis?
The boxplot are automatically "dodged" in order to not overlap. You need to also dodge the points for the means. By default points are dodged along the y axis though. If you do:
ggplot(df, aes(x=x,y=y, fill=z)) +
geom_boxplot() +
stat_summary(fun.y=mean, geom="point", aes(x=x,group=z), shape=5, size=2, position=position_dodge(width=0.75, height=0))
it issues a warning but works. Someone more knowledgeable than me of ggplot's innards might comment why.
> Also is there an easy way to rename factors in legend only like aaaaa, bbb, cccccc instead of a,b,c ? The reason for this is that I use some ids internally as it is convenient for me but I'd like use full descriptions in plots.
Just rename the factors in the data and then plot
df$z = factor(df$z, levels=c("a", "b", "c"), labels=c("aaa", "bbb", "ccc"))
or change the scale in ggplot only
last_plot() + scale_fill_discrete(breaks=levels(df$z), labels=c("aaa", "bbb", "ccc"))
Jean-Olivier Irisson
---
Observatoire Océanologique
Station Zoologique, B.P. 28, Chemin du Lazaret
06230 Villefranche-sur-Mer
Tel: +33 04 93 76 38 04
Mob: +33 06 21 05 19 90
http://jo.irisson.com/
The boxplot are automatically "dodged" in order to not overlap. You need to also dodge the points for the means. By default points are dodged along the y axis though. If you do:
ggplot(df, aes(x=x,y=y, fill=z)) +
geom_boxplot() +
stat_summary(fun.y=mean, geom="point", aes(x=x,group=z), shape=5, size=2, position=position_dodge(width=0.75, height=0))it issues a warning but works. Someone more knowledgeable than me of ggplot's innards might comment why.
> Also is there an easy way to rename factors in legend only like aaaaa, bbb, cccccc instead of a,b,c ? The reason for this is that I use some ids internally as it is convenient for me but I'd like use full descriptions in plots.
Just rename the factors in the data and then plot
df$z = factor(df$z, levels=c("a", "b", "c"), labels=c("aaa", "bbb", "ccc"))
or change the scale in ggplot only
last_plot() + scale_fill_discrete(breaks=levels(df$z), labels=c("aaa", "bbb", "ccc"))