vline using stat_summary

1,777 views
Skip to first unread message

Adam_L...@keybank.com

unread,
Oct 3, 2011, 11:24:05 AM10/3/11
to ggp...@googlegroups.com
Is there an easy way to use geom_vline or stat_summary to produce a vertical line that represents the mean of a distribution? I can do it by summarizing outside of ggplot2 and then using multiple dataframes, but have had difficulty getting it to work against a single dataframe. And yet, this seems like a simple enough task to ask ggplot2 to do the summarizing.


example:

data.x = data.frame(x = rnorm(1000))
ggplot(data.x,aes(x)) + geom_density() + stat_summary(aes(xintercept = ?),<this is where it gets fuzzy>,geom="vline")  


Thanks very much for the help.

Adam Loveland



Email Classification: KeyCorp Internal
This communication may contain privileged and/or confidential information. It is intended solely for the use of the addressee. If you are not the intended recipient, you are strictly prohibited from disclosing, copying, distributing or using any of this information. If you received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. This communication may contain nonpublic personal information about consumers subject to the restrictions of the Gramm-Leach-Bliley Act. You may not directly or indirectly reuse or redisclose such information for any purpose other than to provide the services for which you are receiving the information. 127 Public Square, Cleveland, OH 44114


If you prefer not to receive future e-mail offers for products or services from Key
send an e-mail to mailto:DNERe...@key.com with 'No Promotional E-mails' in the SUBJECT line.

Brian Diggs

unread,
Oct 3, 2011, 1:28:21 PM10/3/11
to ggplot2
On 10/3/2011 8:24 AM,
Adam_Loveland-QlCk...@public.gmane.org wrote:
> Is there an easy way to use geom_vline or stat_summary to produce a
> vertical line that represents the mean of a distribution? I can do it by
> summarizing outside of ggplot2 and then using multiple dataframes, but
> have had difficulty getting it to work against a single dataframe. And
> yet, this seems like a simple enough task to ask ggplot2 to do the
> summarizing.
>
>
> example:
>
> data.x = data.frame(x = rnorm(1000))
> ggplot(data.x,aes(x)) + geom_density() + stat_summary(aes(xintercept =
> ?),<this is where it gets fuzzy>,geom="vline")

In this case, you don't even need to invoke stat_summary:

ggplot(data.x, aes(x)) +
geom_density() +
geom_vline(aes(xintercept = mean(x)))


> Thanks very much for the help.
>
> Adam Loveland


--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University

Adam_L...@keybank.com

unread,
Oct 3, 2011, 1:33:36 PM10/3/11
to Brian Diggs, ggplot2
Brain,

Thanks. But that doesn't work inside different facets. Does it?



Adam Loveland

--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example:
http://gist.github.com/270442

To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options:
http://groups.google.com/group/ggplot2

Brian Diggs

unread,
Oct 3, 2011, 2:12:30 PM10/3/11
to ggplot2, Adam_L...@keybank.com
On 10/3/2011 10:33 AM,
Adam_Loveland-QlCk...@public.gmane.org wrote:
> Brain,
>
> Thanks. But that doesn't work inside different facets. Does it?

No, it doesn't. That wasn't part of the original question :)

I don't think you can get it to work with stat_summary either.
stat_summary will "Summarise y values at every unique x", not allow a
summary of each facet/group. (I'd love to see a more general version
that would, though.) For this, you will have to pre-compute the means.


# smaller data, to better see mean differences
data.x = data.frame(x = rnorm(20), set = rep(c("A","B"), each=10))

# this gives the overall mean in each facet, not a per-facet mean
ggplot(data.x, aes(x=x)) +
geom_density() +
geom_vline(aes(xintercept = mean(x), group=set)) +
facet_grid(set~.)

# pre-compute the means
means <- ddply(data.x, .(set), summarise, xmean=mean(x))

# supply the alternate data set to just the geom_vline layer.
ggplot(data.x, aes(x=x)) +
geom_density() +
geom_vline(data=means, aes(xintercept = xmean)) +
facet_grid(set~.)

> Adam Loveland
>
>
>
>
>
> From: Brian Diggs<diggsb-k1...@public.gmane.org>
> To: ggplot2<ggplot2-/JYPxA39Uh5...@public.gmane.org>
> Date: 10/03/2011 01:29 PM
> Subject: Re: vline using stat_summary
> Sent by: ggplot2-/JYPxA39Uh5...@public.gmane.org
>
>
>
> On 10/3/2011 8:24 AM,

Dennis Murphy

unread,
Oct 3, 2011, 2:28:59 PM10/3/11
to Brian Diggs, ggplot2, Adam_L...@keybank.com
Hi:

I tried the same process as you did, Brian, but your first method
doesn't work on my system (Win 7, 2.13.1, ggplot 0.8.9). I needed to
summarize the means into a data frame with ddply and pass those means
to geom_vline() (method 2) before it would work. I tried using group =
both in the ggplot() statement and in geom_vline(), but neither
worked, and offhand I don't see why they shouldn't.

My example:

d <- data.frame(g = factor(rep(paste('Group', 1:3), each = 100)),
y = rnorm(300))
# produces same overall mean in each panel
ggplot(d, aes(x = y)) +
geom_density() +
geom_vline(aes(xintercept = mean(y), group = g)) +
facet_grid(g ~ .)

dm <- ddply(d, .(g), summarise, m = mean(y))
# produces different means in each panel
ggplot(d, aes(x = y)) +
geom_density() +
geom_vline(data = dm, aes(xintercept = m)) +
facet_wrap( ~ g, ncol = 1)

Dennis

Hadley Wickham

unread,
Oct 4, 2011, 11:17:50 AM10/4/11
to Brian Diggs, ggplot2, Adam_L...@keybank.com
> I don't think you can get it to work with stat_summary either. stat_summary
> will "Summarise y values at every unique x", not allow a summary of each
> facet/group. (I'd love to see a more general version that would, though.)

It's called ddply ;)

Hadley


--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Reply all
Reply to author
Forward
0 new messages