Add mean line to density plot, but clip it so that it stays inside the border of the density curve.

1,718 views
Skip to first unread message

eipi10

unread,
Oct 1, 2013, 5:59:22 PM10/1/13
to ggp...@googlegroups.com
With some help from this group and Stackoverflow, I was able to add vertical lines marking the means in a grouped density plot. However, the vertical lines extend all the way from the very bottom to the very top of the plot region. Is there any way to clip the lines so that they stay inside the borders of the density plot?

Here's a reproducible example with fake data:

df =structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L), .Label = c("Group 1", "Group 2"), class = "factor"), score = c(10.8,
11.7, 15.2, 5.8, 11.7, 8.4, 7.7, 12.7, 11.5, 8.9, 9.6, 7.2, 10,
9.8, 7.3, 7.9, 7.4, 10.3, 13, 11.5, 13.1, 7.9, 14.5, 4.1, 4,
7.6, 0.1, 4.5, 6.5, -1.3, 2.2, 5.1, 6.2, 3.1, 8.3, 5, 4, 4.1,
-0.2, 12.4)), .Names = c("group", "score"), row.names = c(NA,
-40L), class = "data.frame")

grp.mean = ddply(df, .(group), summarise, mean=mean(score))

ggplot(df, aes(score, fill=group)) +
  geom_density(lwd=.7,alpha=.4) +
  geom_vline(data=grp.mean,
             mapping=aes(xintercept=mean, colour=group), lwd=1)

I'd like to clip the vertical lines so that they remain inside the borders of the density curves.

Thanks for any suggestions on how to do this.

Joel

Thomas

unread,
Oct 3, 2013, 9:22:07 AM10/3/13
to ggp...@googlegroups.com
Below is an inexact solution you can use until someone posts the exact solution.

Use geom_segment()

library(gplot2)
library(plyr)
...
grp.mean = cbind(ddply(df, .(group), summarise, mean=mean(score)), O2=c(0.135,0.107), O=c(0,0))

ggplot(df, aes(score, fill=group)) +
    geom_segment(data=grp.mean, aes( x=mean, y=O, xend=mean, yend=O2), lwd=1)+
    geom_density(lwd=.7,alpha=.4)

This requires that you guess at the Y values. A more exact solution can probably be achieved by finding the coordinates geom_density is using at those points.

eipi10

unread,
Oct 3, 2013, 2:10:19 PM10/3/13
to ggp...@googlegroups.com
Dennis Murphy sent me the solution below. It adapts code from a post on CrossValidated to calculate the y-value (of the density function) for any given x-value, and then geom_segment() to draw the vertical lines between zero and y(x).

Joel

library(plyr)
library(ggplot2)


df =structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L), .Label = c("Group 1", "Group 2"), class = "factor"), score = c(10.8,
11.7, 15.2, 5.8, 11.7, 8.4, 7.7, 12.7, 11.5, 8.9, 9.6, 7.2, 10,
9.8, 7.3, 7.9, 7.4, 10.3, 13, 11.5, 13.1, 7.9, 14.5, 4.1, 4,
7.6, 0.1, 4.5, 6.5, -1.3, 2.2, 5.1, 6.2, 3.1, 8.3, 5, 4, 4.1,
-0.2, 12.4)), .Names = c("group", "score"), row.names = c(NA,
-40L), class = "data.frame")

dens_at_mean <- function(d)
{
 # Copied from a StackOverflow solution by whuber:
 #http://stats.stackexchange.com/questions/32093/
 #how-to-draw-mean-median-and-mode-lines-in-r-that-end-at-density
  dens <- density(d$score)
  n <- length(dens$y)
  dx <- mean(diff(dens$x))                  # Typical spacing in x
  y.unit <- sum(dens$y) * dx                # Check: this should integrate to 1
  dx <- dx / y.unit                         # Make a minor adjustment
  x.mean <- sum(dens$y * dens$x) * dx
  y.mean <- dens$y[length(dens$x[dens$x < x.mean])]
  data.frame(mean = x.mean, est.dens = y.mean)
}

grp.mean <- ddply(df, .(group), dens_at_mean)

ggplot(df, aes(score, fill=group)) +
  geom_density(lwd=.7,alpha=.4, color = "transparent") +
  geom_segment(data = grp.mean, aes(x = mean, xend = mean, y = 0,
                                    yend = est.dens, color = group), size = 1)

Setting color to transparent in geom_density gets rid of those
diagonal lines many people hate in the legend key, allowing for a
reasonably nice looking colored segment in its place. To get rid of
it, suppress the color scale with

... + scale_color_discrete(guide = "none")

If you like it, feel free to post it.

Dennis
Reply all
Reply to author
Forward
0 new messages