density strips in ggplot

728 views
Skip to first unread message

Tom Wood

unread,
May 8, 2011, 1:28:46 AM5/8/11
to ggp...@googlegroups.com
hi all

is it possible to replicate these density strips (attached, and also on page 4 of this pdf) in ggplot? 

i thought it would be something like the following

df2 <- data.frame(a   = rep(c("a","b","c"), each = 200),
                  b   =   c(rnorm(200, 1.25, .35), 
                            rnorm(200, -.5, .15),
                            rnorm(200,  2, 1)))

p3 <-ggplot(df2, aes(x = a, y = b)) +                   
           stat_density(aes(geom =  "area",
                            fill = ..density..)) 
but that just results in an error.

hope someone can help- this seems to be a really useful way to depict a univariate distribution 

Tom
density strip example.PNG

Art

unread,
May 8, 2011, 4:18:10 PM5/8/11
to ggp...@googlegroups.com

Hello Tom,


There might be a way to incorporate the density strip CRAN package in ggplot2.

denstrip-package {denstrip}

Best,
 Art

Tom W

unread,
May 10, 2011, 10:47:18 AM5/10/11
to ggplot2
Hi Art

Thanks for that suggestion.

The denstrip package has an option of making its output a lattice
object. To my very rudimentary understanding, this should offer us an
opportunity to simply denstrip into ggplot, since lattice and ggplot
are both built on grid.

Is there a particularly elegant way to layer ggplot and lattice grobs
in the one plot?

Tom

Hadley Wickham

unread,
May 11, 2011, 1:12:48 PM5/11/11
to Tom Wood, ggp...@googlegroups.com
Hi Tom,

You want geom_tile - there are some examples at
http://had.co.nz/ggplot2/geom_tile.html

Hadley

> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Tom W

unread,
May 12, 2011, 11:25:55 AM5/12/11
to ggplot2
Hi Hadley

Thanks for this fantastic advice. From my earlier example, the
following

ggplot(df2, aes(x = b, y = a1)) +
stat_density(aes(fill = ..density..),
geom = "tile") +
scale_fill_gradient(low = "gray75", high = "gray0", limits
= c(.05,3))

gives something very close to what I'm looking for.

I was wondering: is there anyway to use the scale_discrete syntax to
regulate the height of the tiles, and insert whitespace between tiles?

Thanks again

Tom



On May 11, 12:12 pm, Hadley Wickham <had...@rice.edu> wrote:
> Hi Tom,
>
> You want geom_tile - there are some examples athttp://had.co.nz/ggplot2/geom_tile.html
>
> Hadley

Dennis Murphy

unread,
May 12, 2011, 2:32:13 PM5/12/11
to Tom W, ggplot2
Hi:

I played around with this a bit and came up with the following for the
given example:

df2 <- data.frame(a = rep(c("a","b","c"), each = 200),
b = c(rnorm(200, 1.25, .35),
rnorm(200, -.5, .15),
rnorm(200, 2, 1)))

ggplot(df2, aes(x = b, y = a)) +


stat_density(aes(fill = ..density..), geom = "tile") +
scale_fill_gradient(low = "gray75", high = "gray0",

limits = c(.05,3)) +
facet_grid(a ~ ., scales = 'free_y', space = 'free') +
opts(axis.text.y = theme_blank())

I like the concept of this graphic; I could see potential for it to
replace rug plots and even boxplots, particularly when the sample
sizes are large and the variable is continuous (at least in
principle). Someone should give serious consideration to creating a
geom for it :)

Cheers,
Dennis

Tom W

unread,
May 12, 2011, 3:18:50 PM5/12/11
to ggplot2
Hey Dennis

That's really a very elegant solution- I'm certain I would have wasted
a ton of time before coming to facet_grid as an answer. Thanks a lot!

Like you, I think this is really nice graphical concept. I think it's
a much better use of high resolution graphics and printing to more
accurately depict a distribution, compared to rug and box plots.

Tom

Hadley Wickham

unread,
May 12, 2011, 3:26:48 PM5/12/11
to Tom W, ggplot2
> Like you, I think this is really nice graphical concept. I think it's
> a much better use of high resolution graphics and printing to more
> accurately depict a distribution, compared to rug and box plots.

I agree that it's better than rug/box plots, but I'm not sure it's
better than frequency polygons or histograms.

Hadley

Aaron Mackey

unread,
Feb 21, 2014, 10:03:52 AM2/21/14
to ggp...@googlegroups.com, Tom W
Resurrecting this older post to see if I can encourage a "densitystrip=T" option to geom_rug.  My use case is a 2D binhex plot, where I'd also like rug-like 1D distributions on the axes.  geom_rug() with alpha of 0.01 almost does what I want, but with 50K+ data points, the resulting PDFs are enormous (yes, I can of course make a PNG or JPEG but then I lose vector graphics), and I don't really need to see each individual point/tick in the rug, I just want the 1D density.

Peeking at the code for geom_rug, I see that I might be able to overlay a geom_tile() as a single ribbon, but I would need the coordinates of the 3%/97% of the axes, e.g. unit(0.03, "npc"), but I can't send those as valid y values to geom_tile ... ?

My ability to grok ggplot innards is minimal, but it seems like the segmentGrob's that geom_rug draws could be optionally replaced by polygonGrob's with the density/count info across a smaller number of bins than points.

Thoughts?

Thanks,
-Aaron

Chris Jackson

unread,
Mar 24, 2015, 10:43:31 AM3/24/15
to ggp...@googlegroups.com, thoma...@gmail.com

I'm the author of the denstrip package and associated paper, and came across this old thread while (finally!) learning how to use the excellent ggplot2.   I think ggplot2 can do everything I suggested - I should redo the denstrip documentation to include ggplot2 recipes!

Tom W and others - something like the original attachment can be done as follows: note the use of a "ypos" variable to set the y position of each strip, and the "height" aesthetic for the strip thickness. 

x <- seq(-1, 5, by=0.01) # points to evaluate the density at
df2 <- data.frame( x = rep(x, 3),
    dens = c(dnorm(x, 1.25, .35),  dnorm(x, -.5, .15),  dnorm(x,  2, 1)),
    ypos = rep(c(1, 4, 5), each=length(x)))

ggplot(df2, aes(x = x, y = ypos)) + theme_bw() +
  geom_tile(aes(fill =  dens), height=0.2) +
  scale_fill_continuous(low="white", high="black")

or if you want the maximum density to be black in every strip (see the paper for discussion of this):

df2$maxdens <- rep(with(df2, tapply(dens, factor(ypos, levels=unique(ypos)), max)), each=length(x))
df2$dens.unscaled <- df2$dens / df2$maxdens

ggplot(df2, aes(x = x, y = ypos)) + theme_bw() +
  geom_tile(aes(fill =  dens.unscaled), height=0.2) +
  scale_fill_continuous(low="white", high="black")

If you need to estimate the density from data, then use stat_density(.., geom="tile"), as already suggested. 

Aaron - a 2D binhex plot with density strips of the marginal distribution can be done using something like

qplot(x, y, data = diamonds, geom="hex", fill=..density..*10, xlim = c(3, 10),  ylim = c(3, 10), binwidth = c(0.1, 0.1)) + theme_bw() +
        stat_density(aes(y=3.5, fill=..density.., height=0.5), geom="tile", position="identity") +
        stat_ydensity(aes(x=3.5, fill=..density.., width=0.4, height=1), geom="tile", position="identity") +
        scale_fill_continuous(low="white", high="black")

The difficulty here is scaling the densities of the strips and the hexagons so that both go from (roughly) white to black.

Chris

Reply all
Reply to author
Forward
0 new messages