Recreating a plot from Hadley's recent ggplot2 webinar

168 views
Skip to first unread message

Sam Albers

unread,
Feb 10, 2012, 2:08:17 PM2/10/12
to ggplot2
Hello all,

I, as I am sure many of you were as well, was privy to an excellent
talk given by Hadley last Wednesday covering several topics to do with
ggplot2. This is was an excellent informative session and a big thanks
goes to to Hadley for taking the time to do this. One plot that Hadley
displayed in his talk caught my eye but I haven't been able to find a
reference to or even a starting point to begin to recreate. I was
wondering if someone might be able to point me in the right direction.
The plot is one where many small distributions are plotted on top of a
larger plot replacing a larger a more confusing scatter plot. I've
attached a screenshot from Hadley's presentation which illustrate
exactly the type of plot I am looking for.

Thanks in advance and again thanks Hadley for the great talk!

Sam

Many-dist-plot.png

Brian Diggs

unread,
Feb 13, 2012, 1:33:21 PM2/13/12
to ggplot2


I missed the talk (hope to watch the recording sometime), but that plot
looked familiar. I finally hunted down where I had seen it before:

http://blog.revolutionanalytics.com/2011/10/ggplot2-for-big-data.html

That blog post is about 4 months old and ends with the sentence "I'm
currently with working another student, Yue Hu, to turn our research
into a robust R package."

Poking around Hadley's github repositories, it looks like bigvis is the
related repository. That might give you something to get started on how
to create such a plot.

--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University

Yuri Zharikov

unread,
Feb 13, 2012, 10:47:32 PM2/13/12
to Sam Albers, ggplot2
Is the talk in recording or ppt available online? Thank you, Yuri


Sam

--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442

To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

Chris Neff

unread,
Feb 14, 2012, 9:42:26 AM2/14/12
to Brian Diggs, ggplot2
Wow, that looks really cool. I'm looking forward to this being
released and will happily be one of the first testers.

Sam Albers

unread,
Feb 15, 2012, 7:25:42 PM2/15/12
to Chris Neff, Brian Diggs, ggplot2
Thanks for the response Brian. Since you mentioned it, I also remember
seeing that post that you referenced. Unfortunately, I might be out of
my depth but I tried to install the "bigvis" development but something
seemed to go wrong. Maybe I did something wrong here and I apologize
if this seems so basic. I also attached the session info in case that
helps. Can anyone see any obvious mistakes? Perhaps 'bigvis' isn't
meant to be installed like this. It is just that I have the exact
perfect application for that type of plot and so it's exist is
tantalizing to say the least.

Thanks in advance.

Sam

> library(devtools)
> install_github('bigvis')
Installing bigvis from hadley
Error in unzip(src, list = TRUE) :
zip file '/tmp/Rtmpk4tJje/hadley-bigvis.zip' cannot be opened
> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
[5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] devtools_0.5.1

loaded via a namespace (and not attached):
[1] RCurl_1.9-5 tools_2.14.1

Prabhas Pokharel

unread,
Feb 15, 2012, 8:37:45 PM2/15/12
to ggp...@googlegroups.com, Chris Neff, Brian Diggs
I had an error with similar symptoms on Mac OS X Lion, check out https://groups.google.com/d/topic/ggplot2-dev/5qlN4n464ME/discussion to see if it helps (last 3 emails in particular).

Brian Diggs

unread,
Feb 16, 2012, 10:29:05 AM2/16/12
to ggplot2
On 2/13/2012 7:47 PM, Yuri Zharikov wrote:
> Is the talk in recording or ppt available online? Thank you, Yuri

http://blog.revolutionanalytics.com/2012/02/slides-and-replay-for-a-backstage-tour-of-ggplot2.html

> On 10 February 2012 11:08, Sam Albers<tonightsthenight-Re...@public.gmane.org> wrote:
>
>> Hello all,
>>
>> I, as I am sure many of you were as well, was privy to an excellent
>> talk given by Hadley last Wednesday covering several topics to do with
>> ggplot2. This is was an excellent informative session and a big thanks
>> goes to to Hadley for taking the time to do this. One plot that Hadley
>> displayed in his talk caught my eye but I haven't been able to find a
>> reference to or even a starting point to begin to recreate. I was
>> wondering if someone might be able to point me in the right direction.
>> The plot is one where many small distributions are plotted on top of a
>> larger plot replacing a larger a more confusing scatter plot. I've
>> attached a screenshot from Hadley's presentation which illustrate
>> exactly the type of plot I am looking for.
>>
>> Thanks in advance and again thanks Hadley for the great talk!
>>
>> Sam
>>
>> --
>> You received this message because you are subscribed to the ggplot2
>> mailing list.
>> Please provide a reproducible example: http://gist.github.com/270442
>>

>> To post: email ggplot2-/JYPxA39Uh5...@public.gmane.org
>> To unsubscribe: email ggplot2+unsubscribe-/JYPxA39Uh5...@public.gmane.org
>> More options: http://groups.google.com/group/ggplot2
>>
>


--

Brian Diggs

unread,
Feb 16, 2012, 10:41:06 AM2/16/12
to ggplot2
On 2/15/2012 4:25 PM, Sam Albers wrote:
> Thanks for the response Brian. Since you mentioned it, I also remember
> seeing that post that you referenced. Unfortunately, I might be out of
> my depth but I tried to install the "bigvis" development but something
> seemed to go wrong. Maybe I did something wrong here and I apologize
> if this seems so basic. I also attached the session info in case that
> helps. Can anyone see any obvious mistakes? Perhaps 'bigvis' isn't
> meant to be installed like this. It is just that I have the exact
> perfect application for that type of plot and so it's exist is
> tantalizing to say the least.
>
> Thanks in advance.

I hadn't tried installing it, but when I did, I also got an error,
though one different than yours (a complaint about Rcpp). Two things
come to mind:

1) Hadley has not "released" this package, or advertised it, or ever
claimed that it was in general working shape. I may be that it can not
be installed as is, and it is not some problem on your end. This is a
risk with public repositories; they aren't all finished, polished products.

2) The description includes the phrase "particularly in conjunction with
RevoScaleR." Maybe it depends on RevoScaleR in some non-obvious way, or
will only run on RevoScaleR and not stock R.

> Sam
>
>> library(devtools)
>> install_github('bigvis')
> Installing bigvis from hadley
> Error in unzip(src, list = TRUE) :
> zip file '/tmp/Rtmpk4tJje/hadley-bigvis.zip' cannot be opened
>> sessionInfo()
> R version 2.14.1 (2011-12-22)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
> [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] devtools_0.5.1
>
> loaded via a namespace (and not attached):
> [1] RCurl_1.9-5 tools_2.14.1
>
>

> On Tue, Feb 14, 2012 at 6:42 AM, Chris Neff<caneff-Re5JQEe...@public.gmane.org> wrote:
>> Wow, that looks really cool. I'm looking forward to this being
>> released and will happily be one of the first testers.
>>

Hadley Wickham

unread,
Feb 27, 2012, 12:22:54 PM2/27/12
to Brian Diggs, ggplot2
> I hadn't tried installing it, but when I did, I also got an error, though
> one different than yours (a complaint about Rcpp). Two things come to mind:
>
> 1) Hadley has not "released" this package, or advertised it, or ever claimed
> that it was in general working shape. I may be that it can not be installed
> as is, and it is not some problem on your end. This is a risk with public
> repositories; they aren't all finished, polished products.

It does work - but currently you need to load it with load_all from devtools.

> 2) The description includes the phrase "particularly in conjunction with
> RevoScaleR." Maybe it depends on RevoScaleR in some non-obvious way, or will
> only run on RevoScaleR and not stock R.

It should run with both - if you're working with Revo's xdf data
format, it will use revo's tools, but it also works with base
data.frames.

Hadley

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Andrew

unread,
Apr 30, 2012, 4:58:32 PM4/30/12
to ggp...@googlegroups.com
Any news on recreating this plot? I tried to recreate it using stat_summary2d() with geom set to "histogram", but it doesn't seem to like nesting geoms like that. 

Andrew

Sam Albers

unread,
May 31, 2012, 1:33:53 PM5/31/12
to Andrew, ggp...@googlegroups.com
Hello all,

I took another stab at this but just can't seem to figure it out. It
is only because it is such an attractive way to present data that I
keep coming back to it as I feel like I might not be the only one.

I am using the most current version of both R and ggplot2 for this. So
I installed bigvis using the following:

library(Rcpp)
library(ggplot2)
library(devtools)
install_github('bigvis')
library(bigvis)

## Now I am able to access the help file. For example:
?density_2d
## But not the function
density_2d

## I can, however, access density_1d but the example in the help
doesn't seem to work. i.e.:

> bin <- bin_nd(mtcars, "mpg", 0.01)
> dens <- density_1d(bin, 0.5)
Error in .Primitive(".Call")(<pointer: (nil)>, sampleS, kernelS) :
NULL value passed as symbol address
> plot(dens)
Error in plot(dens) : object 'dens' not found

I suppose that this might just mean that this isn't ready for general
consumption but I suppose I just wanted to give this a further
interest bump.

Thanks in advance!

Sam
> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> Please provide a reproducible example:
> https://github.com/hadley/devtools/wiki/Reproducibility

Charlotte Wickham

unread,
May 31, 2012, 7:23:51 PM5/31/12
to Sam Albers, Andrew, ggp...@googlegroups.com
Hi Sam,

Instead of using install_github, download the package source directly and use load_all to load it.  I.e. I grabbed the zip file from https://github.com/hadley/bigvis/zipball/master and unzipped it into ~/R-dev/bigvis  then:

library(devtools)
setwd("~/R-dev/")
load_all("bigvis")

This gets pretty close, without the reference boxes and relative scaling:

library(ggplot2)
library(plyr)
diamonds_sub <- mutate(subset(diamonds, carat < 1.5 & price < 10000),
 price_grid = cut_number(price, 20),
 carat_grid = cut_number(carat, 20))

dia_binned <- bin_nd(diamonds_sub, c("price", "carat", "color"),
 binwidth = c(1000, 0.1, 1))

nbars <- length(dia_binned$centers$color)
dia_binned$centers$color <- 0:(nbars - 1)

dia_glyph <- glyphs(dia_binned, "carat", "color", "price", height = rel(0.9),
 width = nbars/(nbars + 1) * 0.9 * 0.1)

bar_width <- attr(dia_glyph, "width") / (nbars)
glyph_height <- attr(dia_glyph, "height")
dia_glyph <- ddply(dia_glyph, "gid", mutate, total = sum(value))

ggplot(subset(dia_glyph, total > 0)) +
 geom_rect(aes(xmin = gx - bar_width/2, xmax = gx + bar_width/2,
   ymin = price - glyph_height/2 , ymax = gy, fill = factor(color),
               color = factor(color))) +
   xlab("carat") + ylab("price")

Inline image 1
image.png
Reply all
Reply to author
Forward
0 new messages