Nested facets

507 views
Skip to first unread message

Michael Kubovy

unread,
Apr 16, 2012, 9:28:42 AM4/16/12
to ggp...@googlegroups.com
I am trying to imitate the following:

library( lme4 )
data( Pastes )
Pastes$bb <- with( Pastes, reorder( batch, strength ) )
Pastes$ss <- with( Pastes,
  ss <- reorder(
    reorder( sample, strength ), as.numeric( bb )
  )
)
dotplot( ss ~ strength | bb, Pastes,
  strip = FALSE, strip.left = TRUE, layout = c( 1, 10 ),
  scales = list( y = list( relation = 'free' ) ),
  xlab = 'Paste strength', ylab = 'Sample within batch',
  type = c( 'p', 'a' ), jitter.y = TRUE )

#as close as I can get
library( ggplot2 )
Pastes$batch <- with( Pastes, reorder( batch, strength ) )
Pastes$cask <- with( Pastes, reorder( sample, strength ) )
Pastes$sample <- with( Pastes,
  reorder(
    reorder( sample, strength ), as.numeric( batch )
  )
)
ggplot( Pastes, aes( y = strength, x = cask ) ) +
   geom_point( shape = 1,
      position = position_jitter( width = 0.05  ) ) +
   geom_smooth( aes( group = 1 ), se = FALSE )+
   facet_wrap( ~ batch, ncol = 1, drop = TRUE) + coord_flip() +
   xlab( 'Sample within batch' )

______________________________________________
Professor Michael Kubovy
University of Virginia
Department of Psychology
for mail add: for FedEx or UPS add: 
P.O.Box 400400 Gilmer Hall, Room 102
Charlottesville, VA 22904-4400 485 McCormick Road
USA Charlottesville, VA 22903
room phone
Office:    B011 +1-434-982-4729
Lab:        B019 +1-434-982-4751
WWW:    http://www.people.virginia.edu/~mk9y/

Jean-Olivier Irisson

unread,
Apr 16, 2012, 11:14:46 AM4/16/12
to Michael Kubovy, ggplot2
First, having the casks (small a, b, c) not in the same order in the legend of all facets might be a problem; but I'm not sure I understand the data and what you are trying to show with this plot. A bit of background would be welcome.
It seems to me that the cask letter is just a label, rather than a coordinate, and that you just want the casks ordered by increasing strength in each batch. In which case you may want to mask the y coordinate and label points with a colour or a shape (or forget about cask completely).

data(Pastes)
# reorder batches
Pastes$batch <- reorder(Pastes$batch, -Pastes$strength)
# compute the order of the casks within each batch
library(plyr)
Pastes <- ddply(Pastes, ~batch, function(x) {
x$caskScore <- as.numeric(reorder(x$cask, x$strength))
return(x)
})

# label the casks with a colour/shape
ggplot( Pastes, aes( y = factor(caskScore), x = strength , shape=cask) ) + geom_point() + facet_grid(batch~.)
# NB: ideally you would hide only the y axis but that's not possible with the current theming system
# NB: using colour here prevents you from using geom_smooth because the smooth would be done by colour, but geom_smooth is probably not what you want here anyway


Another possibility is to create a fake y coordinate and relabel if afterwards

data(Pastes)
# reorder batches
Pastes$batch <- reorder(Pastes$batch, -Pastes$strength)
# compute the order of the casks within each batch
library(plyr)
Pastes <- ddply(Pastes, ~batch, function(x) {
x$caskScore <- as.numeric(reorder(x$cask, x$strength))
return(x)
})
# compute an overall y scale, ranking by batch (tens) and then cask within each batch (unities)
Pastes$batchScore <- as.numeric(Pastes$batch)
Pastes$score <- Pastes$batchScore*10 + Pastes$caskScore

ggplot( Pastes, aes( y = score, x = strength) ) + geom_point() +
# the facets do not have the same scale so forget about it
facet_grid(batch~., scales="free_y") +
# relabel the y axis breaks
scale_y_continuous("Cask within batch", breaks=Pastes$score, labels=Pastes$sample, expand=c(0.05, 0.2))

# you also jitter points along the y axis but I don't see why actually
# you could do it this way
ggplot( Pastes, aes( y = score, x = strength) ) + geom_point(position = position_jitter( height = 0.05 )) +
# the facets do not have the same scale so forget about it
facet_grid(batch~., scales="free_y") +
# relabel the y axis breaks
scale_y_continuous("Cask within batch", breaks=Pastes$score, labels=Pastes$sample, expand=c(0.05, 0.2))
# except that removes some y breaks (<- no idea why...)

# now you want to add the lines, which are just connecting the mean for each sample if I understand correctly
# compute the mean per sample (but keep all other info - batch, cask etc. - for the plot)
PastesAvg <- ddply(Pastes, ~batch+cask+sample+score, function(x) mean(x$strength))
# plot everything
ggplot( Pastes, aes( y = score, x = strength) ) + geom_point() +
geom_line(aes(x=V1, y=score), data=PastesAvg) +
facet_grid(batch~., scales="free_y") +
scale_y_continuous("Cask within batch", breaks=Pastes$score, labels=Pastes$sample, expand=c(0.05, 0.2))

I think this is exactly the plot you want, except from minor aesthetic aspects (in a future version of ggplot, you'll be able to put the y axis labels on the right, next to the facet labels).

Jean-Olivier Irisson
---
Observatoire Océanologique
Station Zoologique, B.P. 28, Chemin du Lazaret
06230 Villefranche-sur-Mer
Tel: +33 04 93 76 38 04
Mob: +33 06 21 05 19 90
http://jo.irisson.com/

Dennis Murphy

unread,
Apr 16, 2012, 3:24:56 PM4/16/12
to Michael Kubovy, ggp...@googlegroups.com
Hi:

A little bit tricky, but I think this is what you were after:

library('plyr')
Pastes <- ddply(Pastes, 'ss', mutate, avgstr = mean(strength),
bb2 = factor(bb, levels = rev(levels(bb))))

ggplot(Pastes, aes(y = reorder(sample, strength, mean))) +
theme_bw() +
geom_point(aes(x = strength), color = 'dodgerblue') +
geom_line(aes(x = avgstr, group = 1), color = 'dodgerblue') +
facet_grid(bb2 ~ ., scales = 'free_y') +
labs(x = 'Paste strength', y = 'Sample within batch') +
opts(strip.background = theme_rect(fill = 'peachpuff'))

It's less of a headache if you compute the sample within batch means
outside of ggplot2; I used the mutate() function from plyr to generate
the means as well as reversing the levels of bb, since as.table didn't
work for me in facet_grid() - I got the same graph with as.table =
TRUE and as.table = FALSE, where the strengths were low at the top and
high at the bottom.

Notice that the x inputs in geom_point() and geom_line() are
different: the strength points are plotted, but the mean strengths are
connected with lines. In this plot, strength is the x and ordered
factor level is the y, so the group = 1 aesthetic is necessary to
connect the points across factor levels. This is how you get the
equivalent of panel.average() in ggplot2. To get separate y labels per
facet, use scales = 'free_y'.

The only obvious difference is that the strip labels are on the right
rather than the left; if there's a way to do it in ggplot2, someone
else will have to chime in because I don't know how to do it at
present, short of hacking into the facet_grid() code.

HTH,
Dennis

> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> Please provide a reproducible example:
> https://github.com/hadley/devtools/wiki/Reproducibility
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2

Hadley Wickham

unread,
Apr 16, 2012, 6:26:38 PM4/16/12
to Dennis Murphy, Michael Kubovy, ggp...@googlegroups.com
> It's less of a headache if you compute the sample within batch means
> outside of ggplot2; I used the mutate() function from plyr to generate
> the means as well as reversing the levels of bb, since as.table didn't
> work for me in facet_grid() - I got the same graph with as.table =
> TRUE and as.table = FALSE, where the strengths were low at the top and
> high at the bottom.

That's https://github.com/hadley/ggplot2/issues/497 and should be
fixed in the next version.

Hadley


--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Reply all
Reply to author
Forward
0 new messages