drop=FALSE not working in facet_wrap()? bug in plyr::split_labels?

896 views
Skip to first unread message

Ben Bolker

unread,
Feb 18, 2011, 9:33:18 AM2/18/11
to ggplot2
I was trying to respond to

<http://stackoverflow.com/questions/5036938/include-unused-factor-levels-in-facets-with-ggplot2>

where the questioner wants to know if there is a way to reserve space
for unused levels in a faceted plot. From the graph he shows, he is
stretching the ggplot idiom a little bit -- what he wants might work
better in facet_grid, where space is left fo empty *combinations* of
levels, but he seems to be arranging plots of different variables with
different color schemes in three rows (with grid.arrange() or something).

According to the documentation for facet_wrap(), the drop=FALSE
argument should do this, but it didn't seem to work.

# By default, any empty factor levels will be dropped
mpg$cyl2 <- factor(mpg$cyl, levels = c(2, 4, 5, 6, 8, 10))
qplot(displ, hwy, data = mpg) + facet_wrap(~ cyl2)
# Use drop = FALSE to force their inclusion
qplot(displ, hwy, data = mpg) + facet_wrap(~ cyl2, drop = FALSE)


I have a dug in a little way to see if there is some obvious small
bug, but I can't see how this was supposed to work.

From facet-wrap.r:

# Data shape
initialise <- function(., data) {
# Compute facetting variables for all layers
vars <- ldply(data, function(df) {
as.data.frame(eval.quoted(.$facets, df))
})

.$facet_levels <- split_labels(vars, .$drop)
.$facet_levels$PANEL <- factor(1:nrow(.$facet_levels))
}


I checked to see that '.$drop' was in fact FALSE when we got inside
this function ...

OK, so the facet levels are obtained via split_labels(), which is in plyr:

function (splits, drop, id = plyr::id(splits, drop = TRUE))
{
if (length(splits) == 0)
return(data.frame())
if (drop) {
representative <- which(!duplicated(id))[order(unique(id))]
quickdf(lapply(splits, function(x) x[representative]))
}
else {
unique_values <- llply(splits, function(x) sort(unique(x)))
names(unique_values) <- names(splits)
rev(expand.grid(rev(unique_values), stringsAsFactors = FALSE))
}
}

I can't see how this was supposed to work. plyr is using expand.grid()
from base R, but I see no evidence that expand.grid() uses (or is
supposed to use?) the *levels* of the factor it is given rather than the
*values* it is given. Was this previously true?

I would have thought these were supposed to give different answers:

> split_labels(factor("A",levels=c("A","B")),drop=TRUE)
Var1
1 A
> split_labels(factor("A",levels=c("A","B")),drop=TRUE)
X1
1 A

(There is also a small typo in the documentation for plyr: the
description of the arguments is mangled ...)

Does anyone have any thoughts?

R 2.12.1, ggplot2_0.8.9, plyr_1.4

Hadley Wickham

unread,
Feb 23, 2011, 8:34:31 AM2/23/11
to Ben Bolker, ggplot2
Hi Ben,

In the NEWS for plyr 1.5 (unreleased):

* `split_labels` correctly preserves empty factor levels, which means that
`drop = FALSE` should work in more places. Use `base::droplevels` to remove
levels that don't occur in the data, and `drop = T` to remove combinations
of levels that don't occur.

I suspect that fixes the bug.

Hadley

> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Reply all
Reply to author
Forward
0 new messages