Change levels with dplyr

4,036 views
Skip to first unread message

Earl Brown

unread,
Feb 19, 2015, 2:25:25 PM2/19/15
to manip...@googlegroups.com
I'm wondering if there is a way to use dplyr to change the levels of a factor before piping the data frame to ggplot(), in order to create more human-reader-friendly levels in a ggplot. Here's a toy example:

toy.df <- data.frame(originalVar = c("A", "B", "C", "D", "E"))

I've tried the following three ideas, but to no avail:

toy.df %>%
  do(
    levels(.$originalVar) <- c("V", "W", "X", "Y", "Z")
  )

toy.df %>%
  mutate(newVar = factor(.$originalVar, levels = c("V", "W", "X", "Y", "Z")))

toy.df %>%
  mutate(
    newVar = do(
                factor(.$originalVar, levels = c("V", "W", "X", "Y", "Z"))
              )
  )

You can simply change the levels of the factor with levels() before calling ggplot(), or use ggplot(transform(...)), as was offered on SO here:


But, it feels like this could be accomplished in the dplyr pipeline, but I can't figure it out.

Once again, the end goal is to change the level labels in a ggplot boxplot or a faceted scatterplot to something more human-friendly, like can be accomplished with base::boxplot(..., names = c("V", "W", "X", "Y", "Z")).

Thanks for any ideas. Earl

Hadley Wickham

unread,
Feb 19, 2015, 2:51:27 PM2/19/15
to Earl Brown, manipulatr
I'd think you'd want

toy.df %>%
mutate(newVar = factor(originalVar, levels = c("V", "W", "X", "Y", "Z")))

Hadley
> --
> You received this message because you are subscribed to the Google Groups
> "manipulatr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to manipulatr+...@googlegroups.com.
> To post to this group, send email to manip...@googlegroups.com.
> Visit this group at http://groups.google.com/group/manipulatr.
> For more options, visit https://groups.google.com/d/optout.



--
http://had.co.nz/

Earl Brown

unread,
Feb 20, 2015, 2:38:24 PM2/20/15
to manip...@googlegroups.com, ekbr...@gmail.com
I think I had tried that before, but didn't mention it. It results in "<NA>":

> toy.df <- data.frame(originalVar = c("A", "B", "C", "D", "E"))
> toy.df %>%
+   mutate(newVar = factor(originalVar, levels = c("V", "W", "X", "Y", "Z")))
  originalVar newVar
1           A   <NA>
2           B   <NA>
3           C   <NA>
4           D   <NA>
5           E   <NA>

Hadley Wickham

unread,
Feb 20, 2015, 2:43:21 PM2/20/15
to Earl Brown, manipulatr
x <- factor(c("A", "B", "C", "D", "E"))
factor(x, levels = c("V", "W", "X", "Y", "Z"))
factor(x, labels = c("V", "W", "X", "Y", "Z"))

Hadley

Earl Brown

unread,
Feb 20, 2015, 3:01:31 PM2/20/15
to manip...@googlegroups.com, ekbr...@gmail.com
Simple fix. Thank you very much. That is what I was looking for. 

Here's my (slightly larger) toy example, for those who would like to change the labels in ggplots:

library("ggplot2")
library("dplyr")

toy.df <- data.frame(
  originalVar = rep(c("A", "B", "C", "D", "E"), 10),
  value = runif(50)
)

toy.df %>%
  mutate(newVar = factor(originalVar, labels = c("V", "W", "X", "Y", "Z"))) %>% # mind the 'labels', not the 'levels'
  ggplot(aes(newVar, value)) +
  geom_boxplot()


Reply all
Reply to author
Forward
0 new messages