Why does dplyr:recode not allow to change the number of levels, but plyr:revalue?

90 views
Skip to first unread message

Jan-Michael Becker

unread,
Oct 12, 2016, 10:01:42 AM10/12/16
to manipulatr
I have a quite simple task: I want to recode a factor into another factor thereby reducing the number of levels. This will be part of a larger data wrangling task.

plyr:revalue let me do this, but recode in dplyr doesn't. At the moment I have a mix of both packages in my code, which I want to avoid as I think it could cause problems due to different maskings of functions and whatever else when transfering the code to other systems.

Any ideas how to solve the problem?

Example code (using revalue):
df <- dataset %>% mutate(prod_type = revalue(dataset$category, c("cars"="durables", "detergents"="cpg", "frozen food"="cpg", "cosmetics"="cpg", "household appliances"="durables", ...)))

Ista Zahn

unread,
Oct 12, 2016, 12:58:38 PM10/12/16
to Jan-Michael Becker, manipulatr
In what way doesn't dplyr::recode let you do this? Can you make a
reproducible example showing the actual and desired result?

--Ista
> --
> You received this message because you are subscribed to the Google Groups
> "manipulatr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to manipulatr+...@googlegroups.com.
> To post to this group, send email to manip...@googlegroups.com.
> Visit this group at https://groups.google.com/group/manipulatr.
> For more options, visit https://groups.google.com/d/optout.

Earl Brown

unread,
Oct 13, 2016, 9:43:16 AM10/13/16
to manipulatr
I'm also not sure I understand exactly what you're looking for, but the recode function in the car package may help. The following two lines of code give the same result:

# with car package
car
::recode(letters, "c('a', 'b', 'c', 'd', 'e') = 'first_five_letters'; else = 'other'")


# with dplyr
dplyr
::recode(letters, a = "first_five_letters", b = "first_five_letters", c = "first_five_letters", d = "first_five_letters", e = "first_five_letters", .default = "other")

A question for the dplyr wizards,
Is there a more concise way to write the dplyr code, something as concise as the car code above?
Reply all
Reply to author
Forward
0 new messages