Making a pick verb

40 views
Skip to first unread message

Kyle Andrews

unread,
Mar 29, 2015, 2:59:35 PM3/29/15
to manip...@googlegroups.com
I'd like to a create a new verb for dplyr flollowing this StackOverflow post and am seeking help on getting started.

While my "production code" uses the pick function shown in the question inside of mutate (hard coded to only choose from 4 columns), I'd rather be able to choose from any set of columns. Thinking about the problem some more, I came up with the pipeline below (at the bottom of this post). However, instead of typing that block of code every time, I'd like to turn it into a function. How would I go about it? I'm especially confused the non-standard evaluation of columns.

What I want:

df %>% consolidate(cols = y1:y4, pick = x)

What I have:

df %>%
  mutate
(row = row_number()) %>%
  gather
(n, y, y1:y4) %>%
  mutate
(n = as.integer(str_extract(n, "[0-9]+"))) %>%
  filter
(x == n) %>%
  arrange
(row) %>%
  select(-c(row, n))

Hadley Wickham

unread,
Mar 30, 2015, 10:10:23 AM3/30/15
to Kyle Andrews, manipulatr
The link didn't come through. Can you just explain inline what you're
trying to do?
Hadley
> --
> You received this message because you are subscribed to the Google Groups
> "manipulatr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to manipulatr+...@googlegroups.com.
> To post to this group, send email to manip...@googlegroups.com.
> Visit this group at http://groups.google.com/group/manipulatr.
> For more options, visit https://groups.google.com/d/optout.



--
http://had.co.nz/

Kyle Andrews

unread,
Mar 30, 2015, 5:03:39 PM3/30/15
to manip...@googlegroups.com, kyle.c....@gmail.com
Sorry about that. The link is below:

http://stackoverflow.com/questions/28593265/is-there-a-function-like-switch-which-works-inside-of-dplyrmutate

Basically, I am using a hacked up window function called pick inside of mutate to select meaningful columns of data into another column. It works great, but limits me to four choices and the only way I see to extend it to work for more choices is to expand the ifelse spaghetti.

pick <- function(x, v1, v2, v3, v4) {
    ifelse
(x == 1, v1,
           ifelse
(x == 2, v2,
                  ifelse
(x == 3, v3,
                         ifelse
(x == 4, v4, NA))))
}


Later, I came up with a tidyr based workflow which accomplishes largely the same thing, but requires a lot more manipulation.

df %>%
  mutate
(row = row_number()) %>%
  gather
(n, y, y1:y4) %>%
  mutate
(n = as.integer(str_extract(n, "[0-9]+"))) %>%
  filter
(x == n) %>%
  arrange
(row) %>%
 
select(-c(row, n))


Let me start with my original, immediately practical question was: is there a better implementation of pick that works for n columns?

Hadley Wickham

unread,
Mar 30, 2015, 5:17:22 PM3/30/15
to Kyle Andrews, manipulatr
Have a look at the discussion at https://github.com/hadley/dplyr/issues/631

Hadley
Reply all
Reply to author
Forward
0 new messages