using dplyr to create new variables on the fly

1,300 views
Skip to first unread message

ArjunaCap

unread,
Dec 15, 2015, 6:00:39 PM12/15/15
to manipulatr
suppose we have the following data:

library(dplyr)


df
<- data_frame(base = 1:10)


and that we want to generate a series of new named variables, something along the lines of the following (failing) pseudocode

for (i in 2:5) {
     df
<- df %>% mutate(paste0("power_",i) = base^i )
}



i've also tried using NSE, which I probably should understand better, and the following works, but I cant figure out how to NAME the variables on the fly.  

I'm sure it's something really simple:

for (i in 2:5) {
     df
<- df %>% mutate_(paste0("base^",i) )
}



Any guidance is appreciated

Brandon Hurr

unread,
Dec 15, 2015, 10:26:55 PM12/15/15
to ArjunaCap, manipulatr
I don't have an answer for the names, but you can express the same thing in your loop directly in the paste0() with: 

df %>%
  mutate_(paste0("base^", 2:5)) 

Something related from here:
https://stackoverflow.com/questions/30382908/r-dplyr-rename-variables-using-string-functions
df %>%
  mutate_(paste0("base^", 2:5)) %>%
  setNames(c("base", paste0("base", 2:5)))

HTH,
B

--
You received this message because you are subscribed to the Google Groups "manipulatr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to manipulatr+...@googlegroups.com.
To post to this group, send email to manip...@googlegroups.com.
Visit this group at https://groups.google.com/group/manipulatr.
For more options, visit https://groups.google.com/d/optout.

Dennis Murphy

unread,
Dec 16, 2015, 3:38:45 AM12/16/15
to ArjunaCap, manipulatr
Hi:

This is a simple vectorized operation in base R using the outer()
function, so apply it as the argument to do() and then use setNames()
as Brandon suggested:

library(dplyr)

df <- data.frame(base = 1:10)

df %>% do(., as.data.frame(outer(.$base, 1:5, FUN = "^"))) %>%
setNames(paste0("power_", seq(5)))


We need to wrap the output of outer() with as.data.frame() because
dplyr expects the resulting data object along each step of the
pipeline to be a data frame or tbl object.

Dennis

On Tue, Dec 15, 2015 at 3:00 PM, ArjunaCap <mcaw...@greenstenergy.com> wrote:

Hadley Wickham

unread,
Dec 16, 2015, 8:37:56 AM12/16/15
to ArjunaCap, manipulatr
You need to give mutate_ a named list with the .dots argument:

df <- df %>% mutate_(.dots = setNames(list(substitute(base ^ i)),
paste0("power_",i)))

Hadley
> --
> You received this message because you are subscribed to the Google Groups
> "manipulatr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to manipulatr+...@googlegroups.com.
> To post to this group, send email to manip...@googlegroups.com.
> Visit this group at https://groups.google.com/group/manipulatr.
> For more options, visit https://groups.google.com/d/optout.



--
http://had.co.nz/

Michael Cawthon

unread,
Dec 16, 2015, 12:33:02 PM12/16/15
to Hadley Wickham, manipulatr
hmmm. this fails:

for(i in 2:5) {
df <- df %>% mutate_(.dots = setNames(list(substitute(base ^ i)),
paste0("power_",i)))
}

with "Error: object 'i' not found"

i've looked at each of the constituent functions and it's unclear to me
why the loop index doesn't get passed
Michael Cawthon
Chief Investment Officer
Green Street Energy LLC
mcaw...@greenstenergy.com
p: 479-442-1407

Hadley Wickham

unread,
Dec 16, 2015, 2:27:11 PM12/16/15
to Michael Cawthon, manipulatr
I realised there's a better approach anyway:

df <- data_frame(base = 1:10)

pow <- lapply(2:5, function(i) bquote(base ^ .(i)))
names(pow) <- paste0("pow", 2:5)

df %>% mutate_(.dots = pow)

Hadley
--
http://had.co.nz/

Michael Cawthon

unread,
Dec 16, 2015, 5:40:43 PM12/16/15
to Hadley Wickham, manipulatr
Thank you; one problem (my MRE was too minimal):

suppose we define:


f1 <- function(x, i) x ^ i


following the suggested lapply pattern:

df <- data_frame(base = 1:10)
pow <- lapply(2:5, function(i) bquote(f1(base , .(i))))

names(pow) <- paste0("pow", 2:5)
df %>% mutate_(.dots = pow)


this fails with: could not find function f1

I can't figure out a problem with environments

Brandon Hurr

unread,
Dec 16, 2015, 6:08:32 PM12/16/15
to Michael Cawthon, Hadley Wickham, manipulatr

df <- data_frame(base = 1:10)
f1 <- function(x, i) x ^ i
pow <- lapply(2:5, function(i) as.formula(paste0("~", 'f1(base, ', i, ')')))
names(pow) <- paste0("pow", 2:5)
df %>% mutate_(.dots = pow)



Seems to work... after I run the example in the link first. 
df <- data.frame(x = rnorm(10), y=rnorm(10))
test <- function(x) x^2
df %>% mutate_(as.formula(paste0("~", "test(x)")))
 
Otherwise, I get an error:
Error in UseMethod("mutate_") : 
  no applicable method for 'mutate_' applied to an object of class "function"

I have no idea why though. 

> devtools::session_info()
Session info -------------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.2.3 (2015-12-10)
 system   x86_64, darwin13.4.0        
 ui       AQUA                        
 language (EN)                        
 collate  en_US.UTF-8                 
 tz       America/Los_Angeles         
 date     2015-12-16                  

Packages -----------------------------------------------------------------------------------------------------------
 package    * version     date       source                          
 assertthat   0.1         2013-12-06 CRAN (R 3.2.0)                  
 DBI          0.3.1       2014-09-24 CRAN (R 3.2.0)                  
 devtools     1.9.1.9000  2015-12-15 Github (hadley/devtools@9aaa3af)
 digest       0.6.8       2014-12-31 CRAN (R 3.2.0)                  
 dplyr      * 0.4.3.9000  2015-11-16 Github (hadley/dplyr@4f2d7f8)   
 lazyeval     0.1.10.9000 2015-05-26 Github (hadley/lazyeval@ecb8dc0)
 magrittr     1.5         2014-11-22 CRAN (R 3.2.0)                  
 memoise      0.2.1       2014-04-22 CRAN (R 3.2.0)                  
 R6           2.1.1       2015-08-19 CRAN (R 3.2.0)                  
 Rcpp         0.12.2      2015-11-15 CRAN (R 3.2.2)   

Michael Cawthon

unread,
Dec 16, 2015, 9:58:09 PM12/16/15
to Brandon Hurr, Hadley Wickham, manipulatr
very relevant SO entry- thanks

and the as.formula construction works well for my non-minimal use case

still, I'm not sure I fully understand why this construction works, but not the other, aside from the formula preserving the correct calling environment
Reply all
Reply to author
Forward
0 new messages