Help programming function using dplyr and the _ versions of commands

42 views
Skip to first unread message

Michael Ash

unread,
May 22, 2017, 8:29:34 AM5/22/17
to manipulatr
The point of the function in the MWE program below is to start with a dataframe with variables z,x1,x2,and y and to generate a new dataframe that is ordered by z and contains z, the first difference of x1, the first difference of x2, and the first difference of y. 

I'm confused by the referencing and dereferencing rules in dplyr, the underscore versions of the basic dplyr functions, e.g., mutate_, and why assignments don't seem to work normally but apparently need I().  

The code below sort of works, but I can't figure out how to name the transformation of the individual x variables (I'm planning to do this with many variables, not just x1 and x2).  The program currently overwrites the previous x variable each time the loop runs. Also, although I use the "~" and as.name() constructions that are recommended by some of the web pages on programming with dplyr, I don't really understand what I'm doing.  I also suspect that although the ~I() construction works, it is considered inelegant compared to what I'm supposed to be doing.

Comments or suggestions welcome. I've been working as best I can from, for example, http://dplyr.tidyverse.org/articles/programming.html

Thank you.

Yours,
Michael


(my.D = data.frame(my.y=runif(10), my.x1=runif(10), my.x2=1:10, my.z=runif(10)))


my.plm  <- function(D,y,x,z) {
    require(dplyr)
    require(lazyeval)
    z  <- as.name(z)
    y  <- as.name(y)
    print(D  <- arrange_(D,z))
    print(summarize_( D,~mean(z),~mean(y) ))
    print(D.fd  <- mutate_(D,
                     z = z,
                     y = ~I(y - lag(y,n=1L))
               )
          )
    for(xvar in x) {
        print(varname  <- paste("x_", xvar, sep=""))
        xvar  <- as.name(xvar)
        print(D.fd  <- mutate_(D.fd,
                           ~I(xvar - lag(xvar,n=1L))
                           )
              )
    }
}

my.plm(D=my.D,y="my.y",x=c("my.x1","my.x2"),z="my.z")


M. Edward (Ed) Borasky

unread,
May 22, 2017, 4:18:48 PM5/22/17
to Michael Ash, manipulatr
I'd go with the almost-released dplyr 0.6.0 out of GitHub and learn the new API. It's quite different from the "_" convention but it's well documented, so unless you're on a deadline maintaining some code I'd move up to the new semantics.

--
You received this message because you are subscribed to the Google Groups "manipulatr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to manipulatr+...@googlegroups.com.
To post to this group, send email to manip...@googlegroups.com.
Visit this group at https://groups.google.com/group/manipulatr.
For more options, visit https://groups.google.com/d/optout.
--
How many people can stand on the shoulders of a giant before the giant collapses?
Reply all
Reply to author
Forward
0 new messages