Again about do(.) in dplyr

52 views
Skip to first unread message

Stuart Luppescu

unread,
Oct 25, 2016, 6:55:25 PM10/25/16
to manip...@googlegroups.com
I had a problem like this a couple of months ago. Dennis Murphy helped by suggesting using do(.). It worked then, but when I try to apply it to another problem it's failing.

I wrote a function to calculate reliability:
calc.rel <- function(x, se) {
    var.x <- var(x, na.rm=TRUE)
    mse <- sum(se^2, na.rm=TRUE)/sum(!is.na(se))
    rel <- (var.x - mse)/var.x
    data.frame(rel)
}

I have a data frame that looks like this:

 head(dm.both)
   meas   se agency     type
1  4.99 1.10    CPS Combined
2 -0.11 0.62    CPS Combined
3  2.29 0.74    CPS Combined
4  3.42 0.78    CPS Combined
5 -1.19 0.59    CPS Combined
6 -0.84 0.59    CPS Combined

I would like to calculate basic statistics and reliability for every combination of agency and type:

    dm.comp <- dm.both %>%
        group_by(type, agency) %>%
            summarize(mean=mean(meas),
                      stddev=sd(meas),
                      rel=do(calc.rel(meas, se))
                      )

But I'm getting this error:
Error in summarize(., mean = mean(meas), stddev = sd(meas), rel = do(calc.rel(meas,  :   argument "by" is missing, with no default

It works fine without the do(calc.rel()) line. 

Any ideas?
--
Stuart Luppescu -- pixbuf .at. gmail.com
I made up a new word: plagiarism.
  --jpuller

holger brandl

unread,
Oct 26, 2016, 5:12:57 AM10/26/16
to manipulatr
Stuart,

Just remove do within summarize! The grouping is taken care of by summarize

However, I think there's another mistake which is that calc.rel should be invoked with the summarized mean and stdev:

    summarize(
         mean=mean(meas),
         stddev=sd(meas),
         rel=calc.rel(meanstddev)
   )

Cheers,
Holger
Reply all
Reply to author
Forward
0 new messages