ddply -> dplyr: .fun = summarize with several rows

301 views
Skip to first unread message

Sebastian Schubert

unread,
Jun 23, 2014, 10:09:50 AM6/23/14
to manip...@googlegroups.com
Hi,

I'm rather new to (d)plyr and want to focus on learning the more recent package. However, I failed to translate something like the following to dplyr:

library(plyr)
#library(dplyr)

dfx <- data.frame(
    group = c(rep('A', 8), rep('B', 15), rep('C', 6)),
    sex = sample(c("M", "F"), size = 29, replace = TRUE),
    age = runif(n = 29, min = 18, max = 54)
    )
 
p <- c(.2,.4,.6,.8)
ddply(dfx, .(group), .fun = summarize, p=p, stats=quantile(age,probs=p))
# dfx %>% group_by(group) %>% do(stats=quantile(.$age, probs=p))

The commented lines show what I found working with dplyr. Here, the data structure is more complicated, though. (Don't load dplyr package to have ddply working.)
What can I do to get the same result with dplyr as with ddply?

Thanks a lot,
Sebastian

Sebastian Schubert

unread,
Jun 25, 2014, 6:56:11 AM6/25/14
to manip...@googlegroups.com
> library(plyr)
> #library(dplyr)
>
> dfx <- data.frame(
> group = c(rep('A', 8), rep('B', 15), rep('C', 6)),
> sex = sample(c("M", "F"), size = 29, replace = TRUE),
> age = runif(n = 29, min = 18, max = 54)
> )
>
> p <- c(.2,.4,.6,.8)
> ddply(dfx, .(group), .fun = summarize, p=p, stats=quantile(age,probs=p))
> # dfx %>% group_by(group) %>% do(stats=quantile(.$age, probs=p))

I got help on sx:
http://stackoverflow.com/a/24405853/1463740
http://stackoverflow.com/a/24406376/1463740

and the solution is to use unnamed data.frame arguments in do():

dfx %>%
group_by(group) %>%
do(data.frame(p=p, stats=quantile(.$age, probs=p)))

Thanks
Sebastian
Reply all
Reply to author
Forward
0 new messages