mimicking ldply with dplyr/tidyr: converting a list with observations of data to a data frame

3,790 views
Skip to first unread message

Don Boyd

unread,
Jul 9, 2014, 8:11:06 AM7/9/14
to manip...@googlegroups.com

I've never posted before so if I'm in the wrong place or am providing the wrong information please just let me know.

I love dplyr/tidyr and want to convert new work and some old work to it, from plyr/reshape2. The question below is in that spirit: I know how to do it well in plyr, and have figured out how to do it unattractively with dplyr, but I want to know if there is a way to do it well with dplyr.

I want to be able to convert a list of data to a data frame, where each list element is, conceptually, an observation in a data set. A very practical application is when data returned from an API, after conversion from JSON, is in list form like this. I can do this easily with ldply but my dplyr solution is ugly. My question is, is there a better way to do it with dplyr/tidyr?

Here is a reproducible example

# Create a list of data, where each element is an observation
obs1 <- list(x="a", value=123)
obs2 <- list(x="b", value=27)
obs3 <- list(x="c", value=99)
dlist <- list(obs1, obs2, obs3)
dlist

# I want to convert this into a data frame that looks like
goal <- data.frame(x=c("a", "b", "c"), value=c(123, 27, 99))
goal

# In the days of plyr I would do this by
df1 <- ldply(dlist, data.frame)
df1

# Here is how I have figured out how to do it with dplyr. (It produces rbind_all warnings and slight differences in column types, but neither concerns me.)
# No self-respecting analyst should accept a solution like this. This CAN'T be the best dplyr solution, can it?
df2 <- data.frame(recs=1:length(dlist)) %>% 
       group_by(recs) %>% 
       do(data.frame(t(unlist(dlist[.$recs])))) %>%
       ungroup() %>% select(-recs)
df2

Again, I know I can just use ldply, but I am trying to use dplyr as well as possible, and I am trying to convert to dplyr/tidyr as much as possible. Many thanks if you have a better way.

Don


jim holtman

unread,
Jul 10, 2014, 12:57:43 PM7/10/14
to Don Boyd, manipulatr
try this:

> as.data.frame(do.call(rbind, dlist), stringsAsFactors = FALSE)
x value
1 a 123
2 b 27
3 c 99
>

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
> --
> You received this message because you are subscribed to the Google Groups
> "manipulatr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to manipulatr+...@googlegroups.com.
> To post to this group, send email to manip...@googlegroups.com.
> Visit this group at http://groups.google.com/group/manipulatr.
> For more options, visit https://groups.google.com/d/optout.

Don Boyd

unread,
Jul 13, 2014, 8:18:50 AM7/13/14
to manip...@googlegroups.com, donb...@gmail.com
Thanks. It's not a dplyr approach, but it is a lot nicer than the dplyr approach I had devised. 

I'll be curious to see whether dplyr ever can do this as elegantly as plyr can.

Don

Hadley Wickham

unread,
Jul 16, 2014, 7:20:52 PM7/16/14
to Don Boyd, manipulatr
This falls outside of the scope of dplyr, since the input is not a
data frame. I think it's more in scope for tidyr, but that currently
provides no tools for converting lists to data frames. Maybe it
should.

Hadley
--
http://had.co.nz/

Don Boyd

unread,
Jul 16, 2014, 7:38:46 PM7/16/14
to Hadley Wickham, manipulatr
Understood and thanks.

Don

Patrick Nicholson

unread,
Jul 24, 2014, 11:20:00 AM7/24/14
to manip...@googlegroups.com, donb...@gmail.com
If you want to use a dplyr-like syntax, you can pipe to lapply. 

dlist %>% lapply(as.data.frame) %>% rbind_all()

For a more complex operation, you'll have to define a function to use in lapply that returns a data.frame.
Reply all
Reply to author
Forward
0 new messages