Issue applying a function with ddply()

115 views
Skip to first unread message

David Mercer

unread,
Aug 8, 2012, 2:52:39 PM8/8/12
to manip...@googlegroups.com
I have a data set containing 6 groups.  I am trying to impute the missing values and need to do it within groups.  I am using the irmi() function from the VIM package.  I have found a clunky approach to get the job done, but I would like to know how to do it with ddply().  I cannot figure out my problem.  Below is an example using the sleep data frame provided in VIM.  I created a group variable for the purpose of this post.



library(VIM)

sleep$group<-c(rep(1,31),rep(2,31))

# runs fine on one group:
Group1_Fixed <- irmi( sleep[sleep$group==1, 1:7], init.method="median" )
Group1_Fixed

# this approach works:
Fixed <- rbind.fill(
  irmi(sleep[sleep$group==1,1:7]) ,
  irmi(sleep[sleep$group==2,1:7]))
Fixed

# this does not work:
imputed <- ddply(.data=sleep[,1:7],
                 .variables=sleep$group,
                 .fun=irmi)




Dennis Murphy

unread,
Aug 8, 2012, 5:55:16 PM8/8/12
to David Mercer, manip...@googlegroups.com
Hi:

You need group to be in the data frame when you make the ddply() call
(variable 11):

ddply(sleep[, c(1:7, 11)], "group", irmi, init.method = 'median')

I don't know the details of the irmi() function, but the warnings that
are signaled might be worth perusing.

Dennis
> --
> You received this message because you are subscribed to the Google Groups
> "manipulatr" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/manipulatr/-/MxSRi3VAzOUJ.
> To post to this group, send email to manip...@googlegroups.com.
> To unsubscribe from this group, send email to
> manipulatr+...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/manipulatr?hl=en.

David Mercer

unread,
Aug 9, 2012, 2:33:55 PM8/9/12
to manip...@googlegroups.com, David Mercer
Thank you, Dennis.  That definitely worked for my example.  Of course, I tried to make the necessary changes for my actual data and I received a different error; it seems to think that the data is singular: 

Error in rlm.default(x, y, weights, method = method, wt.method = wt.method,  : 
  'x' is singular: singular fits are not implemented in rlm

That error is how I wound up not including my group variable in the identified data frame for ddply. I started off trying to do this with by() and received the above error. When I ran the analysis with by() minus the group variable it worked. So far it seems that by() does not like the group to be included in the data frame, but ddply() does. Any ideas?

David Mercer

unread,
Aug 9, 2012, 4:07:33 PM8/9/12
to manip...@googlegroups.com, David Mercer
OK, I strongly suspect that the error in the last post is due to a bug in either plyr or VIM.

In my analysis I need to add some arguments to the irmi function, specifically:

imputed <- ddply(sleep,"group",irmi, robust=TRUE, noise=FALSE, init.method="median")

The command executes fine without the arguments, as in Dennis's fix for my initial problem.  It also executes with just the noise and init.method arguments.  However, it does not like the robust argument; the error arises only when robust=TRUE is included, both by itself and with the other arguments.  However, as I mentioned before, the robust command did not cause any issues when using by() or rbind.fill(), which is why I suspect an error in plyr or VIM.


Reply all
Reply to author
Forward
0 new messages