"weight" argument with sample_n and sample_frac in dplyr

391 views
Skip to first unread message

Jonathan Judge

unread,
Dec 15, 2017, 11:27:17 AM12/15/17
to manipulatr
The documentation for these functions does not seem to explain the desired input for the "weights" argument.

Is this meant to be a simple set of counts, which the sample function will then take into account as necessary?

Is this meant to be a set of inverse probability weights?

Is this meant to be a set of pre-determined bootstrap weights, akin to what is created from survey::bootweights?

My goal is to resample clusters (which amounts to a mixture of replacement and non-replacement sampling) and I was curious if the weights argument might be able to help with this.  But before I can figure that out, I need to know what input "weights" is expecting.  

I'd appreciate any insight, thanks.

Jonathan

Hadley Wickham

unread,
Dec 19, 2017, 4:21:50 PM12/19/17
to Jonathan Judge, manipulatr
The weights are scaled to sum to 1, then passed on to sample() as
probability weights.
Hadley
> --
> You received this message because you are subscribed to the Google Groups
> "manipulatr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to manipulatr+...@googlegroups.com.
> To post to this group, send email to manip...@googlegroups.com.
> Visit this group at https://groups.google.com/group/manipulatr.
> For more options, visit https://groups.google.com/d/optout.



--
http://hadley.nz
Reply all
Reply to author
Forward
0 new messages