Hi everyone,
I’m using the package plyr and dplyr and I have problems with attaching appropriated labels to the output and obtain the sum of each variable for each subset.
As example I’m going to use the dataset baseball in packages (dplyr)....
The objective is to obtain the sum rbi (runs batted in) per year and team, is not the case here but in my data set there are few observations for some subset so I apply a sample with reposition to each subset of data.
Library (plyr); library(dplyr)
samp_size<-50
iter <-10
funct_df<-function(df){matrix(sample(1:nrow(df), samp_size*iter, replace=T), ncol=samp_size, byrow=T)}
model<-dlply(data,.(year,team),funct_df)
str(model) = 2527 list [1:10,1:50]
I need to transpose and then calculate the sum of each variable but I don’t know how.
I have tried with ldply as in the example in the split apply combine... but without exit.
Could someone help me?
Cheers,
Matias
--
You received this message because you are subscribed to the Google Groups "manipulatr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to manipulatr+unsubscribe@googlegroups.com.
To post to this group, send email to manip...@googlegroups.com.
Visit this group at https://groups.google.com/group/manipulatr.
For more options, visit https://groups.google.com/d/optout.
Hi Brandon,
Yes, as David wrote the data set is in pkg plyr.
The reason why I have to sampling with replacement and made a matrix is because I need a representative mean for the population and 50 is the number of crustacean that optimal should be sampled in the monitoring program per station each year, but many times I have less.
My dataset have 8 variables which I want to summary by two variables (years and stations) and some variables have great amount of zeros so by sample with replacement I get a good representation.
I know how to do it if the results is just one list but not for many
Library (plyr); library(dplyr)
samp_size<-50
iter <-10
funct_df<-function(df){matrix(sample(1:nrow(df), samp_size*iter, replace=T), ncol=samp_size, byrow=T)}
model<-dlply(data,.(year,team),funct_df)
str(model) = 2527 list [1:10,1:50]
y<-t(apply(model, 1, function(i) colSums(x[i,])))
y[1:8,] * if model have only 1 list
Matias
> To unsubscribe from this group and stop receiving emails from it, send an email to manipulatr+unsubscribe@googlegroups.com.
> To post to this group, send email to manip...@googlegroups.com.
> Visit this group at https://groups.google.com/group/manipulatr.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "manipulatr" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to manipulatr+unsubscribe@googlegroups.com.
> To post to this group, send email to manip...@googlegroups.com.
> Visit this group at https://groups.google.com/group/manipulatr.
> For more options, visit https://groups.google.com/d/optout.
group_by(year, team) %>%nest() %>%slice(1:10) %>% # this is so it doesn't do the whole datasetmutate(samples = map(data, .f = function(x) sample_iter(x)))
# here we iterate over the nested data and sample it with the function we made