> loo_RLComputed from 4000 by 129 log-likelihood matrix
Estimate SEelpd_loo -6920.1 150.0p_loo 172.6 13.0looic 13840.2 299.9Warning messages:1: 110 (85.3%) Pareto k estimates between 0.5 and 12: 19 (14.7%) Pareto k estimates greater than 1> loo_RLncComputed from 4000 by 129 log-likelihood matrix
Estimate SEelpd_loo -6562.5 166.6p_loo 230.7 8.9looic 13125.1 333.2Warning messages:1: 74 (57.4%) Pareto k estimates between 0.5 and 12: 55 (42.6%) Pareto k estimates greater than 1> compare(loo_RL, loo_RLnc)elpd_diff se weight1 weight2 357.6 42.3 0.0 1.0Hi Stan,I've been modeling a 2-armed bandit task with a basic reinforcement learning model in a trial-by-trial fashion. For instance, I have 129 subjects who performed the same task for 100 trials, and run overall 4000 samples in Stan. I've implemented 'loglik' in the generated quantities section, who gave me a 4000 by 129 log-likelihood matrix. For such a trial-by-trial model, we normally do not consider the log-likelihood per subject per trial, but rather, we sum the loglik for all trials and get the loglik only per subject, because each trial is not full independent according the model, i.e. the expected value on the current trial is updated from the last trial.
When passing two model fits to the LOO() function, I obtained a huge proportion of subjects whose k-hat are larger than 0.5 and even 1.0 (see below). Please note that, each pointwise k-hat represents a subject, rather than a single data point. For even more complex models, we have more subjects whose k-hat are greater than 1.(1) I wonder if I am doing this correctly? Is the model comparison based on PSIS-LOO still reliable?
(2) I also read from the paper (Vehtari et al, 2015) that I would consider computing k-fold CV instead, however as this is not implemented in the {LOO} package, I have no clear idea how I could do it for our trial-by-trial model. Am I suppose run a k(-trial)-fold CV per subject, or run a k(-subject)-fold CV for all the subjects? Any suggestions about how to calculate a k-fold CV for such a reinforcement learning model?
(3) or alternatively, how about using WAIC?
Thanks Aki, for your clear and comprehensive comment and suggestion! I now have two more questions concerning implementing the K-fold-CV. I apologize if those questions seem too naive.(1) partition data. If I would run a 10-fold-CV, leading to 12-13 subjects per fold. Do I randomly select subjects per fold, or systematically? Do I only run model fitting K times, or with M different data partition method and fit the model M x K times?
(2) obtaining the k-fold-cv output. Suppose I run a 10-fold-cv with systematically selecting subjects (i.e. first 13, second 13, and so on...), and I obtain ten log-likelihood matrix, one for each testing fold. What to do next to summarize the results and compare different models?As I understood the paper, once I get 10 loglik matrix, e.g. for a single 4000-by-13 loglik matrix L, I sum L across the time of simulations S and divide the results by S: rowSum(L)/S; then I sum such results across the 10 folds. This is then the k-fold-cv result for model comparison. Am I correct here?
And does it also make sense to multiply it with -2 to translate it on the information criterion scale?
Hi!
Apologies for going stray slightly, but this happens so often: Suppose I want to do k-fold-cv on the basis of subjects as in the example described by Lei - in his case I understood that each subject has exactly 100 trials. However, I am often facing the situation that subjects can have vastly different number of observations. Is a leave-subjects-out approach then meaningful or should on average each subject have the same number of observations?
Thanks again Aki, your explanation is indeed helpful!Just only one more concern: by computing the elpd_loo as you commented, is there also a way to calculate the estimated effective number of parameters 'p_loo' as in the loo() function? Could I still derive p_loo using the variance-based calculation from the 4000-by-129 loglik matrix? If so, that will be really great.
BTW, would you mind sharing your matlab code for k-fold-cv as you mentioned? Is it one of your .m file on your GitHub repository?
I hate multiplying by -2. Without it difference between lpd and elpd_loo is related to the effective number of parameters. I also prefer that larger value is better.Aki
Thanks again Aki. I agree that it becomes tricky when interpreting the estimated effective number of parameters. However, when reporting LOO, PSIS-LOO, or kfcv, given the fact that they are rather new methods for model comparison, I wonder if it is plausible to maintain the results from AIC, BIC, DIC etc. For such consistency reason, I might put the bias correction alongside.
Last but not least, thanks (Aki) for putting effort on the matlab code. This will be a great contribution to the community.
--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.