Hello all,I’m trying to run a model using rstan arm, but I am having some problems and unsure where can I find more advise.
Questions /comments.
- I tried changing the algorithm to “meanfield” and “fullrank” to test models but output seems quite random and it sometimes comes out with errors, sometimes it just goes quiet with no feedback and doesn’t create model, sometimes it gives some answers, mainly with warnings of not coverging and when it doesn’t complain I am unsure how to check convergence as there is no Rhat statistic…So have kept it again to the MCMC, which takes long so I don’t get immediate feedback , but at least when it doesn’t collapse, I seem to be able to assess convergence or use the shinystan (which is really good!! Congrats!)…
- I tried centering the response (so I have positive and negatives) and use a gaussian and I do get some results converging after some time, however, the ppcheck shows that the normal doesn’t really fit the response which is quite skewed.
- I tried a family=Gamma, which seems conceptually a better option for my response. It doesn’t converge.
- I tried adjusting the link to see if that helped, but it drops the “link= “ parameter fromt he function… I followed the usual glm specification of the family-link, but it doesn’t work, do you know how do I specify different links (e.g. Identity/ inverse / log)
- I have tried adapt_delta to higher values like 0.98 / 0.99, and still not convergence
- There was a comment in one of the feedbacks about increasing the value for tree depth, but I haven’t find exactly how I pass that to the stan_glmer () .. I tried: "control = list(max_treedepth = 20)”, but it seems to ignore / drop it. Any ideas in how to make that go through?
- I’ve tried adding some priors to the coefficients , but maybe I need some more appropriate / vaguer ones, as when I add, the process goes into limbo. So for the moment leaving that out.
- in the priors: when writing BUGs-type code, one usually specify one distribution for each of the parameters... in here it all seems to be collapsed by the "priors" option. I probably don't need to add different priors for each coefficient and I'm guessing priors for variances are ok to leave as "default"
- Basic question: when specifying the priors in rstanarm, does it follow usual R convention where the scale is the standard deviation? or does it refer to "precisions" (1/sigma2).
Hi Ana,Sorry you're running into these issues. Some comments and questions below:
On Monday, February 15, 2016 at 10:17:48 PM UTC-5, AM Madrigal wrote:Hello all,I’m trying to run a model using rstan arm, but I am having some problems and unsure where can I find more advise.You found the right place!
Questions /comments.
- I tried changing the algorithm to “meanfield” and “fullrank” to test models but output seems quite random and it sometimes comes out with errors, sometimes it just goes quiet with no feedback and doesn’t create model, sometimes it gives some answers, mainly with warnings of not coverging and when it doesn’t complain I am unsure how to check convergence as there is no Rhat statistic…So have kept it again to the MCMC, which takes long so I don’t get immediate feedback , but at least when it doesn’t collapse, I seem to be able to assess convergence or use the shinystan (which is really good!! Congrats!)…
When you say it "goes quiet with no feedback and doesn't create the model" do you mean nothing at all happens when you run the code? When it errors, are the messages about "numeric overflow"? Those do seem to happen occasionally with the variational algorithms. If it does converge it should say "MEDIAN ELBO CONVERGED", although that's not the same sort of convergence as for MCMC, which is why there is no Rhat statistic. That said, there has been some discussion of developing something similar to Rhat, but we haven't gotten there yet.
- I tried centering the response (so I have positive and negatives) and use a gaussian and I do get some results converging after some time, however, the ppcheck shows that the normal doesn’t really fit the response which is quite skewed.
Have you tried a log transformation of the positive response in combination with a gaussian likelihood?
- I tried a family=Gamma, which seems conceptually a better option for my response. It doesn’t converge.
- I tried adjusting the link to see if that helped, but it drops the “link= “ parameter fromt he function… I followed the usual glm specification of the family-link, but it doesn’t work, do you know how do I specify different links (e.g. Identity/ inverse / log)
The link parameter should work the way you describe. How do you know it's ignoring it? I just tried out a few stan_glmer models with family = Gamma(link = "inverse") and family = Gamma(link = "log") and they seem to be ok.
- I have tried adapt_delta to higher values like 0.98 / 0.99, and still not convergence
By not convergence do you mean you're getting warnings about divergences? Or bad looking trace plots / large Rhat values?
- There was a comment in one of the feedbacks about increasing the value for tree depth, but I haven’t find exactly how I pass that to the stan_glmer () .. I tried: "control = list(max_treedepth = 20)”, but it seems to ignore / drop it. Any ideas in how to make that go through?
control = list(max_treedepth = 20) is the right way to do it. I just double checked and I think it's working properly. If you specify control = list(max_treedepth = 20) and then run a model, what it does it say if you then look at model$stanfit@stan_args[[1]]$control, which will pull out the internal control arguments used?
- I’ve tried adding some priors to the coefficients , but maybe I need some more appropriate / vaguer ones, as when I add, the process goes into limbo. So for the moment leaving that out.
- in the priors: when writing BUGs-type code, one usually specify one distribution for each of the parameters... in here it all seems to be collapsed by the "priors" option. I probably don't need to add different priors for each coefficient and I'm guessing priors for variances are ok to leave as "default"
In rstanarm, the location and scale arguments for the different priors (e.g. normal, student_t, etc) can be vectors if you want to provide different values for the different coefficients. If they're just scalars then the values are replicated to the appropriate length.
- Basic question: when specifying the priors in rstanarm, does it follow usual R convention where the scale is the standard deviation? or does it refer to "precisions" (1/sigma2).
Yes, scale is the standard deviation.Also, which version of rstanarm are you using? There is a new version on CRAN that we haven't announced yet because the binaries might still be building, but I think for now at least the RStudio CRAN mirror it should be available. So I would definitely update to the most recent version if you haven't already.
Best,Jonah
In the meantime, I tried running the same model as above just changing the algorithm to both flunk and minefield and it does come back with the "numerical overflow" message. What does that mean?
It means that some value became too large. It's just a computational issue due to arithmetic with finite bit representations of numbers. The gamma distribution can have this issue sometimes, especially in combination with the variational approximations. Maybe Dustin or Alp (they wrote those algorithms) will have a better suggestion, but I'd start with using more informative priors if possible.
I tried at some point running a simple frequentists glmer to compare (from lme4) and sometimes even that would "go quiet" too... if I manage to reproduce I'll let you know
- I tried centering the response (so I have positive and negatives) and use a gaussian and I do get some results converging after some time, however, the ppcheck shows that the normal doesn’t really fit the response which is quite skewed.
Have you tried a log transformation of the positive response in combination with a gaussian likelihood?That is a great idea, so I create a log-normal and that might work fine. I'll try and come back to you. I have actually now put it to run and I'll wait and see if that converges.
So I guess it was running it with the default link (inverse?) when it failed to converge.
By not convergence do you mean you're getting warnings about divergences? Or bad looking trace plots / large Rhat values?I mean getting messages at the end of the process messages of something like " chains did not converge. do not analyse the results!"
I'll look into this and come back to you. From memory, at the same time that simulation came with the warning that it was ignoring the "link= .." instruction, it would say something similar about the max_treedepth.
Have you tried a log transformation of the positive response in combination with a gaussian likelihood?That is a great idea, so I create a log-normal and that might work fine. I'll try and come back to you. I have actually now put it to run and I'll wait and see if that converges.
Thanks! Ran the log-normal combination and looks much more promising. Took about an hour and seem to be converging. It came with some warnings...Have you tried a log transformation of the positive response in combination with a gaussian likelihood?That is a great idea, so I create a log-normal and that might work fine. I'll try and come back to you. I have actually now put it to run and I'll wait and see if that converges.Warning messages:1: There were 1 divergent transitions after warmup. Increasing adapt_delta above 0.98 may help.2: Examine the pairs() plot to diagnose sampling problemsWhat does it mean to have "X divergent transitions"? Is that over all chains? If so, having one over 8000 length chains doesn't seem dangerous (?)
What is the right use of the pairs() plot in this context(?) I tried directly using it as pairs(stan_glmermodel) and it returns errors, so assuming this is not the way.
Quick question in terms of goodness of fit and comparing models with different explanatory variables. I know there is no BIC / AIC measures to compare if a model with / without a variable is "better" than other... do you have any ideas for that?
Thanks! Ran the log-normal combination and looks much more promising. Took about an hour and seem to be converging. It came with some warnings...Have you tried a log transformation of the positive response in combination with a gaussian likelihood?That is a great idea, so I create a log-normal and that might work fine. I'll try and come back to you. I have actually now put it to run and I'll wait and see if that converges.Warning messages:1: There were 1 divergent transitions after warmup. Increasing adapt_delta above 0.98 may help.2: Examine the pairs() plot to diagnose sampling problemsWhat does it mean to have "X divergent transitions"? Is that over all chains? If so, having one over 8000 length chains doesn't seem dangerous (?)
Quick question in terms of goodness of fit and comparing models with different explanatory variables. I know there is no BIC / AIC measures to compare if a model with / without a variable is "better" than other... do you have any ideas for that?
What is the right use of the pairs() plot in this context(?) I tried directly using it as pairs(stan_glmermodel) and it returns errors, so assuming this is not the way.
That is the right way. What were the errors?
Loading required namespace: KernSmoothError in check_pars(allpars, pars) :no parameter b[(Intercept) filmCluster:1], b[(Intercept) filmCluster:2], b[(Intercept) filmCluster:3], ..... etcetera...
--
You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/MWqlyt8hwGU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+...@googlegroups.com.
To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Quick question. I have a factor with levels "A","B","C", but when I run the model, I get two coeffs (I guess third one is taken as control), but output gives me "L" and "Q". I saw the same in another test I was running with another factor variable where two levels seem to be recoded.. Is this standard in rstan? If so, any way I can return to my original levels to identify which is which?
options(contrasts = c(unordered = 'contr.treatment',
ordered = 'contr.treatment'))
--