I'll let Andrew or Aki answer the WAIC/DIC, etc questions.
You may need to repost with WAIC in the title to get their
attention.
Almost all the posts on this list are problem-oriented. We
wish more people would write in with success stories and to
share models. We hear them from people in person and via
off-list e-mail, but not so much on-list. It gives a very skewed
perspective!
Whatever you do, a model generating negative probabilities is
ill posed. The usual thing to do to convert unbounded values into
probabilities is to use softmax (exp-transform before normalizing).
It's the link for multi-logit.
As to (1), I don't know if Martyn Plummer updated rjags to
use split-R-hat the way we do in Stan. The split-R-hat (aka
split-"psrf") defined in our manual and in the latest Gelman
et al. Bayesian Data Analysis book is more conservative,
as is our n_eff calculation, which also uses cross-chain information.
You can pull the JAGS output into Stan this way, assuming you can
get the same kind of output out of rjags as out of R2jags.
fit_jags <- jags(model.file = "irt_hier.jags",
data = list(I=I, J=J, y=y),
init = init_list,
parameters.to.save = c("mu_a", "sigma_a", "a",
"mu_b", "sigma_b", "b",
"theta"),
n.chains = num_chains,
n.iter = 20000,
n.thin = 2,
jags.seed = init_seed);
print(c("jags", elapsed_time_jags));
fit_jags_mcmc <- as.mcmc(fit_jags);
arr <- as.array(fit_jags_mcmc);
arr <- aperm(arr, c(1,3,2));
mon <- monitor(arr, inc_warmup = FALSE);
The mon object should everything you need in it.
Some comments on the model below.
> On Jun 25, 2015, at 8:39 PM,
jun...@gmail.com wrote:
>
> My question actually can be rephrased as whether there is way to estimate generalized ordered logit/probit models, in which we can get negative predicted probabilities (so if we replace those negative predicted probabilities with zero, we might have sums greater than one). Thanks!
>
> On Thursday, June 25, 2015 at 8:05:38 PM UTC-4,
jun...@gmail.com wrote:
> Dear rstanians,
>
> This is a bit problems loaded post since I ran into couple of problems estimating a generalized ordered logit model, which is an extension to ordered logit/probit model discussed in p.50 of the Stan Modeling Language User's Guide v2.6.2. Instead of assuming that slopes (for the same independent variables) are the same across different cut-point equations (hence the proportional odds assumption), the generalized ordered logit model allows slopes to vary across equations. Because simple ordered logit model assumes the equivalence of coefficients, stan uses the ordered_logistic() function to set up the likelihood; but to set up a generalized ordered logit model, one may have to some modified version of of the stan codes on p.51 for an ordered probit regression model. Here are my questions (from the most to least straightforward):
>
> 1. how to get those model convergence statistics. I know using print(stanfit) (assuming stanfit is the stan function output object) can produce potential scale reduction factor (psrf); is it the same as those produced by gelman.diag() after rjags? How about other convergence and model fit/comparison statistics (DIC)? Professor Gelman suggested that one do not use DIC since it is not fully Bayesian, but is there any quick way to get it and triangulate results with those from WAIC?
>
> 2. I have a couple technical questions about stan and rstan codes. In terms of the empirical example that I have been working on, the ordinal response variable has eight levels, so one way to write the model is:
>
> data {
>
> int<lower=2> K; // total number of levels for y
>
> int<lower=0> N; // total number of cases
>
> int<lower=1> D; // total number of independent variables
>
> int<lower=0, upper=K> y[N];
>
> row_vector[D] x[N];
>
> }
>
> parameters {
>
> vector[D] b1;
>
> vector[D] b2;
>
> vector[D] b3;
>
> vector[D] b4;
>
> vector[D] b5;
>
> vector[D] b6;
>
> vector[D] b7;
>
This should just be
matrix[7, D] b;
> ordered[K-1] thresh;
>
> }
>
> model {
>
> vector[K] prob;
>
> vector[K-1] eta;
>
> for (d in 1:D) {
>
> b1[d] ~ normal(0, 10);
>
> b2[d] ~ normal(0, 10);
>
> b3[d] ~ normal(0, 10);
>
> b4[d] ~ normal(0, 10);
>
> b5[d] ~ normal(0, 10);
>
> b6[d] ~ normal(0, 10);
>
> b7[d] ~ normal(0, 10);
>
> }
Then this should be:
to_vector(b) ~ normal(0, 10);
though we'd recommend tighter priors unless you're expecting values
of the scale 10.
>
> for ( k in 1:K-1)
>
> thresh[k] ~ normal( k+0.5, 10);
I don't understand why you're doing this. You usually want
to deal with cutpoints. Check out either the Gelman and Hill
book or the ordered_logit example in the manual.
>
> for (n in 1:N) {
>
> eta[1] <- x[n]*b1;
>
> eta[2] <- x[n]*b2;
>
> eta[3] <- x[n]*b3;
>
> eta[4] <- x[n]*b4;
>
> eta[5] <- x[n]*b5;
>
> eta[6] <- x[n]*b6;
>
> eta[7] <- x[n]*b7;
This then becomes
eta <- x * b;
>
> prob[1] <- inv_logit(thresh[1] - eta[1]);
>
> for (k in 2:(K-1)) {
>
> prob[k] <- inv_logit(thresh[k] - eta[k]) - inv_logit(thresh[k-1] - eta[k-1]);
>
> }
>
The problem here is that you have no guarantee these are going
to be positive.
> prob[K] <- 1 - inv_logit(thresh[K-1] - eta[K-1]);
>
> y[n] ~ categorical(prob);
>
> }
>
> }
>
>
> After running my model, I got many error message
Right --- the model as formulated won't work.
> like " the current indicating that the Metropolis proposal is about to be rejected because of the following issues:
> stan::prob::categorical_log(N4stan5agrad3varE): Probabilities is not a valid simplex: Probabilities parameter .... should be greater than or equal to zero....
> If this warning occurs sporadically....then the sampler is fine, but if this warning occurs often, then your model may be either severely ill-conditioned or misspecified."
>
> Here I am guessing that the reason why I got so many error messages and the process continued forever is because the generalized ordered logit/probit does not guarantee that the probabilities are positive since they are differences in two cumulative densities, which is a design flaw with the model. And I don't know how to resolve this issue. I also tried using
Correct. See above.
>
> "prob[k] <- max(0, inv_logit(thresh[k] - eta[k]) - inv_logit(thresh[k-1] - eta[k-1]));"
>
> But looks like it doesn't work
You want to model the positive differences.
>
> 3. My way of specifying model as laid out above is clearly not efficient.
See above. The computational inefficiency comes from not
vectorizing, but the bigger problem is all the cut-and-paste
and repetition making it hard to understand.
> Since I have different slope vectors, I am thinking about using a matrix for the slopes like (see the highlighted part), but then I don't know how to code the rest of it....
>
I'd seriously recommend single-spacing code. But maybe
this is some kind of mailer or cut-and-paste artificact.
Let us know if that answer's not clear --- but do have a look
at the built-in ordered logit and how it's defined.
- Bob