Hello,
I am estimating a simple graded response model (4 items, each with four categories). I have slightly modified a model that runs in JAGS to run in STAN, but am encountering some problems. I am wondering if anybody can provide some pointers.
Below is everything, including the data from the ltm package.
I would be greatly appreciative for some help.
Thanks
require(ltm)
data<-Science[c(1,3,4,7)]
data<-cbind(as.numeric(data[,1]),as.numeric(data[,2]), as.numeric(data[,3]), as.numeric(data[,4]) )
n<-length(data[,1])
p<-length(data[1,])
K<-apply(data,2,max, na.rm=T)
data.1 <- list(y=data,n=n,p=p,K=K)
grm_code<-"
data {
int<lower=0> n; //Number of participants
int<lower=0> p; //Number of items
int<lower=0> K; //Number of categories per item
int<lower=1,upper=4> y[n]; //Data
}
parameters {
real theta[n]; // ability for n
ordered[K-1] tau[p]; // difficulty for the item
real<lower=0> alpha[p];// discrimination parameter for the item
real<lower=0> sigma_tau; // sd of abilities
real<lower=0> sigma_alpha; // sd of difficulties
}
model{
real Q[n, p, 3];
real prob[n, p, 4];
theta~normal(0, 1);
tau ~ cauchy(0,sigma_tau);
alpha~cauchy(0, sigma_alpha);
sigma_tau~cauchy(0,5);
sigma_alpha~cauchy(0,5);
for(i in 1:n){
for (j in 1:p){
for (k in 1:(K[j]-1)){
Q[i,j,k]<-inv_logit(alpha[j]*(theta[i]-tau[j,k]));
}
prob[i,j,1]<-1-Q[i,j,1];
for (k in 2:(K[j]-1))
prob[i,j,k]<-prob[i,j,k-1]-prob[i,j,k];
prob[i,j,K[j]]<-Q[i,j,K[j]-1];
y[i,j]~categorical(prob[i,j,1:K[j]]);
}
}
}
"
fit <- stan(model_code = grm_code,
data = data.1, iter = 1000, chains = 2)
require(ltm)
data<-Science[c(1,3,4,7)]
data<-cbind(as.numeric(data[,1]),as.numeric(data[,2]), as.numeric(data[,3]), as.numeric(data[,4]) )
n<-length(data[,1])
p<-length(data[1,])
K<-apply(data,2,max, na.rm=T)
data.1 <- list(y=data,n=n,p=p,K=K)
grm_code<-"
data {
int<lower=0> n; //Number of participants
int<lower=0> p; //Number of items
int<lower=0> K; //Number of categories per item
int<lower=1,upper=4> y[n,p]; //Data
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model 'grm_code' with error message:
EXPECTATION FAILURE LOCATION: file=input; line=33, column=42
y[i,j]~categorical(prob[i,j,1:K[j]])
^-- here
DIAGNOSTIC(S) FROM PARSER:
Parser expecting: "]"
require(ltm)
data<-Science[c(1,3,4,7)]
data<-cbind(as.numeric(data[,1]),as.numeric(data[,2]), as.numeric(data[,3]), as.numeric(data[,4]) )
n<-length(data[,1])
p<-length(data[1,])
K<-apply(data,2,max, na.rm=T)
data.1 <- list(y=data,n=n,p=p,K=K)
grm_code<-"
data {
int<lower=0> n; //Number of participants
int<lower=0> p; //Number of items
int<lower=0> K[p]; //Number of categories per item
int<lower=1,upper=4> y[n,p]; //Data
}
parameters {
real theta[n]; // ability for n
ordered[K[p]-1] tau[p]; // difficulty for the item
real<lower=0> alpha[p];// discrimination parameter for the item
real<lower=0> sigma_tau; // sd of abilities
real<lower=0> sigma_alpha; // sd of difficulties
}
model{
real Q[n, p, 3];
real prob[n, p, 4];
theta~normal(0, 1);
for (i in 1:p)
tau[i] ~ cauchy(0,sigma_tau);
alpha~cauchy(0, sigma_alpha);
sigma_tau~cauchy(0,5);
sigma_alpha~cauchy(0,5);
for(i in 1:n){
for (j in 1:p){
for (k in 1: (K[j]-1)){
Q[i, j, k]<-inv_logit(alpha[j]*(theta[i]-tau[j,k]));
}
prob[i,j,1]<-1-Q[i,j,1];
for (k in 2: (K[j]-1))
prob[i,j,k]<-Q[i,j,k-1]-Q[i,j,k];
prob[i,j,K[j]]<-Q[i,j,(K[j]-1)];
require(rstan)
require(ltm)
data<-Science[c(1,3,4,7)]
data<-cbind(as.numeric(data[,1]),as.numeric(data[,2]), as.numeric(data[,3]), as.numeric(data[,4]) )
c<-4 #Number of categories per item (here, 4)
J<-length(data[,1])
K<-length(data[1,])
N<-J*K
jj<-rep(c(1:J), times=K) #Index for persons
kk<-rep(c(1:K), each=J) #Index for items
y<-c(data[,1], data[,2], data[,3], data[,4])
dat<-list(N=N,
c=c,
J=J,
K=K,
jj=jj,
kk=kk,
y=y
)
grm_code<-"
data {
int<lower=0> N; //Number of observations
int<lower=1> c; //Number of categories
int<lower=1> J; //Number of ids
int<lower=1> K; //Number of items
int<lower=1, upper=J> jj[N]; //Identifier
int<lower=1, upper=J> kk[N]; //question
int y[N]; //Data
}
parameters {
real theta[N];
ordered[c-1] tau[K]; //Difficulty
real<lower=0> alpha[K];// discrimination parameter for each item
}
model{
alpha~cauchy(0, 5);
theta~normal(0,1);
for (p in 1:4){
tau[p, 1]~uniform(-100, tau[p, 2]);
tau[p, 2]~uniform(tau[p, 1], tau[p, 3]);
tau[p, 3]~uniform(tau[p, 2], 100);
}
for (n in 1:N)
y[n]~ordered_logistic(alpha[kk[n]]*theta[jj[n]], tau[kk[n]]);
}
"
fit <- stan(model_code = grm_code,
data = dat, iter = 100, chains =3)
We put that message in for diagnostic reasons and it's been confusing
users ever since.
On 7/10/13 12:19 PM, Ben Goodrich wrote:
> On Wednesday, July 10, 2013 11:48:20 AM UTC-4, Bob Carpenter wrote:
>
> We put that message in for diagnostic reasons and it's been confusing
> users ever since.
>
>
> Maybe it should be off by default during warmup? Always on after warmup is reasonable, but the message isn't even that
> meaningful on iteration 1.
It will diagnose the problem of too few constraints
being declared, which is one of the main reasons it's
there.
Dear Bob,Thank you for the feedback. I agree on the identifiability issue. I knew that but must have missed this and somehow my quick check did not reveal this issue.I am however a bit puzzled by your comment on the support. Could you clarify the following?-I thought it is ok to put a restriction on the declaration and have an unrestricted priore.g.declare real<lower=0> alpha[K]and later onalpha~cauchy(0, 2);but the opposite is not ok?No restriction in the declaration but specifying a prior with some restriction e.g. uniform(a,b) ?
-What would you suggest in the example of above?(how would you restrict the ordered declaration if that is the suggestion)
On Thursday, November 27, 2014, <louis...@gmail.com> wrote:Dear Bob,Thank you for the feedback. I agree on the identifiability issue. I knew that but must have missed this and somehow my quick check did not reveal this issue.I am however a bit puzzled by your comment on the support. Could you clarify the following?-I thought it is ok to put a restriction on the declaration and have an unrestricted priore.g.declare real<lower=0> alpha[K]and later onalpha~cauchy(0, 2);but the opposite is not ok?No restriction in the declaration but specifying a prior with some restriction e.g. uniform(a,b) ?That is correct. If the parameters end up outside of the support of the uniform distribution, the log probability is -inf and there is no gradient. So, NUTS / HMC stops working efficiently. There is a chance it stumbles back into an area of support, but I think our implementation won't.By default, when Stan initializes its parameters, it chooses random values on the unconstrained scale. If the initial parameters evaluate to a -inf log probability calculation, Stan will choose a different random starting point. It does this a max of 100 times then quits if it can't find a reasonable starting value.If you're insistent on using that sort of construction, you could start sampling by providing initial values, but it would still be very inefficient. Stan does a change of variables under the hood, which make sampling much easier over bounded variables. There's no possible chance of stepping into a place with no support.-What would you suggest in the example of above?(how would you restrict the ordered declaration if that is the suggestion)You might be able to do a change of variables yourself and add the right Jacobian term, maybe use a positive ordered, then subtract 100?
On Thursday, November 27, 2014, Daniel Lee <bea...@alum.mit.edu> wrote:
On Thursday, November 27, 2014, <louis...@gmail.com> wrote:Dear Bob,Thank you for the feedback. I agree on the identifiability issue. I knew that but must have missed this and somehow my quick check did not reveal this issue.I am however a bit puzzled by your comment on the support. Could you clarify the following?-I thought it is ok to put a restriction on the declaration and have an unrestricted priore.g.declare real<lower=0> alpha[K]and later onalpha~cauchy(0, 2);but the opposite is not ok?No restriction in the declaration but specifying a prior with some restriction e.g. uniform(a,b) ?That is correct. If the parameters end up outside of the support of the uniform distribution, the log probability is -inf and there is no gradient. So, NUTS / HMC stops working efficiently. There is a chance it stumbles back into an area of support, but I think our implementation won't.By default, when Stan initializes its parameters, it chooses random values on the unconstrained scale. If the initial parameters evaluate to a -inf log probability calculation, Stan will choose a different random starting point. It does this a max of 100 times then quits if it can't find a reasonable starting value.If you're insistent on using that sort of construction, you could start sampling by providing initial values, but it would still be very inefficient. Stan does a change of variables under the hood, which make sampling much easier over bounded variables. There's no possible chance of stepping into a place with no support.-What would you suggest in the example of above?(how would you restrict the ordered declaration if that is the suggestion)You might be able to do a change of variables yourself and add the right Jacobian term, maybe use a positive ordered, then subtract 100?Better solution: scale a simplex. I think that'll
My apologies -- you are correct. More information would be useful. Below I've included comments throughout the code. I ended up testing it on the Science data in the R ltm package. It consists of four five category items. I realize that Stan cannot yet accommodate ragged arrays, but the number of categories is constant by item here (and in my work), so the vector of response categories by item could just as easily be changed to a scalar where K=4. Other than that, I think the comments will help. If not, please let me know. And thanks in advance for the help. It's much appreciated.
--
You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/Kz6y8aIrKKY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+...@googlegroups.com.