On 9/25/13 8:48 AM, Newbie (: wrote:
> Hi Bob,
> Thank you so much for helping! :) I have attached the stan R code with inf~categorical(k) in the model block. What I
> want to do is to let k[1]=1-pi1 (probability of a person being non-infected) and k[2]=pi1 (probability of someone being
> infected. int inf[I] and inf~categorical(k) so that inf[i] can be either equal 1 or 2. When inf[i]=2, I let mu[i]=alpha
> and when inf[i]=1, I let mu[i]=-alpha.
> The abovementioned code works in JAGS: inf[i]~dcat(k[]).
> But I will get an error after running the R stan code:
> inf~categorical(k);
> ^-- here
>
> DIAGNOSTIC(S) FROM PARSER:
> no matches for function name="categorical_log"
> arg 0 type=real
> arg 1 type=vector
> available function signatures for categorical_log:
> 0. categorical_log(int, vector) : real
> unknown distribution=categorical
> Parser expecting: "}"
> I am not sure whether is it because inf is unknown that cause this error. But I want to let inf to be equal 1 or 2
> depending on pi1 (probability of infected).
You just need to define the variable "inf" as an int rather
than a real.
This is a subtle issue in our parser. The name "inf" shouldn't
be used for a variable because the parser wants to parse it as
a numeric literal denoting +infinity. That's why it thinks
the type of "inf" in:
inf~categorical(k);
is real.
We can solve this problem by declaring a different variable:
int foo[I];
And now
for (i in 1:I) {
foo ~ categorical(k);
gives you a more interpretable error message, namely:
EXPECTATION FAILURE LOCATION: file=../temp2/newbie.stan; line=29, column=5
foo ~ categorical(k);
^-- here
DIAGNOSTIC(S) FROM PARSER:
no matches for function name="categorical_log"
arg 0 type=int[1]
arg 1 type=vector
available function signatures for categorical_log:
0. categorical_log(int, vector) : real
What you really wanted to do was say:
for (i in 1:I) {
foo[i] ~ categorical(k);
We like to stick to Gelman's notation where Greek letters
are used for parameters (e.g., theta), lower-case roman letters (e.g., k)
are used for loop variables and indexes and upper-case roman
letters (e.g., K) for un-modeled constants.
> Btw the issue 3 you have mentioned, the tau in my model actually can be negative and they do not have to sum up to 1. So
> I let tau[1] to be normally distributed and the rest of the tau to follow a uniform distribution. So tau[1] can actually
> be negative too.
The way you've declared these variables won't work because
the declared support of the variables still does not match
the sampling constraints. Everything's OK but the upper bound
of 100.
parameters {
...
ordered[8] tau;
}
model {
...
tau[1]~normal(0,0.01);
tau[2]~uniform(tau[1],tau[3]);
tau[3]~uniform(tau[2],tau[4]);
tau[4]~uniform(tau[3],tau[5]);
tau[5]~uniform(tau[4],tau[6]);
tau[6]~uniform(tau[5],tau[7]);
tau[7]~uniform(tau[6],tau[8]);
tau[8]~uniform(tau[7],100);
The particular issue is that tau[8]'s sampling distribution
puts an upper limit of support at 100 whereas there's no such
constraint on the declaration. There's not a good way to code
this constraint into Stan without declaring each variable tau[1]
through tau[8] separately with the appropriate constraints, which
has to be done in reverse order because tau[8] is the key value.
parameters {
real<upper=100> tau8;
real<upper=tau8> tau7;
...
real<upper=tau2> tau1;
This keeps tau8 under 100, and each successive value after that
under the previous one.
I think this will work, but it's ugly.
An alternative is to do the transforms (and Jacobians) yourself,
which is also clunky (not to mention error prone).
- Bob
P.S. I'm opening an issue to put "inf" on our list of reserved
words so you get a more sensible error message up front when
trying to define the variable:
https://github.com/stan-dev/stan/issues/239
You can use the above link to track progress. We'll probably get
this fix in before the official 2.0 release. Beta will be real soon.
We're feature frozen and just finalizing some doc at this point.
P.P.S. Could everyone please start indenting their code so it's more
readable? My standard operating procedure is to cut out of R strings,
move to a file, then let emacs reset it all. But it's for your own
good to do this the "right" way to begin with --- it'll save you huge
headaches in missing where blocks start and end if the visual code
layout matches the logical layout. The result for this model is:
data {
int<lower=0> I; //number of subjects
int<lower=2> J;
int<lower=1,upper=J> y[I];
}
parameters {
real<lower=0,upper=10> alpha;
real<lower=0,upper=1> pi1;
ordered[8] tau;
}
model {
vector[J] p;
real mu[I];
vector[2] k;
int inf[I];
tau[1]~normal(0,0.01);
tau[2]~uniform(tau[1],tau[3]);
tau[3]~uniform(tau[2],tau[4]);
tau[4]~uniform(tau[3],tau[5]);
tau[5]~uniform(tau[4],tau[6]);
tau[6]~uniform(tau[5],tau[7]);
tau[7]~uniform(tau[6],tau[8]);
tau[8]~uniform(tau[7],100);
k[1]<- 1-pi1;
k[2]<- pi1;
for (i in 1:I) {
inf~categorical(k);
if (inf==2) {