Warning (non-fatal): Left-hand side of sampling statement (~) may contain ...

472 views
Skip to first unread message

Dung Tran

unread,
Sep 20, 2016, 8:55:18 AM9/20/16
to Stan users mailing list
Hi all,

When I run the following code for a linear regression model with Y on X where Y have some missing values

data {
int ssize; // Sample size
int<lower=0> Nobs;
int<lower=0> Nmis;

vector[Nobs] Yobs;

real X[ssize];
}

parameters {
vector[2] beta; // Regression parameters
real<lower=0> sigma; // SD of measurement error
vector[Nmis] Ymis;
}

model {
vector[ssize] Y;

Y = append_row(Yobs, Ymis);

// Prior distributions

sigma ~ uniform(0, 100); // Prior for sigma

for (i in 1:2){
beta[i] ~ normal(0, 1000); // Prior distribution for beta
}

// Model

  for (i in 1 : ssize){
real mu1; // Definition of mean
    mu1 = beta[1] + beta[2] * X[i];
Y[i] ~ normal(mu1, sigma);
}
}

then I got a warning:

DIAGNOSTIC(S) FROM PARSER:
Warning (non-fatal):
Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable.
If so, you need to call increment_log_prob() with the log absolute determinant of the Jacobian of the transform.
Left-hand-side of sampling statement:
    Y[i] ~ normal(...)

I do not understand the detail of the warning. However the parameter estimates are fine (close to the true values). I do not know whether this warning is indicating something serious that ignoring might lead to wrong conclusion?

Any of you have experience?

Thank you,

Tran.
Model2.stan

Ben Goodrich

unread,
Sep 20, 2016, 9:14:27 AM9/20/16
to Stan users mailing list
On Tuesday, September 20, 2016 at 8:55:18 AM UTC-4, Dung Tran wrote:
then I got a warning:

DIAGNOSTIC(S) FROM PARSER:
Warning (non-fatal):
Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable.
If so, you need to call increment_log_prob() with the log absolute determinant of the Jacobian of the transform.
Left-hand-side of sampling statement:
    Y[i] ~ normal(...)

I do not understand the detail of the warning. However the parameter estimates are fine (close to the true values). I do not know whether this warning is indicating something serious that ignoring might lead to wrong conclusion?

The append_row() function is an identity "transformation" so the warning is not something serious. Not understanding what the warning indicates is serious however,

http://mc-stan.org/misc/warnings.html#parser-warnings

Ben

Dung Tran

unread,
Sep 21, 2016, 5:34:02 AM9/21/16
to Stan users mailing list
Hi Ben,

Thank you for your answer!

Continue with this, I run a model for an ordinal variable as in the following codes:

data {
int<lower=1> N; // Sample size
int<lower=1> Nsubj; // Number of subjects
int<lower=2> Ncate; // Number of categories
int<lower=1, upper=Nsubj> ID[N]; // Subject ID

int<lower=0> Nobsadl;
int<lower=0> Nmisadl;

int adlobs[Nobsadl];
real age[N];
// Predictor: Age
int indadl[N];
}

parameters {
vector[1] beta; // Regression parameters
ordered[Ncate-1] cutpoints;

vector[Nsubj] b0; // Random intercepts
real<lower=0> sigmab; // RI SD

int adlmis[Nmisadl];
}

model {
vector[N] adl;

adl[indadl] = append_row(adlobs, adlmis);

// Prior distributions

b0 ~ normal(0, sigmab); // Subject random effects
// sigmab ~ uniform(0, 10); // Prior for sigma RI

for (i in 1:1){
beta[i] ~ normal(0, 1000); // Prior distribution for beta
}

// Model

  for (i in 1 : N){
real mu; // Definition of mean
    mu = beta[1] * age[i] + b0[ID[i]];
adl[i] ~ ordered_logistic(mu, cutpoints);
}

}

and it gives an error: integer parameters or transformed parameters are not allowed. But I cannot specify real numbers because of the model: adl[i] ~ ordered_logistic(mu, cutpoints), i.e., the response here adl, has to be integer.

Do you know how to overcome this?

Thank you,
Tran.

Ben Goodrich

unread,
Sep 21, 2016, 9:18:34 AM9/21/16
to Stan users mailing list
On Wednesday, September 21, 2016 at 5:34:02 AM UTC-4, Dung Tran wrote:
it gives an error: integer parameters or transformed parameters are not allowed. But I cannot specify real numbers because of the model: adl[i] ~ ordered_logistic(mu, cutpoints), i.e., the response here adl, has to be integer.

Well, HMC cannot have integer unknowns because integers are not differentiable. For missingness in an outcome only, you can introduce a latent utility variable that is constrained to be between the appropriate, albeit unknown, cutpoints when the outcome is observed and is unconstrained when the outcome is missing. However, this approach is quite difficult and may not sample particularly well.

Ben

Bob Carpenter

unread,
Sep 21, 2016, 3:48:04 PM9/21/16
to stan-...@googlegroups.com
Youc an also just generate missing data in the generated quantities block
if it's not going to affect the marginal density of the other parameters
of the model, as in this case.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Dung Tran

unread,
Sep 22, 2016, 5:47:06 AM9/22/16
to Stan users mailing list
Thank you so much for develop a very excellent software package! I have a good news that convergence for ordinal data with Stan is much more faster than JAGS. (It takes me 7 hours in stead of three days (but have not converged)) 

With my program, I overcome the difficulty and now I can specify models with missing outcome and covariates. Thank you for your help!

I would like to ask about the append_row function. It works well for case of vectors with real values but it does not for integers. So I have use for loop:

for (i in 1: Nobsex62){
ex62[index62[i]] = ex62obs[i];
}

for (i in 1: Nmisex62){
ex62[index62[Nobsex62+i]] = ex62mis[i];
}

to join two vectors of integers (ex62obs, ex62mis). To my knowledge, for loop is not efficient as joining, but I haven't found any other solutions. Do you know it?

Thank you,

Tran. 

Bob Carpenter

unread,
Sep 22, 2016, 10:22:49 AM9/22/16
to stan-...@googlegroups.com
The loop's just as efficient because Stan just gets
compiled to C++. So you don't pay the loop overhead
you pay in R, (uncompiled) Python, BUGS, or JAGS. We just
like the vectorized forms because they're easier to read
and write (and less error prone).

- Bob

Ben Goodrich

unread,
Sep 22, 2016, 12:42:59 PM9/22/16
to Stan users mailing list
On Thursday, September 22, 2016 at 5:47:06 AM UTC-4, Dung Tran wrote:
I would like to ask about the append_row function. It works well for case of vectors with real values but it does not for integers. So I have use for loop:

for (i in 1: Nobsex62){
ex62[index62[i]] = ex62obs[i];
}

for (i in 1: Nmisex62){
ex62[index62[Nobsex62+i]] = ex62mis[i];
}

to join two vectors of integers (ex62obs, ex62mis). To my knowledge, for loop is not efficient as joining, but I haven't found any other solutions. Do you know it?
 
There is no efficiency loss in writing a loop in Stan language versus calling a Stan function that does the loop internally in C++. However, the problem remains that HMC cannot deal with integer unknowns.

Ben

Dung Tran

unread,
Sep 26, 2016, 11:30:04 AM9/26/16
to Stan users mailing list
Thanks to all!
Reply all
Reply to author
Forward
0 new messages