Modeling reaction time data with shifted log normal hierarchical model

Bruno Nicenboim

unread,

Mar 13, 2015, 10:29:26 AM3/13/15

to stan-...@googlegroups.com

Hi,

I've been trying to fit some reading/reaction time (rt) data with a shifted log normal hierarchical model, where the shift also varies by subject. We (in psycholinguistics) usually use linear mixed models with lmer (from lme4 R package) and we either use log(rt) or (1/rt) as the dependent variable (to have it "normally distributed") and I would like to know if it's justified to fit a shifted-lognormal dist.

So I built the model using Sorensen & Vasishth's tutorial and I followed Bob Carpenter's post for the log-shifted part . And I added shifts by subject.

First of all, when I compile the model I get the following, which I understood from Bob's post, that I can safely ignore:

Warning (non-fatal): Left-hand side of sampling statement (~) contains a non-linear transform of a parameter or local variable.
 You must call increment_log_prob() with the log absolute determinant of the Jacobian of the transform.
  Sampling Statement left-hand-side expression:
    subtract(rt,c) ~ lognormal_log(...)

And then during the sampling, I get a bunch of these ones:

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
stan::prob::lognormal_log(N4stan5agrad3varE): Random variable[475]  is -0.450891:0, but must be >= 0!
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

I guess that's because the general shift + shift by subject is lower than 0 in some of the iterations. But the model converges anyway, so the first question is: can I ignore the errors? Or should I fix this? and how? Should I force the by-subj shift to be not smaller than -shift and not bigger than the min(rt)?

The second question is: how can I check which model fits the data better. I thought about using wAIC (using this code). Is there an alternative/better way?

I'm extracting the log likelihood in this way:

log_lik[i] <- normal_log(rt[i]-c[i],log_mu[i], sigma);

There's something wrong for sure, because I get a huge wAIC for the shifted lognormal, and the credible intervals are a bit wider than the ones of the lognormal distribution (even though I recover the mean of the parameters decently in both).

I'm posting the regular lmm, the shifted lognormal model, R code, and fake data that should have a shifted lognormal distribution (betas =5, .12, .05,.08; shift = 180, sigma_shift = 60,sigma .7 )

Thanks!

Bruno

logshifted_lmm.stan

lmm.stan

evallmm.R

fakedata.Rda

Bob Carpenter

unread,

Mar 13, 2015, 8:32:22 PM3/13/15

to stan-...@googlegroups.com

Thanks for including the models.

You can ignore the Jacobian warning for subtract(rt,c) (corresponding
to model code "(rt - c) ~ ...". Because c is a transformed parameter,
Stan can't track what might have happened to it at compile time (at least
in general, or in specific cases without a lot of work).
We should change wording of "contains a non-linear transform" to "may contain
a non-linear transform".

The other message is serious, because it's telling you that (rt - c) is
negative in some cases. I have no idea what the [475] is doing there. Daniel?
Given that rt[i] is data and c[i] == (shift + shift_u[subj[i]]),
and shift is a scalar parameter, you need to constrain shift properly
to make sure

shift < rt[i] - shift_u[subj[i]]

for all i in 1:num_elements(rt).

functions {

real shift_max(vector shift_u, int[] subj, vector rt) {
real shift_max <- infinity();
for (i in 1:num_elements(rt))
shift_max <- min(shift_max, rt[i] - shift_u[subj[i]]);
return shift_max;
}

}

You should most definitely check all my algebra here. Then you
can declare:

real<lower=0, upper=shift_max(shift_u,subj,rt)> shift;

Given that you're not averse to reading, I'd recommend the sections on
priors in the manual chapter on regression. We've been advocating against
using this kind of thing

real<lower=0,upper=10000> sigma;

in favor of unconstrained variables with at least weakly informative priors (e.g.,
if you expect sigma to be in the unit range, something like cauchy(0,5) or even
cauchy(0,2.5)). The manual cites some papers by Andrew going over why the other
priors are bad news for inference.

As to fitting the data better, if you can afford the computation, and it
makes sense given your groupings, just use cross-validation. You can build that
right in Stan. I have it on my to-do list to add an example to the manual.

I think the issue you're going to have to deal with is marginalizing out all the latent
parameters if you want to properly compute a likelihood here. But others are much
better than me at WAIC. If the model comparison question gets buried in the noise
here, feel free to repost with a more directed title so those in the know (usually Andrew
or Aki, since they're most invested) see it.

- Bob

> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
> <logshifted_lmm.stan><lmm.stan><evallmm.R><fakedata.Rda>

Bruno Nicenboim

unread,

Mar 14, 2015, 6:05:58 AM3/14/15

to stan-...@googlegroups.com

Thanks for the fast answer! Unfortunately, I got a long ugly error, see below:

On Saturday, March 14, 2015 at 1:32:22 AM UTC+1, Bob Carpenter wrote:

Thanks for including the models.

You can ignore the Jacobian warning for subtract(rt,c) (corresponding
to model code "(rt - c) ~ ...". Because c is a transformed parameter,
Stan can't track what might have happened to it at compile time (at least
in general, or in specific cases without a lot of work).
We should change wording of "contains a non-linear transform" to "may contain
a non-linear transform".

The other message is serious, because it's telling you that (rt - c) is
negative in some cases. I have no idea what the [475] is doing there. Daniel?
Given that rt[i] is data and c[i] == (shift + shift_u[subj[i]]),
and shift is a scalar parameter, you need to constrain shift properly
to make sure

shift < rt[i] - shift_u[subj[i]]

for all i in 1:num_elements(rt).

functions {

real shift_max(vector shift_u, int[] subj, vector rt) {
real shift_max <- infinity();
for (i in 1:num_elements(rt))
shift_max <- min(shift_max, rt[i] - shift_u[subj[i]]);
return shift_max;
}

}

So I had to change the function to:

functions {

real shift_max(vector shift_u, int[] subj, vector rt) {

real shift_max;

real values[2];

shift_max <- positive_infinity();

for (i in 1:num_elements(rt)){

values[1] <- shift_max;

values[2] <- rt[i] - shift_u[subj[i]];

shift_max <- min(values);

}

return shift_max;

}

in version 2.5 at least, min doesn't allow 2 reals, I don't know any more elegant way to initialize these two values. This [shift_max, rt[i] - shift_u[subj[i]]] didn't work for me.

And I had to change infinity to positive _infinity and to put it in two lines. With this it started to compile but after a couple of seconds I got a very long error that I'm posting.

You should most definitely check all my algebra here. Then you
can declare:

real<lower=0, upper=shift_max(shift_u,subj,rt)> shift;

Given that you're not averse to reading, I'd recommend the sections on
priors in the manual chapter on regression. We've been advocating against
using this kind of thing

real<lower=0,upper=10000> sigma;

in favor of unconstrained variables with at least weakly informative priors (e.g.,
if you expect sigma to be in the unit range, something like cauchy(0,5) or even
cauchy(0,2.5)). The manual cites some papers by Andrew going over why the other
priors are bad news for inference.

Yes, since I was fitting different transformations of data in the other model, I was lazy to change the priors. But yes, I'm fixing that in shifted lognormal model.

As to fitting the data better, if you can afford the computation, and it
makes sense given your groupings, just use cross-validation. You can build that
right in Stan. I have it on my to-do list to add an example to the manual.

Ok, I'll check that.

error_stan.txt

Bob Carpenter

unread,

Mar 14, 2015, 6:36:20 AM3/14/15

to stan-...@googlegroups.com

I forgot our own function naming conventions, like positive_infinity().
fmax() and fmin() are the two-argument versions for reals (the "f" is
for floating point, which follows C++ convention and is hence confusing
in the Stan context).

We want to trap errors before they become compile-time errors and
spit out long ugly errors. There's good news and bad news for this one.

The good news is that we know the source of the error and are working on a fix
for the next release, which we really hope to get out next week (but
no promises --- we're at least 50 person errors into fixing
this "upgrade" that included the nonstandard type __float128 and messed
up our autodiff, but we think we have it tackled).

The bad news is that you have to revert to an earlier compiler to fix it.
Or wait for the next release. Or check out the current develop branch and
plug it in yourself (only easy with CmdStan). Until our next release or until
you revert to an earlier compiler, I doubt any Stan program will compile.

- Bob

> On Mar 14, 2015, at 9:05 PM, Bruno Nicenboim <bruno.n...@gmail.com> wrote:
>
> Thanks for the fast answer! Unfortunately, I got a long ugly error, see below:
>
> On Saturday, March 14, 2015 at 1:32:22 AM UTC+1, Bob Carpenter wrote:
> Thanks for including the models.
>
> You can ignore the Jacobian warning for subtract(rt,c) (corresponding
> to model code "(rt - c) ~ ...". Because c is a transformed parameter,
> Stan can't track what might have happened to it at compile time (at least
> in general, or in specific cases without a lot of work).
> We should change wording of "contains a non-linear transform" to "may contain
> a non-linear transform".
>
> The other message is serious, because it's telling you that (rt - c) is
> negative in some cases. I have no idea what the [475] is doing there. Daniel?
> Given that rt[i] is data and c[i] == (shift + shift_u[subj[i]]),
> and shift is a scalar parameter, you need to constrain shift properly
> to make sure
>
> shift < rt[i] - shift_u[subj[i]]
>
> for all i in 1:num_elements(rt).
>
> functions {
>
> real shift_max(vector shift_u, int[] subj, vector rt) {
> real shift_max <- infinity();
> for (i in 1:num_elements(rt))
> shift_max <- min(shift_max, rt[i] - shift_u[subj[i]]);
> return shift_max;
> }
>
> }
>

> So I have to change the function to:

> <error_stan.txt>

Bob Carpenter

unread,

Mar 14, 2015, 6:41:19 AM3/14/15

to stan-...@googlegroups.com

Oops. I was looking at the wrong error file --- I get lots of
them via e-mail.

Could you send me the Stan program (logshifted_lmm.stan)
that's causing that error?

I'll try to get that fixed, too, before our next release,
which I hope is next week. Until then, I may be able to find
a workaround for you.

- Bob

Bob Carpenter

unread,

Mar 14, 2015, 6:49:20 AM3/14/15

to stan-...@googlegroups.com

It may be fixed in 2.6 already --- I can check if you send
me the file. It's an issue of user-defined functions not getting
an output stream passed to them for print statements.

- Bob

Bob Carpenter

unread,

Mar 14, 2015, 7:53:22 AM3/14/15

to stan-...@googlegroups.com

You're in luck. We fixed the bug that prevented functions
from being used in constraints in 2.6. Here's the report:

https://github.com/stan-dev/stan/issues/1151

- Bob

Bruno Nicenboim

unread,

Mar 14, 2015, 8:25:21 AM3/14/15

to stan-...@googlegroups.com

Yes, sure, so this is the updated file.

Thanks!!

logshifted_lmm.stan

Bruno Nicenboim

unread,

Mar 14, 2015, 9:53:15 AM3/14/15

to stan-...@googlegroups.com

Hi!

I saw the last message after I sent mine. So you meant that the problem was fixed in 2.6?

I just installed 2.6 alongside my version, and I get also an error (it looks slightly different though). Something with validate_non_negative_index

I'm attaching the error.

Bruno

error_stan2.txt

Bob Carpenter

unread,

Mar 14, 2015, 10:14:22 AM3/14/15

to stan-...@googlegroups.com

I'm not sure what you mean by "installing alongside," but
I don't know much about R.

I can compile your model in RStan 2.6 on Mac OSX using clang++.

-------------------------------
> mod <- stan_model("logshifted_lmm.stan")

TRANSLATING MODEL 'logshifted_lmm' FROM Stan CODE TO C++ CODE NOW.
DIAGNOSTIC(S) FROM PARSER:Warning (non-fatal): Left-hand side of sampling statement (~) contains a non-linear transform of a parameter or local variable.

You must call increment_log_prob() with the log absolute determinant of the Jacobian of the transform.
Sampling Statement left-hand-side expression:
subtract(rt,c) ~ lognormal_log(...)

COMPILING THE C++ CODE FOR MODEL 'logshifted_lmm' NOW.
>
-----------------------------

Maybe one of our other devs who knows more about R can help out here.

- Bob

> <error_stan2.txt>

Bruno Nicenboim

unread,

Mar 14, 2015, 10:33:40 AM3/14/15

to stan-...@googlegroups.com

ok, I've uninstalled the previous versions, I did a clean new install of stan 2.6, and it worked!!! :)

I did a couple of iterations and I didn't get any error. I'll investigate a bit about cross validation and then I'll do a new post with more questions about that and WAIC.

Thanks a lot, this was extremely quick and fruitful!

Bruno

Bob Carpenter

unread,

Mar 14, 2015, 7:34:21 PM3/14/15

to stan-...@googlegroups.com

We're especially concerned with mail when there are bugs on
our end or users have trouble compiling or installing, because
we know how frustrating that is from a user perspective.

And please do post questions about WAIC and x-validation in a new
thread.

- Bob

Bruno Nicenboim

unread,

Apr 26, 2015, 10:34:49 AM4/26/15

to stan-...@googlegroups.com

Hi, I have a new question related to the same shifted log normal hierarchical model.

The problem that I had at the beginning was that I was modeling the distribution of rt (reading times) like this:

(rt-c) ~ lognormal(log_mu, sigma);

and in some cases rt-c <0

where

c[i] <- shift + shift_u[subj[i]];

and both shift and shift_u are parameters, shift is the general shift and shift_u is the shift that each subject has

so you (Bob C.) explained that I should constrain the shift to prevent rt-c <0

and I also constrained shift so that c is positive:

      real<lower=shift_bounds(shift_u,subj,rt)[1], 
           upper=shift_bounds(shift_u,subj,rt)[2]> shift;

with a function:

functions {
 vector shift_bounds(row_vector shift_u, int[] subj, vector rt) {

     vector[2] shift_bounds;
    real shift_min;
    real shift_max;
    shift_min <- negative_infinity();


    shift_max <- positive_infinity();
    for (i in 1:num_elements(rt)){


         shift_min <- fmax(shift_min, - shift_u[subj[i]] );
         shift_max <- fmin(shift_max,rt[i] - shift_u[subj[i]] );
           
        }
    shift_bounds[1] <- shift_min;
    shift_bounds[2] <- shift_max;


    return shift_bounds;
  }
}

So far so good. It worked pretty well.

Now the issue is what happens if shift_u isn't a part of the parameters section but of transformed parameters. I assume that the individual shift may be correlated with the by subject intercept or slopes of the mean (the so-called random effects in linear mixed models). And then if I generate the shift_u from the same multivariate distribution that I use to generate the "random effects", how can I constrain the shift? I can't put it in parameters any longer because it depends on a transformed parameter, and if it's in transformed parameters it should depend on another parameter.... So is there a the solution?

I hope I'm expressing myself clearly enough. And I might be doing something that doesn't make sense, in that case I'll be also happy to know it :)

I'm attaching the latest version of the model (not working now).

Thanks!

Bruno

lmm_BF_shift_normales-cor.stan

Bob Carpenter

unread,

Apr 28, 2015, 10:35:37 AM4/28/15

to stan-...@googlegroups.com

The typical approach is to apply exp() to make sure everything's
positive. But that changes the meaning of the variables.

If you want to stick to a linear scale, then you have to do the
algebra to make sure all your parameters are constrained so that
all the transformed parameters satisfy whatever constraints you need.

- Bob

> <lmm_BF_shift_normales-cor.stan>

Bruno Nicenboim

unread,

Apr 28, 2015, 11:32:34 AM4/28/15

to stan-...@googlegroups.com

mmh, ok, it makes sense.

thanks!

Reply all

Reply to author

Forward