Non-linear models: How to specify priors for multiple random effects?

848 views
Skip to first unread message

Meghna Krishnadas

unread,
Jun 1, 2017, 9:27:45 AM6/1/17
to brms-users
Hi,

I'm modeling a power law relationship between two measured variables, coded as:
Y = a * X^b , where Y is drawn from a negative binomial distribution.

I am also modeling multiple random effects for 'a' and 'b' such that the final model looks like:
Y[i] = a1[G1] * a2[G2] * a[G3] * X^b[G3], where Gj are different grouping factors (random effects).

In addition, a[G3] and b[G3] should be modeled as linear functions of a group-level predictor Z for G3, i.e. a[j] and b[j] = g0 + g1 * Z[j].

I have specified this non-linear model in brms as:

sm.brmod2 <- brm(bf(y ~ a*X^b,
                 b ~ 1 + Z + (1|G3), a ~ 1 + Z + (1|G3) + (1|G2) + (1|G1), nl = T),
                 prior = c(set_prior("normal (0, 1)",   nlpar = "a"),
                 set_prior("normal (0, 1)", nlpar = "b")),
                 data = SDM, family = "negbinomial",
                 iter = 1000, chains = 3, thin = 5,
                 control = list(adapt_delta = 0.99),
                 save_model = "brm-mod-nb2.txt")

This model takes very long to run (each chain takes ~3 hours) and R-hat values remain very high even with 1000 iterations. I suspect this may be due to uninformative priors or initial values. But, I am unable to specify those priors in brms because I do not know how to code the random effects individually, since brms creates the vectors internally.
I have previously run this model in JAGS where I specified vectors of initial values and tight priors for all the random effects, but model convergence has been an issue.

I have ~47k data points, 200 levels in G1, 20 levels in G2 and 140 levels in G3.


Has the model been coded correctly? Can the model coinvergence and speed be improved with better specification of priors or initial values? If so, how can I specify priors and initial values for each of the random effects separately?


Any help is greatly appreciated.


Thanks,
Meghna



Paul Buerkner

unread,
Jun 1, 2017, 1:20:47 PM6/1/17
to brms-users
You can use the "group" (and possibly the "coef") argument of set_prior to achieve what you want.

To set a prior on the random effect of, say, G1, go for

set_prior(<your prior>, nlpar = "a", group = "G1")

Setting custom initial values is likely not required or helpful. You may try inits = 0 though to initiliaze everything at zero. Sometimes this can be helpfuls.

A few other thoughts:

1. thinning is not required for only a few thousands samples (so basically it is never required in Stan). you just throw away helpful information.
2. I would run 2k iter per chains (with 1k warmup). For such complicated models, only 500 warmup samples may not be sufficient.

Best,
Paul

Meghna Krishnadas

unread,
Jun 2, 2017, 6:56:07 AM6/2/17
to brms-users
Thanks Paul. I changed my code as per your suggestions to:

sm.brmod2 <- brm(bf(sdl.rec ~ a*seed.new^b,
                 b ~ 1 + SMlog + (1|sp), a ~ 1 + SMlog + (1|sp) +
                   (1|year) + (1|trap), nl = T),

              prior = c(set_prior("normal (0, 1)",   nlpar = "a"),
                        set_prior("normal (0, 1)", nlpar = "b"),
                        set_prior("normal (0, 1)", class= "sd", nlpar = "a", group = "year"),
                        set_prior("normal (0, 1)", class= "sd", nlpar = "a", group = "trap"),
                        set_prior("normal (0, 1)", class= "sd", nlpar = "a", group = "sp"),
                        set_prior("normal (0, 1)", class= "sd", nlpar = "b", group = "sp")),
              data = SDM, family = "negbinomial", inits = 0,
              iter = 2000, chains = 2, warmup = 1000,
              control = list(adapt_delta = 0.99),
              save_model = "bci-brm-mod-nb2.txt")

1. Is this formulation correct?

2. The model takes even longer to run now! The first iteration for warmup ran for a few minutes. To give you an idea of the time it is taking, here is the message:

Gradient evaluation took 0.041 seconds
1000 transitions using 10 leapfrog steps per transition would take 410 seconds.
Adjust your expectations accordingly!

Could you provide any insights as to why this is happening? Would you have any suggestions to improve the model?

Thanks,
Meghna

Paul Buerkner

unread,
Jun 2, 2017, 7:02:40 AM6/2/17
to brms-users
The prior specification looks good. I am not surprised by the time it will take. You have 47k observations, a complicated non-linear multilevel model and a negbinomial family. Before, it ran only 3 hours, because it didn't actually sample nicely, but now, Stan actually has the chance to fit the model properly, and this will simply take some more time. The first warmup iterations are particularily slow, but the upcoming iterations should be faster.

I would be happy to hear, if the model (once finished sampling) has actually converged.

Meghna Krishnadas

unread,
Jun 2, 2017, 7:12:32 AM6/2/17
to brms-users
Hi Paul,

Thanks for that clarification. I wasn't sure if the speed issues were due to the size of the data and the nature of the model, or a problem with the model specification.
I'll let this model run and update you on the outcome.

Appreciate your help!

Best,
Meghna

Meghna Krishnadas

unread,
Jun 2, 2017, 8:03:29 AM6/2/17
to brms-users
Quick update:

I set the model to run with 2000 iterations and 1000 warmups and after >45 min it is still at
Iteration:    1 / 2000 [  0%]  (Warmup)

Does this still seem reasonable? Is the issue here something other than prior specification or data size?

Thanks,
Meghna

Paul Buerkner

unread,
Jun 2, 2017, 8:11:24 AM6/2/17
to brms-users
Let it run for a day. Than we will see what's happening. I think a little bit of patience might be needed.

Meghna Krishnadas

unread,
Jun 5, 2017, 12:08:39 AM6/5/17
to brms-users
Hi Paul,

Here's an update on the model:

It ran for ~ 22 hours and the parameter estimates converged (rhat < 1.1). The population-level effects appear reasonable (biologically feasible).

However, I am unsure whether the syntax in which I specified the model was correct. To make myself clear, I would like to reiterate the scientific idea I am seeking to test with this model (please bear with me). I'm modeling a power law relationship between two variables coded as:


Y = a * X ^ b , where Y is drawn from a negative binomial distribution.


In addition, a[G3] and b[G3] are modeled as linear functions of a group-level predictor Z for G3,
i.e. a[j] = g0 + g1 * Z[j] and b[j] = h0 + h1 * Z[j].


Using terminology from Gelman & Hill, ARM, the main hypotheses is about how a group-level predictor 'Z' correlates with/explains variation in 'a' and 'b' across groups.

As currently formulated, the model output gives the effect of Z as a population-level effect. The plots of marginal effects also display Z versus the main response Y. I just wanted to make sure that I have specified the model correctly for the hypothesis I want to examine. Should I instead specify the effect of group-level predictor as:

sm.brmod3 <- brm(bf(sdl.rec ~ a*seed.new^b,
                    b ~ 1 + (SMlog|sp), a ~ 1 + (SMlog|sp) +
                      (1|year) + (1|trap), nl = T), ...

Thanks for your time.

Best wishes,
Meghna

Paul Buerkner

unread,
Jun 6, 2017, 5:34:56 AM6/6/17
to brms-users
Hmm,

when I understand your model correctly, it should be something like:

bf(y ~ a*X^b,
   b ~ 1 + Z + (Z|G3),
   a ~ 1 + Z + (Z|G3) + (1|G2) + (1|G1),
   nl = T)


That is, you model Z as varying over G3 centered around the respective population-level effect of Z for a and b.

One important note to your model. By default, the negbinomial family applies the log-link to make sure the predictor is positive. So the expected value of your response is

E(Y) = exp(a*X^b)

You can change to the identity link via family = negbinomial("identity"), but then you have to make sure that
a*X^b ends up being positive only.

Paul Buerkner

unread,
Jun 7, 2017, 3:19:43 AM6/7/17
to brms-users
I think I misread your model. It should be

bf(y ~ a*X^b,
   b ~ 1 + Z + (1|G3),
   a ~ 1 + Z + (1|G3) + (1|G2) + (1|G1),
   nl = T)


In case you write (Z|G3), as I previously suggested, you model as if Z has a different effect of species. In your notation a[j] = g0 + g1[j] * Z[j]. For this to work and be meaningful, Z has to vary within species, but I guess that's not the case right?

Paul Buerkner

unread,
Jun 9, 2017, 3:52:20 AM6/9/17
to brms-users
With regard to the questions asked via email:

1. The model estimates correlations between all parameters, since everything is in the posterior anyway. If you refer to the correlations between all the b_* coefficients, simply go for vcov(fit, cor = TRUE)

2. Do you mean you used just 200 iterations or is the effective sample size (Eff.Sample) equal to 200? You can easily run more samples via argument iter (by default, half will be used as warmup).

3. I currently don't see how you could do posterior predictive checks on 'a' or 'b' nor what would be the benefit of it. I guess, I will have to think of it a bit more.

Best,
Paul

Meghna Krishnadas

unread,
Jun 26, 2017, 5:29:51 AM6/26/17
to brms-users
Hi Paul,

I am writing with an update about the two-level negative binomial models for a power-law relationship, for which I had sought your advice. The models ran and converged with 2500 iterations and they seem to be providing reasonable parameter estimates. Thank you very much for your help!

A few additional issues:

1. I am unable to calculate WAIC due to memory issues with my computer (8 GB RAM). Is there any other way to compare multiple models?

2. Other than posterior predictive checks, how might I test model fit and validity?

Thanks,
Meghna

Paul Buerkner

unread,
Jun 26, 2017, 5:47:38 AM6/26/17
to brms-users
1. Use WAIC(..., pointwise = TRUE). This will take longer but you won't run into memory issues.

2. Information criteria (i.e. LOO or WAIC) allow to compare fit of different models. When comparing models via LOO or WAIC and investigating PPCs you should be fine with regard to model fit I think. What exactly do you mean with "validity" here in this context?

Meghna Krishnadas

unread,
Jul 4, 2017, 2:29:24 PM7/4/17
to brms-users
Hi Paul,

WAIC(..., pointwise = TRUE) worked well. Thank you.

Regarding my query about model fit - In addition to comparing different models for their fit, I would also like to ensure that the model represents the original data well. I suppose pp_check() does this? In effect, if I were to simulate data based on parameters from this model, how closely would I recover the observed/actual data?

Best,
Meghna

Meghna Krishnadas

unread,
Jul 5, 2017, 6:31:11 PM7/5/17
to brms-users
Hi Paul,

I tried re-running the non-linear model to constrain a to >=0. However, this model estimates unreasonably small random/group-level effects suggesting minimal difference among groups. I am pretty sure this is not correct. Further, even the group-level effects for 'b', i.e. from ranef() are totally different from the model without constraints. Why would constraining a lead to such huge variation in the output?

I need to constrain a because of the behavior of this variable in this system. But, if imposing the constraint gives unreasonable estimates, then how does one interpret this?


sm.brmod2 <- brm(bf(y ~ a*X^b,
                 b ~ 1 + Z + (1|G3), a ~ 1 + Z + (1|G3) + (1|G2) + (1|G1), nl = T)


Further, in order to better interpret the outputs, I had a couple of questions:

1. In the Stan code, why are the group-level effects scaled by SD? Does that change how we should interpret and plot variation estimated within groups? e.g. if I want to plot a histogram of parameters estimated for each group, am I correct in thinking that the output from ranef() should be used?

2. At what step is the effect of the group-level predictor 'Z' modeled?

I hope this the right forum to ask these questions, since they may be considered statistical queries and not specific to brms usage!

Thanks,
Meghna

Meghna Krishnadas

unread,
Jul 5, 2017, 8:05:57 PM7/5/17
to brms-users
Here's my formulation:

priors1 <- c(set_prior("normal (0, 1)", nlpar = "a", lb=0),

           set_prior("normal (0, 1)", nlpar = "b"),
           set_prior(" (0, 1)", class= "sd", nlpar = "a", group = "year"),
           set_prior("uniform (0, 1)", class= "sd", nlpar = "a", group = "trap"),
           set_prior("uniform (0, 1)", class= "sd", nlpar = "a", group = "sp"),
           set_prior("uniform (0, 1)", class= "sd", nlpar = "b", group = "sp"))

brm2a <- brm(bf(Y ~ a*X^b,
                   b ~ 1 + Z + (1|sp), a ~ 1 + Z + (1|sp) +
                     (1|year) + (1|trap), nl = T),
                prior = priors1, data = sdsl1,
                family = "zero_inflated_negbinomial",
                iter = 200, chains = 2, #warmup = 1000,
                control = list(adapt_delta = 0.88), cores = 3,
                save_model = "bci-brm-SM-nb2a.txt")

For question 2: I was looking at the Stan code generated by brm to understand how exactly the effect of the group-level predictor had been coded. This was also to double-check my own previous (utterly novice) attempts at formulating the model. I want to understand how predictor Z is incorporated into

mu_b[n] = mu_b[n] + (r_1_b_1[J_1[n]]) * Z_1_b_1[n]

Hope that makes it clearer.

Thanks,
Meghna

Paul Buerkner

unread,
Jul 6, 2017, 12:12:17 PM7/6/17
to brms-users
Just putting uniform priors on the SDs, won't make sure "a" will stay in [0,1] you are basically just biasing your results by imposing way too informative priors.

The only way to ensure a as whole does stay in [0,1] is to model f(a) instead of a where f() maps the real line onto [0,1]. The most convenient function possibly is inv_logit() as shown in my previous post.

2) Z goes into mu_b[n] via mu_b = X_b * b_b as you model Z just as a population-level effect not as a group-level effect. To do the latter, that is model the effect of Z as varying over "sp", go for

bf(Y ~ a*X^b,
     b ~ 1 + Z + (1 + Z | sp),
   a ~ 1 + Z + (1 + Z | sp) +  (1|year) + (1|trap), nl = T)

Meghna Krishnadas

unread,
Jul 6, 2017, 12:17:26 PM7/6/17
to brms-users
Hi Paul,

For constraining posterior estimates of 'a', I can't use the invlogit because I only want 'a' > 0, but it could be >1.

About the effect of 'Z', we had this discussion previously and arrived at the conclusion that 'Z' should be modeled as:

b ~ 1 + Z + (1|sp) (similarly for a).

To reiterate, this was because I do not want 'Z' to vary within the group 'sp'. Instead, I want to see how 'Z' correlates with the estimated values of 'b' per 'sp' (group). The value of 'Z' is actually the same for each group. In a non-joint modeling, I would run a model for individuals within each sp, get the estimates of 'a' and 'b' and then model the effect of Z as:

b(i) = g0 + g1*Z(i)


So, to confirm (and I do apologize for belaboring this issue!) - should I write

b ~ 1 + Z + (1|sp) or b ~ 1 + Z + (1 + Z|sp) ?

Wouldn't the latter mean that Z also varies within individuals of the i'th group sp?

On a tangential note: do you have any plans to organize a workshop or video conference about use of brms to set up various kinds of models? It would be most helpful to have an opportunity to speak with you in person, and also interact with other users.

Thanks,
Meghna

Paul Buerkner

unread,
Jul 7, 2017, 5:42:28 AM7/7/17
to brms-users
If you want a > 0, use exp(a).

Indeed, b ~ 1 + Z + (1 + Z|sp) would mean that Z has to vary within the ith group, so that  b ~ 1 + Z + (1|sp) seems appropriate in your case.

With regard to the tangential note: I am already giving quite a few workshops but mostly after being invited by universities close to Germany, where I live.
I haven't thought of making an "online workshop" yet, but after you mentioned it, I think it would actually make much sense.
That said, if you want to talk to me in person about your research question, we can always to it via skype.

Meghna Krishnadas

unread,
Jul 7, 2017, 10:33:01 AM7/7/17
to brms-users
Thanks Paul.

I already tried exp(a), but all model parameters and CIs are estimated as 0 or 1, even after running 1000+ iterations. I'm not sure what goes wrong. Perhaps I need to specify different priors from what is used now?

I wrote the model as

brm(Y ~ exp(a)*X^b, a ~ 1+Z+(1|G1)+(1|G2)+
(1|G3), b ~ 1+Z+(1|G1), nl=T)...)

Both a and b are given N(0,1) priors. Any obvious errors? Should the prior and group-level formulation be for exp(a)?

Thanks,
Meghna

Paul Buerkner

unread,
Jul 7, 2017, 10:37:33 AM7/7/17
to Meghna Krishnadas, brms-users
I guess it is a prior issue. Try to remove the uniform priors from the sds and all boundary specifications in the priors.

--
You received this message because you are subscribed to the Google Groups "brms-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to brms-users+unsubscribe@googlegroups.com.
To post to this group, send email to brms-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/brms-users/21869889-32be-4c3b-ae5e-78d8db92da2c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Meghna Krishnadas

unread,
Jul 7, 2017, 11:02:49 AM7/7/17
to brms-users
I did remove all priors on sd and bounds on a. The issue persists. Should I try giving initial values?

Paul Buerkner

unread,
Jul 7, 2017, 11:14:23 AM7/7/17
to Meghna Krishnadas, brms-users
Initial values won't help much i think. At this point it may be most reasonable if I can get a look at data and model. If you cannot privately share the data with me could you make some fake data with which the problem occurs as well. I will then try to amend you model to get it working.

2017-07-07 17:02 GMT+02:00 Meghna Krishnadas <srishti...@gmail.com>:
I did remove all priors on sd and bounds on a. The issue persists. Should I try giving initial values?
--
You received this message because you are subscribed to the Google Groups "brms-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to brms-users+unsubscribe@googlegroups.com.
To post to this group, send email to brms-...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages