beta regression using the brms package

1,013 views
Skip to first unread message

Jannik Vindeløv

unread,
Jan 13, 2017, 5:32:24 AM1/13/17
to brms-users
I would like to use brms to model the substrate conversion rate (response) as a function of initial substrate level (lac_init), enzyme dose (dose), temperature (temp), time (time), and pH (finalpH).

The response is a fraction: (initial substrate level - final substrate level) / (intial substrate level) bounded in the interval ]0 ; 1[ . I assume that Beta regression would be suitable for such a problem.

I can model this using the betareg package getting good results

package �betareg� was built under R version 3.2.5
Call:
betareg(formula = response ~ (sqrt(dose) + time + finalpH + temp)^2 + lac_init, data = as.data.frame(in_train))

Standardized weighted residuals 2:
    Min      1Q  Median      3Q     Max 
-7.1395 -0.5896 -0.0256  0.5600  8.1095 

Coefficients (mean model with logit link):
                     Estimate Std. Error z value Pr(>|z|)    
(Intercept)        21.1354940  4.4914824   4.706 2.53e-06 ***
sqrt(dose)         -0.8011171  0.0894485  -8.956  < 2e-16 ***
time               -0.2734958  0.0370627  -7.379 1.59e-13 ***
finalpH            -4.7567368  1.0552170  -4.508 6.55e-06 ***
temp               -0.8340544  0.1265061  -6.593 4.31e-11 ***
lac_init           -1.0515355  0.1043750 -10.075  < 2e-16 ***
sqrt(dose):time     0.0022238  0.0001349  16.487  < 2e-16 ***
sqrt(dose):finalpH  0.1847247  0.0210916   8.758  < 2e-16 ***
sqrt(dose):temp     0.0051239  0.0005236   9.785  < 2e-16 ***
time:finalpH        0.0559777  0.0085047   6.582 4.64e-11 ***
time:temp           0.0015417  0.0002115   7.289 3.13e-13 ***
finalpH:temp        0.1817306  0.0291293   6.239 4.41e-10 ***

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)    
(phi)   73.648      5.939    12.4   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 772.2 on 13 Df
Pseudo R-squared: 0.9243
Number of iterations: 68 (BFGS) + 5 (Fisher scoring) 

The model performs quite well:
      RMSE   Rsquared 
0.04320536 0.96851647 

I would like to do the same model using brms:

brms_model <- brm(response ~ (sqrt(dose) + time + finalpH + temp)^2 + lac_init,
                  data = as.data.frame(in_train),
                  family = Beta())

But get the following error(s):

SAMPLING FOR MODEL 'beta(logit) brms-model' NOW (CHAIN 1).
  [1] "Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:"                        
  [2] "Exception thrown at line 39: beta_log: First shape parameter[194] is 0, but must be > 0!"                                              
  [3] "If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,"
  [4] "but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."      

Any ideas on what i should do? Is the approach correct? do I need to specify a prior on the shape parameter? - and how to do it?

Any help would be highly appreciated

/Jannik

PS. I can provide the data-set if needed

Paul Buerkner

unread,
Jan 13, 2017, 5:51:56 AM1/13/17
to brms-users
HI Jannik,

providing the data would be great! If you don't want to share it here, just send me and email and I find out what goes wrong.

Best,
Paul

Jannik Vindeløv

unread,
Jan 13, 2017, 6:14:03 AM1/13/17
to brms-users
Thanks Paul,

I will send it by mail

/Jannik

Paul Buerkner

unread,
Jan 13, 2017, 6:38:40 AM1/13/17
to brms-users
The sampler has problems finding adequate initial values possibly because the scale of your predictors is too large.

If you use inits = 0, it should work, but it will be slow presumably because there are some many interactions that there is some sort of multicollineraity going on.

Jannik Vindeløv

unread,
Jan 13, 2017, 7:26:02 AM1/13/17
to brms-users
The inits = 0 appears to be working (but is very slow, it is still running on my laptop...) 

Would it help to normalize the predictors (e.g. mean centered unit variance) before modelling and would you generally recommend doing this when using brms?
If so could it be a feature for the brm function to do this internally (e.g brm(..., normalize = TRUE))?

Thank you for your help!

/Jannik

Paul Buerkner

unread,
Jan 13, 2017, 9:36:27 AM1/13/17
to brms-users
When I normalize all predictors, everything works fine and fits in a few seconds suggesting that it is indeed a problem of predictor scales and / or high correlation of predictors (due to the interactions).

Here is what I did:

in_train$zsqrt_dose <- scale(sqrt(in_train$dose))
in_train$ztime
<- scale(in_train$time)
in_train$zfinalpH
<- scale(in_train$finalpH)
in_train$ztemp
<- scale(in_train$temp)
in_train$zlac_init
<- scale(in_train$lac_init)


brms_model
<- brm(response ~ (zsqrt_dose + ztime + zfinalpH + ztemp)^2 + zlac_init,
                  data
= in_train, family = Beta(), inits = 0)

summary
(brms_model)
marginal_effects
(brms_model)
Reply all
Reply to author
Forward
0 new messages