ZeroOne Inflated Beta Regression

741 views
Skip to first unread message

Michael

unread,
Mar 17, 2014, 3:32:31 PM3/17/14
to stan-...@googlegroups.com
I am trying to develop a zero / one inflated beta regression model using rstan.  The idea is that 0s and 1s are modeled using a bernoulli distribution, (0,1) are modeled using a beta distribution. I have attached the R code and the Stan model code.

The gamlss.dist package has a beta inflated distribution, BEINF. The R code uses that to generate example data.  The R code also uses the gamlss function to compare against the stan model.

There are four parameters for the model: mu, lambda, nu, tau.  Mu is the mean of the beta distribution, lambda is the total count.  I am following the re-parameterization of Beta from the Stan Reference 2.2.0 p. 155.  Nu is the probability of getting 0 / probability of (0,1).  Tau is the probability of getting 0 / probability (0,1).  Each parameter is modeled using an intercept and coefficients, with normal distribution for the priors.  

The main problem is this: the model does a particularly poor job of estimating the lambda intercept, and sometimes the lambda coefficients.  (It does fine with everything else).  Using 4 chains, this is how it compares:
lambda_intercept
actual: 1, model: 2.3 (2.0–2.6)

lambda_coef1
actual: 0.2, model: 0.5 (0.2–0.6)

lambda_coef2
actual: -0.5, model: -0.5 (-0.7 – -0.5)

Is this variation to be expected or am I doing something wrong?  I am getting a few of these messages: 
Informational Message: The current Metropolis proposal is about to be rejected becuase of the following issue:
Error in function stan::prob::bernoulli_log(N4stan5agrad3varE): Probability parameter is -nan:0, but must be finite!

Informational Message: The current Metropolis proposal is about to be rejected becuase of the following issue:
Error in function stan::prob::beta_log(N4stan5agrad3varE): Second shape parameter is 0:0, but must be > 0!

I imagine that there is a lot I could do to improve on this model.  Any help is appreciated!  Thanks!

Example - ZeroOne-Inflated Beta Regression GAMLSS.R
ZeroOne Inflated Beta Regression.stan

Bob Carpenter

unread,
Mar 17, 2014, 3:55:11 PM3/17/14
to stan-...@googlegroups.com
The mixture model you wrote for 0/1 inflation is here:

if (y[n] == 0)
increment_log_prob(bernoulli_log(1, p0));
else if (y[n] == 1)
increment_log_prob(bernoulli_log(1, p1));
else
increment_log_prob(beta_log(y[n], alpha, beta) + bernoulli_log(1, p2));

I find this a very confusing way to write a mixture of a bernoulli
and beta, when you could just write it out more directly using
phi as the mixing ratio as

if (y[n] == 0 || y[n] == 1)
increment_log_prob(bernoulli_log(y[n] | p) + bernoulli_log(1,phi))
else
increment_log_prob(beta_log(y[n], alpha, beta) + bernoulli_log(0,phi));

I think we should add (real / vector); a better hack for real x divided
by vector v is:

rep_vector(x,size(v)) ./ v

To answer your question, the amount of variation from the expected values
depend both on the amount of data and whether the model you wrote down
actually identifies the parameters. I'd worry about there being too
many degrees of freedom in your encoding with both tau and nu.

But as always, I'd suggest working up from a simple model. In this case
it might be one where there's only an intercept, say, or not even a regression,
just a simple mixture estimating alpha, beta and phi above.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
> <Example - ZeroOne-Inflated Beta Regression GAMLSS.R><ZeroOne Inflated Beta Regression.stan>

Reply all
Reply to author
Forward
0 new messages