Poisson Likelihood with non-integer response

1,288 views
Skip to first unread message

AJ

unread,
Jul 23, 2016, 8:37:16 AM7/23/16
to R-inla discussion group
Hello,

Is it possible to fit a Poisson model to non-integer data in INLA (similar to what the quasipoisson family does in glm)?

Thanks in advance,

Abdollah

INLA help

unread,
Jul 23, 2016, 8:46:13 AM7/23/16
to AJ, R-inla discussion group
On Sat, 2016-07-23 at 05:37 -0700, AJ wrote:
> Hello,
>
> Is it possible to fit a Poisson model to non-integer data in INLA
> (similar to what the quasipoisson family does in glm)?


no. 

what would the expression for that likelihood be in that case?


--
Håvard Rue
he...@r-inla.org

Finn Lindgren

unread,
Jul 23, 2016, 8:58:07 AM7/23/16
to he...@r-inla.org, AJ, R-inla discussion group
The document
has this to say about quasi-Poisson:
"Another way of dealing with over-dispersion is to use the mean regression function and the variance function from the Poisson GLM but to leave the dispersion parameter φ unrestricted. Thus, φ is not assumed to be fixed at 1 but is estimated from the data. This strategy leads to the same coefficient estimates as the standard Poisson model but inference is adjusted for over-dispersion. Consequently, both models (quasi-Poisson and sandwich-adjusted Poisson) adopt the estimating function view of the Poisson model and do not correspond to models with fully specified likelihoods."

Finn
--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To post to this group, send an email to r-inla-disc...@googlegroups.com.
Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.

AJ

unread,
Jul 23, 2016, 9:51:32 AM7/23/16
to R-inla discussion group, abdullah...@gmail.com, he...@r-inla.org

what would the expression for that likelihood be in that case?

Thanks for your reply,

Let S={y_1, y_2, ...} be a countable set of non-negative numbers (not necessary integers) such that y_i \to \infty as i \to \infty. Then for any \lambda > 0,
   C exp( -\lambda + y_i \log\lambda ) / \Gamma(1 + y_i),   i=1,2, ...
defines a distribution (probability mass function) on S, where
  C = exp(-\lambda) \sum_{i=1}^{\infty}  \exp( y_i \log\lambda ) / \Gamma(1 + y_i) < \infty
is the normalizing factor. Thus the likelihood is proportional to the likelihood of an ordinary Poisson distribution.
This means that the same inference is obtained by using the Poisson likelihood for such data.

In the glm function, family=poisson with non-integer response returns warnings because of calling the Poisson density (mass) function with non-integer values.
> x <- c(2.1, 1.3, 6.2, 7.8, 2.2, 4.6, 3.4, 0.5)
> y <- c(1.4, 1.2, 3.1, 3.2, 1.5, 2.4, 1.8, 0.1)
> glm(y ~ x, family=poisson)

Call:  glm(formula = y ~ x, family = poisson)

Coefficients:
(Intercept)            x  
   
-0.2077       0.1991  

Degrees of Freedom: 7 Total (i.e. Null);  6 Residual
Null Deviance:        5.026
Residual Deviance: 1.474     AIC: Inf
Warning messages:
1: In dpois(y, mu, log = TRUE) : non-integer x = 1.400000
2: In dpois(y, mu, log = TRUE) : non-integer x = 1.200000
3: In dpois(y, mu, log = TRUE) : non-integer x = 3.100000
4: In dpois(y, mu, log = TRUE) : non-integer x = 3.200000
5: In dpois(y, mu, log = TRUE) : non-integer x = 1.500000
6: In dpois(y, mu, log = TRUE) : non-integer x = 2.400000
7: In dpois(y, mu, log = TRUE) : non-integer x = 1.800000
8: In dpois(y, mu, log = TRUE) : non-integer x = 0.100000


But with family=quasipoisson, the same results are obtained with no warnings.

> glm(y ~ x, family=quasipoisson)

Call:  glm(formula = y ~ x, family = quasipoisson)

Coefficients:
(Intercept)            x  
   
-0.2077       0.1991  

Degrees of Freedom: 7 Total (i.e. Null);  6 Residual
Null Deviance:        5.026
Residual Deviance: 1.474     AIC: NA

So, I guess it depends on the way the Poisson likelihood is implemented in INLA. It seams to work fine.
> stk <- inla.stack(data=list(y=y), A=list(1), effects=list(x=x))
> res <- inla(y ~ x, family="poisson", data=inla.stack.data(stk))
> res$summary.fixed[, 1:5]
                  mean        sd  
0.025quant   0.5quant 0.975quant
(Intercept) -0.1903892 0.5603239 -1.379172167 -0.1587073  0.8251839
x            
0.1990556 0.1053645 -0.005815719  0.1983585  0.4076384

How does INLA treat non-integer values in the Poisson likelihood?
Does it simply calculate the likelihood at the data points regardless of their type (integer or non-integer)?

Best regards,

Abdollah

INLA help

unread,
Jul 23, 2016, 9:59:08 AM7/23/16
to AJ, R-inla discussion group
On Sat, 2016-07-23 at 06:51 -0700, AJ wrote:
>
> > what would the expression for that likelihood be in that case? 
> >
> Thanks for your reply,
>
> Let S={y_1, y_2, ...} be a countable set of non-negative numbers (not
> necessary integers) such that y_i \to \infty as i \to \infty. Then
> for any \lambda > 0,
>    C exp( -\lambda + y_i \log\lambda ) / \Gamma(1 + y_i),   i=1,2,
> ...
> defines a distribution (probability mass function) on S, where
>   C = exp(-\lambda) \sum_{i=1}^{\infty}  \exp( y_i \log\lambda ) /
> \Gamma(1 + y_i) < \infty
> is the normalizing factor. Thus the likelihood is proportional to the
> likelihood of an ordinary Poisson distribution. 
> This means that the same inference is obtained by using the Poisson
> likelihood for such data.


so to compute the normalizing constant, we'll need to know all the
'y_i' values it _could take ? 
--
Håvard Rue
he...@r-inla.org

AJ

unread,
Jul 23, 2016, 10:36:19 AM7/23/16
to R-inla discussion group, abdullah...@gmail.com, he...@r-inla.org

so to compute the normalizing constant, we'll need to know all the
'y_i' values it _could take ? 

Yes, for non-integer data the normalizing constant 'C' is not easy (if not impossible) to calculate. But for parameter estimation, the normalizing constant cancels out in the posterior density and therefore we do not need to compute it.

INLA help

unread,
Jul 23, 2016, 4:29:38 PM7/23/16
to AJ, R-inla discussion group
it does depend on \lambda... 
--
Håvard Rue
he...@r-inla.org

Finn Lindgren

unread,
Jul 23, 2016, 4:47:59 PM7/23/16
to he...@r-inla.org, AJ, R-inla discussion group

>> Yes, for non-integer data the normalizing constant 'C' is not easy
>> (if not impossible) to calculate. But for parameter estimation, the
>> normalizing constant cancels out in the posterior density and
>> therefore we do not need to compute it.
>
> it does depend on \lambda...

Precisely. Like they said in the document I referenced, the pseudo-poisson "model" (it appears to be the same one discussed here) used by the glm packages is based on estimating equations, and doesn't have a likelihood model. One can define one in the way discussed in this thread, and that may mean something similar, but as Håvard says, the normalization depends on \lambda, so it only cancels out for a fixed lambda, and not for the whole posterior range of kappa.

Finn

AJ

unread,
Jul 23, 2016, 5:43:35 PM7/23/16
to R-inla discussion group, he...@r-inla.org, abdullah...@gmail.com

> it does depend on \lambda...

That is right. Sorry, my mistake.

Precisely.  Like they said in the document I referenced, the pseudo-poisson "model" (it appears to be the same one discussed here) used by the glm packages is based on estimating equations, and doesn't have a likelihood model. One can define one in the way discussed in this thread, and that may mean something similar, but as Håvard says, the normalization depends on \lambda, so it only cancels out for a fixed lambda, and not for the whole posterior range of kappa.


Yes, the estimation for the quasipoisson family in glm is based on a generalized estimation equation which happens to coincide with the likelihood (score = 0) equation of the Poisson model. Thank you for the detailed answer. In fact the model considered above belongs to an over-dispersed exponential family of distributions as discussed in the following reference:

Gelfand, A. E., & Dalal, S. R. (1990). A note on overdispersed exponential families. Biometrika, 77(1), 55-64.

 

INLA help

unread,
Jul 24, 2016, 4:43:25 AM7/24/16
to AJ, R-inla discussion group
On Sat, 2016-07-23 at 14:43 -0700, AJ wrote:



maybe this is a more natural version? 

http://ac.inf.elte.hu/Vol_039_2013/137_39.pdf
--
Håvard Rue
he...@r-inla.org

AJ

unread,
Jul 24, 2016, 7:45:49 AM7/24/16
to R-inla discussion group, abdullah...@gmail.com, he...@r-inla.org


maybe this is a more natural version? 

http://ac.inf.elte.hu/Vol_039_2013/137_39.pdf
-- 

Thanks, it is more natural. However, the likelihood involves a two dimensional integral  and the moments are not in closed form. I guess it would be hard to work out the link function and implement the likelihood in INLA.
Reply all
Reply to author
Forward
0 new messages