[R] how to define a geometric distribution for "glm"

149 views
Skip to first unread message

Amanda Li

unread,
Oct 27, 2014, 5:31:22 PM10/27/14
to r-h...@r-project.org
Hello,

I was trying to apply "glm" to a dataset that assumes geometric
distribution. I cannot use "glm.nb" in MASS package (negative.binomial (1))
because it tries to estimate this "1" while I am interested in "p", the
probability of success. Does anyone know how I can define a geometric
distribution within "family" so that I can use glm assuming geometric
distribution to estimate "p"?

I am not sure how "quasi" within the family works in this case and I am not
sure whether it can be used to assume geometric distribution.

Thanks in advance for your help! I really appreciate it!
Best regards,
Amanda

[[alternative HTML version deleted]]

______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

peter dalgaard

unread,
Oct 27, 2014, 6:29:59 PM10/27/14
to Amanda Li, r-h...@r-project.org
The likelihood for the geometric distribution is the same as for the binomial distribution, except for the constant term, so estimates and LRT will be the same. The properties of the estimator will be different, e.g. the estimate of p is not unbiased, but asymptotically the likelihood procedures should work (asymptotic in this case means a reasonably large total number of both successes and failures, I suppose.)

So, if your geometric variate is called y, with the R convention of counting the number of failures (not number of experiments), it should work with

glm(cbind(1,y) ~ whatever, family="binomial")

[The likelihood equivalence is fairly well-known in statistical theory as a counterargument to the strong likelihood principle that all inference should be based solely on the likelihood function.]

- Peter D.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk Priv: PDa...@gmail.com

Amanda Li

unread,
Oct 27, 2014, 6:43:36 PM10/27/14
to peter dalgaard, r-h...@r-project.org
Hi Peter,

Thank you very much for your help! However, for my dataset, it may not
asymptotically work. May I ask whether you know how to define a new family?

Thank you very much again!

Best,
Amanda
Reply all
Reply to author
Forward
0 new messages