why statsmodels OLS uses t-test but GLM uses z-test for confidence interval

143 views
Skip to first unread message

Mohsen Vazirizade

unread,
Oct 6, 2021, 3:45:13 PM10/6/21
to pystatsmodels
Hi,
As you may now, both smf.ols and mf.glm can be used for fitting a regression model. Theoretically, they should bear the same results if the error is normally distributed. I ran 2 regressions (attached please find the code), once using ols and the other using glm. While, their results are identical, ols uses t-test but glm uses z-test. Consequently, their confidence intervals are different. 

To my knowledge, we always should use t-test since we do not know the actual value for sigma and we estimate that by the samples we have from the population. I was wondering if someone can explain why smf.glm uses z-test.
Thank you


Screen Shot 2021-10-06 at 2.27.04 PM.png
Screen Shot 2021-10-06 at 2.27.12 PM.png
question_regression.py

josef...@gmail.com

unread,
Oct 6, 2021, 3:50:33 PM10/6/21
to pystatsmodels
On Wed, Oct 6, 2021 at 3:45 PM Mohsen Vazirizade <s.m.vaz...@gmail.com> wrote:
Hi,
As you may now, both smf.ols and mf.glm can be used for fitting a regression model. Theoretically, they should bear the same results if the error is normally distributed. I ran 2 regressions (attached please find the code), once using ols and the other using glm. While, their results are identical, ols uses t-test but glm uses z-test. Consequently, their confidence intervals are different. 

To my knowledge, we always should use t-test since we do not know the actual value for sigma and we estimate that by the samples we have from the population. I was wondering if someone can explain why smf.glm uses z-test.
Thank you

In GLM the default is the same for all families. So, we only assume that the parameters are asymptotically normal distributed.
Using t-distribution for wald tests will often have better small sample behavior than the normal distribution, but that's mostly from monte carlo simulations and not a theoretical result.

However, you can set `use_t=True` as option in `fit(...)` and the distribution is changed from z to t and chi2 to F.

Josef



 


Screen Shot 2021-10-06 at 2.27.04 PM.png
Screen Shot 2021-10-06 at 2.27.12 PM.png

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/e414b712-ad30-4b77-a82e-a403d6299b5fn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages