(partial followup on the robust covariance discussion)
"The user is responsible to check whether a statistical analysis is valid."
In many cases, statsmodels will happily produce results even if the analysis doesn't make much sense, or is statistically not justified.
some older cases
As we add more options and models to statsmodels there will be more combinations that are statistically not justified.
(Besides the statistical "validity", there are also possible numerical problems that we also warn about only partially.)
Users Beware! or More Warnings Please!
Encoding statistical knowledge requires that the developer also has it. But that's not always true.
Help Wanted:
If you know a case where we get an invalid result, then please help us and open an issue or a pull request.
Stata has in some cases a warning in the docstring for cases where the calculations are not valid, but the program has no way to verify this.
In other cases, Stata's estimation results have something like a blacklist for results that can or should not be calculated.
In other cases, Stata just refuses to calculate.
But we don't have the knowledge base nor the manpower of Stata development.
(I have no idea what R is doing in these cases.)
Josef
example
Statsmodels
>>> res_olsg.compare_lr_test(res_ols2)
(4.6679440835894184, 0.030730693840286833, 1.0)
Stata (refuses to calculate without 'force')
. lrtest A B, stats
LR test likely invalid for models with robust vce
r(498);
. lrtest A B, stats force
Likelihood-ratio test LR chi2(1) = 4.67
(Assumption: B nested in A) Prob > chi2 = 0.0307
-----------------------------------------------------------------------------
Model | Obs ll(null) ll(model) df AIC BIC
-------------+---------------------------------------------------------------
B | 202 . -766.3092 2 1536.618 1543.235
A | 202 . -763.9752 3 1533.95 1543.875
-----------------------------------------------------------------------------
Note: N=Obs used in calculating BIC; see [R] BIC note