> By now I have a guilty feeling, costing you guys so much time in going
> through my comments. Please tell me when to stop!
>
> Problem
> This time it is the standard errors for Poisson regression that don't match
> up. The fitted parameters, as well as the "fittedvalues" are pretty close (I
> guess that the mis-matches are just differences in when to stop the
> iteration). But the standard errors are WAY out.
I get the same standard errors as Stata. Stata code
insheet using /home/skipper/scratch/try_poisson.csv
encode age, gen(agecat)
gen agesq = agecat^2
gen smkage=0
gen smoke = 0
replace smkage = agecat if smoking=="smoker"
replace smoke = 1 if smoking=="smoker"
glm deaths agecat agesq smoke smkage, family(poisson) lin(log)
lnoffset(personyears)
Statsmodels Code
df['smoke'] = np.zeros(len(df))
df['smoke'][df['smoking']=='smoker']=1
df['agecat'] = np.array([1,2,3,4,5,1,2,3,4,5])
df['agesq'] = df['agecat']**2
df['smkage'] = df['agecat']
df['smkage'][df['smoking']=='non-smoker']=0
from statsmodels.formula.api import glm
model = glm('deaths~agecat+agesq+smoke+smkage',
family=sm.families.Poisson(), data=df,
exposure=df["person-years"]).fit()
print model.summary()
[28]: model.bse
[28]:
Intercept 0.450077
agecat 0.207949
agesq 0.027367
smoke 0.372199
smkage 0.097041
dtype: float64
hth,
Skipper