Statsmodels mixedlm variance parameters differ from lmer in R

283 views
Skip to first unread message

Alan Samuels

unread,
Nov 19, 2019, 7:48:25 AM11/19/19
to pystatsmodels
Hi,

I am trying to fit a linear mixed effects model to some data (that unfortunately I am not able to share). The model uses an intercept and only one covariate, and includes both a random intercept and a random slope. The statsmodels mixedlm function is suited to fitting such models and the resulting parameter estimates appear reasonable. However, the variance of the random slope term is considerably larger than expected. To verify this, the same model was fit to the same data in R, using the lmer function in the lme4 package. The estimated coefficients are, on the whole, very comparable, although the variance of the random slope term is much more reasonable when estimated in R than when estimated using statsmodels (0.332 for statsmodels and 0.0000324 for lmer, a much larger difference than one attributable to just numerical differences).

What's more is that the predictions of the random effects are very similar in both packages, despite the fact that these predictions are calculated conditionally on the random effects covariance matrix. Therefore I am not sure if I have interpreted the statsmodels output correctly. I am treating cov_re as the random effects covariance matrix, as is explained in the mixedlm documentation, and hence the diagonal elements should be equivalent to the random effects variances in lmer. The scale in this case is also close to 0.34 (in both statsmodels and R), so is very similar to the estimated random slope variance coefficient in statsmodels. On top of this, if I fit the model without a random intercept term then the estimated variance of the random slope is exactly equal to the residual variance (i.e the scale), which, again, does not happen in lmer. I might thus have missed some connection between the two however it is not clear from the documentation.

I have tried using different optimisation methods but obtain similar conclusions. I have attached code below and screenshots of the model summaries, any help as to why this might be would be greatly appreciated. Thanks.


%% Statsmodels
import statsmodels.formula.api as smf
import pandas as pd

df = pd.DataFrame({'Response': response_array, 'Predictor': predictor_array, 'Group': group_array})
md = smf.mixedlm("Response ~ Predictor", df, groups=df["Group"], re_formula="~Predictor").fit()

print(md.summary())

%% lme4 (having loaded the same data frame)

library(lme4)

fit <- lmer(Response ~ Predictor + (1+Predictor|Group), data=df)

summary(fit)
statsmodelssummary.PNG
lme4summary.PNG
Reply all
Reply to author
Forward
0 new messages