Mixed Linear Model prediction

Tarik Luisman

unread,

Mar 27, 2017, 11:08:13 AM3/27/17

to pystatsmodels

Hi all,

I'm trying to predict some outcome based on a MixedLM object based on a training set of 2/3 measurements per group, I want to predict the other 1/3rd of the data and calculate the mean squared error to compare different models with the same dataset.
In R I can just use a predict() function and use my training set model and the test dataset as arguments and it will give me predictions.
I saw that there is a predict function in the help of mixedLM (http://statsmodels.sourceforge.net/0.6.0/generated/statsmodels.regression.mixed_linear_model.MixedLM.predict.html) but this doesn't work yet.

This is why I want to calculate it manually by using the individual intercepts of the subjects, but I can't seem to find any place/variable where this is stored while running the model.
Can anyone point me in the direction on where to find these intercepts? Or is there another way to use a predict function for MixedLM?

Thank you in advance,
Tarik Luisman

josef...@gmail.com

unread,

Mar 27, 2017, 11:14:14 AM3/27/17

to pystatsmodels

I don't know the specific answer, but the new documentation is at www.statsmodels.org

predict and similar should be accessed through the results instance

The random effects that you want might be here http://www.statsmodels.org/stable/generated/statsmodels.regression.mixed_linear_model.MixedLMResults.random_effects.html

Josef

Tarik Luisman

unread,

Mar 28, 2017, 5:59:22 AM3/28/17

to pystatsmodels

Hi Josef,

Thanks for your quick response, I've looked into the predict function but it gives me an error if i try to pass another dataframe into it:

Traceback (most recent call last):

File "<ipython-input-18-8c20f6b44e40>", line 1, in <module>
    mod_lme.predict(datatest)

File "C:\Users\533018\AppData\Local\Continuum\lib\site-packages\statsmodels\base\model.py", line 749, in predict
    return self.model.predict(self.params, exog, *args, **kwargs)

File "C:\Users\533018\AppData\Local\Continuum\lib\site-packages\statsmodels\base\model.py", line 177, in predict
    raise NotImplementedError

NotImplementedError

This is why I thought it was not implemented yet.
mod_lme is the model ran with the training set and datatest is the training set dataframe.

How come some of the methods for the MixedLMResults dont work?
fittedvalues for example gives me an attribute error
'MixedLMResults' object has no attrivute 'fittedvalues'

Thanks for all the help!

Op maandag 27 maart 2017 17:14:14 UTC+2 schreef josefpktd:

Yajuan Wang

unread,

May 22, 2017, 3:14:06 PM5/22/17

to pystatsmodels

I have a similar problem. I am wondering whether there is solution to it right now to it or not?

Thanks,

Yajuan

josef...@gmail.com

unread,

May 22, 2017, 3:52:17 PM5/22/17

to pystatsmodels

can you provide a failing example?

https://github.com/statsmodels/statsmodels/blame/master/statsmodels/regression/mixed_linear_model.py#L935

The code shows the predict method, and it works when I try a quick example with variance components from the test suite, or at least it doesn't raise an exception. (I didn't try to check what it does.)

>>> result.fittedvalues.iloc[0:4]
0   -0.100235
1    0.028168
2   -0.221646
3   -0.124647
dtype: float64

>>> result.predict(df.iloc[0:4])
0    0.013580
1    0.008197
2    0.018671
3    0.014604
dtype: float64

>>> result.model.predict(result.params, result.model.exog[:4])
array([ 0.01358046, 0.00819696, 0.01867076, 0.01460395])

check linear prediction of fixed effect

>>> result.model.exog[:4].dot(result.params.values[:2])
array([ 0.01358046, 0.00819696, 0.01867076, 0.01460395])

Josef

josef...@gmail.com

unread,

May 22, 2017, 4:39:26 PM5/22/17

to pystatsmodels

On Mon, May 22, 2017 at 3:52 PM, <josef...@gmail.com> wrote:

can you provide a failing example?

https://github.com/statsmodels/statsmodels/blame/master/statsmodels/regression/mixed_linear_model.py#L935
The code shows the predict method, and it works when I try a quick example with variance components from the test suite, or at least it doesn't raise an exception. (I didn't try to check what it does.)

looking a bit more

`fittedvalues` includes the predicted random effects https://github.com/statsmodels/statsmodels/blob/master/statsmodels/regression/mixed_linear_model.py#L2107

but I don't see a method to get new predictions for a group that includes the predicted random effect, so out of sample prediction is only for fixed effects

Josef

Yajuan Wang

unread,

May 22, 2017, 5:45:53 PM5/22/17

to pystatsmodels

Thanks for quick response. I attached a html file here for a simple jupyter notebook.

Thanks,

Yajuan

testing.html

josef...@gmail.com

unread,

May 22, 2017, 6:43:42 PM5/22/17

to pystatsmodels

On Mon, May 22, 2017 at 5:45 PM, Yajuan Wang <yaju...@gmail.com> wrote:

Thanks for quick response. I attached a html file here for a simple jupyter notebook.

Based on the commit when predict was added, it wasn't included in statsmodels 0.6.1.

Can you upgrade to 0.8 and try again?

I'm not sure about your second predict problem. It looks like patsy doesn't like a pandas series for predict because it doesn't have the column name "Time". I don't know if this was made more general or if you have to use a dict or dataframe in predict. (A pandas Series didn't or doesn't have a column name which patsy uses to select the variable)

It's difficult to read a html without stylesheet.

Josef

Yajuan Wang

unread,

May 23, 2017, 3:37:58 PM5/23/17

to pystatsmodels

Thanks for the suggestions, Josef! I upgraded the statsmodels, and now it functions well. sorry for the html format thing.

Yajuan

Reply all

Reply to author

Forward