using "ezMixed" to deal with missing data in a mixed ANOVA

kaiyu.m...@gmail.com

unread,

Jul 9, 2018, 6:09:20 PM7/9/18

to ez4r

Thank you Mike Lawrence for developing this great tool!

Now I have a question about running a 2x7 mixed ANOVA with missing values.

In order to run ANOVA with "ezANOVA", I removed some participants, due to missing values. The code was:

ezANOVA(data = mixed_anova_data, dv=.(PERCLOS.value), wid=.(Participant), within=.(Time), between=.(Group), detailed=T, type=3)

1) After looking at this post, I was wondering how can I conduct a mixed ANOVA with "ezMixed" without removing participants. My data is attached as Rdata format.

2) Could you please explain a bit more about how the missing data is dealt with, in the "ezMixed"?

Thanks and looking forward to your reply.

Best regards,

Kaiyu

mixed_anova_data.Rdata

Mike Lawrence

unread,

Jul 9, 2018, 6:26:16 PM7/9/18

to ez...@googlegroups.com

ezMixed provides an interface to fitting/evaluating mixed effects models (a.k.a. hierarchical models), which are different from mixed anovas. They are generally superior to anova and permit missing data. It's a little involved to explain precisely how they achieve inference amidst missing data, but if you should be able to find explanations if you search "mixed effects models and missing data".

Below is how you'd go about using ezMixed. Note that when you have a variable like Time that you've turned into a 7-level factor, the first model will treat that variable categorically (i.e. it doesn't "see" the ordering of levels nor their equal spacing), which generally has the risk of losing power to detect linear effects. The code below shows how to run a second version of the analysis that treats Time properly as a continuous variable, employing generalized additive modelling to estimate a possibly-non-linear function. It actually ends up deciding for your case that a linear function fits best, but behind the scenes it tries a bunch of wiggly functions, so you should trust that the linear is the best prediction.

Note also that both these versions of ezMixed use yield what Barr et al (2013, 10.1016/j.jml.2012.11.001) would call "non-maximal" in that they only model a random effect of participants on the intercept (in other words, participants vary from one another in their overall mean, but manifest identical effects of Time), which isn't ideal. The last example demonstrates how to do a maximal model (this time imposing an assumption of linearity on the effects of Time) manually using lmer directly.

#load libraries
library(lme4)
library(mgcv)
library(ez)
library(ggplot2)

#make sure "Group" has sum contrasts
contrasts(mixed_anova_data$Group) = contr.sum

#make sure "Time" has helmert contrasts
contrasts(mixed_anova_data$Time) = contr.helmert

#fit mixed effects models, obtaining likelihood ratios for each fixed effect
# note: since "Time" is a factor, it is treated as a categorical variable
fit = ezMixed(
    data = mixed_anova_data[!is.na(mixed_anova_data$PERCLOS.value),]
    , dv=.(PERCLOS.value)
    , random=.(Participant)
    , fixed=.(Time,Group)
)

#get posterior predictions from the maximal model
preds = ezPredict(fit$models$`Group:Time`$unrestricted)

#plot all groups
ezPlot2(
    preds = preds
    , x = Time
    , split = Group
)

#plot group differences
ezPlot2(
    preds = preds
    , x = Time
    , diff = Group
)

#Turn "Time" into a number to run a GAMM
mixed_anova_data$Time2 = as.numeric(as.character(mixed_anova_data$Time))

#fits generally converge better and posterior plots look nicer if numeric variables are scaled
mixed_anova_data$Time3 = scale(mixed_anova_data$Time2)

#fit mixed effects models, obtaining likelihood ratios for each fixed effect
# note: since "Time3" is a number, a generalized additive mixed model will be fit
fit2 = ezMixed(
    data = mixed_anova_data[!is.na(mixed_anova_data$PERCLOS.value),]
    , dv=.(PERCLOS.value)
    , random=.(Participant)
    , fixed=.(Time3,Group)
)

#get posterior predictions from the maximal model
preds2 = ezPredict(fit2$models$`Group:Time3`$unrestricted)

#unscale Time
preds2$cells$Time = preds2$cells$Time3*sd(mixed_anova_data$Time2)+mean(mixed_anova_data$Time2)
preds2$boots$Time = preds2$boots$Time3*sd(mixed_anova_data$Time2)+mean(mixed_anova_data$Time2)

#plot all groups
ezPlot2(
    preds = preds2
    , x = Time
    , split = Group
)

#plot group differences
ezPlot2(
    preds = preds2
    , x = Time
    , diff = Group
)

#Estimate a truly "maximal" mixed effects model by hand
# Note: this treats time as having LINEAR effects
fit3 = lmer(
    data = mixed_anova_data[!is.na(mixed_anova_data$PERCLOS.value),]
    , formula = PERCLOS.value ~ Time3*Group + (1+Time3*Group|Participant)
)

#get posterior predictions from the maximal model
preds3 = ezPredict(fit3)

#unscale Time
preds3$cells$Time = preds3$cells$Time3*sd(mixed_anova_data$Time2)+mean(mixed_anova_data$Time2)
preds3$boots$Time = preds3$boots$Time3*sd(mixed_anova_data$Time2)+mean(mixed_anova_data$Time2)

#plot all groups
ezPlot2(
    preds = preds3
    , x = Time
    , split = Group
)

#plot group differences
ezPlot2(
    preds = preds3
    , x = Time
    , diff = Group
)

kaiyu.m...@gmail.com

unread,

Jul 9, 2018, 8:20:23 PM7/9/18

to ez4r

Hi Mike, thank you so much for your quick and detailed reply! I think I need a bit more reading to fully understand the theories, since I am quite new to statistics.

I have another 2 questions here:

1) since my data violates the assumptions of a mixed ANOVA, I also conducted a non-parametric test using ezPerm in the package. Is "Time" treated as a 7-level factor here in the permutation test?

ezPerm(data = mixed_anova_data, dv=.(PERCLOS.value), wid=.(Participant), within=.(Time), between=.(Group))

2) in the "Warning" it reads "ezPerm() is a work in progress. Under the current implementation, only main effects may be trusted." If it is up to date, the interaction effect results are not valid. Am I right?

Regards,

Kaiyu

Mike Lawrence

unread,

Jul 9, 2018, 9:03:40 PM7/9/18

to ez...@googlegroups.com

On Mon, Jul 9, 2018 at 9:20 PM <kaiyu.m...@gmail.com> wrote:

1) since my data violates the assumptions of a mixed ANOVA, I also conducted a non-parametric test using ezPerm in the package. Is "Time" treated as a 7-level factor here in the permutation test?
ezPerm(data = mixed_anova_data, dv=.(PERCLOS.value), wid=.(Participant), within=.(Time), between=.(Group))

Yes, time is being treated as a categorical predictor in that analysis.

2) in the "Warning" it reads "ezPerm() is a work in progress. Under the current implementation, only main effects may be trusted." If it is up to date, the interaction effect results are not valid. Am I right?

Correct. I should probably just take ezPerm out of ez as I never really put much work into making sure it's doing the right thing.

Reply all

Reply to author

Forward