ezANOVA error message

1,055 views
Skip to first unread message

Kitti Ban

unread,
Jul 23, 2018, 4:12:33 PM7/23/18
to ez4r
Hi, 

I'm trying to run a 2 (between-subjects) x 3 (within-subjects) x 3 (within-subjects) ANOVA, however, I keep getting the following messages: 

Warning: Converting "ID" to factor for ANOVA.
Warning: You have removed one or more levels from variable "offered". Refactoring for ANOVA.
Warning: Data is unbalanced (unequal N per group). Make sure you specified a well-considered value for the type argument to ezANOVA().
Error in ezANOVA_main(data = data, dv = dv, wid = wid, within = within,  : 
  One or more cells is missing data. Try using ezDesign() to check your data.


I used the following command: 

proposer_anova <- ezANOVA(data = proposer_data_frame, 
                   dv = .(percentage_offered), 
                   wid = .(ID), 
                   between = .(native),
                   within = .(trustworthiness, offered),
                   type = 2, 
                   detailed = TRUE, 
                   return_aov = TRUE)

Thank you for any help with this issue, I'm really stuck because of this. 

All the best,
Kitti 

This is my data file (each column seems to be pushed to the left):


IDnativeofferedtrustworthinesspercentage_offeredmean_reaction_time
1
1
TRUE
ambiguous
low
13.64
2480.000
2
1
TRUE
ambiguous
middle
8.16
1809.000
3
1
TRUE
fair
middle
8.16
1671.250
4
1
TRUE
unfair
high
100.00
2765.000
5
1
TRUE
unfair
low
86.36
1673.158
6
1
TRUE
unfair
middle
83.67
3181.220
7
2
TRUE
ambiguous
low
22.22
6877.500
8
2
TRUE
ambiguous
middle
7.84
6451.250
9
2
TRUE
fair
high
100.00
2044.583
10
2
TRUE
fair
low
77.78
6601.429
11
2
TRUE
fair
middle
92.16
2736.766
12
3
TRUE
ambiguous
high
33.33
3880.000
13
3
TRUE
ambiguous
low
44.00
2713.091
14
3
TRUE
ambiguous
middle
78.95
3080.833
15
3
TRUE
fair
high
66.67
3333.000
16
3
TRUE
fair
low
16.67
1830.000
17
3
TRUE
fair
middle
7.89
4561.667
18
3
TRUE
unfair
low
52.00
2886.538
19
3
TRUE
unfair
middle
13.16
2544.800
20
4
FALSE
fair
high
100.00
1026.000
21
4
FALSE
fair
middle
100.00
1499.831
22
5
FALSE
ambiguous
high
50.00
2536.111
23
5
FALSE
ambiguous
low
50.00
4336.500
24
5
FALSE
ambiguous
middle
48.00
2470.292
25
5
FALSE
fair
high
38.89
5534.429
26
5
FALSE
fair
low
33.33
2602.000
27
5
FALSE
fair
middle
42.00
2835.810
28
5
FALSE
unfair
high
11.11
3501.500
29
5
FALSE
unfair
low
50.00
14762.000
30
5
FALSE
unfair
middle
10.00
8654.800
31
6
FALSE
ambiguous
low
65.00
2518.923
32
6
FALSE
ambiguous
middle
65.38
1820.794
33
6
FALSE
fair
low
35.00
1925.571
34
6
FALSE
fair
middle
32.69
1723.706
35
6
FALSE
unfair
middle
5.00
1225.000
36
7
TRUE
fair
middle
100.00
1841.194
37
8
FALSE
ambiguous
low
28.26
2735.462
38
8
FALSE
ambiguous
middle
50.00
2728.308
39
8
FALSE
fair
low
19.57
2500.889
40
8
FALSE
fair
middle
23.08
2181.333
41
8
FALSE
unfair
low
52.17
2283.500
42
8
FALSE
unfair
middle
26.92
1856.857
43
9
TRUE
ambiguous
middle
7.69
6107.500
44
9
TRUE
fair
high
100.00
3801.450
45
9
TRUE
fair
middle
92.31
3537.917
46
10
TRUE
ambiguous
high
12.50
1224.000
47
10
TRUE
ambiguous
low
12.50
1926.000
48
10
TRUE
ambiguous
middle
15.62
2480.700
49
10
TRUE
fair
high
63.16
3705.500
50
10
TRUE
fair
low
78.57
1822.455
51
10
TRUE
fair
middle
64.06
2123.024
52
10
TRUE
unfair
high
31.58
1403.500
53
10
TRUE
unfair
low
14.29
2635.500
54
10
TRUE
unfair
middle
20.31
2584.154
55
11
FALSE
ambiguous
high
35.71
3313.000
56
11
FALSE
ambiguous
low
50.00
1625.333
57
11
FALSE
ambiguous
middle
55.77
2853.793
58
11
FALSE
fair
high
42.86
1936.833
59
11
FALSE
fair
low
20.00
2824.000
60
11
FALSE
fair
middle
17.31
2190.667
61
11
FALSE
unfair
high
21.43
2444.667
62
11
FALSE
unfair
low
33.33
1684.000
63
11
FALSE
unfair
middle
26.92
2245.214
64
12
TRUE
fair
high
100.00
1821.000
65
12
TRUE
fair
middle
100.00
1833.843
66
13
TRUE
fair
high
100.00
1427.263
67
13
TRUE
fair
middle
100.00
1189.265
68
14
FALSE
ambiguous
high
11.11
5896.000
69
14
FALSE
ambiguous
low
23.53
5764.750
70
14
FALSE
ambiguous
middle
36.96
5615.294
71
14
FALSE
fair
high
88.89
4037.750
72
14
FALSE
fair
low
23.53
4724.000
73
14
FALSE
fair
middle
45.65
6287.095
74
14
FALSE
unfair
low
52.94
5821.333
75
14
FALSE
unfair
middle
17.39
4549.250
76
15
FALSE
ambiguous
high
12.00
3225.000
77
15
FALSE
ambiguous
low
50.00
5252.000
78
15
FALSE
ambiguous
middle
14.29
7657.500
79
15
FALSE
fair
high
88.00
2435.591
80
15
FALSE
fair
middle
78.57
3376.182
81
15
FALSE
unfair
low
80.00
5826.750
82
15
FALSE
unfair
middle
7.14
5223.000
83
16
FALSE
ambiguous
high
25.00
2839.000
84
16
FALSE
ambiguous
low
100.00
1801.000
85
16
FALSE
ambiguous
middle
33.33
3898.150
86
16
FALSE
fair
high
50.00
3965.333
87
16
FALSE
fair
low
25.00
5146.000
88
16
FALSE
fair
middle
36.67
4196.682
89
16
FALSE
unfair
high
33.33
3649.500
90
16
FALSE
unfair
low
66.67
3424.500
91
16
FALSE
unfair
middle
30.00
3738.389
92
17
FALSE
ambiguous
high
100.00
3310.000
93
17
FALSE
ambiguous
low
61.54
1347.375
94
17
FALSE
ambiguous
middle
58.14
1817.000
95
17
FALSE
fair
high
100.00
1677.000
96
17
FALSE
fair
low
15.38
2290.750
97
17
FALSE
fair
middle
25.58
2057.182
98
17
FALSE
unfair
high
33.33
1780.000
99
17
FALSE
unfair
low
23.08
2096.167
100
17
FALSE
unfair
middle
16.28
3682.000
101
18
TRUE
fair
high
100.00
1110.143
102
18
TRUE
fair
low
100.00
1908.000
103
18
TRUE
fair
middle
100.00
1608.053
104
19
TRUE
ambiguous
high
100.00
4173.000
105
19
TRUE
ambiguous
low
16.67
8632.000
106
19
TRUE
ambiguous
middle
7.14
3792.667
107
19
TRUE
fair
high
91.67
2141.136
108
19
TRUE
fair
low
83.33
1894.600
109
19
TRUE
fair
middle
92.86
2181.385
110
19
TRUE
unfair
high
11.11
4211.000
111
20
FALSE
fair
high
100.00
1546.083
112
20
FALSE
fair
low
100.00
1537.600
113
20
FALSE
fair
middle
100.00
2037.436
114
21
FALSE
ambiguous
high
14.29
2134.000
115
21
FALSE
ambiguous
low
100.00
2503.000
116
21
FALSE
ambiguous
middle
21.67
2470.923
117
21
FALSE
fair
high
92.31
1500.083
118
21
FALSE
fair
middle
76.67
1890.457
119
21
FALSE
unfair
middle
11.11
2100.000
120
22
TRUE
ambiguous
high
31.58
3251.500
121
22
TRUE
ambiguous
low
35.00
13025.286
122
22
TRUE
ambiguous
middle
48.48
2529.688
123
22
TRUE
fair
high
57.89
2099.000
124
22
TRUE
fair
middle
24.24
2385.500
125
22
TRUE
unfair
high
10.53
2490.500
126
22
TRUE
unfair
low
65.00
4453.923
127
22
TRUE
unfair
middle
27.27
2053.444
128
23
TRUE
ambiguous
low
100.00
2728.000
129
23
TRUE
ambiguous
middle
30.43
3448.857
130
23
TRUE
fair
high
100.00
2114.000
131
23
TRUE
fair
middle
68.12
3374.660
132
23
TRUE
unfair
middle
3.57
9638.000
133
24
TRUE
fair
high
100.00
1359.619
134
24
TRUE
fair
middle
100.00
1381.433
135
25
TRUE
ambiguous
high
33.33
5585.000
136
25
TRUE
ambiguous
low
59.26
4978.500
137
25
TRUE
ambiguous
middle
18.42
4096.143
138
25
TRUE
fair
high
85.71
3759.500
139
25
TRUE
fair
low
40.74
3209.364
140
25
TRUE
fair
middle
81.58
4489.710
141
26
FALSE
ambiguous
high
34.48
5026.600
142
26
FALSE
ambiguous
low
30.77
4018.750
143
26
FALSE
ambiguous
middle
40.00
4121.417
144
26
FALSE
fair
high
44.83
4283.231
Showing 1 to 16 of 473 entries

Mike Lawrence

unread,
Jul 23, 2018, 6:20:39 PM7/23/18
to ez...@googlegroups.com
Run the following and send a copy of the resulting plot:

ezDesign(
    data = proposer_data_frame,
    y = ID,
    x = native,
    row = trustworthiness,
    col = offered
 )

Kitti Ban

unread,
Jul 25, 2018, 1:02:37 PM7/25/18
to ez4r
Hi Mike, 

Thank you for your reply, I'm sorry for the delay, I had some other problems with R I needed to sort first to be able to run the code. I attached the resulting graph. 

Thanks again, 
Kitti
Rplot.png

Mike Lawrence

unread,
Jul 25, 2018, 1:33:25 PM7/25/18
to ez...@googlegroups.com
Ok, it looks like you have missing data. If you run this:

ezDesign(
    data = proposer_data_frame, 
    y = ID, 
    x = offered,
    row = trustworthiness,
 )

You'll notice that some rows don't have blank (grey) areas. Those are condition combinations where a participant doesn't have any data.

--
Mike Lawrence
Graduate Student
Department of Psychology & Neuroscience
Dalhousie University

~ Certainty is (possibly) folly ~


--
You received this message because you are subscribed to the Google Groups "ez4r" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ez4r+uns...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kitti Ban

unread,
Jul 25, 2018, 1:37:24 PM7/25/18
to ez4r
Thank you for your fast reply! Do you know any way I could run the ANOVA despite the missing combinations?

Mike Lawrence

unread,
Jul 25, 2018, 6:22:44 PM7/25/18
to ez...@googlegroups.com
No, it is not possible to use ANOVA when data are missing. See here for discussion of alternatives: https://groups.google.com/d/msg/ez4r/PbtPgKtH_zI/XWZhhXQWAwAJ


--
Mike Lawrence
Graduate Student
Department of Psychology & Neuroscience
Dalhousie University

~ Certainty is (possibly) folly ~

Kitti Ban

unread,
Jul 25, 2018, 9:44:44 PM7/25/18
to ez4r
Thank you for the suggestion and all of the help!  

Kitti Ban

unread,
Jul 25, 2018, 10:04:28 PM7/25/18
to ez4r
I started following the instructions given in that post and wrote the following: 

# between-subjects variable
contrasts(proposer_data_frame$native) = contr.sum

# within subject bariables
contrasts(proposer_data_frame$offered) = contr.helmert
contrasts(proposer_data_frame$trustworthiness) = contr.helmert

# ezMixed

fit = ezMixed(data = proposer_data_frame[!is.na(proposer_data_frame$percentage_offered),],
  dv=.(percentage_offered), 
  random=.(ID), 
  fixed=.(offered, native, trustworthiness))
  
# Post-hoc
preds = ezPredict(fit$models$`native:offered:trustworthiness`$unrestricted)


For the last line, I got the following error message: 
Error in ezPredict(fit$models$`native:offered`$unrestricted) : 
  ezPredict does not know how to handle fits of class "lmerModLmerTest"

Could you please help me with this analysis? 

Thank you once again,
Kitti



On Wednesday, 25 July 2018 23:22:44 UTC+1, Mike Lawrence wrote:

Mike Lawrence

unread,
Jul 26, 2018, 9:53:01 AM7/26/18
to ez...@googlegroups.com
Is it possible that you loaded the package "lmerTest"? It overrides things that ez uses behind the scenes. Try running this code to load an updated version of ezPredict and let me know if it works:

ezPredict <-
function(
    fit
    , to_predict = NULL
    , numeric_res = 0
    , boot = TRUE
    , iterations = 1e3
    , zero_intercept_variance = FALSE
){
    fit_class = class(fit)[1]
    if((fit_class=='mer')|(fit_class=='glmerMod')|(fit_class=='lmerMod')|(fit_class=='lmerModLmerTest')){
        data = attr(fit,'frame')
        vars = as.character(attr(data,'terms'))
        dv = vars[2]
        vars = gsub('\\(.+?\\) ?\\+','',vars[3])
        vars = gsub('\\+ ?\\(.+?\\)','',vars)
        # vars = gsub('\\(.+\\)','',vars[3])
        vars = unlist(strsplit(vars,'+',fixed=T))
        vars = str_replace_all(vars,' ','')
        vars = vars[nchar(vars)>0]
        these_terms = vars
        vars = vars[!str_detect(vars,':')]
        vars = unlist(strsplit(vars,'*',fixed=T))
        # vars = as.character(attr(attr(data,'terms'),'variables'))
        # dv = as.character(vars[2])
        # vars = vars[3:length(vars)]
    }else{
        if(fit_class%in%c('gam','bam')){
            data = fit$model
            randoms = NULL
            for(i in fit$smooth){
                if(class(i)[1]=='random.effect'){
                    randoms = c(randoms,i$term)
                }
            }
            vars = as.character(attr(attr(data,'terms'),'variables'))
            dv = as.character(vars[2])
            vars = vars[3:length(vars)]
            vars = vars[!(vars%in%randoms)]
            BY = vars[str_detect(vars,'BY')]
            vars = vars[!str_detect(vars,'BY')]
        }else{
            stop(paste('ezPredict does not know how to handle fits of class "',fit_class,'"',sep=''))
        }
    }
    if(is.null(to_predict)){
        if(length(grep('poly(',vars,fixed=TRUE))>0){
            stop('Cannot auto-create "to_predict" when the fitted model contains poly(). Please provide a data frame to the "to_return" argument.')
        }
        data_vars = vars[grep('I(',vars,fixed=T,invert=T)]
        temp = list()
        for(i in 1:length(data_vars)){
            this_fixed_data = data[,names(data)==data_vars[i]]
            if(is.numeric(this_fixed_data)&(numeric_res>0)){
                temp[[i]] = seq(
                    min(this_fixed_data)
                    , max(this_fixed_data)
                    , length.out=numeric_res
                )
            }else{
                temp[[i]] = sort(unique(this_fixed_data))
                if(!is.numeric(this_fixed_data)){
                    contrasts(temp[[i]]) = contrasts(this_fixed_data)
                }
            }
        }
        to_return = data.frame(expand.grid(temp))
        names(to_return) = data_vars
    }else{
        to_return = to_predict
    }
    data_vars = names(to_return)
    if(fit_class%in%c('gam','bam')){
        for(i in randoms){
            to_return$EZTEMP = data[1,names(data)==i]
            names(to_return)[ncol(to_return)] = i
        }
        for(i in BY){
            to_return$EZTEMP = ''
            for(j in str_split(i,'BY')[[1]]){
                to_return$EZTEMP = paste(to_return$EZTEMP,as.character(to_return[,names(to_return)==j]),sep='')
            }
            to_return$EZTEMP = ordered(to_return$EZTEMP)
            names(to_return)[ncol(to_return)] = i
        }
    }
    to_return$ezDV = 0
    names(to_return)[ncol(to_return)] = dv
    if((fit_class=='mer')|(fit_class=='glmerMod')|(fit_class=='lmerMod')){
        requested_terms = terms(eval(parse(text=paste(
            dv
            , '~'
            , paste(
                these_terms#attr(attr(data,'terms'),'term.labels')
                , collapse = '+'
            )
        ))))
        mm = model.matrix(requested_terms,to_return)
        f = lme4::fixef(fit)
        v = vcov(fit)
        if(zero_intercept_variance){
            v[1,] = 0
            v[,1] = 0
        }
    }else{
        mm <- predict(fit,to_return,type="lpmatrix") # get a coefficient matrix
        for(i in randoms){
            mm[,grep(paste('s(',i,')',sep=''),dimnames(mm)[[2]],fixed=T)] = 0 #zero the subject entry
        }
        f = coef(fit)
        for(i in randoms){
            f[grep(paste('s(',i,')',sep=''),names(f),fixed=T)] = 0 #zero the subject entry
        }
        v = vcov(fit)
        if(zero_intercept_variance){
            v[1,] = 0
            v[,1] = 0
        }
        for(i in randoms){
            row = grep(paste('s(',i,')',sep=''),dimnames(v)[[1]],fixed=T)
            col = grep(paste('s(',i,')',sep=''),dimnames(v)[[2]],fixed=T)
            v[row,] = 0
            v[,col] = 0
        }
    }
    value = mm %*% f
    to_return$value = as.numeric(value[,1])
    tc = Matrix::tcrossprod(v,mm)
    to_return$var = Matrix::diag(mm %*% tc)
    to_return = to_return[,names(to_return) %in% c(data_vars,'value','var')]
    if(boot){
        samples = mvrnorm(iterations,f,v)
        mat = matrix(NA,nrow=nrow(to_return),ncol=iterations)
        for(i in 1:iterations){
            mat[,i] <- mm%*%samples[i,]
        }
        boots = as.data.frame(to_return[,names(to_return) %in% data_vars])
        names(boots) = data_vars
        boots = cbind(boots,as.data.frame(mat))
        boots = melt(
            data = boots
            , id.vars = names(boots)[1:(ncol(boots)-iterations)]
            , variable.name = 'iteration'
        )
        to_return = list(
            cells = to_return
            , boots = boots
        )
    }
    return(to_return)
}



Kitti Ban

unread,
Jul 26, 2018, 11:20:45 AM7/26/18
to ez4r
Indeed, I had the lmerTest package loaded. However, when detaching, I received the error message "Error in str_replace_all(vars, " ", "") : 
  could not find function "str_replace_all"". The function you sent went through without any warnings or errors, however, I'm not entirely sure what it does.  Am I supposed to be looking out for a table similar to the summary table of an ANOVA?
Reply all
Reply to author
Forward
0 new messages