Trouble with ezStats

284 views
Skip to first unread message

Bill Altermatt

unread,
Aug 19, 2011, 1:45:31 PM8/19/11
to ez...@googlegroups.com
Hi, Mike,

Love the program but am having trouble with descriptive statistics via ezStats.  Using the attached data (GoodSamaritan), ezANOVA works:

ezANOVA(
    data = GoodSamaritan
    , dv = .(helping)
    , wid = .(subj)
    , between = .(time.pressure,messag)
    , type = 3)

but ezStats fails:

ezStats(
    data = GoodSamaritan
    , dv = .(helping)
    , wid = .(subj)
    , between = .(time.pressure,messag)
    , type = 3
    )

with this error message:
Warning: Data is unbalanced (unequal N per group). Make sure you specified a well-considered value for the type argument to ezANOVA().
Error in names(x) <- value :
  'names' attribute [8] must be the same length as the vector [7]
Error in ezANOVA_main(data = data, dv = dv, wid = wid, within = within,  :
  The car::Anova() function used to compute results and assumption tests seems to have failed. Most commonly this is because you have too few subjects relative to the number of cells in the within-Ss design. It is possible that trying the ANOVA again with "type=1" may yield results (but definitely no assumption tests).


I've tracked this down a little and it seems like ezStats calls ezANOVA_main, which calls ezANOVA_summary in this line:

try(to_return <- ezANOVA_summary(Anova(wide_lm$lm, idata = wide_lm$idata,
    type = 2, idesign = eval(parse(text = wide_lm$idesign_formula))))

The problem seems to be that the argument for ezANOVA_summary is an anova object, and does not contain some of the elements that ezANOVA_summary asks for.  For example, in ezANOVA_summary:

nterms <- length(object$terms)

has trouble because there is no $terms element in the anova object.  The structure of the anova object in this case is:

Classes 'anova' and 'data.frame':    4 obs. of  4 variables:
 $ Sum Sq : num  33.34 2.14 3.29 57.12
 $ Df     : num  2 1 2 34
 $ F value: num  9.92 1.27 0.98 NA
 $ Pr(>F) : num  0.000403 0.267126 0.385743 NA
 - attr(*, "heading")= chr  "Anova Table (Type II tests)\n" "Response: helping"

nterms is thus 0, which causes trouble in these lines:

table <- data.frame(matrix(0, nterms, 8))   # 'table' is now 8 columns wide with no rows
table3[, 1] <- table2[, 1] <- table[, 1] <- object$terms    # 'table' is now just 7 columns wide because object$terms is NULL

And here is where the error occurs:

colnames(table) <- c("Effect", "DFn", "DFd", "SSn", "SSd", "F",
    "p", "p<.05")

Because 'table' is just 7 columns wide now, those 8 colnames don't fit.

The solution seems to be to specify an object for ezANOVA_summary that contains $terms and $error.df, but I'm not sure what that would be.  I tried using wide_lm$lm, but its $terms are a formula (helping ~ time.pressure * messag) rather than 3 separate terms, and it has a $df.residual instead of an $error.df.  So that doesn't seem like a good fit. 

I would be very grateful for any suggestions, and thanks very much for all your work on this project!

--Bill
GoodSamaritan.rda

Bill Altermatt

unread,
Aug 22, 2011, 8:32:36 AM8/22/11
to ez4r
Couple of things I should have mentioned in the original post:

1. The design is unbalanced. The 2 factors are "messag" and
"time.pressure":

> table(GoodSamaritan$messag,GoodSamaritan$time.pressure)

High Medium Low
Jobs 6 7 7
Samaritan 7 7 6

ezStats runs fine if you add 2 data points to make it balanced:
> GoodSamaritan[41,]<- c("Jobs", "High", 2, 41L)
> GoodSamaritan[42,] <- c("Samaritan", "Low", 1, 42L)

So the problem seems specific to requesting descriptive stats for
unbalanced designs.

2. In the example above, I was using ezANOVA as sourced from:

https://github.com/mike-lawrence/ez/blob/master/R/ezANOVA.R
and
https://github.com/mike-lawrence/ez/blob/master/R/ez-internal.R

Mike Lawrence

unread,
Aug 22, 2011, 8:40:51 AM8/22/11
to ez...@googlegroups.com
Hi Bill,

Thanks for the detailed description of your problem. I have a mission
critical project due today that I'll be working on all day, but I'll
try to take a look at this tonight.

Mike

Bill Altermatt

unread,
Aug 22, 2011, 10:12:19 AM8/22/11
to ez4r
Thanks, Mike.

I think I have a solution, but it involves running the descriptive
statistics from within ezANOVA_main. ezStats is simply computing
means and standard deviations until it runs into Fisher's Least
Significant Difference (FLSD), for which it needs an ANOVA table.
Because it's a separate function from ezANOVA, ezStats needs to run an
ANOVA in order to get that ANOVA table, and that's where my error is
turning up. By putting descriptive stats into ezANOVA, you don't need
to re-run the ANOVA because you've already got the table in the
output. Just grab the bits you need to compute FLSD and voila.

If we add an option for 'descriptives' in the ezANOVA function:

ezANOVA <-
function(
data
, dv
...
, return_aov = FALSE
, descriptives = FALSE
){

[also add "descriptives = descriptives" in the function for the
"to_return =" statement.]

and then copy and paste the descriptive statistics formulas from
ezStats into ezANOVA_main right before the last line
(return(to_return)):
[note that I have added a line defining this_ANOVA as to_return$ANOVA
and a line at the very end appending the descriptives to the to_return
list].


if(descriptives) {
if (!is.null(within) & !is.null(between)) {
if (!is.logical(diff) & length(within) > 1) {
warning("Mixed within-and-between-Ss effect requested;
FLSD is only appropriate for within-Ss comparisons (see warning in ?
ezStats or ?ezPlot).",
call. = FALSE)
}
}
this_ANOVA = to_return$ANOVA
vars = as.character(c(between, within))
temp = idata.frame(cbind(data, ezWID = data[, names(data) ==
as.character(wid)], dummy = rep(1, length(data[, 1]))))
N = ddply(temp, structure(as.list(c(.(dummy), between)),
class = "quoted"), function(x) {
to_return = length(unique(x$ezWID))
names(to_return) = "N"
return(to_return)
})
if (!all(N[, length(N)] == N[1, length(N)])) {
warning("Unbalanced groups. Mean N will be used in
computation of FLSD")
N = mean(N[, length(N)])
}
else {
N = N[1, length(N)]
}
DFd = this_ANOVA$DFd[length(this_ANOVA$DFd)]
MSd = this_ANOVA$SSd[length(this_ANOVA$SSd)]/DFd
Tcrit = qt(0.975, DFd)
CI = Tcrit * sqrt(MSd/N)
FLSD = sqrt(2) * CI
temp = idata.frame(cbind(data, ezDV = data[, names(data) ==
as.character(dv)]))
data <- ddply(temp, structure(as.list(c(between, within)),
class = "quoted"), function(x) {
N = length(x$ezDV)
Mean = mean(x$ezDV)
SD = sd(x$ezDV)
return(c(N = N, Mean = Mean, SD = SD))
})
data$FLSD = FLSD
to_return$descriptives = data
}

Here's a test. If I go back to my unbalanced dataset and run ezANOVA
with the new descriptives=TRUE option:

> GSu <- GoodSamaritan[-42,]
> GSu <- GSu[-41,]
> table(GSu$messag,GSu$time.pressure)
High Medium Low
Jobs 6 7 7
Samaritan 7 7 6

> ezANOVA(
data = GSu
, dv = .(helping)
, wid = .(subj)
, between = .(time.pressure,messag)
, type = 3
, descriptives = TRUE)

I get the following, which works fine:

Warning: You have removed one or more Ss from the analysis.
Refactoring "subj" for ANOVA.
Warning: Data is unbalanced (unequal N per group). Make sure you
specified a well-considered value for the type argument to ezANOVA().
$ANOVA
Effect DFn DFd F p p<.05 ges
2 time.pressure 2 34 4.4331805 0.01946073 * 0.20683727
3 messag 1 34 0.8547151 0.36173728 0.02452222
4 time.pressure:messag 2 34 0.9797785 0.38574290 0.05449336

$`Levene's Test for Homogeneity of Variance`
DFn DFd SSn SSd F p p<.05
1 5 34 4.07619 31.02381 0.8934459 0.4964538

$descriptives
time.pressure messag N Mean SD FLSD
1 High Jobs 6 0.3333333 0.8164966 1.442738
2 High Samaritan 7 1.0000000 0.8164966 1.442738
3 Medium Jobs 7 1.8571429 0.8997354 1.442738
4 Medium Samaritan 7 1.5714286 1.7182494 1.442738
5 Low Jobs 7 2.4285714 1.5118579 1.442738
6 Low Samaritan 6 3.5000000 1.6431677 1.442738

Warning message:
In ezANOVA_main(data = data, dv = dv, wid = wid, within = within, :
Unbalanced groups. Mean N will be used in computation of FLSD

It's not a good solution because I'm not sure why ezStats isn't
working, but it is a bit computationally simpler.

--Bill

Mike Lawrence

unread,
Aug 24, 2011, 10:46:49 AM8/24/11
to ez...@googlegroups.com
Hi Bill,

Well, this is rather strange: when I run your data through ezStats,
using ezDev (https://raw.github.com/mike-lawrence/ez/master/R/ezDev.R)
to grab the latest ez source, I get no errors. Maybe you were running
a slightly earlier version of ez and I happened to inadvertently fix
this in the interim?

Mike

Bill Altermatt

unread,
Aug 24, 2011, 10:49:21 AM8/24/11
to ez...@googlegroups.com
Thanks, Mike. I'll get the latest ezDev and retry it.

Bill

_____________________
Bill Altermatt, PhD
Department of Psychology
Hanover College

Bill Altermatt

unread,
Aug 25, 2011, 8:24:14 AM8/25/11
to ez...@googlegroups.com
Thanks, Mike.  I tried the latest version of ezStats and it worked fine.  Would there be an easy way for me to add some code to my package so that it can check to make sure the user has the necessary version of ez?

--Bill

> ezDev()
Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) :
  SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

--Bill

Mike Lawrence

unread,
Aug 27, 2011, 11:05:39 AM8/27/11
to ez...@googlegroups.com
Until I officially update ez on cran to the long promised version 4.0,
your best bet is include the ezDev() function (or something like it)
in your package and have that run when your package loads.
Reply all
Reply to author
Forward
0 new messages