magnitude of effect of independent variables

102 views
Skip to first unread message

ekbrown77

unread,
Feb 5, 2015, 9:26:33 PM2/5/15
to statforli...@googlegroups.com
After getting the minimal adequate model of a linear regression, I'd like to order the remaining, significant independent variables by their magnitude of effect (or effect size), with the variable with the strongest effect on top, the second most influential variable next, etc. The stats::step() function seems to order the variables by p-values (generally, as there can be mis-ordered variables if one variable is a gradient variable and the adjacent, mis-ordered one is a categorical one).

My question for you:
Can we assume that the independent variable with the smallest p-value has the strongest effect on the dependent variable? If not, how can we place the significant variables in a hierarchy according to their respective magnitudes of effect?

SFLWR, 2nd edition presents the effects::effect() function to measure the relative effect of levels within a variable, but I'm not sure if this function can be applied to measuring the relative effect sizes of the several significant variables of a linear regression.

Thanks in advance for any help. Earl Brown

Stefan Th. Gries

unread,
Feb 5, 2015, 9:39:08 PM2/5/15
to StatForLing with R
> Can we assume that the independent variable with the smallest p-value has the strongest effect on the dependent variable?
Definitely not!

> If not, how can we place the significant variables in a hierarchy according to their respective magnitudes of effect?
By computing effect sizes such as eta or partial eta-squared; also,
see <http://www.r-bloggers.com/example-8-14-generating-standardized-regression-coefficients/>
and the packages relimp, lsr, BaylorEdPsych, and bootES, which can
probably help, too.

> SFLWR, 2nd edition presents the effects::effect() function to measure the relative effect of levels within a variable, but I'm not sure if this function can be applied to measuring the relative effect sizes of the several significant variables of a linear regression.
I don't think so.

HTH,
STG
--
Stefan Th. Gries
----------------------------------
Univ. of California, Santa Barbara
http://tinyurl.com/stgries
----------------------------------

ekbrown77

unread,
Mar 12, 2015, 6:06:28 PM3/12/15
to statforli...@googlegroups.com
I went with the eta-squared values given by lsr::etaSquared(). Quick question about the interpretation of those values:

Does an eta-squared value that is twice (or three times, four times, etc.) larger than another eta-squared value mean that the corresponding independent variable is twice (or 3x, 4x, etc.) more "influential" or "important" (or what?) than the other variable? I don't know how to describe to my reader what exactly the values mean. Maybe the relative position of variables within the hierarchy is most meaningful.

Thanks for any help.

Stefan Th. Gries

unread,
Mar 12, 2015, 6:24:19 PM3/12/15
to StatForLing with R
> Does an eta-squared value that is twice (or three times, four times, etc.) larger than another eta-squared value mean that the corresponding independent variable is twice (or 3x, 4x, etc.) more "influential" or "important" (or what?) than the other variable?
I think you can, yes, because eta-squared and partial eta-squared
quantify the share of explained variance (normalized against different
denominators). Plus, this little simulation demo shows that
eta-squareds and R^2 are just perfectly correlated (in univariate
ANOVAs at least):

################################
rm(list=ls(all=TRUE)); set.seed(1); library(lsr)
etas <- r2s <- rep(0, 1000)
for (i in 1:1000) {
y <- c(rnorm(50), rnorm(50)+2)
x <- factor(rep(letters[1:2], each=50))
etas[i] <- etaSquared(aov(y~x))[[1]]
r2s[i] <- cor(y, as.numeric(x)-1)
}
plot(etas ~ r2s)
cor(etas, r2s) # 0.9989864
################################

ekbrown77

unread,
Nov 3, 2016, 9:31:01 AM11/3/16
to StatForLing with R
A follow-up question to a previous conversation about effect sizes:

Is it theoretically sound to calculate the effect size of the fixed effects of a mixed effects linear regression (created with lme4::lmer) by creating a linear regression model (with base::lm) of only the fixed effects (and some of their interactions), and then using lsr::etaSquared to get their effect sizes? 

Is there a more direct way to calculate effect size for the fixed effects variables of a mixed effects linear regression (created with lme4::lmer)?

Thanks.

Stefan Th. Gries

unread,
Nov 3, 2016, 10:05:53 AM11/3/16
to StatForLing with R

No, that is not sound to me at all - just report the coefficients from the mixed-effects model, which are unstandardized effect sizes plus, I'd say, the predicted values with CIs. Not to make it a point of pride, but I virtually never report standardized effect sizes, only instandardized ones.

Cheers,


STG
--
Stefan Th. Gries
----------------------------------
Univ. of California, Santa Barbara
http://tinyurl.com/stgries
----------------------------------

--
You received this message because you are subscribed to the Google Groups "StatForLing with R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to statforling-with-r+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Martin Schweinberger

unread,
Nov 3, 2016, 10:14:55 AM11/3/16
to statforli...@googlegroups.com
hi all,

if i understood field, miles, and field. 2012 "discovering statistics using r" correctly, then one way to measure effect sizes of fixed effects in mixed effects models would be to follow these steps:

1. build a model without the fixed effect one is interested in.
2. build a model with the fixed effect
3. compare the models using anova (model1, model2, test = "Chi")
4. the chi-square statistics represent the effect size of the fixed effect

am i right or did i miss something here?

best
martin



=====================================
Dr. Martin Schweinberger
Gählerstraße 11
22767 Hamburg

Fon.: ++49 (0)176 387 48 283
Home: http://www.martinschweinberger.de/

To unsubscribe from this group and stop receiving emails from it, send an email to statforling-with-r+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Matías Guzmán Naranjo

unread,
Nov 3, 2016, 10:47:27 AM11/3/16
to statforli...@googlegroups.com
Can't you work with a pseudo R2 measure? That won't tell you about the individual predictors, but as STG mentions, the coefficients already do that.

Stefan Th. Gries

unread,
Nov 3, 2016, 11:59:16 AM11/3/16
to StatForLing with R
@Martin: As far as I know, you don't want to use that chi-squared
value as an effect size. It's used as a significance test for whether
the model with one predictor less is not just worse, but
_significantly_ worse, but it's not really an effect size.

Martin Schweinberger

unread,
Nov 3, 2016, 12:55:52 PM11/3/16
to statforli...@googlegroups.com
stefan, you are right again (of course ;)).
i thought that the chi square value could serve as an effect size measure as its value should correlate with explained variance when comparing the models with the anova call. however, this would probably conflate significance and effect size as you pointed out. i just checked field et al. (2012: 640) - they calculate r to quantify effect size in linear mixed-effects models.

best
martin

=====================================
Dr. Martin Schweinberger
Gählerstraße 11
22767 Hamburg

Fon.: ++49 (0)176 387 48 283
Home: http://www.martinschweinberger.de/

Stefan Th. Gries

unread,
Nov 3, 2016, 1:00:42 PM11/3/16
to StatForLing with R
> stefan, you are right again (of course ;)).
lol, I wish I was as confident about my 'expertise' :-)

> i thought that the chi square value could serve as an effect size measure as its value should correlate with explained variance when comparing the models with the anova call. however, this would probably conflate significance and
> effect size as you pointed out.
Yes, I actually just tried a simulation where I doubled a data set and
the LRT-values are different even though the correlations are the
same. So, yes, better not use the chi-squared/LRT-value for that.

> i just checked field et al. (2012: 640) - they calculate r to quantify effect size in linear mixed-effects models.
Do you have a section number for that? I was looking for that earlier ...

Martin Schweinberger

unread,
Nov 3, 2016, 1:42:28 PM11/3/16
to statforli...@googlegroups.com
its section 14.7. and thanks for running the simulation (the proof is in the pudding!)

best
martin

=====================================
Dr. Martin Schweinberger
Gählerstraße 11
22767 Hamburg

Fon.: ++49 (0)176 387 48 283
Home: http://www.martinschweinberger.de/

Stefan Th. Gries

unread,
Nov 3, 2016, 1:45:18 PM11/3/16
to StatForLing with R
Oh, ok, I was looking in Ch. 19 for something specific to MEM - thanks!

Stefan Th. Gries

unread,
Nov 3, 2016, 2:01:43 PM11/3/16
to StatForLing with R
Ok, one word of caution re that: Section 14.7 refers back to Section
10.7, which then provides a function rcontrast to compute the effect
size Martin mention. Two caveats, however:

1) Note that the book says "An alternative is to compute effect sizes
*for the orthogonal contrasts*. We can use the same equation as in
section 9.5.2.8:

r_contrast = sqrt { t^2 over {t^2+df} }

So, I don't know if you can use this measure if your contrasts are not
orthogonal, which the default treatment contrasts of R that, it seems
to me, most ppl are using, are *not*. I don't know whether that has a
bearing not only on the p-values or also on the effect size(s) here.

2) One of the bigger points of discussion for mixed-effects models is
precisely the df issue and the question of how to compute them (e.g.,
which approximation to use or whether to proceed differently) so I
don't know whether the approximations some packages offer provide
df-values that work with the above formula.

So, I know I'm just problematizing here and not offering much in terms
of solutions for those who insist on standardized effect sizes, and
I'm not saying whether rcontrast is right here or not - all I'm saying
is to be careful and check with someone who knows more about this
(than I do).

Cheers,
STG

ekbrown77

unread,
Nov 3, 2016, 2:39:35 PM11/3/16
to StatForLing with R
The coefficients (called "Estimates" in the output of lme4::lmer) of continuous variables that are on different scales cannot be compared in order to ascertain effect size, right? 

An example:
In my dataset, contextual frequency ranges between 0.0 and 1.0 and lme4::lmer returns a coefficient ("Estimate") of -0.86. Lexical frequency logged ranges between 0.6 and 3.1 and has a coefficient of -0.01. Because the scales are different I cannot directly compare the coefficients in order to determine effect size, right?

If not, how might I determine which variable has a larger effect size?

What about comparing the coefficient of a continuous variable with the coefficient of a categorical variable? What about the coefficient of interaction terms/variables?

Stefan Th. Gries

unread,
Nov 3, 2016, 3:22:47 PM11/3/16
to StatForLing with R
> The coefficients (called "Estimates" in the output of lme4::lmer) of continuous variables that are on different scales cannot be compared in order to ascertain effect size, right?
Not unless you z-standardize them, no, same in your example - after
z-standardization, you can.

> If not, how might I determine which variable has a larger effect size?
I haven't seen much in terms of standardized effect sizes for (g)lmer
models. If a reviewer was inappropriately intransigent, I might make
up something like change in AIC or r-squared per predictor, with the
biggest caveat footnote ever saying that I have no idea whether one
can do that. (So cite me for this only if it passes muster ;-))
Reply all
Reply to author
Forward
0 new messages