Inflating alpha for Interaction to Increase Power

j...@aggienetwork.com

unread,

Jan 10, 2005, 2:45:13 PM1/10/05

to

Has anyone ever heard of inflating the alpha level for the F-test in
declaring significance for the interaction terms of a model? For
instance, if you test your single-factor terms at alpha=0.05, testing
your interaction terms at alpha=0.20?

I ask this because I seem to recall having a professor suggest doing so
to protect against "missing" a significant interaction. However, my
review of texts, etc., shows that this is not commonly done. Based on
power tests, it appears that such a change might be useful.

Does anyone have any thoughts or references about this procedure?
Thanks,

Warren Schlechte

Ray Koopman

unread,

Jan 10, 2005, 2:51:48 PM1/10/05

to

I've never heard of it. Sounds misguided, to me.

Richard Ulrich

unread,

Jan 10, 2005, 7:40:19 PM1/10/05

to

On 10 Jan 2005 11:45:13 -0800, j...@aggienetwork.com wrote:

> Has anyone ever heard of inflating the alpha level for the F-test in
> declaring significance for the interaction terms of a model? For
> instance, if you test your single-factor terms at alpha=0.05, testing
> your interaction terms at alpha=0.20?

I've thought of saying, "There aren't any interactions evident,
even at the (looser) 10% test level. Therefore, there is
nothing to suggest that the main-effect is not simple
and properly modeled."

Making a formal test of it seems wrong-headed to me, too.

>
> I ask this because I seem to recall having a professor suggest doing so
> to protect against "missing" a significant interaction.

- sure, you can stretch the limits to make sure of not
missing something. But if you are going to stick in
an extra dozen tests, you either (a) don't take them as
formal, or don't take them seriously; or (b) had better
adjust the nominal p-level in the other direction

> However, my
> review of texts, etc., shows that this is not commonly done. Based on
> power tests, it appears that such a change might be useful.
>
> Does anyone have any thoughts or references about this procedure?

Describing what you have is useful. Describing tests in
a way that people can understand them is useful.
I don't see what is here that you think might be "useful"....

--
Rich Ulrich, wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html

j...@aggienetwork.com

unread,

Jan 10, 2005, 9:25:08 PM1/10/05

to

Thanks for your reply.

It could well be that your quote of "There aren't any interactions

evident, even at the (looser) 10% test level. Therefore, there is
nothing to suggest that the main-effect is not simple and properly

modeled." is what my professor had intended to illustrate, and I took
it as a formal test -- it was almost 20 years ago. And that could
explain why I have been unable to find a text or article where this
approach was used.

By "useful", I meant in the fashion of your quote above, or the
converse.

Sincerely,

Warren Schlechte

jim clark

unread,

Jan 11, 2005, 8:41:17 AM1/11/05

to

Hi

On 10 Jan 2005, j...@aggienetwork.com wrote:
> It could well be that your quote of "There aren't any interactions
> evident, even at the (looser) 10% test level. Therefore, there is
> nothing to suggest that the main-effect is not simple and properly
> modeled." is what my professor had intended to illustrate, and I took
> it as a formal test -- it was almost 20 years ago. And that could
> explain why I have been unable to find a text or article where this
> approach was used.
>
> By "useful", I meant in the fashion of your quote above, or the
> converse.

There is good reason to not have much faith in the omnibus F test
to detect interactions. In many patterns of interaction, the
"interaction" effect is diluted by being distributed across main
and interaction effects (in the narrow anova sense of
these). Consider the following, which might be predicted in many
pre/post test by treatment/control studies.

Pre Post
Control 10 10
Treatment 10 20

The interaction might not be significant, but the simple effects
would show the pre/post difference is not significant for
Control, but is for Treatment. That is, the effect of pre/post
differs across the level of control/treatment, which meets the
conceptual definition of interaction.

So whether it is by using a more liberal alpha, increasing n,
using planned simple effects, or other means, researchers do need
to ensure their analyses are going to be sufficiently sensitive
to detect interactions in the data.

Best wishes
Jim

============================================================================
James M. Clark (204) 786-9757
Department of Psychology (204) 774-4134 Fax
University of Winnipeg 4L05D
Winnipeg, Manitoba R3B 2E9 cl...@uwinnipeg.ca
CANADA http://www.uwinnipeg.ca/~clark
============================================================================

nau...@nil.com

unread,

Jan 11, 2005, 11:17:08 AM1/11/05

to

I've always been pretty conflicted about what to do with interactions,
especially in observational data. On one hand, the power of the test for
interactions is generally pretty poor (and I have seen people use a more
liberal alpha because of this). On the other hand, with smaller Ns, the
"cell" sizes can vary greatly and the effects can be very unstable. In
the case you show above, Jim, if those simple effects weren't planned
(with the study designed to ensure I have enough power to test them), I'd
be inclined to believe the omnibus test.

Mike Babyak

nau...@nil.com

unread,

Jan 11, 2005, 11:43:03 AM1/11/05

to

nau...@nil.com wrote:
>
> I've always been pretty conflicted about what to do with interactions,
> especially in observational data. On one hand, the power of the test for
> interactions is generally pretty poor (and I have seen people use a more
> liberal alpha because of this). On the other hand, with smaller Ns, the
> "cell" sizes can vary greatly and the effects can be very unstable. In
> the case you show above, Jim, if those simple effects weren't planned
> (with the study designed to ensure I have enough power to test them), I'd
> be inclined to believe the omnibus test.
>
> Mike Babyak

That should say: On the other hand, the extra tests inflate the error
rate, and, with smaller Ns, the "cell" sizes can vary greatly and the

Aleks Jakulin

unread,

Jan 13, 2005, 7:05:35 AM1/13/05

to

jws wrote:
> Has anyone ever heard of inflating the alpha level for the F-test in
> declaring significance for the interaction terms of a model? For
> instance, if you test your single-factor terms at alpha=0.05,
> testing your interaction terms at alpha=0.20?

It's known that interactions are hard to prove significant at 0.05;
you need a *lot* of data.

@article{McClelland93,
author={G. H. McClelland and C. M. Judd},
title={Statistical difficulties of detecting interactions and
moderator effects},
journal={Psychological Bulletin},
volume={114},
year={1993},
pages={376--390}
}

What I usually do is to sort the interactions by increasing p-value,
and list all of them out, rather than to try to draw an arbitrary
threshold by setting the alpha.

--
mag. Aleks Jakulin
http://www.ailab.si/aleks/
Artificial Intelligence Laboratory,
Faculty of Computer and Information Science,
University of Ljubljana, Slovenia.

j...@aggienetwork.com

unread,

Jan 14, 2005, 2:00:21 PM1/14/05

to

Thanks for all your thoughts, and the McClelland and Judd reference.
Respectfully,

Warren Schlechte