how to interpret interaction term when the main effects are not statistically significant

11,338 views
Skip to first unread message

Adaikalavan Ramasamy

unread,
Oct 3, 2008, 9:41:07 AM10/3/08
to MedStats
Hi,

I have analysed a continuous response as a linear model which included
two binary factors, say sex and ever smoke, and its interaction. I
find the sex/smoke interaction statistically significant. However,
neither sex nor ever smoke is NOT statistically significant.

I spoke to a couple of people here on this. The first person said that
the significance of main effects does not matter when interactions are
involved. The second person said that you should only look at
interaction if and only if the main effects were significant. I tend
to agree with first person but I could not explain my reasons well
enough and this is troubling me.

Can anyone please offer any advise on this matter? Reference to a
famous practical example where this happens would help greatly. Thank
you.

Regards, Adai

Adrian Sayers

unread,
Oct 3, 2008, 9:52:29 AM10/3/08
to MedS...@googlegroups.com
Depends on the situation. If a priori your looking for the interaction
then thats the model you should test.

If in an epi situation check the main effects, look for confounding,
test for an interaction. Once you have an interaction it makes the
lower order parameters p-values meaningless.

but watch out for the phantom degrees of freedom in this process.

Everyone seems to have different opinions on how you should do your
modeling. There is the kitchen sink approach or the more thoughtful.
approach.

bw
A


2008/10/3 Adaikalavan Ramasamy <adaikalava...@gmail.com>:

Peter Flom

unread,
Oct 3, 2008, 10:20:07 AM10/3/08
to MedStats
Adaikalavan Ramasamy <adaikalava...@gmail.com> wrote

>Hi,
>
>I have analysed a continuous response as a linear model which included
>two binary factors, say sex and ever smoke, and its interaction. I
>find the sex/smoke interaction statistically significant. However,
>neither sex nor ever smoke is NOT statistically significant.
>
>I spoke to a couple of people here on this. The first person said that
>the significance of main effects does not matter when interactions are
>involved. The second person said that you should only look at
>interaction if and only if the main effects were significant. I tend
>to agree with first person but I could not explain my reasons well
>enough and this is troubling me.
>
>Can anyone please offer any advise on this matter? Reference to a
>famous practical example where this happens would help greatly. Thank
>you.
>

The first person is right, the second person is wrong.

Why?

Well, simply, an interaction means that the main effects can't be interpreted on their own.

In more detail, if the interaction is strong and negative, it could mean that the main effects are 0, which would obviously not be sig.

I don't have an example, but it's easy to make up some data that would show this. Suppose the DV is some continuous measure, and you get the following means on a large sample:

Black males --- 100
Black females --- 130
White males --- 130
White females --- 100

no main effect, but a big interaction..... and saying 'nothing is going on here' is clearly inadequate.

HTH

Peter

Peter L. Flom, PhD
Statistical Consultant
www DOT peterflom DOT com

BXC (Bendix Carstensen)

unread,
Oct 3, 2008, 10:49:16 AM10/3/08
to MedS...@googlegroups.com
The statement that the main effects does not matter when there is an interaction present is a slight oversimplification.

Suppose you have a model with main effects of A and B, and an interaction:

A + B + A:B

The main effect of B in this model is the B-effect at the reference level of A. Unless the reference level of A has a special status this is hardly of any interest. This is the back ground for the general recommendation also known as the "principle of marginality": Never test a term when higher order terms are present in the model.

But occasionally it has a meaning:
Suppose A is exposure to some agent, in doses 0,10,100, and that B is some treatment (yes/no) designed to counteract the effect.
Fitting a model where exposure 0 is the reference level for factor A makes it perfectly sensible to test for the main effect of B --- it will be the test of whether B has effect on the outcome when exposure is 0:
"Does B have any effect in the absence of the agent it is designed to counteract?".

Another story is that you if you accept this hypothesis and you would like to fit a model where only the A by B interaction was present for A levels 10 and 100, most packages will give you problems.

If you in SAS say:

model y = A + A*B

or in R say:

y ~ A + A:B

you still get the full interaction model, just differently parametrized.
I vaguely recall that Stata has a similar behaviour.

So if you really want to get rid of the main effect you will have to hand-code your (in this case 2) interaction variables in order to fool the program not to recognize them as interactions.

In summary: It is always wise to be explicit about the hypotheses you test.
You should be able to phrase them in subject matter wording, with NO statistical terminology whatsoever. If you cannot, the hypothesis is most likely irrelevant (p=0.0036).

Best regards,
Bendix
______________________________________________

Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2-4
DK-2820 Gentofte
Denmark
+45 44 43 87 38 (direct)
+45 30 75 87 38 (mobile)
b...@steno.dk http://www.biostat.ku.dk/~bxc

Ted Harding

unread,
Oct 3, 2008, 11:05:09 AM10/3/08
to MedS...@googlegroups.com

In my view neither of those statements is correct (as a general rule).
In addition, there is a potential complication which depends on how
the "main effects" are estimated (see below).

Since your response is a continuous variable it is appropriate to
discuss the question in terms of means of subsets of the data.

Basically, the Sex*Smoke interaction expresses the change in
the "main effect" of Sex which corresponds to a change from
Smoke=0 to Smoke=1.

One way to look at this is to consider

Smoke=0: "main effect of Sex"
Sex[Smoke=0] =
mean(Y[Sex=M]&[Smoke=0]) - mean(Y[Sex=F]&[Smoke=0])
Smoke=1: "main effect of Sex"
Sex[Smoke=1] =
mean(Y[Sex=M]&[Smoke=1]) - mean(Y[Sex=F]&[Smoke=1])
Interaction:
Sex*Smoke = Sex[Smoke=1] - Sex[Smoke=0]

From this it follows that

Sex*Smoke =
(mean(Y[Sex=M]&[Smoke=1]) - mean(Y[Sex=F]&[Smoke=1]))
- (mean(Y[Sex=M]&[Smoke=0]) - mean(Y[Sex=F]&[Smoke=0]))

= (mean(Y[Sex=M]&[Smoke=1]) - mean(Y[Sex=M]&[Smoke=0]))
- (mean(Y[Sex=F]&[Smoke=1]) - mean(Y[Sex=F]&[Smoke=0]))

= Smoke[Sex=M] - Smoke[Sex=F],
i.e. the change in the effect of Smoke which corresponds to a
change from Sex=F to Sex=M.

Clearly, it is entirely possible for mean(Y[Sex=M]&[Smoke=1])
to differ significantly from mean(Y[Sex=M]&[Smoke=0]),
AND for mean(Y[Sex=F]&[Smoke=1]) to differ significantly from
mean(Y[Sex=F]&[Smoke=0]) (i.e. for there to be a significant
Smoke effect at either level of Sex); and these could be in
the same direction (e.g. both positive);
AND for the Interaction to be significant (i.e. the effects
of Smoke, significant in each Sex, are significantly different
between Sexes.

You would surely want to know ALL of these facts. That is a
counter-example to the statement "The first person said that


the significance of main effects does not matter when interactions
are involved".

Similarly, it may be that mean(Y[Sex=M]&[Smoke=1]) is greater
than mean(Y[Sex=M]&[Smoke=0]), but not significantly; AND that
mean(Y[Sex=F]&[Smoke=1]) is less than mean(Y[Sex=F]&[Smoke=0]),
but again not significantly; AND that the Interaction
(mean(Y[Sex=M]&[Smoke=1]) - mean(Y[Sex=M]&[Smoke=0]))
- (mean(Y[Sex=F]&[Smoke=1]) - mean(Y[Sex=F]&[Smoke=0]))
(being a positive minus a negative) is positive AND significantly
so. So, despite the non-significant "main effects", you have
a significant interaction. Surely you want to know that??
(The presence of a significant interaction, using this defititikon
of "main effect", indicates that there is a real effect of Smoke
in one Sex or the other, but you do not have enough information to
establish it).

This is a counter-example to the second statement: "you should only


look at interaction if and only if the main effects were significant."


Complication: Another way of looking at it
Depending on what system of contrasts you use to compute main
effects and interactions, you may alternatively obtain:

Main effect of Smoke:
mean(Y[Smoke=1]) - mean(Y[Smoke=0]) (including both levels of Smoke)
Main effect of Sex:e
mean(Y[Sex=M]) - mean(Y[Sex=F]) (including both levels of Sex).
Interaction (say):
(mean(Y[Sex=M]&[Smoke=1]) - mean(Y[Sex=M]&[Smoke=0]))
- (mean(Y[Sex=F]&[Smoke=1]) - mean(Y[Sex=F]&[Smoke=0]))

This alternative view of "main effect" can give different results
from the previous one, but once again all possible combinations
of "significant" and "non-significant" in main effects and
interactions are possible, namely:

Main Sex: 0 1 0 1 0 1 0 1
Main Smoke: 0 0 1 1 0 0 1 1
Interactikon: 0 0 0 0 1 1 1 1

and you surely want to know which is which!

NOTE: In the second view of "main effect", if the effects if Smoke
are equal and opposite in the two Sexes, it would be possible to
have both main effects (Sex and Smoke) exactly zero, yet have a
big interaction, E.g.:

Sex=F Sex=M
--------+--------+--------+
Smoke=0:| 100000 | -100000|
| | |
Smoke=1:|-100000 | 100000|
--------+--------+--------+

Main Effect Smoke = 0, Main Effect Sex = 0, Interaction = 400000.

Summary: Rules like those two don't work in general.
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.H...@manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 03-Oct-08 Time: 15:39:18
------------------------------ XFMail ------------------------------

Adrian Sayers

unread,
Oct 3, 2008, 11:59:40 AM10/3/08
to MedS...@googlegroups.com
Stata parametrizes the model as you describe.

y = a + b +ab
is the same as
y = ab

it creates all you dummies automatically.
It can actually be a real pain in complicated model.

However in continuous variables you have to create your interactions yourself!

A

2008/10/3 BXC (Bendix Carstensen) <b...@steno.dk>:

Peter Flom

unread,
Oct 3, 2008, 12:11:16 PM10/3/08
to MedS...@googlegroups.com
I agree with this statement by Bendix, and the one by Ted.

I just read the statement 'significance of main effect does not matter whe there is an interaction' somewhat differently..... I thought it meant that you don't need to have sig. main effects in order for it to make sense to look at the interaction.

But, of course, the main effects are still important, they are the main effects when the other variable is 0

Peter

Swank, Paul R

unread,
Oct 3, 2008, 12:38:04 PM10/3/08
to MedS...@googlegroups.com
But you can't look at the main effects without taking the interaction into account. At the same time you shouldn't ignore them.

Paul R. Swank, Ph.D
Professor and Director of Research
Children's Learning Institute
University of Texas Health Science Center
Houston, TX 77038

Adaikalavan Ramasamy

unread,
Oct 4, 2008, 3:24:20 AM10/4/08
to MedS...@googlegroups.com
Wow, thank you for the wonderful responses. I need some time to digest
these replies. In the meantime thank you to Adrian Sayers, Peter Flom,
Bendix Carstensen, Ted Harding and Paul Swank on very helpful
insights.

Regards, Adai

2008/10/3 Swank, Paul R <Paul.R...@uth.tmc.edu>:

Bland, M.

unread,
Oct 9, 2008, 11:01:52 AM10/9/08
to MedS...@googlegroups.com
If there has an interaction, it means that the effect of each variable
depends on the level of the other. If an effect is different depending
on something else, it must exist! So a significant interaction says
that both variables predict the outcome.

If the model is

logit(p) = a +b*smoke + c*male + d*smoke*male

the effect of male is c + d*smoke and the effect of smoke is c + d*male.

Martin

--
***************************************************
J. Martin Bland
Prof. of Health Statistics
Dept. of Health Sciences
Seebohm Rowntree Building Area 2
University of York
Heslington
York YO10 5DD

Email: mb...@york.ac.uk
Phone: 01904 321334 Fax: 01904 321382
Web site: http://martinbland.co.uk/
***************************************************

Martin Holt

unread,
Oct 10, 2008, 4:56:54 AM10/10/08
to MedS...@googlegroups.com

----- Original Message -----
From: "Bland, M." <mb...@york.ac.uk>
To: <MedS...@googlegroups.com>
Sent: Thursday, October 09, 2008 4:01 PM
Subject: {MEDSTATS} Re: how to interpret interaction term when the main
effects are not statistically significant


>


> If there has an interaction, it means that the effect of each variable
> depends on the level of the other. If an effect is different depending on
> something else, it must exist! So a significant interaction says that
> both variables predict the outcome.
> If the model is
>
> logit(p) = a +b*smoke + c*male + d*smoke*male
>
> the effect of male is c + d*smoke and the effect of smoke is c + d*male.

I'll probably get hit over the head for this, but isn't the effect of smoke
"b + d*male" ?

Best Wishes,

Martin Holt

Bland, M.

unread,
Oct 14, 2008, 4:31:14 AM10/14/08
to MedS...@googlegroups.com
Martin is of course correct and I made a mistake. I shall try to be
more careful in future!

Martin

Reply all
Reply to author
Forward
0 new messages