two-way ANOVA interaction, contrast coefficients

58 views
Skip to first unread message

Kayle Sawyer

unread,
Jan 22, 2015, 10:11:02 PM1/22/15
to piface-d...@googlegroups.com

Hi,

Could you please clarify how the contrast coefficient for interactions should be specified in PiFace?

I am interested in determining a sample size to detect an interaction effect with a 2X2 between-subjects ANOVA design. I selected "Two-way ANOVA," and set row and col both to 2. I selected Differences/Contrasts and set my SD to 0.5 (from my pilot study). Next, I set detectable contrast to 0.3 (we need a difference of differences, i.e. interaction, 0.3 or larger for us to consider it important). I selected contrast levels of row*col, and method as t.

My question is, what do I set the contrast coefficients to be? Usually, when examining this kind of interaction, my software has the model set up with three contrasts:
row: 1 1 -1 -1
col: 1 -1 -1 1
row*col: 1 -1 1 -1

However, this does not appear to be how they are constructed in PiFace, because more than one contrast cannot be specified simultaneously, and the help says the order does not matter (which makes sense if there is only one factor).
Here are the Ns for each contrast at power = 0.80:
1 -1 1 -1: 89
-1 1: 45
-3 1 1 1: 262

Should I use -3 1 1 1? Do I leave it as -1 1? Intuitively, I suspect about 20 to 25 per subgroup (total N of 80 to 100) would be big enough to detect an effect of about d=0.6, but perhaps my intuition is wrong. 

(I am only interested in the interaction, not the overall F, nor the main effect of row, nor main effect of col.)

Thanks for your help!

Lenth, Russell V

unread,
Jan 23, 2015, 3:13:02 PM1/23/15
to ksl...@gmail.com, piface-d...@googlegroups.com

I'll try to answer, but I need to note that different parts of your question are asking for different things, so I am not sure which part you really want the answer for.

 

The initial part of your question seems wrong:

     row: 1 1 -1 -1

     col: 1 -1 -1 1         } These seem backwards!

     row*col: 1 -1 1 -1     } (see below)

 

If you think of the means arranged in a matrix, the contrast coefficients are as shown below

 

                     rows       cols       row*col

     mu11  mu12      1  1      1 -1      1 -1

     mu21  mu22      -1 -1      1 -1      -1  1

 

Look at the marginal sums of each table. The first one, the rows sum to (2,-1), cols to (0,0) - that's why it has only to do with rows. The second table has marginal sums of (0,0) for rows and (2,-1) for columns; and the thid's marginal sums are (0,0) for both rows and columns, so has only to do with interaction. If you are really interested in just the interaction and not the main effects, then why are you troubled by not being able to specify more than one contrast simultaneously? Because in a 2x2 table, there is only the one contrast that has to do with interaction alone. When you ask about a contrast like (-3, 1; 1, 1), [which I am assuming is in order of 1st row; second row], that contrast specifies a combination of row effects, column effects, and interaction effects -- in fact it is the negative of (table 1 + table 2 + table 3) above. Bottom line -- for interaction only, the contrast of interest is 1 -1 -1 1.

 

The other important matters are the error SD and the target effect size. Youi can't just leave these at their initial settings. You have to think about it. For error SD, you need results of some kind of pilot study with the same measurement instrument. For target effect size, you need to think about how big an interaction would be important to detect. I guess you're trying to use Cohen's d as the effect size, but (a) I really, really, really recommend against using it -- you need to think about the actual measurements you will get. And (b) even if you insist on using it, Cohen's d applies to a contrast with coefficients (1,-1) and not to any other contrast.

 

Supposing it's IQ scores. The error SD might be 15 or so. For an interaction, think about a pattern of means like this:

 

     100 + a   100 - a    for instance,   103  97

     100 - a   100 + a                    97 103

 

... for which the value of the contrast with coefficients (1,-1,-1,1) is 4*a, or 12 in the example above. Choose the value of a to reflect the smallest possibility that would produce a result of practical importance. In my illustration, I am saying that a difference of 6 in IQ scores for males, accompanied by a difference of 6 in the opposite directiion for females, would be considered important, but a smaller discrepancy would not be particularly important.

 

Hope this helps. I hope Google Groups doesn't mangle my carefully formatted tables...

 

Russ

 

Russell V. Lenth  -  Professor Emeritus

Department of Statistics and Actuarial Science  

The University of Iowa  -  Iowa City, IA 52242  USA  

Voice (319)335-0712 (Dept. office)  -  FAX (319)335-3017

--
You received this message because you are subscribed to the Google Groups "PiFace discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to piface-discuss...@googlegroups.com.
To post to this group, send email to piface-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/piface-discussion.
To view this discussion on the web visit https://groups.google.com/d/msgid/piface-discussion/02391b3a-bbce-404f-9096-ce67b801ff69%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kayle Sawyer

unread,
Jan 23, 2015, 5:56:31 PM1/23/15
to piface-d...@googlegroups.com, ksl...@gmail.com, russel...@uiowa.edu
Thanks so much for your careful answer. You are correct, I had switched the 'col' and 'row*col'. It should look like this:

     row: 1 1 -1 -1

     col: 1 -1 1 -1         

     row*col: 1 -1 -1 1     

So specifying '1 -1 -1 1' as the contrast makes sense. 

We have determined the error SD for each of our projects with pilot studies (all pilot N's >40), and it's consistent with the SD reported in the literature. The within SD used by PiFace is basically the same as the SD of the overall group mean, right?

The effect size of interest is harder. Your example is helpful, but I want to be sure I understand the number Piface is using for "Detectable contrast." This is in raw units, right? So, in your example, the number would be 12 (a gender effect of -6 for controls, and +6 for experimental group). Just to be clear, this would also be 12, right?

100  87

99 98

This would indicate a deficit of 13 for males, and a deficit of 1 for females. Subtract the gender effect for the experimental group from the control group ((100 - 87) - (99 - 98)), and we have a "Detectable contrast" equal to 12. 

Thanks also for your advice about selection of effect size by practical importance; it will be useful for the grant proposal we are writing. Say we selected 12 IQ points as our minimal interaction effect size of practical interest. Would you then justify this minimum with additional citations of studies showing how a 12 point deficit is meaningful, say with a review showing that a 12 point deficit has been associated with some percent lower income, or with ability to take medication regularly, or whatever relevant practical consequence we're interested in? For absolute scales, say mm^3 of tissue, my collaborator said people use a rule of thumb that a 10% difference is "practically important." If it's less than 10%, who cares.

However, how would you specify that a deficit of say 8 IQ points greater for the males than the females is NOT meaningful? Perhaps we don't need to say that for the grant proposal. It seems more important to have preliminary (pilot) evidence that shows there is a deficit for males that's bigger for females, and ideally our preliminary evidence should be the same size as (or larger than) the minimum we selected for the power analysis. That way we can demonstrate we have a good chance of finding significant effects. I have seen successfully funded grant proposals that use this technique, but I have also read advice from you and others that specifically says NOT to use the pilot effect size for the power analysis.

Regardless as to whether this is the right way to think about it or not, it would be wonderful if you could point me to a source that I could read further. Perhaps a textbook or NIH guide that focuses on how to justify the effect size chosen for our hypothesis? If you know of one, perhaps you could link to it on the "Put science before statistics" section of your home page. 

Anyway, you've given me the answer to my main question: The PiFace contrast for a 2x2 interaction is: 1 -1 -1 1. Thanks again.

Best,

-Kayle Sawyer

Ph.D. Program in Behavioral Neuroscience 
Laboratory of Neuropsychology 
Boston University School of Medicine L-815 
72 E. Concord St 
Boston, MA 02118

Lenth, Russell V

unread,
Jan 24, 2015, 5:29:24 PM1/24/15
to Kayle Sawyer, piface-d...@googlegroups.com

-- My responses are inserted and set off in the style of this paragraph...

Thanks so much for your careful answer. You are correct, I had switched the 'col' and 'row*col'. It should look like this:

     row: 1 1 -1 -1

     col: 1 -1 1 -1         

     row*col: 1 -1 -1 1     

So specifying '1 -1 -1 1' as the contrast makes sense. 

We have determined the error SD for each of our projects with pilot studies (all pilot N's >40), and it's consistent with the SD reported in the literature. The within SD used by PiFace is basically the same as the SD of the overall group mean, right?

-- Right

The effect size of interest is harder. Your example is helpful, but I want to be sure I understand the number Piface is using for "Detectable contrast." This is in raw units, right? So, in your example, the number would be 12 (a gender effect of -6 for controls, and +6 for experimental group). Just to be clear, this would also be 12, right?

100  87

99 98

This would indicate a deficit of 13 for males, and a deficit of 1 for females. Subtract the gender effect for the experimental group from the control group ((100 - 87) - (99 - 98)), and we have a "Detectable contrast" equal to 12. 

-- Right again

Thanks also for your advice about selection of effect size by practical importance; it will be useful for the grant proposal we are writing. Say we selected 12 IQ points as our minimal interaction effect size of practical interest. Would you then justify this minimum with additional citations of studies showing how a 12 point deficit is meaningful, say with a review showing that a 12 point deficit has been associated with some percent lower income, or with ability to take medication regularly, or whatever relevant practical consequence we're interested in? For absolute scales, say mm^3 of tissue, my collaborator said people use a rule of thumb that a 10% difference is "practically important." If it's less than 10%, who cares.

-- Sounds like a start. In some fields, that might be unrealistically small. In acoustics, for example, I think a standard threshold for a practical difference is 3 dB -- which amounts to a two-fold difference in terms of the amount of energy that represents. But every field is different.

However, how would you specify that a deficit of say 8 IQ points greater for the males than the females is NOT meaningful?

-- I was just giving a numerical example. I did not mean to imply that that would necessarily be the right effect size for IQ scores. Sorry if I left that impression.

Perhaps we don't need to say that for the grant proposal. It seems more important to have preliminary (pilot) evidence that shows there is a deficit for males that's bigger for females, and ideally our preliminary evidence should be the same size as (or larger than) the minimum we selected for the power analysis. That way we can demonstrate we have a good chance of finding significant effects. I have seen successfully funded grant proposals that use this technique, but I have also read advice from you and others that specifically says NOT to use the pilot effect size for the power analysis.

-- I do feel that it is right to base target effect sizes on scientific goals. Basing it on past data is just finding a way to try to inflate those results to a level of importance, without asking whether it really is. However, if those past results really are of an important magnitude, then a case could be made that it is wasteful to power for an effect size much smaller than that.

Regardless as to whether this is the right way to think about it or not, it would be wonderful if you could point me to a source that I could read further. Perhaps a textbook or NIH guide that focuses on how to justify the effect size chosen for our hypothesis? If you know of one, perhaps you could link to it on the "Put science before statistics" section of your home page. 

-- Every field is different, and I don't have a specific reference for that. Sorry.

Kayle Sawyer

unread,
Jan 26, 2015, 2:45:59 PM1/26/15
to piface-d...@googlegroups.com, ksl...@gmail.com, russel...@uiowa.edu
That all makes sense. 

Regarding the 8 IQ points, I wasn't asking about 8 in particular; I was asking more generally how you would provide evidence or reasoning that an effect smaller than your cutoff is not meaningful. But as you said, I suppose it depends on your field and scientific goals.

Thanks again for your help, and for PiFace!
Reply all
Reply to author
Forward
0 new messages