Dummy coding? Sum coding?

Skip to first unread message


Apr 15, 2018, 12:44:39 PM4/15/18
to StatForLing with R
Hi all,

I was analyzing my data using mixed effects linear regression and didn't know whether I should use dummy coding or sum coding.

My model includes Variable 1, Variable 2 and their interaction as fixed effects.
Variable 1 has three levels “A”, “B” and “C”, where “A” is the baseline. 
Variable 2 also has three levels, “a”, “b” and “c”, in which “a” is the baseline. 

As long as I know, when there is interaction term, sum coding must be used rather than dummy coding in order to see the main effect of each factor, because when dummy coding is used, any observed effect is a simple effect, not a true main effect. However, I am also aware that when sum coding is used, one level is automatically omitted in analysis, which is fine when there are only two levels for a variable, but problematic when there are three levels or more, as in my data. 

Which coding method should I use for my data?

Reply all
Reply to author
0 new messages