Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Categorical variable coding in SPSS

104 views
Skip to first unread message

Max

unread,
Jan 26, 2006, 4:27:23 AM1/26/06
to
I wonder if anyone can help please?

I've been advised to treat the categorical variable GENDER as continuous for
a particular GLM procedure. That is, to use it as a covariate rather than
fixed effect.

I was told that to do this in SPSS, I must code it as a dummy variable
rather than using effect coding. Is this correct? What is the rationale
please?

Thanks

Max


Thom

unread,
Jan 26, 2006, 6:43:42 AM1/26/06
to

It shouldn't make any differences for most analyses. There may be some
procedures in SPSS that won't accept categorical variables in certain
roles, so dummy coding may simply be a way to run the appropriate
analysis without getting error messages.

Thom

Art Kendall

unread,
Jan 26, 2006, 9:57:54 AM1/26/06
to
Any dichotomy can be considered interval level because when there is one
interval it meets the definition that all intervals are are equal to
each other. The coding does not make much difference. (1000, 2000)
(3,4) (-1, 1) (0,1) etc.

However, treating gender as a covariate in GLM terminology usually is a
BAD IDEA. When you do this, you do get the main effect of gender, but
you are not taking into account the interaction of gender with the other
independent variables. Why throw information away?


Art
A...@DrKendall.org
Social Research Consultants

Bruce Weaver

unread,
Jan 26, 2006, 12:41:42 PM1/26/06
to
Art Kendall wrote:
> Any dichotomy can be considered interval level because when there is one
> interval it meets the definition that all intervals are are equal to
> each other. The coding does not make much difference. (1000, 2000)
> (3,4) (-1, 1) (0,1) etc.
>
> However, treating gender as a covariate in GLM terminology usually is a
> BAD IDEA. When you do this, you do get the main effect of gender, but
> you are not taking into account the interaction of gender with the other
> independent variables. Why throw information away?
>

I don't follow, Art. There is nothing preventing you from including
covariate x factor interactions in GLM UNIANOVA syntax.

However, while playing with an example, I DID find that you can get
different results for the two methods if you go with the default Type
III SS. Is this what you were getting at?

Here's the example I tried.

GET FILE='C:\Program Files\SPSS\1991 U.S. General Social Survey.sav'.

* Use Type III SS (the default) .

UNIANOVA
educ BY race sex
/DESIGN = race sex race*sex .

UNIANOVA
educ BY race WITH sex
/DESIGN = race sex race*sex .

* Use Type I SS .

UNIANOVA
educ BY race sex
/METHOD = SSTYPE(1)
/DESIGN = race sex race*sex .

UNIANOVA
educ BY race WITH sex
/METHOD = SSTYPE(1)
/DESIGN = race sex race*sex .

The first two analyses use Type III SS, and give different F-tests for
the main effect of RACE. The F-tests for Sex and the interaction are
identical in the two models.

The last pair of analyses use Type I SS, and give identical F-tests for
all 3 terms.

Cheers,
Bruce

--
Bruce Weaver
bwe...@lakeheadu.ca
www.angelfire.com/wv/bwhomedir

Art Kendall

unread,
Jan 26, 2006, 3:19:08 PM1/26/06
to
Although there are different differences in usage, the old-fashioned
ANOVA models would not include the interaction term between a covariate.
In usage especially in psych and ed, a covariate is a continuous
variable. Sex has 2 levels and most commonly would be used as a factor.
When a variable is used as a factor most software automatically uses
the interaction term.
The inclusion of the interaction term is what distinguished one-way
ANCOVA from 2 way ANOVA.
UNIANOVA does allow you to force inclusion of the interaction in the design.

SPSS has a an article by Dave Nichols on types of SS somewhere on their
site. But they have re-arranged the site & I don't have a link.

Art

Thom

unread,
Jan 27, 2006, 6:24:40 AM1/27/06
to

I think the default ANCOVA analysis from the menu/GUI interface (GLM
Univariate ...) doesn't include interactions whereas ANOVA does. There
are advantages to this in analyses where you are short of d.f. In
ANCOVA I'd always run the tests of parallelism/homogeneity of
covariance in any case (though I'd usually do this by creating the
product terms using the compute function so I can centre them if I need
to).

Thom

Max

unread,
Jan 29, 2006, 5:38:33 PM1/29/06
to
Does the coding system chosen for a categorical variable treated as a
factor (e.g. gender) not influence the value of the intercept and
therefore the value of the dependent variable when all variables are
zero?

I ask because I have to use gender as a covariate rather than fixed
factor (an SPSS bug - see post elsewhere) and someone suggested that
SPSS expects dichotomous categorical variables used as covariates to be
dummy coded. I could not see why SPSS would want this. What I am
hearing here is that the coding should make no difference whether used
as a dischotmous factor or covariate.

FTR

unread,
Jan 30, 2006, 1:19:03 PM1/30/06
to
Art Kendall wrote:
> Although there are different differences in usage, the old-fashioned
> ANOVA models would not include the interaction term between a covariate.
> In usage especially in psych and ed, a covariate is a continuous
> variable. Sex has 2 levels and most commonly would be used as a factor.
> When a variable is used as a factor most software automatically uses
> the interaction term.
> The inclusion of the interaction term is what distinguished one-way
> ANCOVA from 2 way ANOVA.
> UNIANOVA does allow you to force inclusion of the interaction in the
> design.
>
> SPSS has a an article by Dave Nichols on types of SS somewhere on their
> site. But they have re-arranged the site & I don't have a link.
>
> Art
>

Art,

The articles of Dave Nichols can be found here:
ftp://ftp.spss.com/pub/spss/statistics/nichols

F. Thomas

0 new messages