As there is an intervention, I assume you also have a control group that
did not receive the intervention. If so, the *standard* approach is a
model with:
Y = post-intervention score
X1 = indicator for intervention group (1=Int, 0=control)
X2 = covariate = baseline score
There may be other covariates too, but that's the basic model.
If instead you use Y = Change Score (post - baseline), you will get
exactly the same t-test on X1 (your group indicator). I wrote some
syntax to demonstrate this after a similar question came up in 2001.
You can see it here.
www.angelfire.com/wv/bwhomedir/spss/change_scores_and_ANCOVA.txt
HTH.
--
Bruce Weaver
bwe...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/Home
"When all else fails, RTFM."
compute constant =1.
aggregate file= * /mode =addvariables /break = constant)
/mean_pre = mean(pre)
/mean_income = mean(income).
compute interact1 = (pre -pre_mean) * (income -mean_income).
compute interact2 = (pre -pre_mean) * treatment.
compute interact3 = (pre -pre_mean) * (income -mean_income) * treatment.
REGRESSION
variables = post pre income interact1 power1
/statisics = r anova cha outs
/dependent = post
/method = enter pre income treatment
/method = enter interact1 interact2
/ethod = enter interact3.
In addition use the GUI to paste syntax from GLM repeated and run it
with pre and post as the repeated measure and income and treatment as
independent variables.
If you have 4 variables Pre_fruit Post_fruit Pre_vegs Post vegs. Use the
GUI and paste syntax with a doubly repeated measure (pre vs post) *
(fruit vs veg).
Art Kendall
Social Research Consultants
On 3/19/2011 1:01 PM, Ang wrote:
> I need to run a regression analysis to see if income predicts change
> in fruit and vegetable intake (changeFV) as a result of an
> intervention. ChangeFV is calculated by post-intervention fruit&
> vegetable intake (postFV) minus pre-intervention fruit& vegetable
Just as a note for the OP, this problem of whether to use the change
scores as the dependent variable or the levels at post test is
sometimes referred to as "Lord's Paradox". Paul Allison has a good
paper on the topic ( http://www.pauldallison.com/downloads/Allison.SM90.pdf
) for observational data.
But notice that the two models that Allison is contrasting in his
Lord's Paradox section differ in that the one using the change score
as the outcome variable does not include Y1 as an explanatory
variable. I.e., his two models are:
1. Y2 = b0 + b1*Y1 + b2*X
2. (Y2-Y1) = b0 + b1*X
where X = 1 for the treatment group and 0 for the control group. He
calls Model 1 "the regressor variable approach", and Model 2 the
"change score method".
For the data in his Table 1, b2 from Model 1 was positive and had a p-
value around .03. In Model 2, b1 was close to 0 and non-significant.
But if he had run Model 3 below, he would have found that b2 from
Model 3 equals b2 from Model 1. This is what the example on my
webpage illustrates.
3. (Y2-Y1) = b0 + b1*Y1 + b2*X
HTH.
--
Bruce Weaver
bwe...@lakeheadu.ca
I don't deny your math is right Bruce, I'm not quite sure why that
matters though. Unless I'm missing something, just because your models
1 and 3 give equivalent answers doesn't make them preferable to model
2. In the paper Allison gives examples where model 2 may be preferable
and where model 1 is obviously innapropriate in different
observational contexts.
Your model 3 can be rewritten as
1. (Y2 - Y1) = b0 + b1*Y1 + b2*X2 + e
2. Y2 = b0 + b1*Y1 + b2*X2 + (e + Y1)
Doesn't this mean that b1*Y1 is correlated with the error term (or
does that not matter since all I am interested in is b2?)
So am I missing something? Although these types of experiments are
different than time series analysis of the economics sort, it is
innapropriate to use differences on one side of the equation and
levels on the other. See http://www.griffith.edu.au/__data/assets/pdf_file/0017/88100/Greenberg-2001.pdf
for an example. Maybe that is not a good example, as the nature of
unemployment is different than "vegtable intake", but still aren't
there issues in saying the levels of vegtable intake affect the change
scores?
Andy W
Hi Andy. All I was getting at was that the reason Allison found
different group effects in his two models was not *simply* that he
changed from using Y2 to using Y2-Y1 as his outcome variable. The key
factor was dropping Y1 as a covariate when he went to the model with
Y2-Y1. Had he retained Y1 as a covariate, then he would have found
exactly the same group effect.
>
> Your model 3 can be rewritten as
>
> 1. (Y2 - Y1) = b0 + b1*Y1 + b2*X2 + e
> 2. Y2 = b0 + b1*Y1 + b2*X2 + (e + Y1)
>
> Doesn't this mean that b1*Y1 is correlated with the error term (or
> does that not matter since all I am interested in is b2?)
As I recall, this is the argument against using Y2-Y1 as the outcome
when you are including Y1 as a covariate. The group effect (the thing
of main interest) is the same if you just use Y2; and the coefficient
for Y1 is much easier to interpret.
>
> So am I missing something? Although these types of experiments are
> different than time series analysis of the economics sort, it is
> innapropriate to use differences on one side of the equation and
> levels on the other. Seehttp://www.griffith.edu.au/__data/assets/pdf_file/0017/88100/Greenber...
> for an example. Maybe that is not a good example, as the nature of
> unemployment is different than "vegtable intake", but still aren't
> there issues in saying the levels of vegtable intake affect the change
> scores?
Thanks for the reference.
>
> Andy W
--
Bruce Weaver
bwe...@lakeheadu.ca