Testing the difference between betas within a regression model

M.A. Specka

unread,

Dec 22, 2009, 10:29:30 AM12/22/09

to

In an older posting in sci.stat.math
Message-ID: <1193606771....@i13g2000prf.googlegroups.com>
I found the following question and answer (answer by Ray Koopman):

>>Given a multiple regression equation with (say) 4 IV's, how I can test
>> for the significance of the difference between (say) beta 1 and beta
>> 2, or (say) beta 1 and beta 4?

>Replace the two variables in question by their sum and difference,
>then rerun the regression. If the coefficient of the difference is
>significant then the difference between the coefficients of the two
>original variables is significant.

I suppose sum and difference are calculated from z-transformed
variables. When I perfomed the proposed analysis with some real life
data it produced plausible results. But the rationale of this
procedure remains unclear to me. Could anyone give me a hint
or reference why the coefficient of the difference should answer
the question whether the coefficients of the original variables
are significantly different?

Thank you in advance

Michael

Richard Startz

unread,

Dec 22, 2009, 11:10:56 AM12/22/09

to

You've rewritten the original equation in a way that doesn't change
the equation, but makes the new parameters functions (for example the
difference) of the parameters in the original equation.
-Dick Startz

Rich Ulrich

unread,

Dec 22, 2009, 3:59:05 PM12/22/09

to

On Tue, 22 Dec 2009 16:29:30 +0100, M.A. Specka
<ernst....@das-spiel-hat-90-minuten.de> wrote:

>In an older posting in sci.stat.math
>Message-ID: <1193606771....@i13g2000prf.googlegroups.com>
>I found the following question and answer (answer by Ray Koopman):
>
>>>Given a multiple regression equation with (say) 4 IV's, how I can test
>>> for the significance of the difference between (say) beta 1 and beta
>>> 2, or (say) beta 1 and beta 4?
>
>>Replace the two variables in question by their sum and difference,
>>then rerun the regression. If the coefficient of the difference is
>>significant then the difference between the coefficients of the two
>>original variables is significant.
>
>I suppose sum and difference are calculated from z-transformed
>variables.

- That would be the case if your original question was using
"beta" to denote the "standardized regression coefficient"
in the way that computer programs usually do. I know that I
read the original question and assumed that you were just
talking about the regular regression coefficient, in which case
you would just sum-and-difference the original variables.

The reply by Richard Startz otherwise covers the question.

> When I perfomed the proposed analysis with some real life
>data it produced plausible results. But the rationale of this
>procedure remains unclear to me. Could anyone give me a hint
>or reference why the coefficient of the difference should answer
>the question whether the coefficients of the original variables
>are significantly different?
>

--
Rich Ulrich

Paul

unread,

Dec 22, 2009, 4:43:48 PM12/22/09

to

A less parsimonious but more general answer (which may help you
appreciate why the above is correct) is to look at the sampling
distribution of the coefficients. Let X be the design matrix and b
the estimated coefficient vector; then the mean of b is beta (the true
coefficient vector) and the theoretical covariance matrix of b given X
is sigma^2 * inv(X'X), where sigma is the standard deviation of the
disturbances. So far, we are making no assumptions about the
distribution family of the disturbances (but we are making the usual
regression assumptions about independent observations, etc.).

Now consider a linear contrast L = lambda'b of the estimated
coefficients. For instance, lambda = (1, -1, 0, ..., 0)' to look at
b_1 - b_2. L has mean lambda'beta and covariance sigma^2 * lambda'inv
(X'X)lambda. If we now throw in an assumption that disturbances are
normally distributed, and approximate sigma by the std. dev. of
residuals (s), then under the null hypothesis of a mean contrast of
zero, we get L/se(L) ~ t(n-p) where se(L) = s*sqrt(lambda'inv(X'X)
lambda), n = sample size and p = dim(b). Left to the reader as an
exercise: Ray Koopman's solution is a special (and simpler) case of
this.

Hope this helps. (Also hope I didn't scramble anything.)
Paul

Ray Koopman

unread,

Dec 22, 2009, 11:39:32 PM12/22/09

to

On Dec 22, 7:29 am, M.A. Specka <ernst.kuzo...@das-spiel-hat-90-

x1' = x1 + x2
x2' = x1 - x2

x1 = (x1' + x2')/2
x2 = (x1' - x2')/2

b1*x1 + b2*x2 = b1*(x1' + x2')/2 + b2*(x1' - x2')/2
= ((b1 + b2)/2)*x1' + ((b1 - b2)/2)*x2'
= ( b1' )*x1' + ( b2' )*x2'

M.A. Specka

unread,

Dec 28, 2009, 2:59:47 AM12/28/09

to

Thank you very much for the replies. Each of them was helpful.

Regards

Michael