[R] Hausman Test

Holger Steinmetz

unread,

Jan 16, 2011, 9:07:08 AM1/16/11

to r-h...@r-project.org

Hi,

can anybody tell me how the Hausman test for endogenty works?

I have a simulated model with three correlated predictors (X1-X3). I also
have an instrument W for X1

Now I want to test for endogeneity of X1 (i.e., when I omit X2 and X3 from
the equation).

My current approach:

library(systemfit)

fit2sls <- systemfit(Y~X1,data=data,method="2SLS",inst=~W)
fitOLS <- systemfit(Y~X1,data=data,method="OLS")
print(hausman.systemfit(fitOLS, fit2sls))

This seems to work fine. However, when I include X2 as a furter predictor,
the 2sls-estimation doesn't work.

Thanks in advance
Holger

--
View this message in context: http://r.789695.n4.nabble.com/Hausman-Test-tp3220016p3220016.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Achim Zeileis

unread,

Jan 16, 2011, 9:39:16 AM1/16/11

to Holger Steinmetz, r-h...@r-project.org

On Sun, 16 Jan 2011, Holger Steinmetz wrote:

>
> Hi,
>
> can anybody tell me how the Hausman test for endogenty works?
>
> I have a simulated model with three correlated predictors (X1-X3). I also
> have an instrument W for X1
>
> Now I want to test for endogeneity of X1 (i.e., when I omit X2 and X3 from
> the equation).
>
> My current approach:
>
> library(systemfit)
>
> fit2sls <- systemfit(Y~X1,data=data,method="2SLS",inst=~W)
> fitOLS <- systemfit(Y~X1,data=data,method="OLS")
> print(hausman.systemfit(fitOLS, fit2sls))
>
> This seems to work fine. However, when I include X2 as a furter predictor,
> the 2sls-estimation doesn't work.

When you don't need any instruments for X2, then you should employ

Y ~ X1 + X2, inst = ~ W + X2

Then, regressor X2 is unaltered in the second stage of the regression
(after projection onto the instruments).

hth,
Z

Holger Steinmetz

unread,

Jan 16, 2011, 9:53:17 AM1/16/11

to r-h...@r-project.org

Dear Achim,

thank you very much.

One follow up question. The Hausman-test always gives me a p-value of 1 - no
matter
how small the statistic is.

I now generated orthogonal regressors (X1-X3) and the test gives me

Hausman specification test for consistency of the 3SLS estimation

data: data
Hausman = -0.0138, df = 2, p-value = 1

What is confusing to me is the "3SLS". I am just beginning to learn about
instrumental variables (I am a psychologist ;) Perhaps that's a problem?

As a background, here's the complete simulation:

W = rnorm(1000)
X2 = rnorm(1000)
X3 = rnorm(1000)
X1 = .5*W + rnorm(1000)
Y = .4*X1 + .5*X2 + .6*X3 + rnorm(1000)
data = as.data.frame(cbind(X1,X2,X3,Y,W))

fit2sls <- systemfit(Y~X1,data=data,method="2SLS",inst=~W)
fitOLS <- systemfit(Y~X1,data=data,method="OLS")

print(hausman.systemfit(fitOLS, fit2sls))

Best,
Holger
--
View this message in context: http://r.789695.n4.nabble.com/Hausman-Test-tp3220016p3220065.html

Arne Henningsen

unread,

Jan 16, 2011, 10:28:15 AM1/16/11

to Holger Steinmetz, r-h...@r-project.org

Hi Holger!

On 16 January 2011 15:53, Holger Steinmetz <Holger.s...@web.de> wrote:
> One follow up question. The Hausman-test always gives me a p-value of 1 - no
> matter how small the statistic is.
>
> I now generated orthogonal regressors (X1-X3) and the test gives me
>
>
> Hausman specification test for consistency of the 3SLS estimation
>
> data: data
> Hausman = -0.0138, df = 2, p-value = 1
>
> What is confusing to me is the "3SLS". I am just beginning to learn about
> instrumental variables (I am a psychologist ;) Perhaps that's a problem?
>
> As a background, here's the complete simulation:
>
> W = rnorm(1000)
> X2 = rnorm(1000)
> X3 = rnorm(1000)
> X1 = .5*W + rnorm(1000)
> Y = .4*X1 + .5*X2 + .6*X3 + rnorm(1000)
> data = as.data.frame(cbind(X1,X2,X3,Y,W))
>
> fit2sls <- systemfit(Y~X1,data=data,method="2SLS",inst=~W)
> fitOLS <- systemfit(Y~X1,data=data,method="OLS")
>
> print(hausman.systemfit(fitOLS, fit2sls))

Please do read the documentation of hausman.systemfit(). I regret that
comparing 2SLS with OLS results has not been implemented yet:

====== part of documentation of hausman.systemfit() =================
Usage:

hausman.systemfit( results2sls, results3sls )

Arguments:

results2sls : result of a _2SLS_ (limited information) estimation
returned by ‘systemfit’.

results3sls : result of a _3SLS_ (full information) estimation
returned by ‘systemfit’.

Details:

The null hypotheses of the test is that all exogenous variables
are uncorrelated with all disturbance terms. Under this
hypothesis both the 2SLS and the 3SLS estimator are consistent but
only the 3SLS estimator is (asymptotically) efficient. Under the
alternative hypothesis the 2SLS estimator is consistent but the
3SLS estimator is inconsistent.

The Hausman test statistic is

m = ( b_2 - b_3 )' ( V_2 - V_3 ) ( b_2 - b_3 )

where $b_2$ and $V_2$ are the estimated coefficients and their
variance covariance matrix of a _2SLS_ estimation and $b_3$ and
$V_3$ are the estimated coefficients and their variance covariance
matrix of a _3SLS_ estimation.

=========================================

Please don't hesitate to write a new version of hausman.systemfit()
that can also compare 2SLS with OLS results.

Best regards from Copenhagen,
Arne

--
Arne Henningsen
http://www.arne-henningsen.name

Achim Zeileis

unread,

Jan 16, 2011, 10:29:03 AM1/16/11

to Holger Steinmetz, r-h...@r-project.org

On Sun, 16 Jan 2011, Holger Steinmetz wrote:

>
> Dear Achim,
>
> thank you very much.
>
> One follow up question. The Hausman-test always gives me a p-value of 1
> - no matter how small the statistic is.
>
> I now generated orthogonal regressors (X1-X3) and the test gives me
>
>
> Hausman specification test for consistency of the 3SLS estimation
>
> data: data
> Hausman = -0.0138, df = 2, p-value = 1
>
> What is confusing to me is the "3SLS".

Hausman tests can be used for comparisons of various models. The
implementation in systemfit is intended for comparison of 2SLS and 3SLS
but can also be (ab)used for comparison of 2SLS and OLS. You just have to
enter the models in the reverse order, i.e., hausman.systemfit(fit2sls,
fitOLS).

A worked example that computes the test statistic "by hand" is also
included in

help("Baltagi2002", package = "AER")

in the section about the US consumption data, Chapter 11.

An adaptation is also shown below:

## data
library("AER")
data("USConsump1993", package = "AER")
usc <- as.data.frame(USConsump1993)
usc$investment <- usc$income - usc$expenditure

## 2SLS via ivreg(), Hausman by hand
fm_ols <- lm(expenditure ~ income, data = usc)
fm_iv <- ivreg(expenditure ~ income | investment, data = usc)
cf_diff <- coef(fm_iv) - coef(fm_ols)
vc_diff <- vcov(fm_iv) - vcov(fm_ols)
x2_diff <- as.vector(t(cf_diff) %*% solve(vc_diff) %*% cf_diff)
pchisq(x2_diff, df = 2, lower.tail = FALSE)

## 2SLS via systemfit(), Hausman via hausman.systemfit()
library("systemfit")
sm_ols <- systemfit(expenditure ~ income, data = usc, method = "OLS")
sm_iv <- systemfit(expenditure ~ income, data = usc, method = "2SLS",
inst = ~ investment)
hausman.systemfit(sm_iv, sm_ols)

hth,
Z

Achim Zeileis

unread,

Jan 16, 2011, 10:37:55 AM1/16/11

to Arne Henningsen, r-h...@r-project.org, Holger Steinmetz

On Sun, 16 Jan 2011, Arne Henningsen wrote:

> Hi Holger!
>
> On 16 January 2011 15:53, Holger Steinmetz <Holger.s...@web.de> wrote:
>> One follow up question. The Hausman-test always gives me a p-value of 1 - no
>> matter how small the statistic is.
>>
>> I now generated orthogonal regressors (X1-X3) and the test gives me
>>
>>

>> ï¿½ ï¿½ ï¿½ ï¿½Hausman specification test for consistency of the 3SLS estimation
>>
>> data: ï¿½data

>> Hausman = -0.0138, df = 2, p-value = 1
>>
>> What is confusing to me is the "3SLS". I am just beginning to learn about
>> instrumental variables (I am a psychologist ;) Perhaps that's a problem?
>>
>> As a background, here's the complete simulation:
>>
>> W = rnorm(1000)
>> X2 = rnorm(1000)
>> X3 = rnorm(1000)

>> X1 = .5*W ï¿½+ rnorm(1000)

>> Y = .4*X1 + .5*X2 + .6*X3 + rnorm(1000)
>> data = as.data.frame(cbind(X1,X2,X3,Y,W))
>>
>> fit2sls <- systemfit(Y~X1,data=data,method="2SLS",inst=~W)
>> fitOLS <- systemfit(Y~X1,data=data,method="OLS")
>>
>> print(hausman.systemfit(fitOLS, fit2sls))
>
> Please do read the documentation of hausman.systemfit(). I regret that
> comparing 2SLS with OLS results has not been implemented yet:
>
> ====== part of documentation of hausman.systemfit() =================
> Usage:
>
> hausman.systemfit( results2sls, results3sls )
>
> Arguments:
>
> results2sls : result of a _2SLS_ (limited information) estimation

> returned by ?systemfit?.

>
> results3sls : result of a _3SLS_ (full information) estimation

> returned by ?systemfit?.

>
> Details:
>
> The null hypotheses of the test is that all exogenous variables
> are uncorrelated with all disturbance terms. Under this
> hypothesis both the 2SLS and the 3SLS estimator are consistent but
> only the 3SLS estimator is (asymptotically) efficient. Under the
> alternative hypothesis the 2SLS estimator is consistent but the
> 3SLS estimator is inconsistent.
>
> The Hausman test statistic is
>
> m = ( b_2 - b_3 )' ( V_2 - V_3 ) ( b_2 - b_3 )
>
> where $b_2$ and $V_2$ are the estimated coefficients and their
> variance covariance matrix of a _2SLS_ estimation and $b_3$ and
> $V_3$ are the estimated coefficients and their variance covariance
> matrix of a _3SLS_ estimation.
>
> =========================================
>
> Please don't hesitate to write a new version of hausman.systemfit()
> that can also compare 2SLS with OLS results.

Arne: Unless I'm missing something, hausman.systemfit() essentially does
the right thing and computes the right statistic and p-value (see my other
mail to Holger). Maybe some preliminary check on the input objects could
be used for determining the right order of models.

Best,
Z

Arne Henningsen

unread,

Jan 16, 2011, 11:05:30 AM1/16/11

to Achim Zeileis, r-h...@r-project.org, Holger Steinmetz

Hi Achim!

On 16 January 2011 16:37, Achim Zeileis <Achim....@uibk.ac.at> wrote:
> Arne: Unless I'm missing something, hausman.systemfit() essentially does the
> right thing and computes the right statistic and p-value (see my other mail
> to Holger). Maybe some preliminary check on the input objects could be used
> for determining the right order of models.

Thanks for the response and the suggestions. Adding a check for the
input objects and extending the documentation is a good idea! I will
change the systemfit package accordingly in the future.

/Arne

Holger Steinmetz

unread,

Jan 16, 2011, 11:09:28 AM1/16/11

to r-h...@r-project.org

Thank you both very much !

This helped me a lot.

Best,
Holger
--
View this message in context: http://r.789695.n4.nabble.com/Hausman-Test-tp3220016p3220123.html