Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Wage regression - use sample weights?

3 views
Skip to first unread message

Bob

unread,
Mar 27, 2008, 6:18:50 AM3/27/08
to
Hi,

I am trying to do a wage regression (ln(wage) dependent on several
individual characteristics like age, education, region, etc.) based on
household survey data and now I don't know if and why sample weights
(here: sample inflation factors, multipliers to inflate the sample to
the total population) should be used in the regression and if so, how
this is done.
I saw some references where they discussed the issue but I didn't
really understand why the one way or the other is preferred.

Theoretically, I think one could clone the individual observations
(single household) to equal the respective sample inflation factor and
adding an error term from the distribution of the subgroup sample to
each clone. But practically, the size of the data would not be
manageable.

Can anyone point me to a gentle introduction reference regarding this
issues or give me some clues?

Many thanks,
Bob

Bob

unread,
Apr 1, 2008, 5:39:32 PM4/1/08
to

No one doing regression analysis on household survey data and knowing
about the sample weight issue?!

Richard Ulrich

unread,
Apr 1, 2008, 9:06:20 PM4/1/08
to
On Tue, 1 Apr 2008 14:39:32 -0700 (PDT), Bob <frot...@yahoo.com>
wrote:

> On Mar 27, 11:18 am, Bob <frott...@yahoo.com> wrote:
> > Hi,
> >
> > I am trying to do a wage regression (ln(wage) dependent on several
> > individual characteristics like age, education, region, etc.) based on
> > household survey data and now I don't know if and why sample weights
> > (here: sample inflation factors, multipliers to inflate the sample to
> > the total population) should be used in the regression and if so, how
> > this is done.

SPSS, for instance, allows you to specify case-weights
in general, which are then used for (almost) every procedure.

Computer programs for surveys, I think, allow weighting
within the regression program. I don't remember if SPSS
does.

If you want the tests to be useful, at all, then the total
N after weighting is about the same as the total N
before weighting. If the weighting does very much to
distort the actual cell sizes, then the tests will be
screwed up to a corresponding degree.

> > I saw some references where they discussed the issue but I didn't
> > really understand why the one way or the other is preferred.
> >
> > Theoretically, I think one could clone the individual observations
> > (single household) to equal the respective sample inflation factor and
> > adding an error term from the distribution of the subgroup sample to
> > each clone. But practically, the size of the data would not be
> > manageable.
> >
> > Can anyone point me to a gentle introduction reference regarding this
> > issues or give me some clues?
> >
> > Many thanks,
> > Bob
>
> No one doing regression analysis on household survey data and knowing
> about the sample weight issue?!

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html

Aniko

unread,
Apr 4, 2008, 8:54:10 AM4/4/08
to

The design of the survey has to be taken into account beyond the
weights, since most household surveys use quite complex designs with
multiple levels of stratification, sampling, etc. If the data come
from one of the big national surveys, they usually have some document
describing the recommended method of analysis. Usually you have to use
regression methods designed specifically for survey data to get
correct standard errors of the estimates (weighted regression will
give the right estimates, but the standard errors will be too low). I
know that Stata, SAS, R do have such facilities. In those programs you
just specify the strata, primary sampling units and weights and they
will adjust for those in the regression models.

Aniko

Bob

unread,
Apr 9, 2008, 5:26:38 AM4/9/08
to
>
> The design of the survey has to be taken into account beyond the
> weights, since most household surveys use quite complex designs with
> multiple levels of stratification, sampling, etc. If the data come
> from one of the big national surveys, they usually have some document
> describing the recommended method of analysis. Usually you have to use
> regression methods designed specifically for survey data to get
> correct standard errors of the estimates (weighted regression will
> give the right estimates, but the standard errors will be too low). I
> know that Stata, SAS, R do have such facilities. In those programs you
> just specify the strata, primary sampling units and weights and they
> will adjust for those in the regression models.
>
> Aniko

Many thanks for the answers, Rich and Aniko!
I indeed know the survey design but they did not recommend any
specific regression method for this.

Apart from using these methods in a statistics package, I also would
like to understand why and how the data / weights / errors etc. are
adjusted but I found very little about this and nothing detailed
enough so that I could understand it. Of course, the main reason for
this is my insufficient knowledge of econometrics. But maybe there is
something out there describing the problems and solutions of wage
regressions on household data in detail?

Thanks,
Bob

0 new messages