weighted standard error of the mean

lina

unread,

Sep 29, 2010, 3:56:57 AM9/29/10

to MedStats

Hello,

I have a short question concerning the definition of a standard error
of a weighted mean.

I have two groups: a patient group and a reference group. First, I
computed the mean value for some property for the patient group and a
weighted mean value for the control group with weights according to
the age and sex distribution of the patient group. Second, I compute
the usual standard error of the mean for the patient group, but I am
not sure how I can correctly compute the standard error of the
weighted mean of the control group. I used the sum of the weights (the
number of people in the patient group) as N for the standard error.
But this is in my case perhaps quite conservative, as I have much more
controls than patients and, thus, the standard errors of controls
should be smaller than the ones of the patients.But as the variance in
both groups is comparable, my standard error defintion results in
approximately the same size of standard errors in control and
patients.

Do you know of any "official" definition of a standard error of a
weighted mean? Or do you know of any articles/references about this
topic?

I am looking forward to receive any advice.
Cheers
Lina

Ted Harding

unread,

Sep 29, 2010, 5:15:53 AM9/29/10

to meds...@googlegroups.com

Your query does not fully pin down the problem in detail, but the
following general approach should be applicable.

Let X1, X2, ... , Xn be n variables (independent of each other,
otherwise it gets more complicated ... ), with standard deviations
S1, S2, ... , Sn. Let W1, W2, ... , Wn be weights to be applied in
forming the weighted mean. The weighted mean is then

M = (W1*X1 + W2*X2 + ... + Wn*Xn)/(SW)

where SW = W1 + W2 + ... + Wn is the sum of the weights.

Then the variance of M is

((W1^2)*(S1^2) + (W2^2)*(S2^2) + ... + (Wn^2)*(Sn^2))/(SW^2)

and this is the square of the standard error of M.

Provided you know the values of W1, ... , Wn and S1, ... , Sn,
that is the end of the matter!

Hoping this helps,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.H...@manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 29-Sep-10 Time: 10:15:50
------------------------------ XFMail ------------------------------

lina

unread,

Oct 1, 2010, 2:42:24 AM10/1/10

to MedStats

Thanks for your answer. But the formular you presented should be the
formular of the weighted sample variance and corresponing weighted
sample standard deviation (like given in Wikipedia:
http://en.wikipedia.org/wiki/Weighted_mean). I computed the
corresponding value on my data set and the result had the expected
size for a sample variance.

My problem is that I need the standard error of the mean. Without
weighting, I would take the sample standard deviation divided by the
square root of the number of observations. But what do I do in case of
weighting? I am not sure what the denominator should be.

Cheers,
Lina

> E-Mail: (Ted Harding) <Ted.Hard...@manchester.ac.uk>

Ray Koopman

unread,

Oct 2, 2010, 2:38:37 AM10/2/10

to MedStats

[I have put the prior posts into temporal order.
My post is at the end. -- RFK]

On Sep 30, 11:42 pm, lina <lina....@googlemail.com> wrote:> On 29

> Thanks for your answer. But the formular you presented should be the
> formular of the weighted sample variance and corresponing weighted
> sample standard deviation (like given in Wikipedia:
> http://en.wikipedia.org/wiki/Weighted_mean). I computed the
> corresponding value on my data set and the result had the expected
> size for a sample variance.
>
> My problem is that I need the standard error of the mean. Without
> weighting, I would take the sample standard deviation divided by the
> square root of the number of observations. But what do I do in case
> of weighting? I am not sure what the denominator should be.

For a weighted mean and variance computed as in the wikipedia
article, the standard error of the weighted mean is s/sqrt(n'),
where s is the weighted s.d., n' = V1^2/V2, and V1 & V2 are as in
the article. The df would be n' - 1. n' is the "effective sample
size" and is conceptually the same as Simpson's Reciprocal Index
of Diversity.

However, the variance described in the wikipedia article (and the
standard error I just gave) may not be what you should be using. The
precision of your mean, weighted or not, depends on how precisely
each of the values that are averaged estimates its own age-sex
subpopulation mean. The sizes of the differences between the
subpopulations do not matter. In Ted's expression for the weighted
variance, each term Si^2 should be understood as being (an estimate
of) the true variance of subpopulation i, divided by the number of
scores that enter into your sample mean for that subpopulation.

Why? Because the model that underlies the wikipedia "Weighted sample
variance" section is not the same as the model used in previous
sections. The model that underlies the weighted variance section is
that X1,...,Xn are independent identically distributed observations
from a population with mean mu and variance sigma^2, whereas the
nmodel that is appropriate for your data is that the Xi are
independent but not identically distributed: each Xi comes from
a population with its own mean mu_i and variance sigma_i^2.

lina

unread,

Oct 4, 2010, 3:05:16 AM10/4/10

to MedStats

Thank you a lot.

On 2 Okt., 08:38, Ray Koopman <koop...@sfu.ca> wrote:
> [I have put the prior posts into temporal order.
> My post is at the end. -- RFK]
>

> On Sep 30, 11:42 pm, lina <lina.jan...@googlemail.com> wrote:> On 29

Reply all

Reply to author

Forward