Simple Proof that c=median minimizes E[ |X-c| ] needed.

Menees, Bill

unread,

Sep 3, 1992, 2:44:00 PM9/3/92

to

I'm a senior math major taking my first p&s course, and this problem
has come up and it intrigues me. My prof. has a proof for it, but he said it
was way over my head. Does anyone know of a proof suitable for a senior
undergrad? Thanks in advance.

USENET News System

unread,

Sep 3, 1992, 6:10:35 PM9/3/92

to

If you are a senior math major then you should have no difficulty
in following the proof. I suspect the real problem is that it's way
over your prof.'s head. :-)

Well, I must admit that I'm going to cheat. :-) Let's prove the result
for a discrete distribution only.

When c != any possible x
d/dc (sum|x-c|) = number of x's greater than c - number of x's less than c
otherwise this does not exist.
This derivative is decreasing if c < median and increasing if c > median
and as sum|x-c| is continuous and piecewise linear, the minimum either
occurs where there is a discontinuity in the derivative, or where it is
zero over an interval. The latter case corresponds to a median. In the
former case we choose the discontinuity such that on either side
|number of x's greater than c - number of x's less than c| is smallest.

In the continuous case the derivative exists nowhere but a limiting
argument could be used. (This might be over my head :-).

T. Scott Thompson

unread,

Sep 3, 1992, 10:01:11 PM9/3/92

to

ne...@massey.ac.nz (USENET News System) writes:

>In article <3SEP1992...@utkvx2.utk.edu>, men...@utkvx2.utk.edu (Menees, Bill) writes:
>>
>> I'm a senior math major taking my first p&s course, and this problem
>> has come up and it intrigues me. My prof. has a proof for it, but he said it
>> was way over my head. Does anyone know of a proof suitable for a senior
>> undergrad? Thanks in advance.
>>

[explanation for the discrete case deleted]

>In the continuous case the derivative exists nowhere but a limiting
>argument could be used. (This might be over my head :-).

I think that the continuous version is more straightforward since then
the objective function is _everywhere_ differentiable. To get a feel
for the continuous case consider that any solution to the problem

min E(|x-c|)
c

must satisfy a first-order condition provided E(|x-c|) is
differentiable in c. When x is continuous this function is _always_
differentiable since the smooth distribution for x smooths out the
kink point in the absolute value function. Differentiability only
fails if Pr( x = c ) is strictly positive.

The trick is to compute the derivative of E(|x-c|). We can commute
the derivative operator and the expectation integral (showing that
this is OK is the technically difficult part of the proof that your
professor may have had in mind) in order to derive

d/dc E(|x-c|) = E( d/dc |x-c| )

= E [ 1{ x < c } - 1{ x > c } ]

= E 1{ x < c } - E 1{ x > c}

= Pr( x < c ) - Pr( x > c )

where 1{ ... } is the indicator function taking the value +1 if "..."
is true and zero otherwise. To see why this is true, simply note that
the derivative of |x-c| with respect to c equals +1 if x < c and -1 if
x > c. We can ignore the case x = c, since this has zero probability,
and since any reasonable extension of the definition of
differentiability to handle this case would yield a value for the
derivative somewhere between -1 and +1 at the point x = c.
Technically, to verify the first equality you must check that the
conditions of the "Lebesgue dominated convergence theorem" are
satisfied. You can look this up in any advanced probability theory
text.

Now notice that the derivative is zero if and only if

Pr( x < c ) = Pr( x > c )

and this equation is solved by choosing c = median(x). The solution
is unique if the probability density of x is positive on some
neighborhood of the median.

--
T. Scott Thompson email: thom...@atlas.socsci.umn.edu
Department of Economics phone: (612) 625-0119
University of Minnesota fax: (612) 624-0209

Brian Junker

unread,

Sep 3, 1992, 9:55:05 PM9/3/92

to

Here are some ideas off the top of my head, that should get you going in
the right direction. You will need some decent calculus skills, but
that's about it.

Case 1. X is a real-valued continuous random variable with density f(x)
such that xf(x) -> 0 as x -> +infinity and as x -> -infinity. Write
E[|X-c|] as the sum of two integrals, one going from -infinity to c and
the other going from c to +infinity, where neither integral involves
abs. val's anymore. Differentiate this sum with respect to c and
explore...

Case 2. x_1, x_2, ... x_n are fixed real numbers (i.e. you have a
sample of size n). Then c=the sample median minimizes the sum of
absolute deviations |x_1-c| + |x_2-c| + ... + |x_n-c|. You can do the
cases n=2, 3, and 4 "by hand" by graphing the sum of abs. deviations as
a function of c---it is piecewise linear with bends at the data points
x_1, x_2, ... What do you have to do to give a rigorous proof for n
data points now?

Case 3. X is a discrete random variable taking on the values x_1, x_2,
.... with probabilities p_1, p_2, .... The ideas you explored above
should be useful in proving that the median minimizes E[|X-c|] for this
case.

-BJ

T. Scott Thompson

unread,

Sep 4, 1992, 12:02:05 PM9/4/92

to

The original poster might also be interested in the fact that
c = median also solves the problem

min E( |x-c| - |x| )
c

The advantage of this characterization over the original one is that
since |x-c| - |x| is a bounded function of x, the expectation
E( |x-c| - |x| ) always exists, while E( |x-c| ) might be infinite.

Charles Geyer

unread,

Sep 5, 1992, 6:45:29 PM9/5/92

to

In article <3SEP1992...@utkvx2.utk.edu> men...@utkvx2.utk.edu
(Menees, Bill) writes:

Because this might be a homework problem, I won't post a proof right now,
but there is a one-line proof that requires no calculus and no probability
beyond the fact that expection is a positive linear operator, i. e.

E(aX + Y) = a E(X) + E(Y)

and

0 <= X implies 0 <= E(X)

It works for a general probability distribution using the definition that
c is a median if

1/2 <= P(c <= X) and 1/2 <= P(X <= c)

I'll post the proof in a week or so.

--
Charles Geyer
School of Statistics
University of Minnesota
cha...@umnstat.stat.umn.edu

m...@waikato.ac.nz

unread,

Sep 6, 1992, 5:32:35 PM9/6/92

to

In article <3SEP1992...@utkvx2.utk.edu>, men...@utkvx2.utk.edu (Menees, Bill) writes:

This reminds me of an interesting approach to several
common location statistics. I'll state it for samples
to avoid technicalities.

Suppose we have a sample x_1, x_2, . . .,x_n.
Given a positive real number p define m to minimise

Sum |x_i - m|**p

then we get

p --> 0 mode

p = 1 median

p = 2 mean

p --> infinity midrange

Neat, eh?
--
Murray A. Jorgensen [ m...@waikato.ac.nz ] University of Waikato
Department of Mathematics and Statistics Hamilton, New Zealand
__________________________________________________________________
'Tis the song of the Jubjub! the proof is complete,
if only I've stated it thrice.'

Charles Geyer

unread,

Sep 12, 1992, 8:30:11 PM9/12/92

to

In article <3SEP1992...@utkvx2.utk.edu> men...@utkvx2.utk.edu
(Menees, Bill) writes:

> I'm a senior math major taking my first p&s course, and this problem
> has come up and it intrigues me. My prof. has a proof for it, but he said it
> was way over my head. Does anyone know of a proof suitable for a senior
> undergrad? Thanks in advance.

In article <1992Sep5.2...@news2.cis.umn.edu> I replied:

> Because this might be a homework problem, I won't post a proof right now,
> but there is a one-line proof that requires no calculus and no probability

> beyond the fact that expection is a positive linear operator, ... [and]

> It works for a general probability distribution using the definition that
> c is a median if
>
> 1/2 <= P(c <= X) and 1/2 <= P(X <= c)
>
> I'll post the proof in a week or so.

The promised proof, if anyone is still interested.

Suppose c is a median and c < d (the case c > d follows by symmetry).

E{|X - d| - |X - c|} >= (d - c)[P(X <= c) - P(X > c)} >= 0

the first inequality following because |x - d| - |x - c| is equal to d - c
for x <= c and greater than - (d - c) elsewhere, and the second inequality
following by the definition of "median".

There is a similar proof that the mean minimizes expected squared deviation
-- that shouldn't use any calculus either.