Use of the notation "d" in measure theory

80 views
Skip to first unread message

Toby Bartels

unread,
Aug 23, 1998, 3:00:00 AM8/23/98
to
It seems to me that the use of "d" in measure theory is illogical.
It's used where no symbol is necessary, and, because of this,
it contradicts both Leibniz's use and the use in differential geometry.
Conversely, it could be used in the measure theory of the real line
in such a way as to agree with both Leibniz and geometers.
But maybe it is I who am mixed up on this.
So let me tell you what I think, and you tell me
if I'm crazy or if the measure theorists are.

Leibniz originally used "d" in two places: integrals and derivatives.
In integrals, it's used in front of a variable x;
in derivatives, it's used in front of both x and a real function f.
In differential geometry, "d" can be used in front of any differential form.
This is a generalization of Leibniz's use,
since a real function is a 0form on the real line
and the variable x can be interpreted as the identity function.
Then df/dx is an actual ratio,
because it asks for the unique function f' such that f' dx = df.
\int_A f dx also makes sense, because f dx is a 1form,
which is just what you want to integrate over submanifolds of the real line.
(In some ways, it is also a restriction,
because differential geometry only considers continuous functions.)

In measure theory, "d" is also used in two places:
integrals and Radon Nikodym derivatives.
It is always used in front of measures.
If m is a measure, f is a measurable function,
and A is a locally measurable set,
then \int_A f dm is the integral of f with respect to m over A.
If m and n are both measures,
then dm/dn is their Radon Nikodym derivative, if it exists.
You can make the use in integrals a generalization of Leibniz's
by identifying the variable x with Lebesgue measure.
The Radon Nikodym derivative, however,
is not a generalization of Leibniz's derivative,
at least not with the standard notation.

But note this: these two uses of "d" are completely superfluous!
Suppose you just wrote "\int_A f m" and "m/n".
These expressions don't mean anything in the standard notation.
The only thing the "d" does is help you remember what you're talking about;
but simply remembering that m and n are measures is enough for that.
There actually might be some confusion with "\int_A f m"
or, more generally, with "\int_A g f m",
because f m is itself a measure and you can't tell whether to
integrate g f with repsect to m or integrate g with respect to f m.
But, of course, it's the same thing either way.
So, instead of having the theorem that \int_A g dfm = \int_A gf dm,
you have a theorem that the notation "\int_A gfm" is unambiguous.
(In fact, the notation lends itself to a different tack
on presenting the logical development of integration;
see the appendix.)

I've removed meaningless "d"s; but what real good does that do?
Well, here's the kicker: in measure theory on the real line,
you *can* give "d" a meaning, in such a way that
both the integral *and* the Radon Nikodym derivative
agree with Leibniz's integral and derivative.
First of all, identify x with the identity function on the real line,
just as the differential geometers do.
Then, given a left continuous function f
whose positive or negative variation is bounded,
define df to be the Lebesgue Stieltjes measure of f,
the signed Borel measure such that (df)[a,b[ = f(b) - f(a).
Then Lebesgue measure is the completion of dx.

f is differentiable at a point p exactly when
the Radon Nikodym derivative df/dx is defined at p,
and the derivative f' is in fact df/dx.
(Remember that I am writing "m/n" instead of "dm/dn";
if you were to use "d" in both its standard measure theory sense
*and* the sense which I'm introducing in this paragraph,
you'd have to write "d(df)/d(dx)", which obviously you don't want to do.)
Also, any differentiable function is of course left continuous.
Boundedness may be a problem, but you can always define df
on some suitable subset of the real line.
This also covers the case where f isn't everywhere defined.

You can also call df the "differential" of f,
since that's what Leibniz and geometers call it.
The defintion of df involves a difference;
but the Radon Nikodym derivative itself doesn't.
In fact, the only differentiation in the calculation of df/dx
comes in calculating df and dx, not in applying Radon & Nikodym.
Even in general measure theory, m/n (previously written "dm/dn")
doesn't depend on how m and n *change* with respect to each other
but on the relative *sizes* of the measures they assign to sets.
Thus, the so called Radon Nikodym "derivative"
is actually the Radon Nikodym *ratio*, as my notation "m/n" implies.
(After all, m/n is a function f such that m = f n.)

If f and g are both functions such that df and dg are defined,
you can also form the Radon Nikodym ratio df/dg, assuming df << dg.
This can happen even when f and g are not themselves differentiable!
The reason is that df may be defined even when df/dx is not.
For example, let f- and g- be bounded left continuous functions
on the nonpositive real line, and let f+ and g+ be
bounded left continuous functions on the positive real line.
Then df/dg is just what you'd expect: (f-)'/(g-)'
on the negative reals and (f+)'/(g+)' on the positive reals;
(df/dg)(0) is the ratio of the jumps at the discontinuity.
I don't think this really gets you anything useful and new,
but it shows the extent to which the notation can go.
Also, you can do \int_A g df; for example,
if f is the left continuous Heaviside step function,
\int_A g df = g(0) if 0 is in A or 0 if 0 is outside A,
just as you would expect.


Appendix:

Here is a logical development of integration theory using my notation:
Define measurable spaces, measurable functions, and *signed* measures.
Then define the measure fm by first defining it
when f is a characteristic function, then when f is a step function, etc,
so that you develop fm(A) the same way you normally develop
\int_A f m (previously called "int_A f dm").
There is a theorem here; that (gf)m = g(fm).
Now introduce, purely as a matter of *notation*, "\int_A m"
to mean m(A) for any measure m, including one of the form fn.


-- Toby
to...@ugcs.caltech.edu


G. A. Edgar

unread,
Aug 24, 1998, 3:00:00 AM8/24/98
to

> It is always used in front of measures.

Not always. There are those who put it inside, like this:
\int_A f(x) \mu(dx)
[Here, we think of dx as a "small" set. We evaluate f at a point in
the set, multiply by the measure of the set, and add up lots
of these forming a partition of A. The integral is the limit of such things,
under refinement of the partition. Such a limit really produces the
Lebesgue integral.]

....

> Appendix:
>
> Here is a logical development of integration theory using my notation:
> Define measurable spaces, measurable functions, and *signed* measures.
> Then define the measure fm by first defining it
> when f is a characteristic function, then when f is a step function, etc,
> so that you develop fm(A) the same way you normally develop
> \int_A f m (previously called "int_A f dm").
> There is a theorem here; that (gf)m = g(fm).
> Now introduce, purely as a matter of *notation*, "\int_A m"
> to mean m(A) for any measure m, including one of the form fn.

I have seen a development something like this. But in fact it was used to
define \int_A f m, where m is a set function more general than a measure...
semigroup-valued, or something.

...

Here is a question for your notation: In the theory of Markov
chains, we often talk about a "transition probability", which
is a function of the form p(x,A), where for each fixed set A
it is a measurable function of x, and each fixed point x it is
a measure as a function of A. We want to use integrals like:

\int_A g(y) p(x,dy)

What is your notation for this?
--
Gerald A. Edgar ed...@math.ohio-state.edu


Herman Rubin

unread,
Aug 24, 1998, 3:00:00 AM8/24/98
to
In article <6rpsei$4...@gap.cco.caltech.edu>,

Toby Bartels <to...@ugcs.caltech.edu> wrote:
>It seems to me that the use of "d" in measure theory is illogical.
>It's used where no symbol is necessary, and, because of this,
>it contradicts both Leibniz's use and the use in differential geometry.

It is not that different in use on the real line, and the Riemann
and Riemann-Stieltjes integrals go with that usage.

Transformations of variables in the Euclidean case go along with
the use of differentials, and partially with the use in differential
geometry.

Also, we are not obliged to use a particular notation because
Leibniz used it. What you are proposing would change notation
from what thousands of mathematicians have used.

However, I will try to point out why your suggestions would be
likely to cause problems in any case, and why a use of d which
is slightly different from the usual one helps even more.

..................

>In measure theory, "d" is also used in two places:
>integrals and Radon Nikodym derivatives.
>It is always used in front of measures.

This is not the case. Many use m(dx) instead of dm(x), which
is the rather established notation.

>If m is a measure, f is a measurable function,
>and A is a locally measurable set,
>then \int_A f dm is the integral of f with respect to m over A.

One problem is that there is often not even a fair notation for
a function. We can write \int x^3 dm(x), and we know what this
means. However, we do not have agreement on what \int x^3 dm(1/x)
means, but we do on what \int x^3 m(d 1/x) means. This latter
notation is often used, and does not lead to confusion, and it
allows for convenient transformation of variables, even for
measures on non-Euclidean spaces.

>If m and n are both measures,
>then dm/dn is their Radon Nikodym derivative, if it exists.
>You can make the use in integrals a generalization of Leibniz's
>by identifying the variable x with Lebesgue measure.
>The Radon Nikodym derivative, however,
>is not a generalization of Leibniz's derivative,
>at least not with the standard notation.

I do not see it as that different from the modern use of
differential, and it works fine for Stieltjes integrals,
especially if m(dx) is used instead of dm(x).

>But note this: these two uses of "d" are completely superfluous!
>Suppose you just wrote "\int_A f m" and "m/n".
>These expressions don't mean anything in the standard notation.
>The only thing the "d" does is help you remember what you're talking about;
>but simply remembering that m and n are measures is enough for that.

This is not enough; measures are, to me, on any kind of space.
We usually mean by (f/g)(u) the quantity f(u)/g(u). If we then
have discrete measures, m/n would already be defined for sets,
and this would lead to confusion, and inconsistent notation.

.......................

>Appendix:

>Here is a logical development of integration theory using my notation:
>Define measurable spaces, measurable functions, and *signed* measures.
>Then define the measure fm by first defining it
>when f is a characteristic function, then when f is a step function, etc,
>so that you develop fm(A) the same way you normally develop
>\int_A f m (previously called "int_A f dm").
>There is a theorem here; that (gf)m = g(fm).
>Now introduce, purely as a matter of *notation*, "\int_A m"
>to mean m(A) for any measure m, including one of the form fn.

Again, this notation can lead to confusion as shown above.

To clean up notation, we should replace dm(x) by m(dx), which now
maintains transformation properties for 1-1 transformations.

We do not have any reasonable notation for functions in general.
There is no agreed-upon way of writing f(x^2) or g(sqrt(x)) or
even the function h with h(x) = x, so we need to be able to write
f(x) dm(x) or f(x) m(dx) or f(x)m(dq(x)).
--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558


Toby Bartels

unread,
Aug 25, 1998, 3:00:00 AM8/25/98
to
G. A. Edgar <ed...@math.ohio-state.edu> wrote at last:

>Here is a question for your notation: In the theory of Markov
>chains, we often talk about a "transition probability", which
>is a function of the form p(x,A), where for each fixed set A
>it is a measurable function of x, and each fixed point x it is
>a measure as a function of A. We want to use integrals like:

> \int_A g(y) p(x,dy)

>What is your notation for this?

I would just say "\int_A g(y) p(x,y).
This is analogous to "\int_A f(x,y) m(y)",
which I get from the more standard "\int_A f(x,y) dm(y)".
What I notice is that this last sort of standard notation,
the kind with which alone I was familiar,
is inadequate to deal with your integral,
because we don't want "\int_A g(y) dp(x,y)".
Both "\int_A g(y) p(x,dy)" and "\int_A g(y) p(x,y)"
avoid putting "d" in a bad place.

Now, I can charge that your "d" is unnecessary,
as long as we remember that p is a measure in its second variable.
BUT, given your explanation for "m(dx)" and given Herman Rubin's m(dg(x)),
I see first that the "d" is not so bad,
and second that the "d" is positively useful.
So I think you and Herman have a convert to the "m(dx)" notation.
(That said, I still think I was right to suggest
that "d" should not be used in front of measures,
and I'm glad to learn that it isn't always.)


-- Toby
to...@ugcs.caltech.edu


Toby Bartels

unread,
Aug 25, 1998, 3:00:00 AM8/25/98
to
Herman Rubin <hru...@stat.purdue.edu> wrote in part:

>We are not obliged to use a particular notation because Leibniz used it.

That is certainly true.
What's relevant is that thousands of calculus students use it,
and some of them will eventually learn measure theory.
But it was the differential geometry I really cared about.

>A use of d which is slightly different from the usual one helps even more.

This usage's most general form is "\int_A f(x) m(dg(x))", right?
I've never seen that before (and I did do research for my article),
and it looks very nice to me. Do you have offhand a precise definition?
Better yet, a book I can look at which uses this?

Basically, this notation looks as if it solves all my problems,
because it puts d back in front of the function again, not the measure.
Arguably, the d is still pointless if you only say "\int_A f(x) m(dx)",
but apparently you can say much more than that; and, if m(dx)
>maintains transformation properties for 1-1 transformations<,
then I not only have no problem with it but am delighted by it.

So, I take it all back in the face of m(dg(x)), except for this:

>>Suppose you just wrote "m/n".

>This is not enough; measures are, to me, on any kind of space.
>We usually mean by (f/g)(u) the quantity f(u)/g(u). If we then
>have discrete measures, m/n would already be defined for sets,
>and this would lead to confusion, and inconsistent notation.

It doesn't seem to me to be such a terror that "m/n" has two meanings,
as long as they are consistent whenever conflation is possible.
The set function m/n is defined only on subsets of the space X,
while dm/dn (to use standard notation) is defined on points of X.
The only possible confusion I can see is (m/n)({p}).
But there are only 3 possibilites here:
m !<< n, in which case dm/dn is undefined;
n({p}) = 0, in which case (m/n)({p}) is undefined;
otherwise, in which case (dm/dn)(p) = (m/n)({p}) necessarily.
The fact that we have equality in the only case where comparison is possible
highlights what I said earlier about the Radon Nikodym derivative's
really being more of a ratio than a derivative.
dm/dn has very much to do with the set function m/n !
(OTOH, maybe when you say "discrete measure",
you're talking about something I don't know about
which actually does cause problems.)

Actually, now that I think about it,
even the undefined cases fit together.
If m !<< n and n({p}) != 0, we can restrict to the subset {p};
on this space, m << n and (dm/dn)(p) = (m/n)({p}) necessarily.
If n({p}) = 0 and m << n, then m({p}) = 0,
so (m/n)({p}) is undefined because it's an indeterminate form,
and (dm/dn)(p) isn't precisely defined either,
because dm/dn is defined only modulo n measure 0.
Finally, if m !<< n and n({p}) = 0,
it's even more obvious that neither (m/n)({p}) nor (dm/dn)(p) is defined.

>There is no agreed-upon way of writing f(x^2) or g(sqrt(x)) or
>even the function h with h(x) = x, so we need to be able to write
>f(x) dm(x) or f(x) m(dx) or f(x)m(dq(x)).

On this point, I was always in favour of "\int_A x^2 m(x)" and the like,
but that's superseded by the m(dg(x)) stuff.


-- Toby
to...@ugcs.caltech.edu


Toby Bartels

unread,
Aug 25, 1998, 3:00:00 AM8/25/98
to
I <to...@ugcs.caltech.edu> wrote in small part:

>So, I take it all back in the face of m(dg(x)), except for this:
>>>Suppose you just wrote "m/n".

But, if you write "m(dx)/n(dx)" and could have, say, m(dg(x))/n(dh(x)),
I take that back too. So the only thing left is that I'd call
the Radon Nikodym derivative "ratio", not "derivative".

>OTOH, maybe when you say "discrete measure",
>you're talking about something I don't know about
>which actually does cause problems.

I found a book with discreate measures in it; they look pretty tame :-).


-- Toby
to...@ugcs.caltech.edu


Daniel Luecking

unread,
Aug 25, 1998, 3:00:00 AM8/25/98
to
hru...@stat.purdue.edu (Herman Rubin) writes:

[snip]
> ... we do not have agreement on what \int x^3 dm(1/x)


>means, but we do on what \int x^3 m(d 1/x) means.

Well, I for one have not the slightest idea what "\int x^3 m(d 1/x)"
could mean. I have never seen it and I am not a tyro in measure theory.
Could someone enlighten me and justify this usage?

[snip]

--
Dan Luecking Dept. of Mathematical Sciences
luec...@comp.uark.edu University of Arkansas
http://comp.uark.edu/~luecking/ Fayetteville, AR 72101


Herman Rubin

unread,
Aug 25, 1998, 3:00:00 AM8/25/98
to
In article <6rtfkm$q...@gap.cco.caltech.edu>,

Toby Bartels <to...@ugcs.caltech.edu> wrote:
>G. A. Edgar <ed...@math.ohio-state.edu> wrote at last:

>>Here is a question for your notation: In the theory of Markov
>>chains, we often talk about a "transition probability", which
>>is a function of the form p(x,A), where for each fixed set A
>>it is a measurable function of x, and each fixed point x it is
>>a measure as a function of A. We want to use integrals like:

* \int_A g(y) p(x,dy)

>>What is your notation for this?

>I would just say "\int_A g(y) p(x,y).
>This is analogous to "\int_A f(x,y) m(y)",
>which I get from the more standard "\int_A f(x,y) dm(y)".

We have more of a problem here than meets the eye. The
expression (*) above is the expression for the propagation
of probability, but if the argument of the function is
switched, and the position of the "d" is switched, it is
the equation for the (backward) propagation of expectation.

Also, there is a numerical method in evaluating the average
of multivariate integrals, with rather arbitrary measures,
known as the Gibbs sampler. In this, one goes through stages,
each one using a different variable. So it is necessary to
be able to put the "d" wherever it is appropriate at the time.

Toby Bartels

unread,
Aug 25, 1998, 3:00:00 AM8/25/98
to
Herman Rubin <hru...@stat.purdue.edu> wrote:

>Toby Bartels <to...@ugcs.caltech.edu> wrote:

>>G. A. Edgar <ed...@math.ohio-state.edu> wrote at last:

>>>In the theory of Markov


>>>chains, we often talk about a "transition probability", which
>>>is a function of the form p(x,A), where for each fixed set A
>>>it is a measurable function of x, and each fixed point x it is
>>>a measure as a function of A.

[Discussion of notation of its integrals]

>We have more of a problem here than meets the eye. The
>expression (*) above is the expression for the propagation
>of probability, but if the argument of the function is
>switched, and the position of the "d" is switched, it is
>the equation for the (backward) propagation of expectation.

You mean you write "p(A,x)" for backward transition?


-- Toby
to...@ugcs.caltech.edu

G. A. Edgar

unread,
Aug 26, 1998, 3:00:00 AM8/26/98
to
>
> Well, I for one have not the slightest idea what "\int x^3 m(d 1/x)"
> could mean. I have never seen it and I am not a tyro in measure theory.
> Could someone enlighten me and justify this usage?
>

It is a bit unusual. It has to do with change of variables.
If we write y = 1/x, and this correspondence maps a set A bijecively
onto a set B, then
\int_A x^3 m(d 1/x) = \int_B (1/y)^3 m(dy) .

Reply all
Reply to author
Forward
0 new messages