Frequentist probability confusion

Bartosz Milewski

unread,

Mar 22, 2004, 5:36:22 AM3/22/04

to

I was trying to figure out if the frequentist interpretation could be used
as the foundation of the probabilistic interpretation of QM. As I understand
(correct me if I'm wrong) the basis of the frequentist interpretation can be
summarized in the following statement:
We have an repeatable experiment with M possible outcomes, each with
probability P_m. Let's repeat the experiment N times. The number of times we
get the m'th result is X_m . The statement is that,
when N goes to infinity, X_m/N -> P_m.
You can take this as a definition of probability P_m.
But how is this limit defined? Does it mean that for every epsilon there is
a N(epsilon) such that for any n > N(epsilon) the value |X_m / n - P_m| <
epsilon ?
Can this N(epsilon) be calculated? Does it depend on the details of the
experiment?
This doesn't make sense to me, so I'm wondering if my intuition is
completely wrong.

Arnold Neumaier

unread,

Mar 22, 2004, 6:05:15 AM3/22/04

to

In experimental terms, this limit is undefinable since during the
sum of lifetimes of all physicists ever, only a finite number of
experiments have been performed. And I fear this will be the case
in the near future, too. Thus a limit makes sense only on the
theoretical level. But there is no problem with probabilities anyway.

You can find the justification of a relative frequency interpretation
in any textbook of probability under the heading of the weak law
of large numbers. The limit is 'in probability', which means that
the probability of violating |X_m / n - P_m| < epsilon goes to zero
as n gets large. How large n must be at a given confidence level
can be calculated, if one is careful in the argument leading to the
proof. Unfortunately there is nothing that excludes the unlikely
remaining probability...

Arnold Neumaier

r...@maths.tcd.ie

unread,

Mar 23, 2004, 12:42:14 PM3/23/04

to

Arnold Neumaier <Arnold....@univie.ac.at> writes:

>Bartosz Milewski wrote:
>> I was trying to figure out if the frequentist interpretation could be used

>> as the foundation of the probabilistic interpretation of QM. ...

> ...

>You can find the justification of a relative frequency interpretation
>in any textbook of probability under the heading of the weak law
>of large numbers. The limit is 'in probability', which means that
>the probability of violating |X_m / n - P_m| < epsilon goes to zero
>as n gets large. How large n must be at a given confidence level
>can be calculated, if one is careful in the argument leading to the
>proof. Unfortunately there is nothing that excludes the unlikely
>remaining probability...

Right, so actually, the frequentist interpretation of probability
suffers from the same disease that the many-worlds interpretation
does, or at least the non-Bayesian one. In many worlds, the problem
is that there's no way to justify dismissing worlds with a small
quantum amplitude as being rare, and in the frequentist
version of probability theory, there's no way to justify dismissing
outcomes with small probability as being rare.

The frequentist interpretation of probability suffers from worse
diseases as well. For example, you'll find in many probability
books and hear from the mouths of top probability theorists the
claim that no process can produce random, uniformly distributed
positive integers, but that processes can produce random uniformly
distributed real numbers between zero and one (e.g. toss a fair
coin exactly aleph_0 times to get the binary expansion).

In fact, a process which produces uniformly distributed random real
numbers between zero and one can be modified so that it produces
uniformly distributed random positive integers in the following
way: Consider [0,1) as an additive group of reals modulo 1. Then
it has a subgroup, S, consisting of rational numbers in [0,1). Form
a set X by choosing one element from each coset of S in [0,1). Then
define X_r = {a+r mod 1 | a \in X}, for each r in S. The X_r are
pairwise disjoint, pairwise congruent sets, with congruent meaning
they are related to each other by isometries of the group [0,1).
In that sense, they are as equiprobable as can be. Now if q is a
random number between 0 and 1, then it falls into exactly one X_r,
so there is a unique rational number, r, associated with that real
number, and since the rationals are countable, there is also a
unique positive integer associated with that real number. Since the
X_r's are congruent, no one can be any more or less likely than any
other, so no positive integer is any more or less likely than any
other to result from this process. Voila, we have a way to get a
"random" positive integer from a "random" real in [0,1).

The problem is that if you define probabilities in terms of
outcomes of repeated processes or experiments, then you might get
lead astray when you find that certain probability distributions
don't exist (e.g. a uniform distribution over the positive
integers). You might start imagining that your probability theory
is telling you something about what kind of random-number-generation
processes are or aren't possible. As the example above shows, this
is incorrect.

R.

Ps. Yes, I know I used the axiom of choice and rely on axioms of
infinity, but that's not a problem. Nobody would actually
drop these axioms in order to save the frequentist interpretation,
or at least, nobody worth mentioning.

Bartosz Milewski

unread,

Mar 24, 2004, 9:45:31 PM3/24/04

to

"Arnold Neumaier" <Arnold....@univie.ac.at> wrote in message
news:405EC787...@univie.ac.at...

> You can find the justification of a relative frequency interpretation
> in any textbook of probability under the heading of the weak law
> of large numbers. The limit is 'in probability', which means that
> the probability of violating |X_m / n - P_m| < epsilon goes to zero
> as n gets large. How large n must be at a given confidence level
> can be calculated, if one is careful in the argument leading to the
> proof. Unfortunately there is nothing that excludes the unlikely
> remaining probability...

Doesn't this justification suffer from the same problem? What's the precise
meaning of "probability goes to zero as n gets large"? I don't know if
statements like this have a meaning at all. In mathematics "goes to" or "has
a limit" are very well defined notions (i.e., for any epsilon there exist...
etc.).

I can think of an unorthodox way of dealing with limits in probability.
Introduce the "random number generator," just like in computer simulations
(in practice one uses a pseudo-random generator). The N(epsilon) such that
for each n > N(epsilon) |X_m / n - P_m| < epsilon could then be calculated
based on the properties of the "random" number stream that generates values
X_m.
Here's some handwaving: Imagine that a stream of "random" numbers has the
following property: If you generate heads and tails using this stream, then
there is a certain N_1 above which you are guaranteed that the first N
tosses could not have been all heads or all tails (there is at least 1 odd
result). I don't know how much this would break radnomness properties, but
it would make the definition of limits meaningful.

By the way, has anyone tested experimetally the randomness of quantum
experiments? Is a quantum random number generator perfectly random?

J. J. Lodder

unread,

Mar 25, 2004, 5:04:00 AM3/25/04

to

Bartosz Milewski <bar...@nospam.relisoft.com> wrote:

> By the way, has anyone tested experimetally the randomness of quantum
> experiments? Is a quantum random number generator perfectly random?

Of course not, perfection doesn't exist.
There are always instrumental effects that spoil it.

A well known, and already very old example
is the dead time of Geiger counters.
This spoils the ideal Poisson distribution of the famous clicks.

For a practical implementation of a hardware random number generator
one often uses Zener diodes biassed in the reverse direction.
Since the current is caused by tunnelling of electrons through a barrier
this may wel be considered to be a quantummechanical noise source.

Suitable electronics can transform the variable current
into (very nearly) random bits, hence numbers.
Some small bias remains though.

Jan

Patrick Van Esch

unread,

Mar 25, 2004, 8:25:50 PM3/25/04

to

"Bartosz Milewski" <bar...@nospam.relisoft.com> wrote in message news:<c3qnae$scv$1...@brokaw.wa.com>...
>

> Here's some handwaving: Imagine that a stream of "random" numbers has the
> following property: If you generate heads and tails using this stream, then
> there is a certain N_1 above which you are guaranteed that the first N
> tosses could not have been all heads or all tails (there is at least 1 odd
> result).

But this is exactly what the original question was all about:
the frequentist application of probability theory doesn't make sense
if there is no "lower cutoff" below which we consider that the event
will never happen and which is NOT 0.
Indeed, naively said, the law of LARGE numbers cannot be replaced by
the law of INFINITE numbers because then probability theory makes NO
statement at all about finite statistics.
If you toss a coin N times, the probability to have all heads is
1/2^N. So if you toss a coin N times, to estimate the probability of
having "heads" on one go (which you expect to be 1/2), there is a
probability of 1/2^N that you will actually find 1 (and also a
probability of 1/2^N that you will find 0).
If N = 10000, then that's a small probability indeed, but according to
orthodox probability theory, it CAN occur.
Now let us say that we repeat this tossing of N coins M times, you'd
then expect that finding an estimate of the probability of one tossing
of the coin = 1 will only occur, on average, 1/2^N times M. But there
is again, a very small probability that we will find for ALL of these
M experiments (each with N coins), an average of 1 (namely 1/2^(NxM)
is that probability).
So the statement that "events with an extremely small probability
associated to it will probably not occur" is an empty statement
because tautological. It is only when we say that "events with an
extremely small probability will NOT occur" that suddenly, all of the
frequentist interpretation of probability theory makes sense. But
that statement is a strong one, however, in practical, and
experimental life we always make it. Most, if not all, experimental
claims are associated with a small probability (10 sigma for example)
that it is a statistical fluctuation, but beyond a certain threshold,
people take it as a hard fact.

cheers,
Patrick.

J. J. Lodder

unread,

Mar 25, 2004, 8:26:24 PM3/25/04

to

Bartosz Milewski <bar...@nospam.relisoft.com> wrote:

Your conceptual problem has nothing to do with quantum mechanics.
It arises in precisely the same form when you want to verify by
experiment that a coin being thrown repeatedly is fair
(That is, has exactly 50% probability of coming up heads or tails)

For the resolution see any textbook on probability:
you can never verify such a thing, you can only give confidence limits.
It has nothing to do with frequentist versus Bayesianism either:
a Bayesian can do no better.

Best,

Jan

Matt Leifer

unread,

Mar 26, 2004, 2:37:42 AM3/26/04

to

> You can find the justification of a relative frequency interpretation
> in any textbook of probability under the heading of the weak law
> of large numbers. The limit is 'in probability', which means that
> the probability of violating |X_m / n - P_m| < epsilon goes to zero
> as n gets large. How large n must be at a given confidence level
> can be calculated, if one is careful in the argument leading to the
> proof. Unfortunately there is nothing that excludes the unlikely
> remaining probability...

Notice that the argument is circular, i.e. one uses a concept of
probability in order to define the concept of probability. This
doesn't cause problems for most applications of probability theory,
but it is the main reason to be a Bayesian from the conceptual point
of view.

Matthew Donald

unread,

Mar 26, 2004, 4:49:59 AM3/26/04

to

Bartosz Milewski wrote
> has anyone tested experimentally the randomness of quantum
> experiments?

Try the e-print physics/0304013. Here's the abstract:

*******

physics/0304013

From: Dana J. Berkeland [view email]
Date: Fri, 4 Apr 2003 18:16:19 GMT (143kb)

Tests for non-randomness in quantum jumps

Authors: D.J. Berkeland, D.A. Raymondson, V.M. Tassin
Comments: 4 pages, 5 figures
Subj-class: Atomic Physics

In a fundamental test of quantum mechanics, we have collected
over 250,000 quantum jumps from single trapped and cooled
88Sr+ ions, and have tested their statistics using a
comprehensive set of measures designed to detect non-random
behavior. Furthermore, we analyze 238,000 quantum jumps from
two simultaneously confined ions and find that the number of
apparently coincidental transitions is as expected. Similarly, we
observe 8400 spontaneous decays of two simultaneously trapped
ions and find that the number of apparently coincidental decays
agrees with expected value. We find no evidence for short- or
long-term correlations in the intervals of the quantum jumps or
in the decay of the quantum states, in agreement with quantum
theory.

*****

Matthew Donald (matthew...@phy.cam.ac.uk)
web site:
http://www.poco.phy.cam.ac.uk/~mjd1014
``a many-minds interpretation of quantum theory''
*****************************************

Arnold Neumaier

unread,

Mar 26, 2004, 4:50:02 AM3/26/04

to

Bartosz Milewski wrote:
> "Arnold Neumaier" <Arnold....@univie.ac.at> wrote in message
> news:405EC787...@univie.ac.at...
>
>>You can find the justification of a relative frequency interpretation
>>in any textbook of probability under the heading of the weak law
>>of large numbers. The limit is 'in probability', which means that
>>the probability of violating |X_m / n - P_m| < epsilon goes to zero
>>as n gets large. How large n must be at a given confidence level
>>can be calculated, if one is careful in the argument leading to the
>>proof. Unfortunately there is nothing that excludes the unlikely
>>remaining probability...
>
> Doesn't this justification suffer from the same problem? What's the precise
> meaning of "probability goes to zero as n gets large"?

It has a precise mathematical meaning. it has no precise physical meaning.
But nothing at all has a precise physical meaning. Only theory can be
exact - you can say exactly what an electron is in QED; you cannot
say exactly what it is in reality.

The interface between theory and the real world is never exact.
This interface must just be clear enough to guarantee working protocols
for the execution of science in practice.

I think it is not reasonable to require of probability a more stringent
meaning than for an electron. It suffices that there are recipes that
give, in practice acceptable approximations.

Arnold Neumaier

unread,

Mar 26, 2004, 5:49:48 AM3/26/04

to

Matt Leifer wrote:
>>You can find the justification of a relative frequency interpretation
>>in any textbook of probability under the heading of the weak law
>>of large numbers. The limit is 'in probability', which means that
>>the probability of violating |X_m / n - P_m| < epsilon goes to zero
>>as n gets large. How large n must be at a given confidence level
>>can be calculated, if one is careful in the argument leading to the
>>proof. Unfortunately there is nothing that excludes the unlikely
>>remaining probability...
>
>
> Notice that the argument is circular, i.e. one uses a concept of
> probability in order to define the concept of probability.

No. It uses the concept of probability to explain why, within this
framework, relative frequences are valid approximations to
probabilities. Probabilities themselves are defined axiomatically,
and not justified at all.

Any theory needs to start somewhere with some basic, unexplained
terms that get their meaning from the consequences of the theory,
and not from anything outside.

Arnold Neumaier

unread,

Mar 26, 2004, 5:49:51 AM3/26/04

to

Patrick Van Esch wrote:

> So the statement that "events with an extremely small probability
> associated to it will probably not occur" is an empty statement
> because tautological.

Yes.

> It is only when we say that "events with an
> extremely small probability will NOT occur"

but this is wrong. If you randomly draw a real number x from a unifrom
distribution in [0,1], and get as a result s, the probability that
you obtained exactly this number is zero, but it was the one you got.

> that suddenly, all of the
> frequentist interpretation of probability theory makes sense.

This has nothing to do with the frequentist interpretation; the
problem of unlikely things happening is also present in a Bayesian
interpretation.

The problem stems from the attempt to assign 100% precise meaning
to concepts that have an inherent uncertainty. Once one realizes
that any concept applied to reality is limited since we cannot
say too precisely what is meant by it on the operational level
(100% clarity is avaialble only in theoretical models), the
difficulty disappears. That's why it has become standard practice
to distinguish between reality and our models of it.

Arnold Neumaier

Bartosz Milewski

unread,

Mar 26, 2004, 5:24:30 PM3/26/04

to

"Patrick Van Esch" <van...@ill.fr> wrote in message
news:c23e597b.04032...@posting.google.com...

> So the statement that "events with an extremely small probability
> associated to it will probably not occur" is an empty statement
> because tautological. It is only when we say that "events with an
> extremely small probability will NOT occur" that suddenly, all of the
> frequentist interpretation of probability theory makes sense.

Yes, that's exactly my point. I was trying to make the cutoff somewhat
better defined making it a property of a "random" number generator. Let me
clarify my point: In mathematics, probability theory describes properties of
"measures." It's a self-consistent theory (modulo Goedel) and that's it. In
physics we are trying to interpret these measures as probabilities, so we
have to provide a framework. The frequentist framework doesn't seem to be
consistent. I propose to extend the frequentist interpretation by
abstracting the random number generator part of it. Probability can then be
defined formally as a limit of frequencies, provided the frequencies fulfill
some additional properties--the cutoff properties. In essence, they must
behave as if they were generated by a computer program using a random number
generator with some well defined cutoff property. This random number
generator is necessary so that all the frequencies exhibit the same cutoff
property (i.e., the frequency of [quantum] heads has the same cutoffs as the
frequency of tails).

Bartosz Milewski

unread,

Mar 26, 2004, 5:21:56 PM3/26/04

to

"J. J. Lodder" <nos...@de-ster.demon.nl> wrote in message
news:1gb70ve.44...@de-ster.xs4all.nl...

> Bartosz Milewski <bar...@nospam.relisoft.com> wrote:
>
> Your conceptual problem has nothing to do with quantum mechanics.
> It arises in precisely the same form when you want to verify by
> experiment that a coin being thrown repeatedly is fair
> (That is, has exactly 50% probability of coming up heads or tails)

There is a huge difference between quantum probability and classical
probability. Coin tosses are not "really" random. They are chaotic, which
means we can't predict the results because (a) we never know the initial
conditions _exactly_ (the butterfly effect) and (b) because we don't have
computers powerful enough to model a coin toss. So coin tossing is for all
"practical" purposes random, but theoreticall it's not! In QM, on the other
hand, randomness is inherent. If you can prepare a system in pure state, you
know the initial conditions _exactly_. And yet, the results of experiments
are only predicted probabilistically. Moreover, there are no hidden
variables (this approach has been tried), whose knowledge could specify the
initial conditions more accurately and maybe let you predict the exact
outcomes.

I have no problems with coin tosses as long as you don't use a quantum coin.

Bartosz Milewski

unread,

Mar 27, 2004, 6:15:11 AM3/27/04

to

"Arnold Neumaier" <Arnold....@univie.ac.at> wrote in message

news:40640905...@univie.ac.at...

> but this is wrong. If you randomly draw a real number x from a unifrom
> distribution in [0,1], and get as a result s, the probability that
> you obtained exactly this number is zero, but it was the one you got.

This argument is also circular. How do you draw a number randomly? It's not
a facetious question.

Patrick Van Esch

unread,

Mar 29, 2004, 2:36:38 AM3/29/04

to

Arnold Neumaier <Arnold....@univie.ac.at> wrote in message news:<40640905...@univie.ac.at>...

> Patrick Van Esch wrote:
>
> > So the statement that "events with an extremely small probability
> > associated to it will probably not occur" is an empty statement
> > because tautological.
>
> Yes.
>
> > It is only when we say that "events with an
> > extremely small probability will NOT occur"
>
> but this is wrong. If you randomly draw a real number x from a unifrom
> distribution in [0,1], and get as a result s, the probability that
> you obtained exactly this number is zero, but it was the one you got.

I'm not talking about the mathematical theory of probabilities (a la
Kolmogorov) which is a nice mathematical theory. I'm talking about
the application of this mathematical theory to the physical sciences,
which maps events (potential experimental outcomes) to numbers (called
probabilities, but we could call it "gnorck"). Saying that measuring
the number of neutrons scattered under an angle of 12 to 13 degrees
has gnorck = 10^(-6) doesn't mean much as such. So this assignment of
probabilities to experimental outcomes has only a meaning when it
eventually turns into "hard" statements, and that can only happen when
we apply a lower cutoff to probabilities.
Your argument of drawing a real number out of [0,1] doesn't apply
here, because the outcome of an experiment is never a true real number
(most of which cannot even be written down !). There are always a
finite number of possibilities in the outcome of an experiment
(otherwise it couldn't be written onto a hard disk!).

cheers,
Patrick.

Italo Vecchi

unread,

Mar 29, 2004, 4:22:03 AM3/29/04

to

"Bartosz Milewski" <bar...@nospam.relisoft.com> wrote in message news:<c426a4$bvd$1...@brokaw.wa.com>...

> "J. J. Lodder" <nos...@de-ster.demon.nl> wrote in message
> news:1gb70ve.44...@de-ster.xs4all.nl...
> > Bartosz Milewski <bar...@nospam.relisoft.com> wrote:
> >
> > Your conceptual problem has nothing to do with quantum mechanics.
> > It arises in precisely the same form when you want to verify by
> > experiment that a coin being thrown repeatedly is fair
> > (That is, has exactly 50% probability of coming up heads or tails)
>
> There is a huge difference between quantum probability and classical
> probability. Coin tosses are not "really" random. They are chaotic, which
> means we can't predict the results because (a) we never know the initial
> conditions _exactly_ (the butterfly effect) and (b) because we don't have
> computers powerful enough to model a coin toss.

That "we can't predict the results" IS randomness. There is nothing
more to randomness than that.

Regards,

IV

Arnold Neumaier

unread,

Mar 29, 2004, 5:55:17 AM3/29/04

to

Bartosz Milewski wrote:

> There is a huge difference between quantum probability and classical
> probability. Coin tosses are not "really" random.

What is "really" random??? the term "random" has no precise meaning
outside theory. But in the theory of stochastic processes, there is
"real" classical randomness.

> In QM, on the other
> hand, randomness is inherent. If you can prepare a system in pure state, you

But no one can. Pure states in quantum mechanics are as much idealizations
as classical random processes.

Arnold Neumaier

unread,

Mar 29, 2004, 2:38:36 PM3/29/04

to

Patrick Van Esch wrote:
> Arnold Neumaier <Arnold....@univie.ac.at> wrote in message news:<40640905...@univie.ac.at>...
>
>>Patrick Van Esch wrote:

>>>It is only when we say that "events with an
>>>extremely small probability will NOT occur"
>>
>>but this is wrong. If you randomly draw a real number x from a unifrom
>>distribution in [0,1], and get as a result s, the probability that
>>you obtained exactly this number is zero, but it was the one you got.
>
> I'm not talking about the mathematical theory of probabilities (a la
> Kolmogorov) which is a nice mathematical theory. I'm talking about
> the application of this mathematical theory to the physical sciences,
> which maps events (potential experimental outcomes) to numbers (called
> probabilities, but we could call it "gnorck").

In that case, all statements are approximate statements only,
and probability has no exact meaning. Saying p=1e-12
in a practical situation where you can never collect more than
a few thousands cases is as meaningless as saying that the
distance earth-moon is 384001.1283564032984930201549807km

> Saying that measuring
> the number of neutrons scattered under an angle of 12 to 13 degrees
> has gnorck = 10^(-6) doesn't mean much as such. So this assignment of
> probabilities to experimental outcomes has only a meaning when it
> eventually turns into "hard" statements, and that can only happen when
> we apply a lower cutoff to probabilities.

But where exactly would you take it? If there is an eps such that
p<eps means 'it will not occur', what is the supremum of all such eps?
It would have to be a fundamental constant of nature.

The nonexistence of such a constant implies that your proposal cannot
be right.

> Your argument of drawing a real number out of [0,1] doesn't apply
> here, because the outcome of an experiment is never a true real number
> (most of which cannot even be written down !). There are always a
> finite number of possibilities in the outcome of an experiment
> (otherwise it couldn't be written onto a hard disk!).

This does not really help. Let eps be the "extremely small probability"
according to your proposal. Pick N >> - log eps / log 2, and
run a series of N coin tosses. You get the result x_1 ... x_N, say.
Although you really obtained exactly this result, the probability of
obtaining it was only 2^{-N} << eps. Thus your proposal amounts to
proving the impossibility of tossing coins more than a fairly small
number of times... Do you really want to claim that???

The truth is that probabilities, just like real numbers,
are concepts of theory, and as such apply only approximately
to reality, with a context-dependent and user-dependent accuracy
with fuzzy boundaries.

Arnold Neumaier

Patrick Powers

unread,

Mar 30, 2004, 3:47:24 AM3/30/04

to

r...@maths.tcd.ie wrote in message news:<c3njd0$295l$1...@lanczos.maths.tcd.ie>...

> Arnold Neumaier <Arnold....@univie.ac.at> writes:
>
>
> >Bartosz Milewski wrote:
> >> I was trying to figure out if the frequentist interpretation could be used
> >> as the foundation of the probabilistic interpretation of QM. ...
>
> > ...
>
> >You can find the justification of a relative frequency interpretation
> >in any textbook of probability under the heading of the weak law
> >of large numbers. The limit is 'in probability', which means that
> >the probability of violating |X_m / n - P_m| < epsilon goes to zero
> >as n gets large. How large n must be at a given confidence level
> >can be calculated, if one is careful in the argument leading to the
> >proof. Unfortunately there is nothing that excludes the unlikely
> >remaining probability...

The idea is that the probability may be made as small as one likes.
So it can be made so small that the event is for all practical
purposes impossible.

>
> Right, so actually, the frequentist interpretation of probability
> suffers from the same disease that the many-worlds interpretation
> does, or at least the non-Bayesian one. In many worlds, the problem
> is that there's no way to justify dismissing worlds with a small
> quantum amplitude as being rare, and in the frequentist
> version of probability theory, there's no way to justify dismissing
> outcomes with small probability as being rare.
>

Quantum theory is a probabilistic theory and extremely unlikely events
are not excluded, nor should they be. So this is a property of the
theory, not the interpretation. It seems to me that an interpretation
that excluded such events absolutely would be in error.

> The frequentist interpretation of probability suffers from worse
> diseases as well. For example, you'll find in many probability
> books and hear from the mouths of top probability theorists the
> claim that no process can produce random, uniformly distributed
> positive integers, but that processes can produce random uniformly
> distributed real numbers between zero and one (e.g. toss a fair
> coin exactly aleph_0 times to get the binary expansion).

Yes these claims as stated are contradictory. I suspect that the
definitions you are using are imprecise. The word "process" implies
computability, that the process is finite. A real number is cleverly
defined as a limit of a finite process. So a real number is
computable in this sense, that it can be approximated as closely as
one likes in finite time. The problem with your proof is that as the
real number is computed the choice of cosets changes with each step so
the process does not converge to an integer.

Using the axioms of choice and infinity then one can indeed choose a
natural number at random. There are some rather strange consequences.
It is then possible to prove that each number chosen in this way will
be greater than all such previously chosen numbers with probability
one. Let N be the greatest such number chosen so far. Then there are
finitely many natural numbers less than or equal to N but infinitely
many greater than N. So the next number chosen will be greater than N
with probability one. Note that our ostensibly random sequence is
strictly increasing with probability one. This is not the only
bizarre consequence of the axiom of choice: see the well-known
Banach-Tarski sphere paradox. So I should think a physicist would do
well to be wary of the axiom of choice as tending to produce
non-physical results.

The frequentist approach does not assume the axiom of choice and makes
no use of transfinite mathematics or completed limits. If it did, the
problems you mention would in fact arise.

Arnold Neumaier

unread,

Mar 30, 2004, 11:32:30 AM3/30/04

to

> r...@maths.tcd.ie wrote in message news:<c3njd0$295l$1...@lanczos.maths.tcd.ie>...
>

>>For example, you'll find in many probability

>>books and hear from the mouths of top probability theorists the
>>claim that no process can produce random, uniformly distributed
>>positive integers, but that processes can produce random uniformly
>>distributed real numbers between zero and one (e.g. toss a fair
>>coin exactly aleph_0 times to get the binary expansion).

This has a very simple reason: There is no consistent definition of
random, uniformly distributed positive integers, while there is
one for random uniformly distributed real numbers between zero and one.
This is a purely mathematical statement independent of any
interpretation!

And of course, when people say 'produce' they mean
'produce in theory', or if they mean 'produce in practice' they
have in mind that it is produced only approximately.

Arnold Neumaier

Russell Blackadar

unread,

Mar 30, 2004, 12:42:53 PM3/30/04

to

Patrick Powers wrote:
>
> r...@maths.tcd.ie wrote in message news:<c3njd0$295l$1...@lanczos.maths.tcd.ie>...
> > Arnold Neumaier <Arnold....@univie.ac.at> writes:

[snip]

> Using the axioms of choice and infinity then one can indeed choose a
> natural number at random. There are some rather strange consequences.
> It is then possible to prove that each number chosen in this way will
> be greater than all such previously chosen numbers with probability
> one. Let N be the greatest such number chosen so far. Then there are
> finitely many natural numbers less than or equal to N but infinitely
> many greater than N. So the next number chosen will be greater than N
> with probability one. Note that our ostensibly random sequence is
> strictly increasing with probability one.

Hmm, interesting post, thanks.

This is not the only
> bizarre consequence of the axiom of choice: see the well-known
> Banach-Tarski sphere paradox. So I should think a physicist would do
> well to be wary of the axiom of choice as tending to produce
> non-physical results.

But this only happens if you use it in a scenario that is *already*
unphysical, e.g. if you claim it's possible to draw a lottery ball
from a cage containing aleph_0 ping-pong balls. Here the physical
problem is not in the drawing, but in the setting up of the cage to
begin with. As for Banach-Tarski, it also requires an unphysical
scenario; you can't make it work by pulverizing and reassembling a
ball made of any physical material.

So I don't see any problem with a physicist accepting AC and its
useful consequences as applied to the mathematics of continua.

J. J. Lodder

unread,

Mar 30, 2004, 12:29:07 PM3/30/04

to

Bartosz Milewski <bar...@nospam.relisoft.com> wrote:

> "J. J. Lodder" <nos...@de-ster.demon.nl> wrote in message
> news:1gb70ve.44...@de-ster.xs4all.nl...
> > Bartosz Milewski <bar...@nospam.relisoft.com> wrote:
> >
> > Your conceptual problem has nothing to do with quantum mechanics.
> > It arises in precisely the same form when you want to verify by
> > experiment that a coin being thrown repeatedly is fair
> > (That is, has exactly 50% probability of coming up heads or tails)
>
> There is a huge difference between quantum probability and classical
> probability.

Not at all, from the point of view of probability theory.

> Coin tosses are not "really" random. They are chaotic, which
> means we can't predict the results because (a) we never know the initial
> conditions _exactly_ (the butterfly effect) and (b) because we don't have
> computers powerful enough to model a coin toss. So coin tossing is for all
> "practical" purposes random, but theoreticall it's not!

We use probability theory when we don't know, or don't want to know,
about underlying causes.
Whether or not such causes are actually present is irrelevant.
Probability just deals with 'something' that produces heads or tails,
and determines properties of the sequence of them,
(like confidence in being fair)
using the means of probability theory.

> In QM, on the other
> hand, randomness is inherent. If you can prepare a system in pure state, you
> know the initial conditions _exactly_. And yet, the results of experiments
> are only predicted probabilistically. Moreover, there are no hidden
> variables (this approach has been tried), whose knowledge could specify the
> initial conditions more accurately and maybe let you predict the exact
> outcomes.
>
> I have no problems with coin tosses as long as you don't use a quantum coin.

All coins are quantum coins, for we live in a quantum word.
In practice it may be quite hard to say whether or not
a coin throw may be considered to be 'classical'.
Quantum mechanics may come in in the precise timing
of the twitching of your fingers, on the molecular level,
when flipping the coin.

Not that it matters,

Jan

Patrick Van Esch

unread,

Mar 30, 2004, 12:31:45 PM3/30/04

to

"Bartosz Milewski" <bar...@nospam.relisoft.com> wrote in message news:<c428jd$cvv$1...@brokaw.wa.com>...

> "Patrick Van Esch" <van...@ill.fr> wrote in message
> news:c23e597b.04032...@posting.google.com...
> > So the statement that "events with an extremely small probability
> > associated to it will probably not occur" is an empty statement
> > because tautological. It is only when we say that "events with an
> > extremely small probability will NOT occur" that suddenly, all of the
> > frequentist interpretation of probability theory makes sense.
>
> Yes, that's exactly my point. I was trying to make the cutoff somewhat
> better defined making it a property of a "random" number generator.

In fact, this point is something that bothered me since I learned
about probability theory (20 years ago) and most people didn't seem to
even understand what my problem was (depends probably on the people
you talk to).
The fact that others here seem to struggle with the same problem
indicates that this is somehow a problem :-)
However, I have no idea if you can make a consistent probability
interpretation with such a cutoff. I don't know if ever some work in
that direction has been undertaken.
The Bayesian interpretation of probabilities is a nice information
theoretical construct of course, I'm not expert enough in it to see if
the problem also exists there. But I have difficulties with people
who deny the frequentist interpretation: after all, this is - to me -
the only way to make a connection to experimental results ! How do
you verify differential cross sections ? You do a number of
experiments, and then you make the HISTOGRAM (counting the number of
occurences) of the outcomes which you compare with your calculated
probability density. That's nothing else but applying the frequentist
interpretation, no ?

cheers,
Patrick.

Arnold Neumaier

unread,

Mar 30, 2004, 12:44:24 PM3/30/04

to

I am not circular since I didn't attempt to give a definition of
probability but simply assumed it to refute a claim made.

If randomness has any meaning at all in practice, it must be able to
draw numbers randlomly; if not by the uniform distribution then by
whatever process is assumed.

If you accept that there is something like a 'fair coin toss'
which gives independent events with probability 1/2, you can
easily get arbitrary small probabilities without invoking real numbers,
as I showed in another reply in this thread.

If you can't accept a fair coin toss, I wonder whether you have
any place at all for probabilisitic models in your world view.

The problem in all these issues related to probability is the silent
switch between theory and practice at some place, which different
people take at a different place, which makes communication difficult
and invites paradoxes. The interface between theory and reality is
always a little vague, and one has to be careful not to make statements
which are meaningless.

Raw reality has no concepts; it simply is. But to do science,
and indeed already to live intelligently, one needs to sort reality
into various conceptual bags that allow one to understand and predict.
Because of our incomplete access to reality, we can do this only in an
imperfect, somewhat fuzzy way. Respecting this in one's thinking
avoids all paradoxes; coming across a paradox means that one moved
somewhere across the border of what was permitted - though it is not
always easy to see where and why.

Full clarity can be obtained only on the logical level, this is
necessarily one level away from reality. In the foundations of
logic, one builds within an intuitive logic a complete model of
everything logic is about, and then is able to clarify the limits
of logical reasoning. This is the closest one can get to a clear
understanding of the foundations. To do the same in physics amounts
to building within mathematics a complete model with all the features
external reality is believed to have, and to discuss within this model
all the concepts and activities physicists work with. If this can be
done in a consistent way, it is as close as we can get to ascertaining
that the model is indeed faithful to external reality.

Therefore, when _I_ discuss probability, I choose such a model
as the background on the basis of which I can speak of well-defined
probabilities. In such a mathematical world, one can draw number
by decree (even though one cannot know what was drawn).

In reality, one needs to substitute pseudo-random number generators,
which makes drawing random numbers a practical activity. But of
course their properties only approximate the theoretical thing,
as always when a formal concept is implemented in nature.
Not even the Peano axioms for natural numbers can be realized in
nature - how much less more subtle concepts like probability.

Arnold Neumaier

Bartosz Milewski

unread,

Mar 30, 2004, 12:48:40 PM3/30/04

to

"Arnold Neumaier" <Arnold....@univie.ac.at> wrote in message

news:4067FA65...@univie.ac.at...

> What is "really" random??? the term "random" has no precise meaning
> outside theory. But in the theory of stochastic processes, there is
> "real" classical randomness.

But stochastic theory is NOT a fundamental theory. It's an idealization of a
chaotic process. Can we interpret QM as an idealization of some more
fundamental theory? A theory where events are theoretically 100%
predictable, but so chaotic that in practice we can only make stochastic
predictions? I'm afraid this path had been tried (hidden variables) and
rebuked.

Patrick Powers

unread,

Mar 30, 2004, 2:28:14 PM3/30/04

to

vec...@weirdtech.com (Italo Vecchi) wrote in message news:<61789046.04032...@posting.google.com>...

>
> That "we can't predict the results" IS randomness. There is nothing
> more to randomness than that.
>

True. But some posters are saying that quantum processes are
essentially random in that asuming that the results can be predicted
leads to a contradiction. In other words, they say that it has been
proved impossible that the events will ever be predicted.

I know about Bell's theorem, but don't know that such a thing has been
proved in general. Can someone please provide a reference?

Patrick Van Esch

unread,

Mar 30, 2004, 2:28:20 PM3/30/04

to

Arnold Neumaier <Arnold....@univie.ac.at> wrote in message news:<406803CA...@univie.ac.at>...

> Patrick Van Esch wrote:
> > Arnold Neumaier <Arnold....@univie.ac.at> wrote in message news:<40640905...@univie.ac.at>...
> >
>

> But where exactly would you take it? If there is an eps such that
> p<eps means 'it will not occur', what is the supremum of all such eps?
> It would have to be a fundamental constant of nature.
>
> The nonexistence of such a constant implies that your proposal cannot
> be right.
>

I agree that it should be a constant of this universe, if ever there
was such a thing. It is not unthinkable that there IS such a constant
(such as the inverse of the number of spacetime events, which could be
finite). But I do realise how speculative that statement is.
Nevertheless, without this vague idea, I cannot make any sense of what
it experimentally means to have an event with a probability p. Or
better, why every time we do statistics on real data, it works out !

>
> > Your argument of drawing a real number out of [0,1] doesn't apply
> > here, because the outcome of an experiment is never a true real number
> > (most of which cannot even be written down !). There are always a
> > finite number of possibilities in the outcome of an experiment
> > (otherwise it couldn't be written onto a hard disk!).
>
> This does not really help. Let eps be the "extremely small probability"
> according to your proposal. Pick N >> - log eps / log 2, and
> run a series of N coin tosses. You get the result x_1 ... x_N, say.
> Although you really obtained exactly this result, the probability of
> obtaining it was only 2^{-N} << eps. Thus your proposal amounts to
> proving the impossibility of tossing coins more than a fairly small
> number of times

Maybe that "fairly small number of times" is in fact a very big
number, and in our universe there's not enough matter and time to do
all this tossing around!

A very small cutoff can save ALL of frequentist interpretations of
probability, because you are allowed to consider combined, independent
events (what's the probability of tossing 100 times a coin and finding
100 heads in a row AND seeing the moon go supernova etc...).

cheers,
Patrick.

Arnold Neumaier

unread,

Mar 31, 2004, 2:30:17 AM3/31/04

to

What is 'fundamental'? Stochastic processes are mathematically sound,
well-founded, and consistent, unlike current quantum field theory,
say. So they'd make a better foundation.

Chaotic processes are also idealizations, no less than stochastic
processes. And who can tell what is more basic? You can get one
from the other in suitable approximations...

Arnold Neumaier

unread,

Mar 31, 2004, 5:36:20 PM3/31/04

to

Patrick Van Esch wrote:
> Arnold Neumaier <Arnold....@univie.ac.at> wrote in message news:<406803CA...@univie.ac.at>...

>
>>But where exactly would you take it? If there is an eps such that
>>p<eps means 'it will not occur', what is the supremum of all such eps?
>>It would have to be a fundamental constant of nature.
>>
>>The nonexistence of such a constant implies that your proposal cannot
>>be right.
>>
>
>
> I agree that it should be a constant of this universe, if ever there
> was such a thing. It is not unthinkable that there IS such a constant
> (such as the inverse of the number of spacetime events, which could be
> finite).

But the constant, to be meaningful in our real life, would have to be
not extremely small; and then it can be refuted easily.

> But I do realise how speculative that statement is.
> Nevertheless, without this vague idea, I cannot make any sense of what
> it experimentally means to have an event with a probability p.

For a single event, it means almost nothing.
For a large number of events, it means roughly the
relative frequency, but with a possibility of deviating to a not
precisely specified amount.

> Or
> better, why every time we do statistics on real data, it works out !

The sense it makes is the following: If you have a sound probabilistic
model of a multitude of independent events e_i with assigned
probability p you'd be surprised if the frequency of events is not
close to p within a small multiple of sqrt(p(1-p)/N). And you'd probably
rather try to explain away a rare occurence (a brick going upwards due
to fluctuations) by assuming a hidden, unobserved cause (someone throwing
it) rather than just accept it as something within your probabilistic
mode. The way probabilities are used in practice is always as rough guides
of what to expect, but not as statements with a 100% exact meaning.
I wrote a paper on surprise:
A. Neumaier,
Fuzzy modeling in terms of surprise,
Fuzzy Sets and Systems 135 (2003), 21-38.
http://www.mat.univie.ac.at/~neum/papers.html#fuzzy
that helps understand the fuzziness inherent in our concepts of reality.

>>>Your argument of drawing a real number out of [0,1] doesn't apply
>>>here, because the outcome of an experiment is never a true real number
>>>(most of which cannot even be written down !). There are always a
>>>finite number of possibilities in the outcome of an experiment
>>>(otherwise it couldn't be written onto a hard disk!).
>>
>>This does not really help. Let eps be the "extremely small probability"
>>according to your proposal. Pick N >> - log eps / log 2, and
>>run a series of N coin tosses. You get the result x_1 ... x_N, say.
>>Although you really obtained exactly this result, the probability of
>>obtaining it was only 2^{-N} << eps. Thus your proposal amounts to
>>proving the impossibility of tossing coins more than a fairly small
>>number of times
>
>
> Maybe that "fairly small number of times" is in fact a very big
> number, and in our universe there's not enough matter and time to do
> all this tossing around!

Oh, this is possible only if the universal eps is so tiny that
almost everything is possible - against the use you wanted to make
of it in real life! An ordinary person would take the eps to justify
their unconcious probabilistic models for assessing ordinary reality
quite high, for events that are not very repetitive probably at
1e-6 or so. (Even engineers who are responsible for the safety
of buildings, airplanes, etc.) This can only be justified with
being prepared to take the risk, but not with an objective cutoff.

Arnold Neumaier

Patrick Van Esch

unread,

Apr 1, 2004, 5:17:21 AM4/1/04

to

Arnold Neumaier <Arnold....@univie.ac.at> wrote in message news:<c4fh54$uur$1...@lfa222122.richmond.edu>...

>
> For a single event, it means almost nothing.
> For a large number of events, it means roughly the
> relative frequency, but with a possibility of deviating to a not
> precisely specified amount.

I know, and that's what you usually do, and the funny thing is that
the deviation is not very high ! I will call, when experimental
results follow in this sense the predictions of probability, a
"statistically correct" experiment.
But you realize the problems here:
first of all, all observation, even if it is 10000 times flipping a
coin, is a *single event* when taken as a whole, which has a high
probability when the event is "statistically correct" and a low
probability when it is not. We observe that experimental results of
this single event which are not statistically correct *never* occur.
This, to me, is a kind of miracle if there's not some kind of "law"
stating exactly this. Now I know the statistical physics explanation
of course: the high and low probabilities of these combined events
just reflect the fact that we deal with subsets of events with
different sizes: the subset of all sequences of 10000 heads and tails
which have an average around 5000 heads and 5000 tails is much bigger
than the subset with 10000 heads, which essentially contains just one
sequence. So if you hit one sequence "blindly" you'd probably hit one
of the biggest subsets. In fact, when you are in such a case, you
don't really need probability theory as such, it is just a matter of
*counting* equivalent points.
However, I have much more difficulties with quantum mechanical
probability predictions. After all, these are more fundamental
predictions because not the result of "picking blindly into a big set
of possiblities". So in order to make sense of the QM probability
predictions, we need a better understanding of exactly what is meant
by the frequentist interpretation of events with probability p. Now I
know that the frequentist interpretation is not the favourite one
amongst theoreticians, but I don't see how, as an experimentalist, you
can get around it.

cheers,
Patrick.

Aaron Denney

unread,

Apr 1, 2004, 10:15:23 AM4/1/04

to

On 2004-03-30, Patrick Van Esch <van...@ill.fr> wrote:
> The Bayesian interpretation of probabilities is a nice information
> theoretical construct of course, I'm not expert enough in it to see if
> the problem also exists there. But I have difficulties with people
> who deny the frequentist interpretation: after all, this is - to me -
> the only way to make a connection to experimental results ! How do
> you verify differential cross sections ? You do a number of
> experiments, and then you make the HISTOGRAM (counting the number of
> occurences) of the outcomes which you compare with your calculated
> probability density. That's nothing else but applying the frequentist
> interpretation, no ?

You can view it that way, but it comes nicely out of the Bayesian
interpretation too. Basically, as the number of samples goes up, the
probability of getting something markedly different than having the
right frequencies gets incredibly small.

--
Aaron Denney
-><-

Jerzy Karczmarczuk

unread,

Apr 1, 2004, 10:45:09 AM4/1/04

to

Patrick Van Esch wrote:
> We observe that experimental results of [[ 10000 coin flip example]]

> this single event which are not statistically correct *never* occur.
> This, to me, is a kind of miracle if there's not some kind of "law"
> stating exactly this. Now I know the statistical physics explanation
> of course: the high and low probabilities of these combined events
> just reflect the fact that we deal with subsets of events with
> different sizes: the subset of all sequences of 10000 heads and tails
> which have an average around 5000 heads and 5000 tails is much bigger
> than the subset with 10000 heads, which essentially contains just one
> sequence. So if you hit one sequence "blindly" you'd probably hit one
> of the biggest subsets. In fact, when you are in such a case, you
> don't really need probability theory as such, it is just a matter of
> *counting* equivalent points.
> However, I have much more difficulties with quantum mechanical
> probability predictions. After all, these are more fundamental
> predictions because not the result of "picking blindly into a big set
> of possiblities". So in order to make sense of the QM probability
> predictions, we need a better understanding of exactly what is meant
> by the frequentist interpretation of events with probability p. Now I
> know that the frequentist interpretation is not the favourite one
> amongst theoreticians, but I don't see how, as an experimentalist, you
> can get around it.

The "miracle" is not only the cardinality of contributing sets of events,
but also the *ergodicity*, which makes the statistics actually work in
practice.

Now, let's speculate (without selling our multi-souls to Multi-Devil...)
about a variant of many-world model of the Quantum Reality. Imagine that
there ARE effectively many worlds, each of them forming a fibre upon the
configuration substrate. When an electron may pass through a double slit,
in one subset of worlds it passes by one, in the other - through the other.
I am *NOT* speaking about the decoherence. No "splitting" takes place, the
fibres *are there*, and as in many other fibrous space you may choose your
fibres as you wish. Since this picture is embedded within a normal quantum
evolution picture, you may imagine also the "fusion"; you follow two different
fibres, but which finally end-up as one, since this is just drawing lines
in space, not physics.

And now, the dynamics in this fibrous space is *ERGODIC*. A kind of chaos in
multi-space... Following one fibre long enough to be able to repeat one
experiment (with identical preparation) many times, should give you the
distribution obtained from many fibres, from a "statistical ensemble" of them.

And you get a probabilistic model for the quantum reality. Of course, I didn't
really explain anything, I just shifted the focus from one set of words to
another. But such a "model" might rise its head again, when in some unspecified
future the experimentalists making very, very delicate measurements discover
that there are non-linear disturbances of the linear superposition principle.

I believe that one day the actual quantum theory will be replaced by something
else. Of course we won't get back to classical physics. If we discover some
non-linearities, then we will probably have to change our probabilistic/fre-
quentist or whatever interpretation, but let's wait...

Jerzy Karczmarczuk

Patrick Powers

unread,

Apr 4, 2004, 8:37:02 AM4/4/04

to

van...@ill.fr (Patrick Van Esch) wrote in message news:<c23e597b.04033...@posting.google.com>...

> Now I
> know that the frequentist interpretation is not the favourite one
> amongst theoreticians, but I don't see how, as an experimentalist, you
> can get around it.
>
>
> cheers,
> Patrick.

Actually, I think experimentalists use the Baysian approach. Usually
an experiment is undertaken with the expectation of some result. If
the results do not match this expectation, the equipment is tweaked
until the expected result is obtained. If this doesn't work either
the experiment is dropped or (rarely) some other explanation is found.

It is also true that physics experiments and games of chance are
deliberately constructed so that the frequentist model holds. Other
applications of the frequentist model, such as predictions of the
weather, are a looser application of statistics. In many sciences the
application of statistics falls in the "better than nothing" category.

What problem do theoretical physicists have with the frequentist
approach? I don't see how else QED could be interpreted. For
cosmology it is somewhat questionable.

Italo Vecchi

unread,

Apr 4, 2004, 8:37:00 AM4/4/04

to

nos...@de-ster.demon.nl (J. J. Lodder) wrote in message news:<1gba5ru.1be...@de-ster.xs4all.nl>...

> All coins are quantum coins, for we live in a quantum word.

Well said.

> In practice it may be quite hard to say whether or not
> a coin throw may be considered to be 'classical'.
> Quantum mechanics may come in in the precise timing
> of the twitching of your fingers, on the molecular level,
> when flipping the coin.
>

Quantum mechanics comes into coins also in the impossibility to
fix/determine initial conditions.
It would be interesting to have estimates for the growth of initial
"quantum scale" indeterminacies in classical models of chaotic
physical systems.
Consider for example a set of macroscopic balls bouncing in a box. One
may assume that deviations grow by a factor 10 between bounces (this
is based on my experience as a billiard player) in a reasonable
position-momentum norm. If the above assumption is realistic the
uncertainty principle prevents macroscopically accurate* deterministic
forecasts spanning more than a few dozen bounces.

IV

* of the kind that's relevant in actual billiard games.

Bartosz Milewski

unread,

Apr 4, 2004, 8:37:04 AM4/4/04

to

"Arnold Neumaier" <Arnold....@univie.ac.at> wrote in message

news:c4fh54$uur$1...@lfa222122.richmond.edu...

> The sense it makes is the following: If you have a sound probabilistic
> model of a multitude of independent events e_i with assigned
> probability p you'd be surprised if the frequency of events is not
> close to p within a small multiple of sqrt(p(1-p)/N). And you'd probably
> rather try to explain away a rare occurence (a brick going upwards due
> to fluctuations) by assuming a hidden, unobserved cause (someone throwing
> it) rather than just accept it as something within your probabilistic
> mode. The way probabilities are used in practice is always as rough guides
> of what to expect, but not as statements with a 100% exact meaning.
> I wrote a paper on surprise:
> A. Neumaier,
> Fuzzy modeling in terms of surprise,
> Fuzzy Sets and Systems 135 (2003), 21-38.
> http://www.mat.univie.ac.at/~neum/papers.html#fuzzy
> that helps understand the fuzziness inherent in our concepts of reality.

This brings about an interesting possibility that the cutoff is anthropic.
Things that are statistically improbable (from the point of the theory we
are testing), even if they happen, are rejected. Conversely, if too many
improbable things happen, we reject the theory. So there is no correct or
incorrect theory (as long as it's self-consistent), only the currently
accepted one. Moreover, a theory once believed to be experimentally
confirmed (within a very good margin of error) might at some point be
rebuked by another set of identical experiments.

There is a very good example of this phenomenon--the evolution of the speed
of light since 1935. After the first publication of the Michelson
measurement, up till1947, all the measurement where lower than the currently
accepted value by more than the (admitted) experimental error (see diagram
at http://www.sigma-engineering.co.uk/light/lightindex.shtml). This was
probably caused by the experimenters rejecting the data points that were too
far from the then accepted Michelson's number. Notice that even here I'm not
considering the possibility that there was a 12-year long statistical
fluctuation ;-)

Danny Ross Lunsford

unread,

Apr 5, 2004, 3:02:06 PM4/5/04

to

Patrick Powers wrote:

> Actually, I think experimentalists use the Baysian approach. Usually
> an experiment is undertaken with the expectation of some result. If
> the results do not match this expectation, the equipment is tweaked
> until the expected result is obtained. If this doesn't work either
> the experiment is dropped or (rarely) some other explanation is found.

I've often worried that the vaunted accuracy of QED is illusory, that
is, data are used to "tune" the equipment.

-drl

Italo Vecchi

unread,

Apr 6, 2004, 1:55:52 PM4/6/04

to

nos...@de-ster.demon.nl (J. J. Lodder) wrote in message news:<1gba5ru.1be...@de-ster.xs4all.nl>...

> All coins are quantum coins, for we live in a quantum word.

Well said.

> In practice it may be quite hard to say whether or not
> a coin throw may be considered to be 'classical'.
> Quantum mechanics may come in in the precise timing
> of the twitching of your fingers, on the molecular level,
> when flipping the coin.
>

Quantum mechanics kicks into coins also in the impossibility to
fix/determine initial conditions.
In a chaotic system an initial "quantum scale" indeterminacy will
quckly grow macroscopic, as highlighted in "Newtonian Chaos +
Heisenberg Uncertainty = macroscopic indeterminacy" by Barone, S.R.,
Kunhardt, E.E., Bentson, J., and Syljuasen, A., American Journal of
Physics, Vol 61, No. 5, May 1993.

Cheers,

IV

r...@maths.tcd.ie

unread,

Apr 6, 2004, 1:57:27 PM4/6/04

to

Arnold Neumaier <Arnold....@univie.ac.at> writes:

>> r...@maths.tcd.ie wrote in message news:<c3njd0$295l$1...@lanczos.maths.tcd.ie>...

>>>For example, you'll find in many probability
>>>books and hear from the mouths of top probability theorists the
>>>claim that no process can produce random, uniformly distributed
>>>positive integers, but that processes can produce random uniformly
>>>distributed real numbers between zero and one (e.g. toss a fair
>>>coin exactly aleph_0 times to get the binary expansion).

>This has a very simple reason: There is no consistent definition of
>random, uniformly distributed positive integers, while there is
>one for random uniformly distributed real numbers between zero and one.

Please give the definition you claim exists.

>This is a purely mathematical statement independent of any
>interpretation!

Wow.

>And of course, when people say 'produce' they mean
>'produce in theory', or if they mean 'produce in practice' they
>have in mind that it is produced only approximately.

My point was that probability distributions and methods of
generating random numbers are not in one-to-one correspondance, and
I gave an example of a method of generating integers which had
no corresponding probability distribution. I too meant "produce
in theory", since obviously we can't use the axiom of choice in
real life, but if we want to understand what probability theory
is and isn't about (in theory), then we shouldn't make mistakes
on this fundamental point.

R.

r...@maths.tcd.ie

unread,

Apr 6, 2004, 5:52:47 PM4/6/04

to

frisbie...@yahoo.com (Patrick Powers) writes:

>r...@maths.tcd.ie wrote in message news:<c3njd0$295l$1...@lanczos.maths.tcd.ie>...
>>

>> Right, so actually, the frequentist interpretation of probability
>> suffers from the same disease that the many-worlds interpretation
>> does, or at least the non-Bayesian one. In many worlds, the problem
>> is that there's no way to justify dismissing worlds with a small
>> quantum amplitude as being rare, and in the frequentist
>> version of probability theory, there's no way to justify dismissing
>> outcomes with small probability as being rare.
>>
>Quantum theory is a probabilistic theory and extremely unlikely events
>are not excluded, nor should they be. So this is a property of the
>theory, not the interpretation. It seems to me that an interpretation
>that excluded such events absolutely would be in error.

I'm not saying that they should be excluded; as a good Bayesian
I would merely say that the information available to me leads
me to expect that they won't happen, although it's not impossible.

>> The frequentist interpretation of probability suffers from worse
>> diseases as well. For example, you'll find in many probability
>> books and hear from the mouths of top probability theorists the
>> claim that no process can produce random, uniformly distributed
>> positive integers, but that processes can produce random uniformly
>> distributed real numbers between zero and one (e.g. toss a fair
>> coin exactly aleph_0 times to get the binary expansion).

>Yes these claims as stated are contradictory. I suspect that the
>definitions you are using are imprecise. The word "process" implies
>computability, that the process is finite. A real number is cleverly
>defined as a limit of a finite process. So a real number is
>computable in this sense, that it can be approximated as closely as
>one likes in finite time. The problem with your proof is that as the
>real number is computed the choice of cosets changes with each step so
>the process does not converge to an integer.

You are right; there is no convergence and there's no way to
actually compute such an integer or any approximation to it
in a finite number of operations. Modern mathematics, however,
allows us to deal with infinite sets without having to always
consider what can and can not be done in a finite number of
operations. The set of (not necessarily continuous) functions
from R to R has a cardinality greater than R itself, for example,
although this fact is of no relevance to finite creatures like
us. We don't *need* to define reals in terms of limits (for example,
we can define them in terms of Dedekind cuts).

So, rather than considering what's happening with the
cosets as being something which happens while the number is being
generated, I suppose that some acquaintance of mine can merely
give me a random number between 0 and 1, and then I convert
it into an integer. You, for example, might tell me 0.5 which
you can do in finite time, having generated it by your own
algorithm, which might be "just pick the middle number". If my
choice function is independent of your choice of number, then the
integer corresponding to 0.5 will be as random as the integer
corresponding to any other real number.

On the other hand, your point is well taken; generating reals
between 0 and 1 is itself impossible in practice.

>Using the axioms of choice and infinity then one can indeed choose a
>natural number at random. There are some rather strange consequences.
> It is then possible to prove that each number chosen in this way will
>be greater than all such previously chosen numbers with probability
>one. Let N be the greatest such number chosen so far. Then there are
>finitely many natural numbers less than or equal to N but infinitely
>many greater than N. So the next number chosen will be greater than N
>with probability one. Note that our ostensibly random sequence is
>strictly increasing with probability one.

If we consider infinite processes then the notion that the probabilities
one or zero mean anything goes out the window. Note that if I
tell you the third "random" integer first, then with probability
one the second is bigger than it, so with probability one the sequence
is not strictly increasing.

>This is not the only
>bizarre consequence of the axiom of choice: see the well-known
>Banach-Tarski sphere paradox. So I should think a physicist would do
>well to be wary of the axiom of choice as tending to produce
>non-physical results.

Indeed; these are more mathematical facts than physical ones.
Also, you don't need the axiom of choice to produce things like
Banach-Tarski - the set of complex numbers:
A={\sum a_n exp(i*n): a_n in N}, where N is the set of non-negative
integers can be broken into A=B disjoint union C, where B is A+1 and
C is exp(i)*A. Both B and C are exactly the same shape and size
as A, B being a translated version of A and C being a rotated version
of A, so A can be broken into two parts, each as "big" as A itself.

What I'm saying is that physicists don't need the axiom of choice
to get "unphysical" results. Infinite sets, which we use all the
time, are sufficient.

>The frequentist approach does not assume the axiom of choice and makes
>no use of transfinite mathematics or completed limits. If it did, the
>problems you mention would in fact arise.

Well, the problems I mentioned arise if we believe the axiom
of choice, the "experiment generating random numbers" version of
probability theory, and the idea that we can deal with infinite
sets (an axiom of infinity, eg "there exist infinite sets"), all
at the same time.

R.

Arnold Neumaier

unread,

Apr 6, 2004, 5:53:30 PM4/6/04

to

Bartosz Milewski wrote:
> "Arnold Neumaier" <Arnold....@univie.ac.at> wrote in message
> news:c4fh54$uur$1...@lfa222122.richmond.edu...
>
>>The sense it makes is the following: If you have a sound probabilistic
>>model of a multitude of independent events e_i with assigned
>>probability p you'd be surprised if the frequency of events is not
>>close to p within a small multiple of sqrt(p(1-p)/N). And you'd probably
>>rather try to explain away a rare occurence (a brick going upwards due
>>to fluctuations) by assuming a hidden, unobserved cause (someone throwing
>>it) rather than just accept it as something within your probabilistic
>>mode. The way probabilities are used in practice is always as rough guides
>>of what to expect, but not as statements with a 100% exact meaning.
>>I wrote a paper on surprise:
>> A. Neumaier,
>> Fuzzy modeling in terms of surprise,
>> Fuzzy Sets and Systems 135 (2003), 21-38.
>> http://www.mat.univie.ac.at/~neum/papers.html#fuzzy
>>that helps understand the fuzziness inherent in our concepts of reality.
>
>
> This brings about an interesting possibility that the cutoff is anthropic.

Not only anthropic, but subjective. Different people have different
views on the matter and are prepared to take different risks.

> Things that are statistically improbable (from the point of the theory we
> are testing), even if they happen, are rejected. Conversely, if too many
> improbable things happen, we reject the theory. So there is no correct or
> incorrect theory (as long as it's self-consistent), only the currently
> accepted one.

Positrons were observed before they were predicted by theory, but the
observers didn't believe the phenomenon was real. Rather than face
ridicule with a premature publication they ignored their evidence.
On the other hand, cold fusion had a different story...

We take a small probability p serious only if the associated phenomena
are repeatable frequently enough that an approximate frequentist
interpretation makes sense.

Arnold Neumaier

eb...@lfa221051.richmond.edu

unread,

Apr 7, 2004, 6:45:05 AM4/7/04

to

In article <9511688f.04032...@posting.google.com>,
Patrick Powers <frisbie...@yahoo.com> wrote:

>Using the axioms of choice and infinity then one can indeed choose a
>natural number at random

>From context, let me add "with a uniform distribution" -- that is,
with all natural numbers equally probable.

Is this statement meant to be obvious? It's not at all clear to me
how the axiom of choice says anything about probabilities.

If it's not meant to be obvious, but is nonetheless true, can someone
point me to an appropriate place to read more on this?

-Ted

--
[E-mail me at na...@domain.edu, as opposed to na...@machine.domain.edu.]

Daryl McCullough

unread,

Apr 8, 2004, 2:26:49 PM4/8/04

to

eb...@lfa221051.richmond.edu says...

>In article <9511688f.04032...@posting.google.com>,
>Patrick Powers <frisbie...@yahoo.com> wrote:
>
>>Using the axioms of choice and infinity then one can indeed choose a
>>natural number at random
>
>>From context, let me add "with a uniform distribution" -- that is,
>with all natural numbers equally probable.
>
>Is this statement meant to be obvious? It's not at all clear to me
>how the axiom of choice says anything about probabilities.
>
>If it's not meant to be obvious, but is nonetheless true, can someone
>point me to an appropriate place to read more on this?

I'm cross-posting to sci.math, because maybe a mathematician
has something to add. Patrick's point is not complicated
to prove, but it's hard to understand how to interpret it.

1. Pick an enumeration of all positive rational numbers
between 0 and 1. For example, 1/2, 1/3, 2/3, 1/4, 3/4, 1/5, 2/5,
3/5, 4/5, ... Let q_n be the nth rational number.

2. Define an equivalence relation on real numbers between
0 and 1: x ~~ y if and only if |x-y| is rational.

3. Using the axiom of choice, construct a set S by picking
one element out of every equivalence class.

4. Define S_n to be { x | |x - q_n| is in S }

Note that S_0 union S_1 union S_2 union ... = (0,1).

5. So here's how you generate a random nonnegative integer: Generate
a random real x in (0,1), and let your random integer be that n such
that x is an element of S_n.

There is no probability distribution on the possible outcomes of this
process, so it isn't a "uniform distribution on the integers" in a
measure-theoretic sense. But you can argue by symmetry that in some
sense every n is "equally likely" because each of the sets S_n are
identical, except for a translation.

--
Daryl McCullough
Ithaca, NY

Arnold Neumaier

unread,

Apr 8, 2004, 2:26:54 PM4/8/04

to

r...@maths.tcd.ie wrote:
> Arnold Neumaier <Arnold....@univie.ac.at> writes:

>>There is no consistent definition of
>>random, uniformly distributed positive integers, while there is
>>one for random uniformly distributed real numbers between zero and one.
>
> Please give the definition you claim exists.

Just to know what kind of answer you'd be prepared to accept,
please let me know what you regard as the definition of random
binary numbers with equal probabilities of 0 and 1. Then I'll
be able to answer your more difficult question satisfactorily.

Arnold Neumaier

unread,

Apr 8, 2004, 6:38:28 PM4/8/04

to

eb...@lfa221051.richmond.edu wrote:
> In article <9511688f.04032...@posting.google.com>,
> Patrick Powers <frisbie...@yahoo.com> wrote:
>
>
>>Using the axioms of choice and infinity then one can indeed choose a
>>natural number at random
>
>
>>From context, let me add "with a uniform distribution" -- that is,
> with all natural numbers equally probable.

There is no uniform distribution on natural numbers.
There is no way to make formal sense of the statement
'all natural numbers equally probable'.
Thus this 'context' is logically meaningless.

The natural least informative distribution on natural numbers
is a Poisson distribution, but here one has to be at least informed
about the mean.

Arnold Neumaier

Phillip Helbig---remove CLOTHES to reply

unread,

Apr 8, 2004, 6:40:54 PM4/8/04

to

In article <40745CDC...@univie.ac.at>, Arnold Neumaier
<Arnold....@univie.ac.at> writes:

> > I've often worried that the vaunted accuracy of QED is illusory, that
> > is, data are used to "tune" the equipment.
>

> There is no way to tune the Lamb shift.
>
> Any equipment has to be calibrated to give maximally consistent
> results (i.e., smallest measurement errors), but this is a
> completely different matter.

When I was studying physics in Hamburg, in the course on electrodynamics
(which basically followed Jackson's book), the lecturer (Prof. Dr.
Heinrich Victor von Geramb) remarked on the good agreement of QED theory
and observation, to 13 significant digits or whatever it was at the
time. I asked him what happens after the least significant digit: is
that as accurate as experiment can get, or are genuine deviations
detected. His answer was neither: that's as accurate as the theory can
get (at present), since the actual numerical calculations are quite
involved.

r...@maths.tcd.ie

unread,

Apr 8, 2004, 6:43:19 PM4/8/04

to sci-physic...@moderators.isc.org

da...@atc-nycorp.com (Daryl McCullough) writes:

>eb...@lfa221051.richmond.edu says...

>>In article <9511688f.04032...@posting.google.com>,
>>Patrick Powers <frisbie...@yahoo.com> wrote:
>>
>>>Using the axioms of choice and infinity then one can indeed choose a
>>>natural number at random

> Patrick's point is not complicated

>to prove, but it's hard to understand how to interpret it.

It was my point, and I'll offer four possible interpretations:

1. Probability theory only works for finite sets.
2. The axiom of choice is wrong.
3. Probability theory does not tell us what kinds of processes
of generating numbers are or aren't possible, and in fact
methods of generating numbers exist which have no corresponding
probability distribution.
4. The problem is a problem with real arithmetic. Replace
reals by (for example) Conway numbers and consistent probability
distributions can be generated.

I think 3 is the most conservative.

R.

r...@maths.tcd.ie

unread,

Apr 8, 2004, 6:42:51 PM4/8/04

to sci-physic...@moderators.isc.org

Arnold Neumaier <Arnold....@univie.ac.at> writes:

I never claimed that there was a consistent definition of
a random number at all; in fact, I would say that there isn't.
We might very well have a situation where we have incomplete
information about the value of a particular number, and in
such cases we use probability theory to reason about it.

It would be incorrect to jump from there to the statement that there
was some process used to generate the number with the property that
the same process applied twice will produce a different number even
though nothing at all is different the second time around, except
the number produced. This is, approximately, what people mean when
they talk of randomness. If we suppose that some unknown difference
is responsible for producing a different number the second time,
then this is merely lack of knowledge and not randomness - that is,
it is a Bayesian interpretation.

So my position is the following - the use of probability theory
is not required because of a property that the number we
wish to talk about has (ie randomness is not a property of
numbers themselves, and hence there is no such thing as a
random number). Neither is it required because of a property
of a method of producing those numbers (the "randomness"
property described in the above paragraph), and in fact I have
proven that probability theory is not something which tells
us about methods of generating numbers. I would say that
probability theory is required when we do not have sufficient
information to reason deductively, and that it expresses the
relationship between given information and possible values of
the number, where possible means not ruled out by the
information available.

Instead of giving you a definition of a random number, I will
tell you that I would use probability if I did not know whether a
particular number was 0 or 1. Now you can let me know what
was the consistent definition you had in mind for "random uniformly
distributed real numbers between zero and one", when you claimed
that one existed.

R.

r.e.s.

unread,

Apr 11, 2004, 11:44:36 AM4/11/04

to

"Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...

> The natural least informative distribution on natural numbers
> is a Poisson distribution, but here one has to be at least informed
> about the mean.

It would be Poisson if, in addition to the mean, one knew (only)
that the distribution is that of a sum of independent Bernoulli
rv's. It may be more "natural" not to assume any such additional
knowledge, in which case the max-entropy distribution would be
geometric rather than Poisson.

--r.e.s.

Italo Vecchi

unread,

Apr 11, 2004, 11:44:30 AM4/11/04

to

frisbie...@yahoo.com (Patrick Powers) wrote in message news:<9511688f.04040...@posting.google.com>...

> Actually, I think experimentalists use the Baysian approach. Usually
> an experiment is undertaken with the expectation of some result. If
> the results do not match this expectation, the equipment is tweaked
> until the expected result is obtained. If this doesn't work either
> the experiment is dropped or (rarely) some other explanation is found.
>

You're raising a huge issue.
Data must be interpreted and "unbiased interpretation" is certainly
an idealisation and arguably an oxymoron. Besides, noone ever lost
his job for bowing to mainstream consensus.

An interesting example of experimental evidence yielding divergent
interpretations concerns the univalence (or boson-fermion)
superselection rule ([4]). Observables exhibiting lack of periodicity
for 2pi rotations are ruled out by the superselection rule (sic),
which stems from the postulate of rotational invariance and is deeply
imbedded in modern physics.
Embarassingly, there are experimental results ( [1], [2], [5] , [7])
exhibiting broken 2pi-periodicity.
The interpretation of the experimental results is controversial
([3],[6]), also because the idea of invariance is a rather subtle one.
According to the critics "the standard response is that the
experiments are beautiful, but they are not the ones involved in the
univalence superselection rule where you must rotate the entire
isolated system" ([6]). I find the objection rather bizarre, since
there is no such thing as invariance of an isolated system. Invariance
can be meaningfully formulated and tested only for two non-isolated
systems, one of them including the observer.

Cheers,

IV

---------------------------------------------

"When I use a word," Humpty Dumpty said, in rather a scornful tone,
"it means just what I choose it to mean - neither more nor less."

"The question is," said Alice, "whether you can make words mean so
many different things."

"The question is," said Humpty Dumpty, "which is to be master - that's
all."

Lewis Carroll, Through the Looking-Glass, Chapter VI

Patrick Powers

unread,

Apr 11, 2004, 11:44:29 AM4/11/04

to

r...@maths.tcd.ie wrote in message news:<c54a85$2nrl$1...@lanczos.maths.tcd.ie>...

> da...@atc-nycorp.com (Daryl McCullough) writes:
>
>
> It was my point, and I'll offer four possible interpretations:
>
> 1. Probability theory only works for finite sets.
> 2. The axiom of choice is wrong.
> 3. Probability theory does not tell us what kinds of processes
> of generating numbers are or aren't possible, and in fact
> methods of generating numbers exist which have no corresponding
> probability distribution.
> 4. The problem is a problem with real arithmetic. Replace
> reals by (for example) Conway numbers and consistent probability
> distributions can be generated.
>
> I think 3 is the most conservative.
>
> R.

You began with a criticism of frequentism. Frequentism is a
nuts-and-bolts area of mathematics intended for use by physicists,
statisticians, and the like. As such, it concerns itself with finite
processes, computable numbers, and limits. To use other tools to
attempt to invalidate such results is simply irrelevant: there is no
point in arguing about what tools are permissible, one simple chooses
one's arena and proceeds from there. So this area of probability
theory indeed places limits on processes of generating random numbers.
If you choose to play a different game, you will likely get different
results. And very well you may do so: probability theory is by no
means restricted to frequentism.

ZZBunker

unread,

Apr 11, 2004, 11:44:23 AM4/11/04

to

eb...@lfa221051.richmond.edu wrote in message news:<c4v9oc$dlf$1...@lfa222122.richmond.edu>...

> In article <9511688f.04032...@posting.google.com>,
> Patrick Powers <frisbie...@yahoo.com> wrote:
>
> >Using the axioms of choice and infinity then one can indeed choose a
> >natural number at random
>
> >From context, let me add "with a uniform distribution" -- that is,
> with all natural numbers equally probable.
>
> Is this statement meant to be obvious? It's not at all clear to me
> how the axiom of choice says anything about probabilities.

That's because AC doesn't say anything about natural numbers.
It says that the real numbers can be well-ordered.

All AC asserts is that there are choice functions
C:N->{0,1).

If C and D are choice functions, then
C=D iff C(n)=D(n) for all n.

It doesn't say anything about natural numbers.
It doesn't say anything about real numbers.
It doesn't say anything about probabilities.
It doesn't say how to construct a choice function.
It doesn't say anything about pi.

Ralph E. Frost

unread,

Apr 12, 2004, 10:06:11 AM4/12/04

to

"Italo Vecchi" <vec...@weirdtech.com> wrote in message
news:61789046.0404...@posting.google.com...

Are you saying there is a substantial body of experimental results from
particle studies that are routinely and consistently thrown out because, to
include them would require a significantly different explanation than that
provided by the current version of the Standard Model?

Italo Vecchi

unread,

Apr 13, 2004, 3:43:31 AM4/13/04

to

"Ralph E. Frost" <ra...@REMOVErefrost.com> wrote in message news:<107ke21...@corp.supernews.com>...

I don't know. Not necessarily anyway, even accepting that univalent
superselection doesn't hold. However I wonder how eagerly such
violations are being sought by QED experimentalists.

Below are the references for my previous post.

Cheers,

IV

[1] Rauch H. et al. "Verification of Coherent Spinor Rotation of
Fermions" Physical Letters 54A (1975) pp 425-427.
[2] Greenberger D.M. "The neutron interferometer as a device for
illustrating the strange behaviour of quantum systems" Rev.Mod. Phys.
55
(1983) , pp 875-905.
[3] Hegerfeldt G. and Kraus K. "Critical remark on the observability
of
the sign change of spinors under 2pi rotations" Phys. Rev. 170 (1978)
,
pp 449-457.
[4] E. P. Wigner " Interpretation of Quantum Mechanics" (1981) in
Quantum Theory and Measurement, Wheeler and Zurek eds. pp. 260-314.
[5] Rauch H. et al. "Precise determination of the 4pi-Periodicity
factor
of a Spinor Wave Function" (1978) Zeitschrift für Physik B29 pp
281-284.
[6] A.S. Wightman "Superselection Rules:Old and New" (1995) Nuovo
Cimento 110B (5-6) pp 752-769.
[7] Badurek et al. "Polarized neutron interferometry: a survey" (1988)
Physica B 151, pp 82-92.

Bartosz Milewski

unread,

Apr 13, 2004, 3:43:50 AM4/13/04

to

<r...@maths.tcd.ie> wrote in message
news:c549nk$2nnh$1...@lanczos.maths.tcd.ie...

> It would be incorrect to jump from there to the statement that there
> was some process used to generate the number with the property that
> the same process applied twice will produce a different number even
> though nothing at all is different the second time around, except
> the number produced.

Yet this is exactly what QM claims. For instance, take the following
process:
Create a large number of muons and pass them through a filter that will
select particles in a particular state. Measure the time until such prepared
muon decays. You will get a different number every time. The theory claims
that this randomness happens not because there is a tiny clock ticking
inside a muon whose initial condition cannot be measured (that would be a
hidden variable), but because QM is inherently random.

I doesn't matter whether it is technically feasible or not to create muons
or other unstable elementary particles in a pure state, as long as QM
doesn't say it is impossible.

Arnold Neumaier

unread,

Apr 13, 2004, 12:31:42 PM4/13/04

to

Ralph E. Frost wrote:

> Are you saying there is a substantial body of experimental results from
> particle studies that are routinely and consistently thrown out because, to
> include them would require a significantly different explanation than that
> provided by the current version of the Standard Model?

This happened already long ago with positron tracks,
before the positron was officially shown to exist.
It takes courage to be against mainstream even when you are right,
since the chance that you are wrong and just misinterpreting
something is much much higher. Thus a competent physicist goes
against mainstream only if the evidence is overwhelming.

Arnold Neumaier

Patrick Powers

unread,

Apr 13, 2004, 5:46:44 PM4/13/04

to

r...@maths.tcd.ie wrote in message news:<c549nk$2nnh$1...@lanczos.maths.tcd.ie>...

I hereby abandon discussion of frequentism and resort to the axioms of
probability. In this light, you are correct in believing that the
generation of random numbers is not part of probability theory. In
fact, randomness itself is not really part of probability theory,
which is the theory of normed measure spaces. So in the bounds of
probability theory it perfectly all right to deny that random numbers
exist.

As to a consistent definition of "randomly uniformly distributed real
numbers between zero and one", this is what Lebesque measure is all
about. You take all the sets with obvious measure like P[a<X<b]=b-a
and combine them in obvious ways and show that the resulting sets have
the expected measure. The interesting, useful theorem is that the sum
of countably many sets of measure zero is zero which makes it easy to
prove (among other things) that the set of rationals in [0,1] is of
measure zero. You take note of Cantor's clever construction of an
uncountable set of measure zero. Once finished constructing Lebesque
measure there remain exotic sets which have no defined measure, but it
can be proved that any attempt to avoid this makes things worse.
Rudin has nice treatments of this.

There IS a theory of pseudo-random numbers -- iterated functions that
generate sequences of numbers that pass statistical tests of
randomness -- but this is part of number theory.

Daniel Waggoner

unread,

Apr 13, 2004, 5:48:45 PM4/13/04

to

Unfortunately, S_n as defined above is not Lebesgue measurable. This
means that it does not really make sense to ask the question, "What is the
porbability that a uniform random variable lies in S_n?" I realize that
this is counterintuitive, but probability over uncountable sample spaces
is tricky, precisely for these reasons.

As to the original question about natural numbers. If one requires that
probability measures be countably additive, then there is on probability
measure defined on the integers that I would want to label "uniform."
Countably additive means that if {A_n} is a countable collection of
disjoint (measurable) sets, then the probability of the union of the A_n
is equal to the sum of the probabilities of each of the A_n. I certainly
want my probability measures to be countably additive, but some authors
argue that it is enough to be finitely additive. If one allows finitely
additive probability measures, then one can defined a probability measure
over the natural numbers that some might want to label "uniform." However,
the construction of this measure uses the the axiom of choice (or at least
some large portion of the axiom of choice).

Daniel Waggoner

Patrick Powers

unread,

Apr 13, 2004, 5:52:17 PM4/13/04

to

>
> In fact, a process which produces uniformly distributed random real
> numbers between zero and one can be modified so that it produces
> uniformly distributed random positive integers in the following
> way: Consider [0,1) as an additive group of reals modulo 1. Then
> it has a subgroup, S, consisting of rational numbers in [0,1). Form
> a set X by choosing one element from each coset of S in [0,1). Then
> define X_r = {a+r mod 1 | a \in X}, for each r in S. The X_r are
> pairwise disjoint, pairwise congruent sets, with congruent meaning
> they are related to each other by isometries of the group [0,1).
> In that sense, they are as equiprobable as can be. Now if q is a
> random number between 0 and 1, then it falls into exactly one X_r,
> so there is a unique rational number, r, associated with that real
> number, and since the rationals are countable, there is also a
> unique positive integer associated with that real number. Since the
> X_r's are congruent, no one can be any more or less likely than any
> other, so no positive integer is any more or less likely than any
> other to result from this process. Voila, we have a way to get a
> "random" positive integer from a "random" real in [0,1).
>

This heuristic argument is thought-provoking. To sharpen the results
we may attempt to assign a probability to the countably many sets X_r.
We encounter the classic dilemma of a normed measure space. If the
measure of each set is zero then it seems the measure of [0,1) should
also be zero. If positive and the sets are equiprobable, then [0,1)
would tend to have unbounded measure. The canonical measure of
probability, Lebesque meausure, is no help since it simply throws up
its hands and declares each X_r to be unmeasureable. It is not clear
where to procede from here in constructing a suitable measure.

r...@maths.tcd.ie

unread,

Apr 13, 2004, 5:54:47 PM4/13/04

to

frisbie...@yahoo.com (Patrick Powers) writes:

>You began with a criticism of frequentism. Frequentism is a
>nuts-and-bolts area of mathematics intended for use by physicists,
>statisticians, and the like. As such, it concerns itself with finite
>processes, computable numbers, and limits. To use other tools to
>attempt to invalidate such results is simply irrelevant: there is no
>point in arguing about what tools are permissible, one simple chooses
>one's arena and proceeds from there.

You are saying that we deal with finite sets in practice, when
we have our sleeves rolled up and are clutching a screwdriver,
so we don't need an interpretation of probability theory that
deals with infinite sets. Ok; but that's your interpretation,
and not everybody will be happy with it; somebody who spends more
time dealing with the axiom of choice than with nuts-and-bolts might
not, for example.

>So this area of probability
>theory indeed places limits on processes of generating random numbers.

That's not clear. The mere existence of a distribution over some
set doesn't immediately tell me anything about the existence or
nonexistence of processes selecting elements from that set. I can
have a uniform distribution over the set {0,1}; that doesn't
furnish me with a way to select one of those numbers "at random",
and doesn't tell me that anybody else knows a way either. Similarly,
the nonexistence of a uniform normalised distribution over the
integers tells me nothing about whether somebody I meet tomorrow
might or might not start spouting random integers at me.

The real problem is that "finite processes, computable numbers and
limits" as you say, will never suffice to produce anything random;
for that you need a seed to the random number generator, or some
unknown extra input which can serve as a seed, and you suppose that
that seed is already "random". Then your finite process argument
merely says that if we don't already have a random element of an
infinite set, we can't generate one, although we can generate
elements of perhaps large finite sets given many seeds from smaller
sets.

R.

Arnold Neumaier

unread,

Apr 14, 2004, 3:17:08 AM4/14/04

to

Yes. High order radiative corrections require the sum of hundreds
(if not more) high-dimensional integrals that are hard to compute
numerically. Also, the fine structure constant is not known too well.
Finally, I think, the accuracy is now such that improvements
require corrections from proton form factors, etc., which are not known
to high precision. Thus it will be difficult to extend the accuracy...

Arnold Neumaier

unread,

Apr 14, 2004, 3:17:10 AM4/14/04

to

r...@maths.tcd.ie wrote:
> Arnold Neumaier <Arnold....@univie.ac.at> writes:
>
>
>>r...@maths.tcd.ie wrote:
>>
>>>Arnold Neumaier <Arnold....@univie.ac.at> writes:
>
>
>>>>There is no consistent definition of
>>>>random, uniformly distributed positive integers, while there is
>>>>one for random uniformly distributed real numbers between zero and one.
>>>
>>>Please give the definition you claim exists.
>
>
>>Just to know what kind of answer you'd be prepared to accept,
>>please let me know what you regard as the definition of random
>>binary numbers with equal probabilities of 0 and 1. Then I'll
>>be able to answer your more difficult question satisfactorily.
>

> So my position is the following - the use of probability theory

> is not required because of a property that the number we
> wish to talk about has (ie randomness is not a property of
> numbers themselves, and hence there is no such thing as a
> random number). Neither is it required because of a property
> of a method of producing those numbers (the "randomness"
> property described in the above paragraph), and in fact I have
> proven that probability theory is not something which tells
> us about methods of generating numbers. I would say that
> probability theory is required when we do not have sufficient
> information to reason deductively, and that it expresses the
> relationship between given information and possible values of
> the number, where possible means not ruled out by the
> information available.
>
> Instead of giving you a definition of a random number, I will
> tell you that I would use probability if I did not know whether a
> particular number was 0 or 1. Now you can let me know what
> was the consistent definition you had in mind for "random uniformly
> distributed real numbers between zero and one", when you claimed
> that one existed.

With such a vague concept of probability as you use, I cannot
satisfy your curiosity. Giving a formal definition requires that
we agree on a common basis, which is obviously not the case.

Arnold Neumaier

Patrick Powers

unread,

Apr 15, 2004, 2:57:01 AM4/15/04

to

Daniel Waggoner <danielNowa...@mindspring.com> wrote in message news:<pan.2004.04.10....@mindspring.com>...

> As to the original question about natural numbers. If one requires that
> probability measures be countably additive, then there is on probability
> measure defined on the integers that I would want to label "uniform."
> Countably additive means that if {A_n} is a countable collection of
> disjoint (measurable) sets, then the probability of the union of the A_n
> is equal to the sum of the probabilities of each of the A_n. I certainly
> want my probability measures to be countably additive, but some authors
> argue that it is enough to be finitely additive.

It seems to me that no countable probability space for which each
element has probability zero can have a countably additive meausure.
By a countably additive measure we mean that we have the useful
theorem that a countable union of sets of measure zero has measure
zero. In the hypothesized uniform measure over the integers each
singleton set has measure zero. The integers are a countable union of
singleton sets. So a countably additive measure must have a measure
of zero for the entire space and it fails to be a probability space.

Italo Vecchi

unread,

Apr 15, 2004, 11:33:28 AM4/15/04

to

Arnold Neumaier <Arnold....@univie.ac.at> wrote in message news:<407BE49...@univie.ac.at>...
...

> Thus a competent physicist goes
> against mainstream only if the evidence is overwhelming.
>

Indeed. Nothing to the tune of "the standard model is flawed because
univalence superselection is testably false" has ever been published
or posted. Even the people who demonstrated broken 2pi-symmetry (guys
of Aharonov and Zeilinger caliber) seem uninterested in pressing the
point, although the factual evidence is rock solid. Wightman, who
co-invented superselection, acknowledes that the experimental results
are beautiful. He just explains them away with an argument (first set
forth in [1]) that appears meaningless, at least to me.

Possible violations of Lorentz symmetry are apparently a hot topic in
current physics research (see [2]), but the results on 2pi-rotations
of fermions are nowhere mentioned.

Cheers (oh well ... ),

IV

[1] Hegerfeldt G. and Kraus K. "Critical remark on the observability

of the sign change of spinors under 2pi rotations" Phys. Rev. 170
(1978), pp 449-457.

[2] http://physicsweb.org/article/world/17/3/7

---------------

Fools rush in where angels fear to tread.

A. Pope

Arnold Neumaier

unread,

Apr 15, 2004, 11:34:26 AM4/15/04

to

Patrick Powers wrote:
> r...@maths.tcd.ie wrote in message news:<c549nk$2nnh$1...@lanczos.maths.tcd.ie>...
>
>>Arnold Neumaier <Arnold....@univie.ac.at> writes:
>>
>>>>>There is no consistent definition of
>>>>>random, uniformly distributed positive integers, while there is
>>>>>one for random uniformly distributed real numbers between zero and one.
>>>>
>>>>Please give the definition you claim exists.
>

> I hereby abandon discussion of frequentism and resort to the axioms of
> probability. In this light, you are correct in believing that the
> generation of random numbers is not part of probability theory. In
> fact, randomness itself is not really part of probability theory,
> which is the theory of normed measure spaces. So in the bounds of
> probability theory it perfectly all right to deny that random numbers
> exist.

Not at all.

In probability theory, a random number is just a random variable x,
i.e., a measurable function on the sigma algebra of measurable subsets
of the set Omega of possible experiments.

For each experiment omega in Omega, x(omega) is a realization, i.e.,
the number drawn in this particular experiment.

The only thing not specified in probability
theory is the mechanism that draws the number, and hence there is no
way to know which experiment omega has been realized. Probability
theory makes only statements about _all_ realizations simultaneously.

Given the axioms of probability theory, it is clear that a random
variable x such that
<f(x)> = integral_0^1 f(s) ds
for all integrable functions f is a random number uniformly distributed
between zero and one, and any x(omega) is a realization of it, i.e.,
an actual number in [0,1].

On the other hand, there is no uniformly distributed random natural
number since the uniform measure on natural numbers,
mu(f) = integral dmu(k) f(k)
is not normalizable.

Arnold Neumaier

r...@maths.tcd.ie

unread,

Apr 19, 2004, 1:32:32 PM4/19/04

to sci-physic...@moderators.isc.org

frisbie...@yahoo.com (Patrick Powers) writes:

>Daniel Waggoner <danielNowa...@mindspring.com> wrote in message news:<pan.2004.04.10....@mindspring.com>...
>> As to the original question about natural numbers. If one requires that
>> probability measures be countably additive, then there is on probability
>> measure defined on the integers that I would want to label "uniform."
>> Countably additive means that if {A_n} is a countable collection of
>> disjoint (measurable) sets, then the probability of the union of the A_n
>> is equal to the sum of the probabilities of each of the A_n. I certainly
>> want my probability measures to be countably additive, but some authors
>> argue that it is enough to be finitely additive.

>It seems to me that no countable probability space for which each

>element has probability zero can have a countably additive measure.

>By a countably additive measure we mean that we have the useful
>theorem that a countable union of sets of measure zero has measure
>zero. In the hypothesized uniform measure over the integers each
>singleton set has measure zero. The integers are a countable union of
>singleton sets. So a countably additive measure must have a measure
>of zero for the entire space and it fails to be a probability space.

Right; the problem would be solved if our number system had genuine
infinitesimals. Then each integer could be assigned a probability
1/aleph_0 and everything would work out ok. Such number systems
exist and there is nothing inherent in probability theory which
ties it to real numbers except its current formulation. We could
even imagine a formal definition of probability distributions as
equivalence classes of algorithms for producing numbers from "random"
seeds. Then we could honestly claim to be talking about processes
generating numbers when we do probability.

>> If one allows finitely
>> additive probability measures, then one can defined a probability measure
>> over the natural numbers that some might want to label "uniform." However,
>> the construction of this measure uses the the axiom of choice (or at least
>> some large portion of the axiom of choice).

Do you have a reference for this?

R.

Patrick Powers

unread,

Apr 19, 2004, 1:42:00 PM4/19/04

to

Arnold Neumaier <Arnold....@univie.ac.at> wrote in message news:<c5ma22$1be$1...@lfa222122.richmond.edu>...

> Patrick Powers wrote:
> > r...@maths.tcd.ie wrote in message news:<c549nk$2nnh$1...@lanczos.maths.tcd.ie>...

> >

Thank you for this post. I resorted to actually looking up the
Kolmogorov probabilty axioms. There are only three, and quite simple.
There is no mention of randomness at all. Indeed this is very much
in the Hilbert spirit of axioms. He believed that axioms should not
appeal to intuition, and stated that his axioms for geometry would be
equally valid if the word "line" were replaced with "xasdaegvm". In
this spirit I contend that the concept of randomness is not essential
for probability theory.

As you note, a random variable is really a measureable function on a
normed measure space. The realization of a random variable is a
rather tenuous concept because "probability theory makes only
statements about _all_ realizations simultaneously" so a single
realization is not part of this world. It is really about integrals
over sets. I concur that the concept of the random number and the
intuition therefrom are important and useful and confess to pedantry.
(Not pederasty! Pedantry!)

By the way, countable additivity is the third axiom. So just as you
say, any uniform distribution over the integers is not part of the
Kolmogorov world.

Daniel Waggoner

unread,

Apr 19, 2004, 2:07:53 PM4/19/04

to

On Thu, 15 Apr 2004 06:57:01 +0000, Patrick Powers wrote:

> Daniel Waggoner <danielNowa...@mindspring.com> wrote in message
> news:<pan.2004.04.10....@mindspring.com>...
>> As to the original question about natural numbers. If one requires
>> that probability measures be countably additive, then there is on
>> probability measure defined on the integers that I would want to label
>> "uniform." Countably additive means that if {A_n} is a countable
>> collection of disjoint (measurable) sets, then the probability of the
>> union of the A_n is equal to the sum of the probabilities of each of
>> the A_n. I certainly want my probability measures to be countably
>> additive, but some authors argue that it is enough to be finitely
>> additive.
>
> It seems to me that no countable probability space for which each
> element has probability zero can have a countably additive meausure. By
> a countably additive measure we mean that we have the useful theorem
> that a countable union of sets of measure zero has measure zero. In the
> hypothesized uniform measure over the integers each singleton set has
> measure zero. The integers are a countable union of singleton sets. So
> a countably additive measure must have a measure of zero for the entire
> space and it fails to be a probability space.

This is exactly correct. Things get more interesting if we only require
finite additivity. It is possible to have a finitely additive probability
measure on a countable set where the probability of each singleton is zero
but the probability of the entire set is one. This is possible because
countable collections of sets do not have to "add up" in probability.
There are some properties that a uniform probability measure on a
countable set must satisfy.

1) The probability of any finite set is zero.
2) The probability of any set with finite complement is one.

The problem, the place where the axiom of choice comes in, is in assiging
probabilities to infinite sets with infinite complements. For example,
what is the probability of the even naturals? What is the probability of
the odd naturals? What is the probability of the positive powers of two?
One might argue (though I would not) that the probability of the first two
is a half, but it is not so clear what the probability of the third should
be. Even if you could come up with a rule for assiging a probability to
this set, I could always come up with ever more complicated sets that your
rule would not apply to. Also, in assigning probabilities, care must be
taken so that finite additivity is preserved. This is were the axiom of
choice comes in. It allows one to make the (uncountable number of)
choices of probabilities for infinite sets with infinite complements in a
consistent manner.

Herman Rubin

unread,

Apr 20, 2004, 2:34:59 AM4/20/04

to

In article <c5mli1$1c1e$1...@lanczos.maths.tcd.ie>, <r...@maths.tcd.ie> wrote:
>frisbie...@yahoo.com (Patrick Powers) writes:

>>Daniel Waggoner <danielNowa...@mindspring.com> wrote in message news:<pan.2004.04.10....@mindspring.com>...
>>> As to the original question about natural numbers. If one requires that
>>> probability measures be countably additive, then there is on probability
>>> measure defined on the integers that I would want to label "uniform."
>>> Countably additive means that if {A_n} is a countable collection of
>>> disjoint (measurable) sets, then the probability of the union of the A_n
>>> is equal to the sum of the probabilities of each of the A_n. I certainly
>>> want my probability measures to be countably additive, but some authors
>>> argue that it is enough to be finitely additive.

>>It seems to me that no countable probability space for which each
>>element has probability zero can have a countably additive measure.
>>By a countably additive measure we mean that we have the useful
>>theorem that a countable union of sets of measure zero has measure
>>zero. In the hypothesized uniform measure over the integers each
>>singleton set has measure zero. The integers are a countable union of
>>singleton sets. So a countably additive measure must have a measure
>>of zero for the entire space and it fails to be a probability space.

>Right; the problem would be solved if our number system had genuine
>infinitesimals. Then each integer could be assigned a probability
>1/aleph_0 and everything would work out ok.

Wrong. In non-standard analysis, there exist
infinitesimals, but all of them are much smaller than
anything looking like that. All non-standard positive
integers have at least as many smaller integers as there
are ordinary real numbers. But they behave like finite
integers within the model.

Such number systems
>exist and there is nothing inherent in probability theory which
>ties it to real numbers except its current formulation.

Non-standard models of the real numbers are much more
different from the usual ones than you seem to think,
and are at the same time more similar.

We could
>even imagine a formal definition of probability distributions as
>equivalence classes of algorithms for producing numbers from "random"
>seeds. Then we could honestly claim to be talking about processes
>generating numbers when we do probability.

This is already the case in probability as we have it
now, but with random seeds being real numbers uniform
between 0 and 1.

>>> If one allows finitely
>>> additive probability measures, then one can defined a probability measure
>>> over the natural numbers that some might want to label "uniform." However,
>>> the construction of this measure uses the the axiom of choice (or at least
>>> some large portion of the axiom of choice).

>Do you have a reference for this?

One does not need much of the axiom of choice, but some
is needed. If one only wants some of the sets to be
measurable, nothing is needed; consider the field of
sets which are periodic from some point on, and give
it the limiting frequency. But what are you going to
do with it?

--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

Patrick Powers

unread,

Apr 20, 2004, 2:35:22 AM4/20/04

to

r...@maths.tcd.ie wrote in message news:<c5mli1$1c1e$1...@lanczos.maths.tcd.ie>...

> frisbie...@yahoo.com (Patrick Powers) writes:
>
> >Daniel Waggoner <danielNowa...@mindspring.com> wrote in message news:<pan.2004.04.10....@mindspring.com>...
> >> As to the original question about natural numbers. If one requires that
> >> probability measures be countably additive, then there is on probability
> >> measure defined on the integers that I would want to label "uniform."
> >> Countably additive means that if {A_n} is a countable collection of
> >> disjoint (measurable) sets, then the probability of the union of the A_n
> >> is equal to the sum of the probabilities of each of the A_n. I certainly
> >> want my probability measures to be countably additive, but some authors
> >> argue that it is enough to be finitely additive.
>
> >It seems to me that no countable probability space for which each
> >element has probability zero can have a countably additive measure.
> >By a countably additive measure we mean that we have the useful
> >theorem that a countable union of sets of measure zero has measure
> >zero. In the hypothesized uniform measure over the integers each
> >singleton set has measure zero. The integers are a countable union of
> >singleton sets. So a countably additive measure must have a measure
> >of zero for the entire space and it fails to be a probability space.
>
> Right; the problem would be solved if our number system had genuine
> infinitesimals. Then each integer could be assigned a probability
> 1/aleph_0 and everything would work out ok. Such number systems
> exist and there is nothing inherent in probability theory which
> ties it to real numbers except its current formulation.

Actually measure theory can be applied to just about any set. Complex
numbers, vector spaces, groups, rings, topological spaces, and so
forth are commonly used. I'd be very much surprised if non-standard
analysis had never been applied.

> We could
> even imagine a formal definition of probability distributions as
> equivalence classes of algorithms for producing numbers from "random"
> seeds. Then we could honestly claim to be talking about processes
> generating numbers when we do probability.
>
> >> If one allows finitely
> >> additive probability measures, then one can defined a probability measure
> >> over the natural numbers that some might want to label "uniform." However,
> >> the construction of this measure uses the the axiom of choice (or at least
> >> some large portion of the axiom of choice).
>
> Do you have a reference for this?
>
> R.

Finitely Additive Measures on Groups and Rings
with M. Pasteka, R. Tichy and R. Winkler
On arbitrary topological groups a natural finitely additive measure
can be defined via compactifications. It is closely related to
Hartman's concept of uniform distribution on non-compact groups (cf.
S. Hartman, Remarks on equidistribution on non-compact groups, Comp.
Math. 16 (1964) 66-71). Applications to several situations are
possible.

Some results of M. Pasteka and other authors on uniform distribution
with respect to translation invariant finitely additive probability
measures on Dedekind domains are transfered to more general
situations. Furthermore it is shown that the range of a polynomial of
degree >1 on a ring of algebraic integers has measure 0.

---
There seems to be a fair amount of literature under "finitely additive
measure".

I also tried "non-Kolmogorov probability" and found some Russians
refuting the non-locality result of Bell's theorem using the p-adics
to produce sets with negative probabilities. Humph.

r...@maths.tcd.ie

unread,

Apr 21, 2004, 4:25:17 PM4/21/04

to

hru...@stat.purdue.edu (Herman Rubin) writes:

>In article <c5mli1$1c1e$1...@lanczos.maths.tcd.ie>, <r...@maths.tcd.ie> wrote:
>>frisbie...@yahoo.com (Patrick Powers) writes:

>>Right; the problem would be solved if our number system had genuine
>>infinitesimals. Then each integer could be assigned a probability
>>1/aleph_0 and everything would work out ok.

>Wrong. In non-standard analysis, there exist
>infinitesimals, but all of them are much smaller than
>anything looking like that. All non-standard positive
>integers have at least as many smaller integers as there
>are ordinary real numbers. But they behave like finite
>integers within the model.

I was thinking of the surreals rather than the hyperreals;
I believe they have well-defined multiplicative inverses for
every non-zero number (including aleph_0) along with the same
rules of distributivity as real numbers. I'm don't know
very much about this, but wouldn't that be sufficient?

> We could
>>even imagine a formal definition of probability distributions as
>>equivalence classes of algorithms for producing numbers from "random"
>>seeds. Then we could honestly claim to be talking about processes
>>generating numbers when we do probability.

>This is already the case in probability as we have it
>now, but with random seeds being real numbers uniform
>between 0 and 1.

If that were true there would be a distribution for my process which
generates random integers, but there isn't.

>>>> If one allows finitely
>>>> additive probability measures, then one can defined a probability measure
>>>> over the natural numbers that some might want to label "uniform." However,
>>>> the construction of this measure uses the the axiom of choice (or at least
>>>> some large portion of the axiom of choice).

>>Do you have a reference for this?

>One does not need much of the axiom of choice, but some
>is needed. If one only wants some of the sets to be
>measurable, nothing is needed; consider the field of
>sets which are periodic from some point on, and give
>it the limiting frequency. But what are you going to
>do with it?

Just some of the sets being measurable isn't satisfying, but
Daniel Waggoner has explained it in another post (thanks, Daniel).

R.

r...@maths.tcd.ie

unread,

Apr 22, 2004, 3:29:04 PM4/22/04

to

Arnold Neumaier <Arnold....@univie.ac.at> writes:

>In probability theory, a random number is just a random variable x,
>i.e., a measurable function on the sigma algebra of measurable subsets
>of the set Omega of possible experiments.

>For each experiment omega in Omega, x(omega) is a realization, i.e.,
>the number drawn in this particular experiment.

The formalism of probability theory is not in dispute. The
interpretation in terms of randomness is. I showed that a method
of generating reals between zero and one can be modified to generate
integers instead, and further that if no one real is any more or
less likely to be generated than any other, then the same holds for
the corresponding integers generated. The implication of this is
that statements which are commonly made within probability theory,
of the sort "There is no process satisfying this or that", are *not*
in fact statements about processes in the way that we think of them
- processes generating uniformly distributed integers do in fact
exist if we accept the axiom of choice, or at least they exist if
processes generating uniformly distributed reals in [0,1) exist.

R.

Ralph E. Frost

unread,

Apr 22, 2004, 3:44:39 PM4/22/04

to

"Italo Vecchi" <vec...@weirdtech.com> wrote in message

news:61789046.04041...@posting.google.com...

> Arnold Neumaier <Arnold....@univie.ac.at> wrote in message
news:<407BE49...@univie.ac.at>...
> ...
> > Thus a competent physicist goes
> > against mainstream only if the evidence is overwhelming.
> >

> Indeed. Nothing to ...
...snip...

> of Aharonov and Zeilinger caliber) seem uninterested in pressing the
> point, although the factual evidence is rock solid.

Are you both saying that univalence superselection is testably false but
that this is an absolutely irrelevant fact?
Or, are you both saying there is sound scientific evidence of a valid
anomally, but that it is presently, seemingly insufficient, or deemed
insufficient, by itself, to modulate the mesmerizing chant of the dominant
trial theory?

Also, if it is an anomally, how does this small anomally compare with --Was
it?-- the "UV catastrophe" at the start of the 20th century?

Thanks, in advance, for any additions or corrections you or others can
offer.

Ralph Frost

Arnold Neumaier

unread,

Apr 22, 2004, 3:47:52 PM4/22/04

to

But it is permitted to talk about realizations, which are just
function values f(omega). By giving a specific definition of the
sigma algebra of interest, and specific recipes defining f(omega),
one has model worlds in which realizations make perfect sense.
The caveat is, of course, that for the real world, we do not have
such a model.

Arnold Neumaier

Herman Rubin

unread,

Apr 22, 2004, 4:29:09 PM4/22/04

to

In article <c64730$2cgb$1...@lanczos.maths.tcd.ie>, <r...@maths.tcd.ie> wrote:
>hru...@stat.purdue.edu (Herman Rubin) writes:

>>In article <c5mli1$1c1e$1...@lanczos.maths.tcd.ie>, <r...@maths.tcd.ie> wrote:
>>>frisbie...@yahoo.com (Patrick Powers) writes:

>>>Right; the problem would be solved if our number system had genuine
>>>infinitesimals. Then each integer could be assigned a probability
>>>1/aleph_0 and everything would work out ok.

>>Wrong. In non-standard analysis, there exist
>>infinitesimals, but all of them are much smaller than
>>anything looking like that. All non-standard positive
>>integers have at least as many smaller integers as there
>>are ordinary real numbers. But they behave like finite
>>integers within the model.

>I was thinking of the surreals rather than the hyperreals;
>I believe they have well-defined multiplicative inverses for
>every non-zero number (including aleph_0) along with the same
>rules of distributivity as real numbers. I'm don't know
>very much about this, but wouldn't that be sufficient?

Probably not. They do not behave enough like the real
numbers for much to work.

>> We could
>>>even imagine a formal definition of probability distributions as
>>>equivalence classes of algorithms for producing numbers from "random"
>>>seeds. Then we could honestly claim to be talking about processes
>>>generating numbers when we do probability.

>>This is already the case in probability as we have it
>>now, but with random seeds being real numbers uniform
>>between 0 and 1.

>If that were true there would be a distribution for my process which
>generates random integers, but there isn't.

How can you generate such "random integers"?

>>>>> If one allows finitely
>>>>> additive probability measures, then one can defined a probability measure
>>>>> over the natural numbers that some might want to label "uniform." However,
>>>>> the construction of this measure uses the the axiom of choice (or at least
>>>>> some large portion of the axiom of choice).

>>>Do you have a reference for this?

>>One does not need much of the axiom of choice, but some
>>is needed. If one only wants some of the sets to be
>>measurable, nothing is needed; consider the field of
>>sets which are periodic from some point on, and give
>>it the limiting frequency. But what are you going to
>>do with it?

>Just some of the sets being measurable isn't satisfying, but
>Daniel Waggoner has explained it in another post (thanks, Daniel).

--

Ralph Hartley

unread,

Apr 22, 2004, 4:30:57 PM4/22/04

to

Patrick Powers wrote:
> r...@maths.tcd.ie wrote:

>>Right; the problem would be solved if our number system had genuine
>>infinitesimals. Then each integer could be assigned a probability
>>1/aleph_0 and everything would work out ok. Such number systems
>>exist and there is nothing inherent in probability theory which
>>ties it to real numbers except its current formulation.
>
> Actually measure theory can be applied to just about any set. Complex
> numbers, vector spaces, groups, rings, topological spaces, and so
> forth are commonly used.

You can define measures on any set, but I don't think that's what he means.

He wants to consider measures taking *values* in a field larger than R. For
instance you might want to consider Surreal valued measures. (according to
another message that that *is* what he is thinking of).

Then you could have a countably additive measure on the integers that
assigns each integer a probability 1/omega. There are omega integers so
that measure is normalized. A finite subset of the integers with N members
would have probability N/omega (which in the surreals is different from 0
and from 1/omega).

Of course, that still doesn't give a good definition of "a randomly chosen
integer," but it might give a definition of "a (well behaved) function of a
randomly chosen integer." So you might be able to give a consistent meaning
to the sentence "The probability that a randomly chosen integer is even is
1/2." (for the measure given above it is true)

You need to be careful though! Integration, and limits in general, are
tricky in the Surreals, because of the gaps.

Quantum Mechanics (to get back to physics) can be formulated in terms of
measures taking different sorts of values (complex numbers, operators,
etc.), so you would think that someone would have worked all this out, but
I haven't seen it.

Ralph Hartley

Arnold Neumaier

unread,

Apr 24, 2004, 12:16:56 PM4/24/04

to

r...@maths.tcd.ie wrote:
> Arnold Neumaier <Arnold....@univie.ac.at> writes:
>
>
>>In probability theory, a random number is just a random variable x,
>>i.e., a measurable function on the sigma algebra of measurable subsets
>>of the set Omega of possible experiments.
>
>
>>For each experiment omega in Omega, x(omega) is a realization, i.e.,
>>the number drawn in this particular experiment.
>
>
> The formalism of probability theory is not in dispute. The
> interpretation in terms of randomness is. I showed that a method
> of generating reals between zero and one can be modified to generate
> integers instead, and further that if no one real is any more or
> less likely to be generated than any other, then the same holds for
> the corresponding integers generated.

This is not conclusive. Without a formal probability model, the notion
of 'any more likely' does not make sense. If you have a uniform random
number generator in [0,1] genrating and you transform the results x you
get by some function phi, the phi(x) are generally no longer uniformly
distributed.

A uniform distribution of integers would have to be one in which
the conditional probability that you draw i given that you know already
that i in [1:N] should be 1/N for all i. This requires
Pr(i)=Pr(i|[1:N])*Pr([1:N])=Pr([1:N])/N independent of i,
hence Pr(i)=p is constant, so that the sum of all probabilities diverges
instead of being 1.

By a similar reasoning one sees that in any fixed probability
distribution, sufficiently large integers are extremely unlikely.

Arnold Neumaier

unread,

Apr 24, 2004, 12:17:32 PM4/24/04

to

Ralph E. Frost wrote:
> "Italo Vecchi" <vec...@weirdtech.com> wrote in message
> news:61789046.04041...@posting.google.com...
>
>>Arnold Neumaier <Arnold....@univie.ac.at> wrote in message
>
> news:<407BE49...@univie.ac.at>...
>
>>...
>>
>>>Thus a competent physicist goes
>>>against mainstream only if the evidence is overwhelming.
>>>
>>
>>Indeed. Nothing to ...
>
> ...snip...
>
>>of Aharonov and Zeilinger caliber) seem uninterested in pressing the
>>point, although the factual evidence is rock solid.
>
>
> Are you both saying that univalence superselection is testably false but
> that this is an absolutely irrelevant fact?
> Or, are you both saying there is sound scientific evidence of a valid
> anomally, but that it is presently, seemingly insufficient, or deemed
> insufficient, by itself, to modulate the mesmerizing chant of the dominant
> trial theory?

I was saying nothing about anomalies, only warning that to seriously
go against mainstream opinion (no matter about which subject)
requires much higher standards of care than when npresenting
mainstream research.

Arnold Neumaier

r...@maths.tcd.ie

unread,

Apr 24, 2004, 12:17:49 PM4/24/04

to

hru...@stat.purdue.edu (Herman Rubin) writes:

>>> We could
>>>>even imagine a formal definition of probability distributions as
>>>>equivalence classes of algorithms for producing numbers from "random"
>>>>seeds. Then we could honestly claim to be talking about processes
>>>>generating numbers when we do probability.

>>>This is already the case in probability as we have it
>>>now, but with random seeds being real numbers uniform
>>>between 0 and 1.

>>If that were true there would be a distribution for my process which
>>generates random integers, but there isn't.

>How can you generate such "random integers"?

The original description is at:
http://groups.google.com/groups?selm=c3njd0%24295l%241%40lanczos.maths.tcd.ie

Basically, [0,1) can be expressed as a countable disjoint union of
sets which are images of each other under translations (with a
wraparound at 1). Given that translations are isometries, the sets
are "the same size" and any seed in [0,1) falls into exactly one
of these sets, which can be put into correspondence with a positive
integer, since the number of sets involved is countable. The sets
aren't measurable, which is why there's no corresponding distribution
over the integers.

R.

Daniel Waggoner

unread,

Apr 27, 2004, 2:40:21 PM4/27/04

to

I don't see why, from the definition given above, the probability that a
randomly chosen integer is even is 1/2. I assume that you want to use the
property that the probability of the union of a countable number of
disjoint sets is equal to the sum of the individual probabilities. I am
not an expert on the surreals, but I suspect that infinite sums of
surreals are even more tricky that in the standard reals.

If one attempted to define infinite sums via limits, then in the order
topology on the surreals the sequence n/omega cannot converge to 1/2, or
any other positive standard real for that matter. It may the the case
that one could use a different topology to define the convergence of an
infinite sequence or that infinite sums could be defined in a completely
different manner, but I am skeptical that a rigorous argument for the
above claims could be given. In particular, how could the countable sum
of 1/omega be different if we count by twos instead of by ones!

> You need to be careful though! Integration, and limits in general, are
> tricky in the Surreals, because of the gaps.
>
> Quantum Mechanics (to get back to physics) can be formulated in terms of
> measures taking different sorts of values (complex numbers, operators,
> etc.), so you would think that someone would have worked all this out,
> but I haven't seen it.
>
> Ralph Hartley

Daniel Waggoner

Italo Vecchi

unread,

Apr 27, 2004, 2:42:32 PM4/27/04

to

"Ralph E. Frost" <ra...@REMOVErefrost.com> wrote in message news:<1088k82...@corp.supernews.com>...

>
> Are you both saying that univalence superselection is testably false but
> that this is an absolutely irrelevant fact?
> Or, are you both saying there is sound scientific evidence of a valid
> anomally, but that it is presently, seemingly insufficient, or deemed
> insufficient, by itself, to modulate the mesmerizing chant of the dominant
> trial theory?

I guess it depends on people's agendas. Zeh for example writes that
"in spite of the success of the superposition principle it is evident
[huh?] that not all superpositions are found in nature. This led some
physicists to postulate superselection rules which restrict this
principle by axiomatically excluding certain superpositions" but then
he adds "most disturbing ... seem to be superpositions of states with
integer and half-integer spin (bosons and fermions). They violate
invariance under 2pi rotations ... but such a non-invariance has been
experimentally confirmed ... ." [1]. The brackets are mine. Zeh has
his own recipe to deal with weird superpositions. It goes under the
name of decoherence theory and is very fashionable this days.

>
> Also, if it is an anomally, how does this small anomally compare with --Was
> it?-- the "UV catastrophe" at the start of the 20th century?

Call it small. Besides, anomaly is in the eye of the beholder. I find
the very idea of superselection fishy.

IV

[1] http://arxiv.org/PS_cache/quant-ph/pdf/9506/9506020.pdf

----------------------------------------

"If you're so so smart, how come you're a scientist?"

r.e.s.

unread,

Apr 27, 2004, 2:49:24 PM4/27/04

to

"r.e.s." <r...@ZZmindspring.com> wrote ...
> "Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...
>
> > The natural least informative distribution on natural numbers
> > is a Poisson distribution, but here one has to be at least informed
> > about the mean.
>
> It would be Poisson if, in addition to the mean, one knew (only)
> that the distribution is that of a sum of independent Bernoulli
> rv's. It may be more "natural" not to assume any such additional
> knowledge, in which case the max-entropy distribution would be
> geometric rather than Poisson.

I would like to state that more precisely ...

The maximum-entropy distribution on the natural numbers, constrained
only to have a given mean, is geometric, not Poisson. (It's a shifted
version of the usual geometric distribution if 0 is included in the
naturals. This is just the Boltzmann distribution for energy-levels
that are the natural numbers scaled by an appropriate constant.)

Even if more than the mean is known, the Poisson distribution still
doesn't arise if the additional constraints are all just mean values
of various functions (e.g. higher moments). If A denotes the Poisson
distribution with mean m, then Entropy(A) = sup{Entropy(B): B in S},
where S is the set of distributions of mean-m sums of finitely-many
independent Bernoulli rv's. (A isn't in S, but it is the limit of a
sequence in S.)

--r.e.s.

Esa A E Peuha

unread,

Apr 28, 2004, 2:49:19 AM4/28/04

to

vec...@weirdtech.com (Italo Vecchi) writes:

> Embarassingly, there are experimental results ( [1], [2], [5] , [7])
> exhibiting broken 2pi-periodicity.

You seem to have left out the references in your post. Could you post
them again?

--
Esa Peuha
student of mathematics at the University of Helsinki
http://www.helsinki.fi/~peuha/

Arnold Neumaier

unread,

Apr 28, 2004, 6:11:58 AM4/28/04

to

r.e.s. wrote:
> "r.e.s." <r...@ZZmindspring.com> wrote ...
>
>>"Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...
>>
>>
>>>The natural least informative distribution on natural numbers
>>>is a Poisson distribution, but here one has to be at least informed
>>>about the mean.
>>
>>It would be Poisson if, in addition to the mean, one knew (only)
>>that the distribution is that of a sum of independent Bernoulli
>>rv's. It may be more "natural" not to assume any such additional
>>knowledge, in which case the max-entropy distribution would be
>>geometric rather than Poisson.
>
>
> I would like to state that more precisely ...
>
> The maximum-entropy distribution on the natural numbers, constrained
> only to have a given mean, is geometric, not Poisson. (It's a shifted
> version of the usual geometric distribution if 0 is included in the
> naturals. This is just the Boltzmann distribution for energy-levels
> that are the natural numbers scaled by an appropriate constant.)

The maximum entropy solution
S(rho)= <-rho(x)> = min!
for a distribution with density rho(x) depends on whether we define
densities rho of a random natural number x by
<f(x)> = sum_n rho(n) f(n)
or
<f(x)> = sum_n rho(n) f(n)/n!
corresponding to different choices of priors.

The first (your) choice gives rho(n) = Pr(n), while the second (my)
choice is more useful in most circumstances I came across. For example,
it is the right prior in statistical mechanics of systems with
indefinite number n of particles ('correct Boltzmann counting').

One of the problems of the Bayesian approach is that one always
needs a prior before information theoretic arguments make sense.
If there is doubt about the former the results become doubtful, too.

In particular, information theory in statistical mechanics works
out correctly _only_ if one used the right prior (mine).
That the prior is objectively determined is strange for a subjective
approach as the information theoretic one, and casts doubt on the
relevance of information theory in the foundations.

Arnold Neumaier

Marc Nardmann

unread,

Apr 28, 2004, 2:30:18 PM4/28/04

to

r...@maths.tcd.ie wrote:

> Right; the problem would be solved if our number system had genuine
> infinitesimals. Then each integer could be assigned a probability

> 1/aleph_0 and everything would work out ok. Such number systems

> exist and there is nothing inherent in probability theory which
> ties it to real numbers except its current formulation.

There is something very important which ties (countably additive)
probability theory to the real numbers; see below.

> I was thinking of the surreals rather than the hyperreals;
> I believe they have well-defined multiplicative inverses for
> every non-zero number (including aleph_0) along with the same
> rules of distributivity as real numbers.

Yes, they are equipped with the structure of an ordered field. (The
field is not a set but a proper class. But that does not matter for the
discussion here; you can consider suitable subfields of the surreal
numbers which are sets.)

> I'm don't know
> very much about this, but wouldn't that be sufficient?

No. Look at the definition of a real-valued measure: There is a
condition about countable additivity. One side of this equation is a
countable infinite sum of nonnegative real numbers. A countable sum of
nonnegative real numbers is well-defined as a supremum in [0,infinity].
Now try to define the sum of a countable infinite sequence of
nonnegative surreal numbers and you will see the problem.

If one can circumvent this problem at all, then the result will probably
be neither more practical nor more aesthetical than standard measure
theory. Of course, you can prove me wrong in this respect by inventing a
nice definition of a surreal-valued measure. But that is certainly not
trivial.

-- Marc Nardmann

(To reply, remove every occurrence of a certain letter from my e-mail
address.)

Italo Vecchi

unread,

Apr 28, 2004, 3:28:02 PM4/28/04

to

Esa A E Peuha <esa....@helsinki.fi> wrote in message news:<86poepd...@sirppi.helsinki.fi>...

> vec...@weirdtech.com (Italo Vecchi) writes:
>
> > Embarassingly, there are experimental results ( [1], [2], [5] , [7])
> > exhibiting broken 2pi-periodicity.
>
> You seem to have left out the references in your post. Could you post
> them again?

They are in my subsequent post:
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=61789046.0404120953.585cdad5%40posting.google.com

IV

r.e.s.

unread,

Apr 30, 2004, 3:03:21 AM4/30/04

to

"Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...

> r.e.s. wrote:

> >>"Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...
> >>>The natural least informative distribution on natural numbers
> >>>is a Poisson distribution, but here one has to be at least informed
> >>>about the mean.

> > The maximum-entropy distribution on the natural numbers, constrained

> > only to have a given mean, is geometric, not Poisson. (It's a shifted
> > version of the usual geometric distribution if 0 is included in the
> > naturals. This is just the Boltzmann distribution for energy-levels
> > that are the natural numbers scaled by an appropriate constant.)
>
> The maximum entropy solution
> S(rho)= <-rho(x)> = min!

^^^^^^^^^^^^^^^^
?

> for a distribution with density rho(x) depends on whether we define
> densities rho of a random natural number x by
> <f(x)> = sum_n rho(n) f(n)
> or

> <f(x)> = sum_n rho(n) f(n)/n! [*]

> corresponding to different choices of priors.

I'm not sure what was intended, but your mention of priors suggests
that perhaps you meant to write the "relative entropy":
S(rho)= <log(rho(x)/p(x)> = min!
where the expectation is wrt probability density rho(), and p() is a
prior probability density -- but then the minus sign is out of place.

In any case, something else is amiss, for [*] contradicts the stated
assumption that rho() is a probability density; that is,
[*] ==> <1> = sum_n rho(n)/n! = 1, contradicting sum_n rho(n) = 1
-- the latter being required of rho() as a probability density.)

Would you mind stating explicitly the prior probability density
that you intend as corresponding to [*]?

> The first (your) choice gives rho(n) = Pr(n), while the second (my)
> choice is more useful in most circumstances I came across.

-snip-

If one wants to consider prior states, it's not hard to see that
in the present case the geometric distribution is a limiting
min-relative-entropy result for a sequence of uniform priors on
{0,...,n-1}, as n -> oo. NB: The min-relative-entropy distributions
corresponding to these priors approach a limit distribution, namely
geometric, even though the priors themselves do not have a limit
distribution (there being no uniform distribution on the naturals).

However, the original point remains independent of priors ...
If a Poisson distribution and a geometric distribution on the same
set have the same mean, then the geometric distribution necessarily
has greater Shannon entropy than does the Poisson distribution. A
reasonable interpretation of this is that the geometric distribution
represents a state of knowledge that incorporates less information
than does a state corresponding to the Poisson, *regardless* of how
those states come about.

--r.e.s.

Arnold Neumaier

unread,

Apr 30, 2004, 8:04:14 AM4/30/04

to

r.e.s. wrote:
> "Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...
>
>>r.e.s. wrote:
>
>>>>"Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...
>>>>
>>>>>The natural least informative distribution on natural numbers
>>>>>is a Poisson distribution, but here one has to be at least informed
>>>>>about the mean.
>
>>>The maximum-entropy distribution on the natural numbers, constrained
>>>only to have a given mean, is geometric, not Poisson. (It's a shifted
>>>version of the usual geometric distribution if 0 is included in the
>>>naturals. This is just the Boltzmann distribution for energy-levels
>>>that are the natural numbers scaled by an appropriate constant.)
>>
>>The maximum entropy solution
>> S(rho)= <-rho(x)> = min!

Sorry, I meant
S(rho)= <-log(rho(x))> = max!
I must have been quite tired when I wrote this.

>>for a distribution with density rho(x) depends on whether we define
>>densities rho of a random natural number x by
>> <f(x)> = sum_n rho(n) f(n)
>>or
>> <f(x)> = sum_n rho(n) f(n)/n! [*]
>>corresponding to different choices of priors.
>
>
> I'm not sure what was intended, but your mention of priors suggests
> that perhaps you meant to write the "relative entropy":
> S(rho)= <log(rho(x)/p(x)> = min!
> where the expectation is wrt probability density rho(), and p() is a
> prior probability density -- but then the minus sign is out of place.

This is an almost equivalent formulation. See below.

> In any case, something else is amiss, for [*] contradicts the stated
> assumption that rho() is a probability density; that is,
> [*] ==> <1> = sum_n rho(n)/n! = 1, contradicting sum_n rho(n) = 1
> -- the latter being required of rho() as a probability density.)

No - probability densities and probabilities are distinct notions.
From (*) you can see that the probaility to get n is
p_n = rho(n)/n!

> However, the original point remains independent of priors ...
> If a Poisson distribution and a geometric distribution on the same
> set have the same mean, then the geometric distribution necessarily
> has greater Shannon entropy than does the Poisson distribution.

Yes.

> A reasonable interpretation of this is that the geometric distribution
> represents a state of knowledge that incorporates less information
> than does a state corresponding to the Poisson, *regardless* of how
> those states come about.

You only proved that it incorporates less 'Shannon entropy'.
But the identification of 'information' and 'Shannon entropy' is
dubious for situations with infinitely many alternatives.
Shannon assumes in his analysis that in the absence of knowledge,
all alternatives are equally likely, which makes no sense
in the infinite case (and may even be debated in the finite case).

Here is a more careful setting that should explain our differences:

For a probability distribution on a finite set of alternatives,
given by probabilities p_n summing to 1, the Shannon entropy is
defined by
S = - sum p_n log_2 p_n.
The main use of the entropy concept is the maximum entropy principle,
used to define various interesting ensembles by maximizing the entropy
subject to constraints defined by known expectation values
<f> = sum P_n f(n)
for certain key observables f.

If the number of alternatives is infinite, this formula must be
appropriately generalized. In the literature, one finds various
possibilities, the most common being, for random vectors with
probability density p(x), the absolute entropy
S = - k_B integral dx p(x) log p(x)
with the Boltzmann constant k_B and Lebesgue measure dx.
The value of the Boltzmann constant k_B is conventional and has no
effect on the use of entropy in applications.
There is also the relative entropy
S = - k_B integral dx p(x) log (p(x)/p_0(x)),
which involves an arbitrary positive function p_0(x). If p_0(x)
is a probability density then the relative entropy is nonnegative.

For a probability distribution over an _arbitrary_ sigma algebra,
the absolute entropy makes no sense since there is no distinguished
measure and hence no probability density. Thus one needs to assume a
measure to be able to define a probability density (namely as the
Radon-Nikodym derivative, assuming it exists). This measure is
called the prior (it is often improper = not normalizable).
Once one has specified a prior dmu,
<f(x)> = integral dmu(x) rho(x) f(x)
defines the density rho(x), and then
S(rho)= <-k_B log(rho(x))>
defines the entropy with respect to this prior. Note that the
condition for rho to define a probability density is
integral dmu(x) rho(x) = <1> = 1.

In many cases, symmetry considerations suggest a unique natural prior.
For random variables on a homogeneous space, the conventional measure
is the invariant Haar measure. In particular, for probability theory
of finitely many alternatives, it is conventional to consider the
symmetric group on the set of alternatives and take as the prior the
uniform measure, giving
<f(x)> = sum_x rho(x) f(x).
The density rho(x) agrees with the probability p_x, and the
corresponding entropy is the Shannon entropy is one takes k_B=1/log2.

For random variables whose support is R or R^n, the conventional
symmetry group is the translation group, and the corresponding
(improper) prior is the Lebesgue measure. In this case one obtains
the absolute entropy given above. But one could also take as prior
a noninvariant measure
dmu(x) = dx p_0(x);
then the density becomes rho(x)=p(x)/p_0(x), and one arrives at the
relative entropy.

If there is no natural transitive symmetry group, there is no natural
prior, and one has to make other useful choices. In particular, this
is the case for random natural numbers.

Choice A. Treating the natural numbers as a limiting situation of
finite interval [0:n] suggests to use the measure with
integral dmu(x) phi(x) = sum_n phi(n)
as (improper) prior, making

<f(x)> = sum_n rho(n) f(n)

the definition of the density; in this case, p_n=rho(n) is the
probability of getting n.

Choice B. Statistical mechanics suggests to use instead the measure
with
integral dmu(x) phi(x) = sum_n phi(n)/n!
as prior, making

<f(x)> = sum_n rho(n) f(n)/n!

the definition of the density; in this case, p_n=rho(n)/n! is the
probability of getting n.

The maximum entropy ensemble defined by given expectations depends on
the prior chosen. In particular, if the mean of a random natural number
is given, choice A leads to a geometric distribution, while
choice B leads to a Poisson distribution. The latter is the one
relevant for statistical mechanics. Indeed, choice B is the prior
needed in statistical mechanics of systems with an indefinite
number n of particles to get the correct Boltzmann counting in the
grand canonical ensemble. With choice A, the maximum entropy
solution is unrelated to the distributions arising in statistical
mechanics.

Arnold Neumaier

Ralph Hartley

unread,

May 1, 2004, 8:52:32 AM5/1/04

to

Daniel Waggoner wrote:
> Ralph Hartley wrote:

>>Then you could have a countably additive measure on the integers that
>>assigns each integer a probability 1/omega. There are omega integers so
>>that measure is normalized. A finite subset of the integers with N
>>members would have probability N/omega (which in the surreals is
>>different from 0 and from 1/omega).
>>

>>it might give a definition of "a (well behaved)
>>function of a randomly chosen integer." So you might be able to give a
>>consistent meaning to the sentence "The probability that a randomly
>>chosen integer is even is 1/2." (for the measure given above it is true)
>
> I don't see why, from the definition given above, the probability that a
> randomly chosen integer is even is 1/2.

Actually I rather doubt that I have uniquely described a measure. But given
such a measure it is at least a meaningful statement, and it is defined to
be true if the measure of the set of even integers is 1/2.

When I said:
>> There are omega integers

That was a bit of a fib. The number of integers is a Cardinal, and numbers
like omega and omega/2 are Surreals. Cardinals can be viewed as equivalence
classes of Surreals with omega and omega/2 in the same class. So just
saying that the measure of a set of size S is S/omega does not uniquely
describe a measure.

> I assume that you want to use the
> property that the probability of the union of a countable number of
> disjoint sets is equal to the sum of the individual probabilities. I am
> not an expert on the surreals, but I suspect that infinite sums of
> surreals are even more tricky that in the standard reals.

That is so, and limits are worse. One might *define* sums in terms of a
measure, with the requirement that sums that *do* converge have the "right"
value.

> but I am skeptical that a rigorous argument for the
> above claims could be given.

I have claimed nothing. I said you might be able to do something
consistent, but I certainly haven't done it! But I haven't seen a proof
that it can't be done either.

> In particular, how could the countable sum
> of 1/omega be different if we count by twos instead of by ones!

One has twice as many terms as the other, though both are countably
infinite. Of course, this isn't the normal way of describing of the size of
a set, and I don't know if it can be made to make sense.

Ralph Hartley

r.e.s.

unread,

May 3, 2004, 1:18:30 PM5/3/04

to

"Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...
> r.e.s. wrote:
> > "Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...
> >>r.e.s. wrote:
> >>>>"Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...
> >>>>
> >>>>>The natural least informative distribution on natural numbers
> >>>>>is a Poisson distribution, but here one has to be at least informed
> >>>>>about the mean.
> >
> >>>The maximum-entropy distribution on the natural numbers, constrained
> >>>only to have a given mean, is geometric, not Poisson. (It's a shifted
> >>>version of the usual geometric distribution if 0 is included in the
> >>>naturals. This is just the Boltzmann distribution for energy-levels
> >>>that are the natural numbers scaled by an appropriate constant.)
> >>
> >>The maximum entropy solution

> S(rho)= <-log(rho(x))> = max!

> >>for a distribution with density rho(x) depends on whether
> >>we define densities rho of a random natural number x by

> >> <f(x)> = sum_n rho(n) f(n) [A]
> >>or
> >> <f(x)> = sum_n rho(n) f(n)/n! [B]

> >>corresponding to different choices of priors.

<snip>

Without the measure-theoretic framework, it was difficult to see
what was meant, so thanks for the detailed explanation below. Yes,
of course constrained maximization of
- integral dmu(x) rho(x) log(rho(x)) [1]
leads to different densities rho with respect to the two different
measures mu, which in the present problem can be characterized by
their behavior on the singleton sets {n} (n in {0,1,2,...}):

(A) mu_A({n}) = 1
(B) mu_B({n}) = 1/n!

Jaynes' "finite sets policy" (see below) might help to see why
(A) is the correct choice here.

> > However, the original point remains independent of priors ...
> > If a Poisson distribution and a geometric distribution on the same
> > set have the same mean, then the geometric distribution necessarily
> > has greater Shannon entropy than does the Poisson distribution.
>
> Yes.
>
> > A reasonable interpretation of this is that the geometric distribution
> > represents a state of knowledge that incorporates less information
> > than does a state corresponding to the Poisson, *regardless* of how
> > those states come about.
>
> You only proved that it incorporates less 'Shannon entropy'.
> But the identification of 'information' and 'Shannon entropy' is
> dubious for situations with infinitely many alternatives.
> Shannon assumes in his analysis that in the absence of knowledge,
> all alternatives are equally likely, which makes no sense
> in the infinite case (and may even be debated in the finite case).

It's wise to distrust pat answers in the infinite case, but in the
**finite** case, the question has been effectively settled, IMO.
In the simplest version, if one must assign "state of knowledge"
probabilities concerning just two alternatives, any nonuniform
assignment clearly represents information that favors one of the
two alternatives. The same reasoning extends to other finite sets,
or you could appeal to an invariance argument such as you mention
below.

For distributions on *infinite* spaces, it's useful to consider
Jaynes' "finite sets policy". His last book was published
posthumously, and the following is from a pre-publication version:

"Our conclusion -- based on some forty years of mathematical
efforts and experience with real problems -- is that, at least
in probability theory, an infinite set should be thought of
only as the limit of a specific (i.e. unambiguously specified)
sequence of finite sets. Likewise, an improper pdf has meaning
only as the limit of a well-defined sequence of proper pdf's.
... Indeed, experience to date shows that almost any attempt
to depart from our recommended `finite sets' policy has the
potentiality for generating a paradox, in which two equally
valid methods of reasoning lead us to contradictory results."

(He devoted an entire chapter to illustrating how errors easily
arise when this policy is not followed.)

If we apply a finite-sets policy to the present problem, together
with your view of mu as a prior, then we should choose which of
the two priors, mu_A or mu_B, represents the least information
when the basic set is finite, say {0,...,n-1}, and no other
information is given (since these are *prior* to any such). Both
measures are then normalizable, mu_A being a uniform probability
distribution and mu_B being a particular Poisson probability
distribution, both restricted to the finite space.

But in each of the finite cases, any nonuniform distribution cannot
possibly be the least informative one; rather, a uniform
distribution is seen to be inescapable for each of the finite sets
{0,...,n-1}. Consequently, the constrained maximizations [1] on
these finite sets produce a sequence of geometric distributions
converging to the geometric distribution on the infinite set.

I admit, though, that in the statistical mechanics of many-particle
systems, there might be physical justifications for a Poisson prior
to represent some kind of information before the mean value is
specified -- e.g. to get the 'correct Boltzmann counting' that you
mention. But there is no such physical context in the present
problem. (I had cited the Boltzmann distribution only to give a
familiar example that formally can reduce to a geometric.)

> Here is a more careful setting that should explain our differences:
>
> For a probability distribution on a finite set of alternatives,
> given by probabilities p_n summing to 1, the Shannon entropy is
> defined by
> S = - sum p_n log_2 p_n.
> The main use of the entropy concept is the maximum entropy principle,
> used to define various interesting ensembles by maximizing the entropy
> subject to constraints defined by known expectation values
> <f> = sum P_n f(n)
> for certain key observables f.
>
> If the number of alternatives is infinite, this formula must be
> appropriately generalized. In the literature, one finds various
> possibilities, the most common being, for random vectors with
> probability density p(x), the absolute entropy
> S = - k_B integral dx p(x) log p(x)
> with the Boltzmann constant k_B and Lebesgue measure dx.
> The value of the Boltzmann constant k_B is conventional and has no
> effect on the use of entropy in applications.
> There is also the relative entropy
> S = - k_B integral dx p(x) log (p(x)/p_0(x)),
> which involves an arbitrary positive function p_0(x). If p_0(x)
> is a probability density then the relative entropy is nonnegative.

^^^^^^^^^^^
No, it won't be nonnegative. Relative entropy by your definition
is non*positive*. Suppose P, Q are probability measures with
P << m, Q << m for positive measure m. Then the Radon-Nikodym
derivatives satisfy

integral dm (dP/dm) log( (dP/dm)/(dQ/dm) ) >= 0

which provides the more usual definition of relative entropy.
Of course, maximizing your nonpositive quantity leads to the
same results as minimizing the nonnegative one.

> For a probability distribution over an _arbitrary_ sigma algebra,
> the absolute entropy makes no sense since there is no distinguished
> measure and hence no probability density. Thus one needs to assume a
> measure to be able to define a probability density (namely as the
> Radon-Nikodym derivative, assuming it exists). This measure is
> called the prior (it is often improper = not normalizable).
> Once one has specified a prior dmu,
> <f(x)> = integral dmu(x) rho(x) f(x)
> defines the density rho(x), and then
> S(rho)= <-k_B log(rho(x))>
> defines the entropy with respect to this prior. Note that the
> condition for rho to define a probability density is
> integral dmu(x) rho(x) = <1> = 1.
>
> In many cases, symmetry considerations suggest a unique natural prior.
> For random variables on a homogeneous space, the conventional measure
> is the invariant Haar measure. In particular, for probability theory
> of finitely many alternatives, it is conventional to consider the
> symmetric group on the set of alternatives and take as the prior the
> uniform measure, giving
> <f(x)> = sum_x rho(x) f(x).
> The density rho(x) agrees with the probability p_x, and the
> corresponding entropy is the Shannon entropy is one takes k_B=1/log2.

Of course the above argument acquires special significance
if a comprehensive finite-sets policy can be implemented.

> For random variables whose support is R or R^n, the conventional
> symmetry group is the translation group, and the corresponding
> (improper) prior is the Lebesgue measure. In this case one obtains
> the absolute entropy given above. But one could also take as prior
> a noninvariant measure
> dmu(x) = dx p_0(x);
> then the density becomes rho(x)=p(x)/p_0(x), and one arrives at the
> relative entropy.

OK, but as mentioned above, this 'relative entropy' is nonpositive.

> If there is no natural transitive symmetry group, there is no natural
> prior, and one has to make other useful choices. In particular, this
> is the case for random natural numbers.

If we adopt a finite-sets policy, together with your view of the
measure mu as a prior, the above is shown to be a non sequitur.
The reason you don't see the natural prior is that you're not
examining the priors in terms of their representative sequences
on *finite* spaces.

> Choice A. Treating the natural numbers as a limiting situation of
> finite interval [0:n] suggests to use the measure with
> integral dmu(x) phi(x) = sum_n phi(n)
> as (improper) prior, making
> <f(x)> = sum_n rho(n) f(n)
> the definition of the density; in this case, p_n=rho(n) is the
> probability of getting n.
>
> Choice B. Statistical mechanics suggests to use instead the measure
> with
> integral dmu(x) phi(x) = sum_n phi(n)/n!
> as prior,

So mu_B({n}) = 1/n! (n in {0,1,2,...}),
which normalizes to a Poisson distribution with mean 1.

> making
> <f(x)> = sum_n rho(n) f(n)/n!
> the definition of the density; in this case, p_n=rho(n)/n! is the
> probability of getting n.
>
> The maximum entropy ensemble defined by given expectations depends on
> the prior chosen. In particular, if the mean of a random natural number
> is given, choice A leads to a geometric distribution, while
> choice B leads to a Poisson distribution. The latter is the one
> relevant for statistical mechanics. Indeed, choice B is the prior
> needed in statistical mechanics of systems with an indefinite
> number n of particles to get the correct Boltzmann counting in the
> grand canonical ensemble. With choice A, the maximum entropy
> solution is unrelated to the distributions arising in statistical
> mechanics.

The present problem is not about the statistical mechanics of a
many-particle system, so it seems hard to justify forcing the
prior to be Poisson by reference to 'correct Boltzmann counting'.

--r.e.s.

Arnold Neumaier

unread,

May 4, 2004, 4:04:55 PM5/4/04

to

r.e.s. wrote:
> "Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...
>

> It's wise to distrust pat answers in the infinite case, but in the

> **finite** case, the question has been effectively settled, IMO.
> In the simplest version, if one must assign "state of knowledge"
> probabilities concerning just two alternatives, any nonuniform
> assignment clearly represents information that favors one of the
> two alternatives.

This depends. If you have nested hierarichies of finitely many
objects such as Objects A1, A2, A3, B1, B2, C1, C2, C3, C4,
there is no natural prior. Should each item get prior probability 1/9.
or should each class A,B,C get prior probability 1/3 and the members
of each class the same class specific conditional probability?

Natural priors exist only where there is a natural transitive symmetry
group.

> If we apply a finite-sets policy to the present problem, together
> with your view of mu as a prior, then we should choose which of
> the two priors, mu_A or mu_B, represents the least information
> when the basic set is finite, say {0,...,n-1}, and no other
> information is given (since these are *prior* to any such).

I can't accept this kind of reasoning as 'natural'. How would you
do it to derive a prior for R^2? You'd not get Lebesgue measure.

> I admit, though, that in the statistical mechanics of many-particle
> systems, there might be physical justifications for a Poisson prior
> to represent some kind of information before the mean value is
> specified -- e.g. to get the 'correct Boltzmann counting' that you
> mention. But there is no such physical context in the present
> problem. (I had cited the Boltzmann distribution only to give a
> familiar example that formally can reduce to a geometric.)

Well, this is a physics newsgroup, where probability has a quite
specific context...

>>There is also the relative entropy
>> S = - k_B integral dx p(x) log (p(x)/p_0(x)),
>>which involves an arbitrary positive function p_0(x). If p_0(x)
>>is a probability density then the relative entropy is nonnegative.
>
> ^^^^^^^^^^^
> No, it won't be nonnegative. Relative entropy by your definition
> is non*positive*.

Yes, of course.

> Suppose P, Q are probability measures with
> P << m, Q << m for positive measure m. Then the Radon-Nikodym
> derivatives satisfy
>
> integral dm (dP/dm) log( (dP/dm)/(dQ/dm) ) >= 0
>
> which provides the more usual definition of relative entropy.
> Of course, maximizing your nonpositive quantity leads to the
> same results as minimizing the nonnegative one.
>

>>If there is no natural transitive symmetry group, there is no natural
>>prior, and one has to make other useful choices. In particular, this
>>is the case for random natural numbers.
>
> If we adopt a finite-sets policy, together with your view of the
> measure mu as a prior, the above is shown to be a non sequitur.
> The reason you don't see the natural prior is that you're not
> examining the priors in terms of their representative sequences
> on *finite* spaces.

But the finite set policy is already a particular choice whose
usefulness depends on the circumstances.

>>The maximum entropy ensemble defined by given expectations depends on
>>the prior chosen. In particular, if the mean of a random natural number
>>is given, choice A leads to a geometric distribution, while
>>choice B leads to a Poisson distribution. The latter is the one
>>relevant for statistical mechanics. Indeed, choice B is the prior
>>needed in statistical mechanics of systems with an indefinite
>>number n of particles to get the correct Boltzmann counting in the
>>grand canonical ensemble. With choice A, the maximum entropy
>>solution is unrelated to the distributions arising in statistical
>>mechanics.
>
>
> The present problem is not about the statistical mechanics of a
> many-particle system, so it seems hard to justify forcing the
> prior to be Poisson by reference to 'correct Boltzmann counting'.

Hmm, the thread started with the question of whether the
frequentist interpretation could be used as the foundation of the
probabilistic interpretation of QM. Clearly this is closely related to
statistical mechanics, and the correct Boltzmann counting is needed
to take account of the (anti)symmetrization of the wave function...

Arnold Neumaier

r.e.s.

unread,

May 10, 2004, 6:02:02 AM5/10/04

to

"Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...
> r.e.s. wrote:
> > "Arnold Neumaier" <Arnold....@univie.ac.at> wrote ...

[...]

> If you have nested hierarichies of finitely many
> objects such as Objects A1, A2, A3, B1, B2, C1, C2, C3, C4,
> there is no natural prior. Should each item get prior probability 1/9.
> or should each class A,B,C get prior probability 1/3 and the members
> of each class the same class specific conditional probability?

Note that that argument applies as well to x in {a, b, c}. E.g.,
x is either in class {a} or class {b, c} -- if there is no other
information about x, should each *element* get prior probability
1/3, or should each *class* get prior probability 1/2?

ISTM that there's no ambiguity in the example; for if we know only
that x is in a given finite set X, then no additional information
is provided merely by noticing that X can be partitioned into
particular 'classes'. (We already know X can be partitioned in
that way, among others.) Consequently, in this situation, a
nonuniform prior would inappropriately reflect information that
we do not have. OTOH, if there is information to the effect that
x is the result of some multistage selection procedure involving
a particular partition, then of course we expect the prior to
reflect that, perhaps being appropriately nonuniform (depending
on the details).

That was a qualitative argument for the natural "non-informative"
prior being uniform in any finite space -- a conclusion that's
independently supported by the fact that Shannon entropy is
maximized by such a uniform distribution.

(BTW, you'd said earlier ...

"Shannon assumes in his analysis that in the absence

of knowledge, all alternatives are equally likely ..."

but that's not true. His assumption nearest to it, perhaps, is
that among cases in which n alternatives are equally likely,
the uncertainty measure should increase with increasing n. That,
and two other reasonable axioms, leads to the Shannon entropy
functional. So the uniform distribution being "non-informative"
in Shannon's quantitative sense, is a conclusion following from
those axioms, independently of the qualitative arguments above.)

> > If we apply a finite-sets policy to the present problem, together
> > with your view of mu as a prior, then we should choose which of
> > the two priors, mu_A or mu_B, represents the least information
> > when the basic set is finite, say {0,...,n-1}, and no other
> > information is given (since these are *prior* to any such).
>
>
> I can't accept this kind of reasoning as 'natural'. How would you
> do it to derive a prior for R^2? You'd not get Lebesgue measure.

This concerns the limit set {0,1,2,...} -- 'infinite' here means
'countably infinite'.

> > Suppose P, Q are probability measures with
> > P << m, Q << m for positive measure m. Then the Radon-Nikodym
> > derivatives satisfy
> >
> > integral dm (dP/dm) log( (dP/dm)/(dQ/dm) ) >= 0

(I meant to say "for sigma-finite measure m".)

> > The present problem is not about the statistical mechanics of a
> > many-particle system, so it seems hard to justify forcing the
> > prior to be Poisson by reference to 'correct Boltzmann counting'.
>
> Hmm, the thread started with the question of whether the
> frequentist interpretation could be used as the foundation of the
> probabilistic interpretation of QM. Clearly this is closely related to
> statistical mechanics, and the correct Boltzmann counting is needed
> to take account of the (anti)symmetrization of the wave function...

Such a "foundations" question need not involve many-particle
systems; however, even there the general claim about the Poisson
distribution can be disproved by a single counter-example in which
the geometric distribution emerges as least-informative given a
specified mean:

The quantum-mechanical grand canonical ensemble is given correctly
by constrained maximization of Shannon entropy (i.e. using a
uniform prior) -- this was already in Jaynes' 1957 Phys Rev paper,
"Information Theory and Statistical Mechanics". The specific case
to consider is that of bosons at a single energy level, for which
the particle-number density is found to be *geometric*, i.e.
p_n = p_0 (1 - p_0)^n (n in {0,1,2,...}).

By regarding a prior on a countably infinite space as referring to
a sequence of priors on finite spaces, the Poisson example shows
how a nonuniform prior necessarily represents an incorporation of
information: If a nonuniform prior is needed in some case, the
implication is that it represents *some* missing piece of required
information that's not yet accounted for in the formulation of the
problem. For "correct Boltzmann counting", that missing piece of
information obviously is a QM constraint unaccounted for in the
classical formulation. The quantum-mechanical grand canonical
ensemble, in contrast, doesn't have that shortcoming, so a uniform
prior is appropriate there.

--r.e.s.