Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Normal and non-normal numbers

35 views

Skip to first unread message

Dave L. Renfro

unread,

Sep 18, 2009, 2:35:05 PM9/18/09

The on-going discussion about the digits of pi and
whether pi is a normal number reminded me that I
wrote a lot about normal numbers a few days ago in
another math group. Although I've previously written
several lengthy posts about normal and non-normal
numbers, especially the abundance of numbers that
are not normal in very strong ways (going back to
July 1999), my description of L. Olsen's "Extremely
non-normal numbers" paper, way below, is new.

What follows is a slightly edited union of three
separate posts I made a few days ago in that other
group.

----------------------------------------------
----------------------------------------------

[snip question about 0.123456789101112...]

This is a very well known constant (especially in
math recreational and newsgroup settings) and it's
called "Champernowne's number" or "Champernowne's
constant". See p. 442 of Finch's book.

http://en.wikipedia.org/wiki/Champernowne_constant

In 1933 Champernowne showed that this number is normal
in base 10. This means that, as n approaches infinity,
the proportion of the digits that are 0's in the first
n digits approaches 1/10. Similarly for each of the
other digits 1, 2, ..., 9. Moreover, as n approaches
infinity, the proportion of the (possibly overlapping)
2-blocks '00' in the first n digits approaches 1/100.
Similarly for each of the other 2-block possibilities
'01', '02', ..., '09', '11', '12', ..., '19', '21',
..., '98', '99'. And the analogous result for each of
the 1000 many 3-blocks of digits, and the analogous
result for each of the 10,000 many 4-blocks of digits,
and so on.

Among other things, this implies that every finite
string of digits (no matter how long) appears infinitely
often in the decimal expansion. In fact, this latter
property is quite a bit weaker than being normal,
although in popular writing and on the internet
the two are often incorrectly identified. Here's why
every digit string has to appear infinitely often.
Suppose, for the sake of argument, that the digit
string 31415926 only appears a finite number of times,
the last time with '6' being the n'th decimal digit
of the number that is supposed to be normal. At the
most, this digit string can appear n/8 times in these
first n digits. [Or, if you're bothered about whether
overlapping could make it more than this (I don't
know either way, not having stopped to think about it,
although it seems to me this will vary according as
to the digits of the string), then just use "n times"
instead of "n/8 times", since there's no way it can
appear more times than the number of digits.] Now think
about the proportion of appearances after n^2 digits,
after n^3 digits, after n^1000 digits, after n^(n^10)
digits, etc. Clearly, the proportion approaches zero
(which contradicts normality), since you'll never, ever
see another 31415926 again, but the total number of digits
keeps getting larger and larger and larger and . . .

This means, among other things, that every normal number
contains every possible "finitely expressed" coded message,
no matter how long. In fact, it will contain each such
message infinitely many times. Therefore, if you let 'a'
be 1, 'b' be 2, 'c' be 3, ..., 'z' be 26, 'space' be
'27', and so on for however many characters are used in
the totality of the books in the Library of Congress, and
then you take all these books in some order and generate
a VERY VERY LONG sequence of digits by concatenating
in order (as they occur in each book, with the books
taken in the order you happened to pick them) the digit
sequences for each of the characters appearing in all
the books, thereby obtaining a single digit string that
codes everything written in all the books in the Library
of Congress (and not only this, but it codes them in the
particular order that you picked them), then this digit
string will appear infinitely often in Champernowne's
number. If there is any sense in which the universe
is finitely describable (for example, if there are finitely
many particles and each particle can be in only a finite
number, of locations, and there is a finite number of
quantized time units from the birth to death of the
universe), then a code for the entire universe from
past to future also appears infinitely often in
Champernowne's number.

Interestingly, most real numbers are normal to base 10
(in fact, even simultaneously normal to every positive
integer base) in the sense of Lebesgue measure. This
means that the numbers which are not normal, when taken
together, form a set that has "zero length". Less well
known is that most real numbers in another sense (Baire
category) are not normal (indeed, not normal in an
extremely strong way -- more on this below). Thus,
there are many (uncountable many in every open interval,
and more so) numbers that are transcendental and normal,
and there are many (again, uncountably many in every
open interval, and more so) numbers that are transcendental
and not normal. [This follows from the fact that what
I said holds for real numbers in place of transcendental
numbers, and the restriction to transcendental numbers
is now immediate, since there are only countably many
numbers ("small" in both the Lebesgue measure sense and
the Baire category sense) that are not transcendental.]
By the way, Champernowne's number is known to be
transcendental (it's obviously irrational).
This was proved in 1961.

An interesting and not very difficult result is that if
a decimal expansion contains every digit string of every
length (i.e. each of the digit strings '0', '1', ...,
'9', '00', 01', ..., '99', '000', '001', ...) at least
once, then the decimal expansion contains every digit
string of every length, infinitely often. For example,
to show that '531' appears infinitely often, note that
the 3-digit string '531' has to appear at least once,
the 6-digit string '531531' has to appear at least once,
the 9-digit string '531531531' has to appear at least
once, etc.

I've seen several names for these kinds of numbers in print,
the most common being "absolutely disjunctive real numbers"
(google the phrase) and "lexicons" (google along with the
phrase "real numbers"). Another name I've seen is "repetitious
number", which was probably only used once [in a published
talk abstract in Amer. Math. Monthly 51 (1944), p. 432].
The set of real numbers, each of which has this property
simultaneously in every base b and not just in base 10
(what I believe "absolutely disjunctive" and "lexicon"
actually mean), is very much smaller in many ways than
the set of normal numbers. I outlined some of these ways
in the following two posts. [For example, the collection
of real numbers in [0,1] whose b-ary expansions omit
a specific digit string is clearly a Cantor-like set
that is nowhere dense, and thus those numbers whose
b-ary expansions, for each b = 2, 3, 4, ..., omit any
of the only countably many possible finite-length digit
strings in base b will be a countable union of nowhere dense
sets, and hence a set of the first Baire category, whereas
the set of numbers that are not normal forms a much, much
larger set, one that is actually the complement of a first
Baire category set. It's kind of like saying the non-lexicons
have "measure zero" while the non-normals have "full measure",
and thus the lexicons have "full measure" while the normals
have "measure zero". Thus, to be a normal number is a much
more stringent property than to be a lexicon, since there
are so far fewer many normal numbers than lexicons.]

http://groups.google.com/group/sci.math/msg/4ec315328c1afdb8

http://groups.google.com/group/alt.math/msg/4cab6840d537f81c

I had a chance to look through some of my things
this weekend about normal numbers . . .

**********************************************************

For introductions to normal numbers in books (and one Master's
Thesis survey), see the following:

Adrian Belshaw, "On the Normality of Numbers",
Master's Thesis (Senior Supervisor: Peter Borwein),
Simon Fraser University, Fall 2005, vi + 34 pages.
http://www.math.sfu.ca/numthry/report/2006jan13.pdf

Billingsley, "Probability and Measure", 1995 3rd edition.
[Chapter 1.1, pp. 1-17]

Hardy/Wright, "An Introduction to the Theory of Numbers"
[Chapter 9.12 -- pp. 124-128 4th edition]

I. N. Herstein, "Irrational Numbers" [Chapter 8]

Knuth, "The Art of Computer Programming" [Volume 2,
Chapter 3, Section 5 (pp. 142-169 in 1981 edition)]

Kuipers/Niederreiter "Uniform Distribution of Sequences"
[Has extensive literature and historical notes at the
end of Chapter 1.8.]

**********************************************************

For expository/elementary papers, see the following
[URL's given for those I was able to find freely
available on the internet]:

Jose Nilo G. Binongo, "Randomness, statistics, and pi",
Mathematics Teacher 95 #3 (March 2002), 224-230.
[In MT 95 #6 (September 2002), p. 478, Binongo says
that he has a longer version of the paper, which he
can send to anyone interested.]

Ralph P. Boas, "Distribution of digits in integers",
Mathematics Magazine 50 #4 (September 1977), 198-201.

Davar Khoshnevisan, "Normal Numbers are Normal",
Clay Mathematics Institute Annual Report,
2006, pp. 15 & 27-31.
http://tinyurl.com/n4xsam [293 .pdf file]

Greg Martin, "Absolutely abnormal numbers", American
Mathematical Monthly 108 #8 (October 2001), 746-754.
http://arxiv.org/abs/math/0006089
http://arxiv.org/PS_cache/math/pdf/0006/0006089v2.pdf

S. J. Taylor, "The regularity of randomness", Mathematical
Gazette 62 #419 (March 1978), 1-8.

Sergio B. Volchan, "What is a random sequence?", American
Mathematical Monthly 109 #1 (January 2002), 46-63.
http://www.maa.org/news/monthly046-063.pdf

Bodo Volkmann, "On non-normal numbers", Compositio
Mathematica 16 (1964), 186-190.
http://www.numdam.org/numdam-bin/fitem?id=CM_1964__16__186_0

**********************************************************

An excellent early history of normal numbers and how
the notion influenced (or was influenced by) the various
statistical laws of large numbers, ergodic theory, etc.
is the following:

Albert Novikoff and Jack Barone, "The Borel law of normal
numbers, the Borel zero-one law, and the work of Van Vleck",
Historia Mathematica 4 (1977), 43-65.

For example, on p. 54 they mention that in 1910 Fabar
constructed a strictly increasing function that was
not (finitely) differentiable at each point on the
complement of the set of normal numbers. Fabor noted
that this gives another proof that the set of normal
numbers has Lebesgue measure zero, since by a result
of Lebesgue's on the differentiability of monotone
functions (a standard result in virtually every graduate
real analysis class), the function he constructed must be
finitely differentiable almost everywhere. Regarding this
connection with real analysis, see also Mandelbrot/Jaffard's
paper in Mathematical Intelligencer 19 #4 (Fall 1997),
21-26. Another real analysis connection is the fact that
if f(x) is absolutely continuous, then for almost every
(Lebesgue measure) x we have x and f(x) simultaneously normal.

**********************************************************

There are quite a few miscellaneous results known about
normal numbers and related notions. I'll mention 4 that
are easy to state:

1. A normal number plus a rational number is always normal.

2. A normal number times a nonzero rational number is
always normal.

3. Every real number can be expressed as the sum of
two normal numbers.

4. Every nonzero real number can be expressed as the
product of two normal numbers.

The first two were proved in D. D. Wall's 1949 Ph.D.
Dissertation and the last two were proved by J. E.
Maxfield in Amer. Math. Monthly 59, 1952, p. 98.

**********************************************************

In this last section I'd like to discuss some extreme
forms of non-normality that do not seem to be very
well known.

In "Absolutely abnormal numbers" (see above), Greg Martin
constructs a number that is irrational and simultaneously
not normal in every base b = 2, 3, 4, ... Later in the paper
he notes that his method actually gives uncountably many
such numbers in every open interval. [In fact, his method
gives more, namely continuum many such numbers in every
open interval, since his proof is by an explicit
correspondence with a set whose cardinality is the
same as that of the set of real numbers, rather than
by (for example) a method in which a contradiction is
obtained from assuming there are only countably many
such numbers.]

Much more extreme results have previously been proved
for a much greater predominance of numbers, but rather
than try to sort through all the historical details
(which I'm not really prepared to do at this time anyway,
plus I doubt anyone would be interested), I'm going to
describe a result by a sometime sci.math poster (although
very infrequently) that is the most extreme result in
this direction I know of.

In order for a number to be normal, every possible
finite sequence of digits has to appear in the
appropriate limiting proportion as you look through
the digits to the right of the decimal point of
the number.

Thus, if x is a normal number and N(5,n) is the
number of appearances of the digit '5' in the first
n digits of the decimal expansion of x, then we have
N(5,n)/n --> 1/10 as n --> infinity. For this limit
to not equal 1/10, we can either have the limit not
exist or exist and not equal to 1/10. However, having
the limit exist and not equal to 1/10 seems rather
dull, as it corresponds to behavior nearly as uniform
as being normal. This would be like a coin that isn't
fair, but it's still statistically stable in the sense
that with a large number of tosses we always get heads
just about 40% of the time and tails just about 60% of
the time. Far worse would be a coin that we simply cannot
make any statistical guesses as to how likely heads
will be. You toss it one million times and get 42%,
then toss it another million times and get 95%, then
another million times and get 3%, and so on, and the
same wild fluxuations occur when you change "million"
to "billion", or "million" to "trillion", or "million"
to "googol", etc.

The worse possible way for N(5,n)/n to not have 1/10
as its limit would be that, as you evaluate N(5,n)/n
for larger and larger values of n, the values of N(5,n)/n
bounce around, getting arbitrarily close to every single
real number between 0 and 1, inclusive. That means you
can pick any number r such that 0 <= r <= 1 and there
will be a sequence n_1, n_2, n_3, ... of positive
integers such that N(5,n_i)/n_i approaches r as
i --> infinity. It's like how (-1)^n + 1/n approaches
arbitrarily close to each of the numbers -1 and 1,
except N(5,n)/n does this for each real number between
0 and 1. And yes, it's EACH real number between 0 and 1,
not just each rational number between 0 and 1 or each
of some other countable set of values between 0 and 1.

Not only this, but we want this behavior to hold for
each of the 10 digits 0, 1, 2, ..., 9. Thus, for each
choice of a digit d = 0, 1, 2, ..., 9 and for each choice
of a real number r between 0 and 1 (inclusive), there
is a sequence {n_j} of positive integers such that
N(d,n_j)/n_j approaches r. Here, N(d,m) is just the
number of times the digit 'd' appears in the first
m digits to the right of the decimal point of x. Another
way of to say this is that for each digit d, the set
of subsequential limits of the sequence N(d,n)/n is
equal to the closed interval [0,1].

This sounds like the most extreme possible non-convergence
behavior we could require of N(d,n)/n, but an observation
I'll make shows that an even stronger kind of behavior
can be imagined. Suppose I pick a real number r between
0 and 1 and another real number s between 0 and 1. At
this point, all I've said is that there is a sequence
{n_i} of positive integers such that N(5,n_i)/n_i
approaches r and a possibly different sequence {n_k} of
positive integers such that N(6,n_k)/n_k approaches s.
Can I have my cake and eat it too, by requiring N(5,n_j)/n_j
to approach r and N(6,n_j)/n_j for the SAME sequence {n_j}?
No, this is too much to ask for, at least if r + s > 1.
We simply can't have, for instance, roughly 65% of the
digits being 5's and roughly 50% of the digits being 6's,
at least not when we're looking back through the same
stretch of digits. The best we can hope for is that this
will happen whenever r + s doesn't exceed 1.

If we want to simultaneously deal with three digit frequency
values (instead of two digit frequency values, r and s),
the best we can hope for when using the same sequence of
positive integer number of digits is that we can get each
the three specified digits approaching to the their assigned
digit frequency values when the sum of the three frequency
values isn't greater than 1.

Indeed, all this -- and more -- is possible. Specifically,
there exists a real number x such that (take a deep breath):

Given any 10 positive real numbers r_0, r_1, r_2, ...,
r_9 between 0 and 1 (inclusive) such that their sum is
less than or equal to 1, then there exists a sequence
{n_i} of positive integers such that N(0,n_i)/n_i --> r_0
and N(1,n_i)/n_i --> r_1 and N(2,n_i)/n_i --> r_2 and ...
and N(9,n_i)/n_i --> r_9 (all ten limits being evaluated
as i approaches infinity).

Here's another way to describe this: The subsequential
limits of the sequence (N(0,n), N(1,n), ..., N(9,n)) in
the space R^10 is equal to the closed simplex in R^10
with vertices (0,0,0,...,0), (1,0,0,...,0), (0,1,0,...,0),
..., (0,0,0,...,1) (i.e. 10-tuples of real numbers in which
exactly one coordinate is 1 and the remaining 9 coordinates
are 0).

This result specifically pertains to base 10. What about
other bases? As you might expect, we can simultaneously
in all bases maximally un-limit the single digit behaviors.

Specifically, there exists a real number x such that
(take an even deeper breath than before):

Given any base b (a positive integer greater than 1) and
given b many positive real numbers r_0, r_1, r_2, ...,
r_(b-1) between 0 and 1 (inclusive) such that their sum
is less than or equal to 1, then there exists a sequence
{n_i} of positive integers such that N(0,n_i)/n_i --> r_0
and N(1,n_i)/n_i --> r_1 and N(2,n_i)/n_i --> r_2 and ...
and N(b-1,n_i)/n_i --> r_(b-1). This time, of course,
N(0,n_i) is the number of appearances of the digit '0'
in the first n_i digits past the decimal point in the
b-ary expansion of x.

Another way to describe this: The set of subsequential limits
of the sequence (N(0,n), N(1,n), ..., N(b-1,n)) in the space
R^b is equal to the closed simplex in R^b with vertices
(0,0,0,...,0), (1,0,0,...,0), (0,1,0,...,0), ..., (0,0,0,...,1).

Incidentally, it is known that, given any nonempty closed
and connected subset E of this simplex in R^b, there exists
a real number x such that the set of subsequence limits of the
sequence (N(0,n), N(1,n), ..., N(b-1,n)) is equal to E. See
the second half of p. 188 of Bodo Volkmann's paper, which
I mentioned above. In a certain sense, this is the best
possible result, since the set of these subsequence limits
always forms a nonempty closed connected subset of this
b-dimensional simplex. Actually, I can think of stronger
results, such as arranging for these subsequential limits
to be achieved via subsequences with specified upper densities,
and I strongly suspect we can get upper density 1 for each of
these subsequential limits, for residually many real numbers x.

What about messing up the frequencies of each 2-digit
string, and of each 3-digit string, and so on? And can
we do some or all of this in just one base, some bases
(where "some" might be finitely many, infinitely many
but with possibly infinitely many exceptions, infinitely
many exceptions that have small or even zero density
in the set of all positive integers, etc.), or all bases?

Rather than try to give Olsen's actual result, I'll just
say that he identified a certain set that the subsequential
limits of all the fixed-length string frequencies belong
to (the corresponding set in the version I gave above
is the set of all 10-tuples of positive real numbers
r_0, r_1, r_2, ..., r_(b-1) between 0 and 1 such that
their sum is less than or equal to 1), no matter what
the real number x, and then he showed that "most" (in the
Baire category sense) real numbers are so extremely
non-normal that the limiting points of all the various
finite length string frequencies for each of these real
numbers completely fills up this set, for each base
b = 2, 3, 4, ..., of course.

Without describing the set in question, here's a more
precisely worded account of what he proved can happen
for a real number x (just keep breathing and hope for
the best):

Given any base b (a positive integer greater than 1) and
given any positive integer k (k is the specified string
length that we'll be investigating; in the version I gave
above, k = 1) and given any b^k many positive real
numbers satisfying a certain condition (the condition is
something that is analogous to the k = 1 case, in which
they were between 0 and 1 and have a sum not more than 1),
then there exists a sequence {n_i} of positive integers
such that, as i --> infinity, all the proportions of the
b^k many k-length strings in the first n_i b-ary digits
past the decimal point approach the corresponding b^k
many chosen real numbers (where how they are supposed
to be corresponded is something you decide on in advance;
in the k = 1 case I gave above, this is the part where
N(0,n_i)/n_i --> r_0 and N(1,n_i)/n_i --> r_1 and ...).

Note that Greg Martin's "absolutely abnormal numbers"
have the following much, much, much weaker property:

Given any base b (a positive integer greater than 1),
then AT LEAST ONE OF N(0,n)/n, N(1,n)/n, N(2,n)/n, ...,
N(b-1,n)/n does not approach 1/b.

Martin also shows that there are uncountably many such
numbers in every open interval.

What Olsen did was to show that his "extremely non-normal"
numbers form a set whose complement has first Baire category
in the set of real numbers. Among other things, this implies
there are uncountably many (in fact, continuum many and
much more than this in ways that cardinality doesn't measure)
of his extremely non-normal numbers in every open interval.

Olsen's paper is not on the internet, but for those interested,
it was published as:

Lars Olsen, "Extremely non-normal numbers", Mathematical
Proceedings of the Cambridge Philosophical Society 137
(2004), 43-53.

Finally, I don't know if a Bodo Volkmann type of result
has been proved for the set Olsen identifies, but it's
almost certainly true. In fact, Volkmann's constructions
can probably be extended in a natural way to cover this
type of extension, and doing so might be a possible
jumping off point for someone's Ph.D. topic.

**********************************************************

Dave L. Renfro

Blither

unread,

Sep 18, 2009, 2:39:17 PM9/18/09

My God the vacuous blithering from Google is an endless source of amusement.

"Dave L. Renfro" <renf...@cmich.edu> wrote in message
news:10080a99-edb4-4bdd...@31g2000vbf.googlegroups.com...

Blither:

> The on-going discussion about the digits of pi and

> in base 10. This means that, as n approaches infinity,
> the proportion of the digits that are 0's in the first

> string of digits (no matter how long) appears infinitely

> codes everything written in all the books in the Library
> of Congress (and not only this, but it codes them in the

> number, of locations, and there is a finite number of

> category) are not normal (indeed, not normal in an

> (google the phrase) and "lexicons" (google along with the
> phrase "real numbers"). Another name I've seen is "repetitious

> strings in base b will be a countable union of nowhere dense

> Master's Thesis (Senior Supervisor: Peter Borwein),

Blither.

David Bernier

unread,

Sep 18, 2009, 3:38:17 PM9/18/09

Dave L. Renfro wrote:
> The on-going discussion about the digits of pi and
> whether pi is a normal number reminded me that I
> wrote a lot about normal numbers a few days ago in
> another math group. Although I've previously written
> several lengthy posts about normal and non-normal
> numbers, especially the abundance of numbers that
> are not normal in very strong ways (going back to
> July 1999), my description of L. Olsen's "Extremely
> non-normal numbers" paper, way below, is new.

[...]

On the Arts side, there's Pi [the movie].

Cf.:
< http://www.imdb.com/title/tt0138704/ >

This below is interesting, but has film spoilers:
IMDb > Pi (1998) > Trivia .

David Bernier

Gottfried Helms

unread,

Sep 18, 2009, 4:29:45 PM9/18/09

Am 18.09.2009 20:35 schrieb Dave L. Renfro:
> The on-going discussion about the digits of pi and
> whether pi is a normal number reminded me that I
> wrote a lot about normal numbers a few days ago in

(...)

Great! Thank you for this article.

Gottfried

0 new messages