Proof that A^n = 0 if A is nilpotent

vip.ma...@gmail.com

unread,

Dec 11, 2007, 5:40:32 PM12/11/07

to

Hi ,

I would like to request some help if anyone has ideas about how to
prove that if A is nilpotent matrix then A^n = 0?

Thanks

quasi

unread,

Dec 11, 2007, 5:55:27 PM12/11/07

to

On Tue, 11 Dec 2007 14:40:32 -0800 (PST), vip.ma...@gmail.com
wrote:

>I would like to request some help if anyone has ideas about how to
>prove that if A is nilpotent matrix then A^n = 0?

It's true by definition.

<http://en.wikipedia.org/wiki/Nilpotent_matrix>

quasi

Chip Eastham

unread,

Dec 11, 2007, 6:29:42 PM12/11/07

to

On Dec 11, 5:55 pm, quasi <qu...@null.set> wrote:
> On Tue, 11 Dec 2007 14:40:32 -0800 (PST), vip.maver...@gmail.com

> wrote:
>
> >I would like to request some help if anyone has ideas about how to
> >prove that if A is nilpotent matrix then A^n = 0?
>
> It's true by definition.
>
> <http://en.wikipedia.org/wiki/Nilpotent_matrix>
>
> quasi

I suspect that what the OP is asking has to do with
replacing the exponent that appears in the definition
with the dimension of the (square) matrix A. For
simplicity I'll assume that A is a complex matrix
rather than a matrix over some other coefficients.

That is, A is nilpotent iff A^k = 0 for some positive
integer k. Proposition: If A is nilpotent and the
dimensions of A are nxn, then A^n = 0.

One approach to proving this would be to show that
the characteristic polynomial of A, which is of
degree n, is specifically x^n. Then by the Cayley-
Hamilton Theorem, A satisfies its characteristic
polynomial, ie. A^n = 0.

Here's a sketch of an argument the characteristic
polynomial of A is x^n. Otherwise the polynomial
would have a nonzero root, possibly complex, say
r. Every characteristic root (eigenvalue) must
have at least one nonzero eigenvector, say:

Av = rv

Then A^n v = r^n v, but also A^n v must be zero.
This is a contradiction (since both r and v are
nonzero). It follows that the characteristic
polynomial of A must be simply x^n.

regards, chip

vip.ma...@gmail.com

unread,

Dec 11, 2007, 7:05:02 PM12/11/07

to

Hi ,

Thanks for the suggestion, if we assume say k <= n then A^k = 0 hence
obiviously A^n =0, how do we handle the case of k > n and show that
A^n is still 0?

Thanks

quasi

unread,

Dec 11, 2007, 7:24:53 PM12/11/07

to

On Tue, 11 Dec 2007 16:05:02 -0800 (PST), vip.ma...@gmail.com
wrote:

Read Chip's proof.

Note that it never assumes k <= n.

quasi

unread,

Dec 11, 2007, 7:41:48 PM12/11/07

to

Here's another proof, although essentially the same ...

Proposition: Let A be an n x n matrix over a field K. If A is
nilpotent, then A^n = 0.

proof (I'll just give an outline):

(1) Let J = {f in K[x] | f(A) = 0}. Then J is an ideal of K[x].

(2) Since K[x] is a PID, J must be a principal ideal, J = (g) say.

(3) Since A is nilpotent, A^s = 0 for some positive integer s. It
follows that x^s is in J.

(4) But then g must be a factor of x^s, hence g = x^r for some
positive integer r, where r <= s.

(5) But every eigenvalue of A is a root of the minimal polynomial,
that is, a root of g, hence all eigenvalues of A are zero.

(6) Since all eigenvalues of A are zero, the characteristic polynomial
of A must be x^n.

(7) Finally, since A satisfies its characteristic polynomial, it
follows that A^n = 0.

quasi

Jim Ferry

unread,

Dec 12, 2007, 12:14:32 AM12/12/07

to

If A : R^n -> R^n, then consider the sequence

A^0 R^n, A^1 R^n, A^2 R^n, ...

Each element of this sequence is a linear space that contains the next
element. If the containment is proper, the dimension goes down by at
least one. On the other hand, if and when the containment is actually
an equality, the sequence is constant from that point forward. This
constant is 0-d iff A is nilpotent.

Those would be the ideas I'd use if I were writing this up for
homework, though I'd want to put them together into a coherent proof.

-Jim Ferry
Metron, Inc.
f rr @m tsc .c m
e y e i o

quasi

unread,

Dec 12, 2007, 12:48:11 AM12/12/07

to

Nice proof strategy.

Works just as well over any field.

quasi

Bill Dubuque

unread,

Dec 12, 2007, 1:36:26 AM12/12/07

to

quasi <qu...@null.set> wrote: (*edited*)
>
> PROPOSITION Let A be an n x n matrix over a field K.

> If A is nilpotent, then A^n = 0.
>

> PROOF (sketch)

> (1) Let J = {f in K[x] | f(A) = 0}. Then J is an ideal of K[x].
> (2) Since K[x] is a PID, J must be a principal ideal, J = (g) say.
> (3) Since A is nilpotent, A^s = 0 for some positive integer s.
> It follows that x^s is in J.
> (4) But then g must be a factor of x^s, hence g = x^r for some
> positive integer r, where r <= s.
> (5) But every eigenvalue of A is a root of the minimal polynomial,
> that is, a root of g, hence all eigenvalues of A are zero.
> (6) Since all eigenvalues of A are zero, the characteristic polynomial
> of A must be x^n.
> (7) Finally, since A satisfies its characteristic polynomial,
> it follows that A^n = 0.

m deg f
SIMPLER f in PID K[x], x prime -> (f(x), x ) | x

--Bill Dubuque

Proginoskes

unread,

Dec 12, 2007, 3:26:32 AM12/12/07

to

Yet another idea: For any square matrix A, there is an invertible
matrix U and an upper triangular matrix T such that

A = U T U^(-1).

(Schur Triangulation.) Furthermore, the diagonal entries of T are the
eigenvalues of A, which in this case are all 0. Then

A^n = (U T U^(-1))^n = U T^n U^(-1),

and since T^n = 0 --- in particular, in T^k, the main diagonal and
the k-1 diagonals above it have all-zero entries --- this proves
A^n = 0 as well.

A bit longer, but the good proofs were taken.

--- Christopher Heckman

Bill Dubuque

unread,

Dec 12, 2007, 6:06:33 AM12/12/07

to

Jim Ferry <corkl...@hotmail.com> wrote:

>On Dec 11 2007, vip.ma...@gmail.com wrote:
>>
>> I would like to request some help if anyone has ideas about
>> how to prove that if A is nilpotent matrix then A^n = 0?
>

> If A : R^n -> R^n, then consider the sequence
>
> A^0 R^n, A^1 R^n, A^2 R^n, ...
>
> Each element of this sequence is a linear space that contains the next
> element. If the containment is proper, the dimension goes down by at
> least one. On the other hand, if and when the containment is actually
> an equality, the sequence is constant from that point forward. This

> constant is 0-d iff A is nilpotent [...]

It recalls a pretty pigeonhole proof of AB=I -> BA=I or, equivalently,
injective (1-1) -> surjective (onto), for finite dim vectors spaces V.
I mentioned this proof here [1] on Apr 24 1999, but forgot to followup
with details. The details are below, followed by further comments about
various generalizations, and some references to related literature.

BA = I -> A injective, since B times Ax = Ay yields x = y; and

A surjective -> AB = I, since for all x exist y: x = Ay = A(BA)y = ABx

Thus BA = I -> AB = I reduces to A injective -> A surjective; but

LEMMA A injective -> A surjective, for all linear A on finite dim V
where dim V := maximum length of all subspace chains R < S < ... < V

PROOF A injective -> A preserves injections: R < S -> AR < AS
thus for R < S < ... < V a subspace chain of max length
its image AR < AS < ... < AV <= V would have greater length
if AV < V, thus instead AV = V, i.e. A is surjective. QED

It's one of the prettiest simple pigeonhole proofs I've discovered.

It fails for infinite dim V; see [1] for some easy counterexamples
employing very simple shift operators inspired by the Hilbert Hotel.
The shift operator is of fundamental importance in linear algebra,
for example see the review of Fuhrmann's book in my prior post [2].

[1] http://google.com/groups?threadm=y8zogkdmz1g.fsf%40berne.ai.mit.edu
[2] http://google.com/groups?threadm=y8zogksm5v0.fsf%40berne.ai.mit.edu

The above proof does not depend upon any specific properties of
vector spaces. In fact, the key lemma is merely a lattice/poset
result that simply says that strict-order-preserving maps cannot
decrease heights. The exact same proof applies to any algebraic
structure V of finite height in its lattice of subalgebras, where
A is any injective endomorphism of V. For example, in the trivial
case (no operations) the structure is just a set and homs are just
set maps, and the proof specializes to a form of the pigeonhole
principle, namely that a one-to-one map on a finite set is onto.
For finite-length modules it yields: injective endomorphisms are
surjective. Also related are Jordan-Holder results on composition
series (which offer an alternative foundation for dimension theory
of vector spaces). Thus we see that the results AB = I -> BA = I
and injective endomorphisms are surjective are simple consequences
of a general pigeonhole principle for finitely generated structures;
moreover the crucial role of the finiteness hypothesis is now clear.

Of course it is highly unlikely that any of these results are new.
Indeed, a quick AMS Math Reviews search turned up the reviews below
discussing related results regarding finitely generated modules.

--Bill Dubuque
------------------------------------------------------------------------------
45:246 13C05
Vasconcelos, Wolmer V.
Regular endomorphisms of finitely generated modules.
J. London Math. Soc. (2) 4 (1971), 27--32.
------------------------------------------------------------------------------
The author studies the question: When is an injective endomorphism
of a finitely generated module over a commutative ring an isomorphism?

Let R be a commutative Noetherian ring, and let E be an R-module of
finite type. Consider the following properties of f in Hom_R(E,E):
(a) f is injective; (b) f is right cancellable; (c) f is left cancellable;
(d) f is surjective; (e) f is an isomorphism. Then the following relations
hold: (e) <-> (d) -> (c) -> (b) <-> (a). \O(E) = Hom_R(E,E) admits a two-sided
total ring of quotients which is obtained by localizing \O(E) at the set of
all non-zero divisors of E. Consequently, \O(E) has an Artinian total ring
of quotients if and only if I, the annihilator of E , has no embedded primes
and E is a torsion-free R/I-module. In the final section the hypothesis that
R is Noetherian is dropped. Theorem: Let E be a finitely generated torsion
module over a ring whose finitistic global dimension is 1; if E has finite
projective dimension, then any injective endomorphism of E is an isomorphism.
Reviewed by M. Teply
------------------------------------------------------------------------------
41:3460 13.40
Vasconcelos, Wolmer V.
Injective endormorphisms of finitely generated modules.
Proc. Amer. Math. Soc. 25 1970 900--901.
------------------------------------------------------------------------------
The author proves the following theorem concerning arbitrary commutative
rings. Theorem: Any injective endomorphism of a finitely generated R-module
is an isomorphism if and only if every prime ideal of R is maximal.

The proof is elementary and rests on the Cohen-Seidenberg theorem concerning
integral extensions of a ring.
Reviewed by R. E. Mac Rae

Zbl: ... is a well-known exercise that Artinian modules have this property.
http://www.emis.de/cgi-bin/Zarchive?an=0197.31404
------------------------------------------------------------------------------
57:9754 16A52
Armendariz, Efraim P.; Fisher, Joe W.; Snider, Robert L.
On injective and surjective endomorphisms of finitely generated modules.
Comm. Algebra 6 (1978), no. 7, 659--672.
------------------------------------------------------------------------------
The authors discuss the following two problems: (I) [(S]) For what rings are
injective[surjective] endomorphisms of finitely generated modules isomorphisms?
The answer to (I) (also proved by F. Dischinger] is: The matrix rings R_n over
R must be left \pi-regular (basically d.c.c. on ideals generated by the powers
of an element). This condition is an analog of the condition in the commutative
case that the Krull dimension be zero. In the commutative case, the R_n 's are
easily shown to be \pi-regular along with R: in general this is not known.
The characterization is however strong enough to prove (I) when R is a P.I.
\pi-ring or when R is perfect. As for (S), the results are less decisive, but
it is shown to be true for P.I.-rings that are integral over the center.
Another class of rings for which (S) is valid is that of von Neumann rings
whose primitive factor rings are Artinian. The paper concludes with various
examples which exhibit a variety of limitations on (S), in particular that
it does not hold for arbitrary P.I.-rings.
Reviewed by W. V. Vasconcelos
------------------------------------------------------------------------------
81k:16015 16A30
Menal, Pere
On \pi -regular rings whose primitive factor rings are Artinian.
J. Pure Appl. Algebra 20 (1981), no. 1, 71--78.
------------------------------------------------------------------------------
This paper is concerned with rings R that are \pi-regular (for each x in
R there exist a positive integer n and an element y in R such that
x^n y x^n = x^n and have the further property that all primitive factor
rings of R are Artinian. The main theorem of the paper is that if R is
a \pi-regular ring whose primitive factor rings are Artinian, then R is
strongly \pi-regular (for each x in R there exist a positive integer n
and an element y in R such that x^n = x^{n+1}y) and R has stable range 1
(for any a,b in R with aR + bR = R , there exists c in R such that a+bc
is a unit in R ). This theorem leads to the following results concerning
finitely generated modules: If M is a finitely generated module over a (von
Neumann) regular ring R whose primitive factor rings are Artinian;
consequently, End_R(M) is strongly \pi-regular and has stable range 1, and
M cancels from direct sums of R-modules. (The strong \pi-regularity of
End_R(M) had been proven earlier by E. P. Armendariz, J. W. Fisher and R. L.
Snider [Comm. Algebra 6 (1978), no. 1, 659--672; MR 57#9754]. The question of
stable range 1 for End_R(M) and the cancellation property for M had been
posed as open problems in the reviewer's book [von Neumann regular rings,
Problem 53, see p. 350, Pitman, London, 1979; MR 80e:16011].) Further results
in this paper deal with a \pi-regular ring R whose index of nilpotence is a
finite integer n: then the primitive factor rings of R are all Artinian of
index at most n. Also, if R is a regular ring of index n and M is an
m-generated R-module, then End_R(M) is a \pi-regular ring of index at most mn.
Reviewed by K. R. Goodearl
------------------------------------------------------------------------------
82a:16031 16A64
Hirano, Yasuyuki
Correction to: "On Fitting's lemma" [Hiroshima Math. J. 9 (1979), no. 3,
623--626; MR 80j: 16022].
Hiroshima Math. J. 10 (1980), no. 3, 699.
------------------------------------------------------------------------------
The author provides a revised proof of the implication (1) -> (2) in
Proposition 1. J. W. Fisher had pointed out the error to the author.
------------------------------------------------------------------------------
80j:16022 16A64 (16A30)
Hirano, Yasuyuki
On Fitting's lemma.
Hiroshima Math. J. 9 (1979), no. 3, 623--626.
------------------------------------------------------------------------------
A module M satisfies Fitting's lemma if for any endomorphism f of M
there exists an integer n such that M = f^n(M) (+) Ker(f^n). It is shown
that if for every finitely generated R-module injective endomorphisms are
isomorphisms and surjective endomorphisms are isomorphisms, then every
finitely generated R-module satisfies Fitting's lemma. This fact is used to
give an alternate treatment of some results of E. P. Armendariz, J. W. Fisher
and the reviewer [Comm. Algebra 6 (1978), no. 7, 659--672; MR 57#9754].
Reviewed by Robert L. Snider
------------------------------------------------------------------------------
92h:16008 16E50 (16P70)
Hirano, Yasuyuki(J-OKAY)
Some characterizations of \pi -regular rings of bounded index.
Math. J. Okayama Univ. 32 (1990), 97--101.
------------------------------------------------------------------------------
A ring R is said to have bounded index n if there is some number n for
which a^{n+1} = 0 implies a^n = 0. R is said to be \pi-regular, if for
any a in R there is r in R and n>0 for which a^n = a^n r a^n. The
author shows that prime \pi-regular rings of bounded index are simple
Artinian, thereby enabling him to characterize a ring of bounded index as
being \pi-regular if and only if every prime factor ring is Artinian. (This
result relies on a theorem of E. P. Armendariz, J. W. Fisher and R. L. Snider
[Comm. Algebra 6 (1978), no. 7, 659--672; MR 57 #9754].) He also shows
that a ring is \pi-regular of bounded index if and only if the indices of
the endomorphism rings of its cyclic modules are bounded.
Reviewed by Louis Rowen
------------------------------------------------------------------------------
93k:16010 16D80 (13C99 16P99)
Leary, F. C.(1-STBV)
Dedekind finite objects in module categories.
J. Pure Appl. Algebra 82 (1992), no. 1, 71--80.
------------------------------------------------------------------------------
An R-module M is called Dedekind-finite when every injective endomorphism
of M is an isomorphism. The author obtains some elementary results on
Dedekind-finite modules and, in particular, on the commutative rings R such
that each finitely generated free R-module is Dedekind-finite. Surprisingly
enough, the relevant papers on the subject, such as those by W. V. Vasconcelos
[Proc. Amer. Math. Soc. 25 (1970), 900--901; MR 41#3460] and E. P. Armendariz,
J. W. Fisher and R. L. Snider [Comm. Algebra 6 (1978), no. 7, 659--672; MR
57#9754], are not even mentioned.
Reviewed by J. L. Gomez Pardo
------------------------------------------------------------------------------
99g:13011 13C13
Barry, M.(SNG-DAKS-MI); Gueye, C. T.(SNG-DAKS-MI); Sanghare, M.(SNG-DAKS-MI)
On commutative FGI-rings.
Extracta Math. 12 (1997), no. 3, 255--259.
------------------------------------------------------------------------------
Let R be a commutative ring. An R-module M is said to satisfy property
(I) if every injective endomorphism on M is surjective. W. V. Vasconcelos
[Proc. Amer. Math. Soc. 25 (1970), 900--901; MR 41#3460] showed that every
finitely generated R-module satisfies property (I) if and only if every prime
ideal of R is maximal. Call R an FGI-ring if every R-module with property
(I) is finitely generated. It is shown that in a commutative FGI-ring every
prime ideal is maximal and the number of prime ideals is finite. Moreover, a
countable commutative ring is an FGI-ring if and only if it is an Artinian
principal ideal ring.
Reviewed by Daniel D. Anderson

Herman Rubin

unread,

Dec 12, 2007, 1:10:47 PM12/12/07

to

In article <l95ul3939p35kjvhc...@4ax.com>,

quasi <qu...@null.set> wrote:
>On Tue, 11 Dec 2007 14:40:32 -0800 (PST), vip.ma...@gmail.com
>wrote:

>>I would like to request some help if anyone has ideas about how to
>>prove that if A is nilpotent matrix then A^n = 0?

>It's true by definition.

I presume that the original poster meant by n the
dimension of the matrix. This definition just states
that it is true for SOME power.

If A^k = I for some k, it need not be true that A^n = I.

--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

quasi

unread,

Dec 12, 2007, 3:15:41 PM12/12/07

to

On 12 Dec 2007 13:10:47 -0500, hru...@odds.stat.purdue.edu (Herman
Rubin) wrote:

>In article <l95ul3939p35kjvhc...@4ax.com>,
>quasi <qu...@null.set> wrote:
>>On Tue, 11 Dec 2007 14:40:32 -0800 (PST), vip.ma...@gmail.com
>>wrote:
>
>>>I would like to request some help if anyone has ideas about how to
>>>prove that if A is nilpotent matrix then A^n = 0?
>
>>It's true by definition.
>
>I presume that the original poster meant by n the
>dimension of the matrix. This definition just states
>that it is true for SOME power.
>
>If A^k = I for some k, it need not be true that A^n = I.

Yes, if we assume that A is n x n, then there's something to prove.

Using that assumption, I did supply a proof in a later reply.

quasi

Michael Press

unread,

Dec 12, 2007, 9:15:35 PM12/12/07

to

In article
<abe5b6f5-eee7-48b9...@1g2000hsl.googleg
roups.com>,
vip.ma...@gmail.com wrote:

Take the dimension of A to be n.
Suppose s is the least s such that A^s = 0
Take u =/= 0 in R^n.
There is a least integer k <= n such that
A^0.u, A^1.u, A^2.u, ..., A^k.u are linearly dependent.
That is to say there is a polynomial P of degree k <= n
such that P(A) = 0. Therefore P(X) divides X^s, and P(X) = X^k.
Therefore s <= n, and A^n = 0.

--
Michael Press

Bill Dubuque

unread,

Dec 13, 2007, 3:55:27 AM12/13/07

to

Please follow the link below (to my prior message in this thread)
http://google.com/groups?selm=y8z4peo6ukm.fsf%40nestle.csail.mit.edu

Please ignore this post unless you were following the above link.
It exists only to correct a Google Groups indexing bug.

tag: dubuque fuhrmann holder

--Bill Dubuque

Chip Eastham

unread,

Dec 13, 2007, 7:42:28 PM12/13/07

to

It's a nice proof, avoiding both use of the big hammer
(Cayley-Hamilton) and any eigenvalue/algebraic closure
assumptions on the field of coefficients.

--c

Jim Ferry

unread,

Dec 13, 2007, 11:45:30 PM12/13/07

to

On Dec 13, 7:42 pm, Chip Eastham <hardm...@gmail.com> wrote:
> On Dec 12, 12:14 am, Jim Ferry <corkleb...@hotmail.com> wrote:
> >
> > A^0 R^n, A^1 R^n, A^2 R^n, ...
>

> It's a nice proof, avoiding both use of the big hammer
> (Cayley-Hamilton) and any eigenvalue/algebraic closure
> assumptions on the field of coefficients.

Thank you.

The variety of proofs was surprising. And yet one could imagine many
more: e.g., no one invoked a decomposition into the Jordan canonical
form.

Bill Dubuque

unread,

Dec 14, 2007, 4:23:03 AM12/14/07

to

Jim Ferry <corkl...@hotmail.com> wrote:
>Chip Eastham <hard...@gmail.com> wrote:
>>Jim Ferry <corkl...@hotmail.com> wrote:

>>>On Dec 11, 5:40 pm, vip.ma...@gmail.com wrote:
>>>>
>>>> I would like to request some help if anyone has ideas about
>>>> how to prove that if A is nilpotent matrix then A^n = 0?
>>>

>>> If A : R^n -> R^n, then consider the sequence
>>>

>>> A^0 R^n, A^1 R^n, A^2 R^n, ...
>>>

>>> Each element of this sequence is a linear space that contains the next
>>> element. If the containment is proper, the dimension goes down by at
>>> least one. On the other hand, if and when the containment is actually
>>> an equality, the sequence is constant from that point forward. This
>>> constant is 0-d iff A is nilpotent.
>>

>> It's a nice proof, avoiding both use of the big hammer (Cayley-Hamilton)

>> and any eigenvalue/algebraic closure assumptions on the field of coeff's.

>
> The variety of proofs was surprising. And yet one could imagine many more:
> e.g., no one invoked a decomposition into the Jordan canonical form.

But why employ huge sledgehammers when the essence is rather trivial?
As I stressed in my prior post here, both this theorem and the theorem
A 1-1 -> A onto for linear A on a finite dimensional vector space, are
both special cases of trivial theorems on posets (Partially Ordered SETs).
Indeed, there I presented a proof highlighting the key role of finiteness,
and consequent application of the pigeonhole principle. However, it seems
that elegant proofs like these are widely overlooked. Of course one can
only speculate why this is so. One reason may be that poset theory is not
widely taught - though it does prove crucial when one studies important
lattices of substructures and congruences later on in Universal Algebra.

Secondly and, no doubt, most importantly, naive students simply cannot
resist the temptation to dive in and calculate. Here that means working
with specific representations of vector spaces, i.e. in terms of bases,
coordinates, determinants, etc. If, instead, one resists this temptation
and, at first sight of the problem, attempts to distill the true essence
of the matter, one often finds that the desired proof is nothing but
a specialization of a rather trivial proof - when viewed at the proper
level of abstraction. I've emphasized this viewpoint many times here,
e.g. see my posts [1] on abstracting out "hidden ideals" in many common
proofs regarding factorization and divisors in elementary number theory.
Speaking of ideals, their inventor, Dedekind, was the champion of these
conceptual methodological issues. On such, below is a few page extract
from a nice 34 page paper [2] of Jeremy Avigad discussing such matters.
See also his other interesting publications [3] on related topics.

--Bill Dubuque

[1] http://google.com/groups/search?q=group%3A*math*+wgd+hidden+ideal
[2] http://www.andrew.cmu.edu/user/avigad/Papers/dedekind.pdf
[3] http://www.andrew.cmu.edu/user/avigad/papers.html
Excerpt from [2]
================
The work of Richard Dedekind provides a clear example of interplay
between general philosophical views and methodological concerns. His
work has certainly had a tangible effect on mathematics, inaugurating
practices that were to proliferate in the decades that followed. These
include the use of infinitary, set-theoretic language; the use of
nonconstructive reasoning; axiomatic and algebraic characterization
of structures, and a focus on properties that can be expressed in
terms of mappings between them; the use of particular algebraic
structures, like modules, fields, ideals, and lattices; and uses of
algebraic mainstays like equivalence relations and quotient structures.
These innovations are so far-reaching that it is hard not to attribute
them to a fundamentally different conception of what it means to do
mathematics. Indeed, we find Dedekind addressing foundational issues
in his Habilitationsrede (1854), and his constructions of the real
numbers (1872) and the natural numbers (1888) are well-known. But even
his distinctly mathematical writings bear a strongly reflective tone,
one that is further evident in his correspondence with colleagues; see
(Dedekind,1968; Dugac, 1976; Ewald, 1996; van Heijenoort, 1967
[...]
Dedekind's overall mathematical output reflects a remarkable
methodological unity, characterized, above all, by a drive to
radically reformulate the conceptual settings of the mathematical
problems he addressed, through the introduction and improvement of
new, more effective, and simpler concepts. (Corry, 2004, 68)
[...]
3 Dedekind's emphasis on conceptual reasoning

Below we will discuss a number of aspects of Dedekind's theory of
ideals. There is one, however, that gives it a character that is
squarely opposed to that of Kronecker's theory, namely, Dedekind's
emphasis on "conceptual" over algorithmic reasoning. Whereas
Kronecker's theory of ideal divisors is explicitly algorithmic
throughout, Dedekind's stated goal was to avoid algorithmic
reasoning:

Even if there were such a theory, based on calculation, it still
would not be of the highest degree of perfection, in my opinion.
It is preferable, as in the modern theory of functions, to seek
proofs based immediately on fundamental characteristics, rather than
on calculation, and indeed to construct the theory in such a way that
it is able to predict the results of calculation ... (Dedekind, 1877,
Werke vol. 3, 296; trans. Stillwell, 1996, 102; also translated in
Stein, 1988,245)

In this passage, Dedekind is referring to Riemann's approach to the
theory of functions of a complex variable, in which functions are
characterized by their topological and geometric properties. Dedekind
makes a similar claim in a letter to Lipschitz, written around the same
time:

My efforts in number theory have been directed towards basing the work
not on arbitrary representations or expressions but on simple
foundational concepts and thereby - although the comparison may sound a
bit grandiose - to achieve in number theory something analogous to
what Riemann achieved in function theory, in which connection I cannot
suppress the passing remark the Riemann's principles are not being
adhered to in a significant way by most writers - for example, even
in the newest work on elliptic functions. Almost always they mar the
purity of the theory by unnecessarily bringing in forms of representation
which should be results, not tools, of the theory. (Dedekind, 1876,
Werke vol. 3, 468-469; quoted and translated in Edwards, 1983)

Dedekind returns to this point in (1895), an essay we will discuss
below. There, he first quotes an excerpt from Article 76 of Gauss's
Disquisitiones Arithmeticae (1801), in which Gauss observes that
Wilson's theorem was first published by Waring. In that excerpt, Gauss
notes that

...neither of them was able to prove the theorem, and Waring
confessed that the demonstration was made more difficult because
no notation can be devised to express a prime number. But in our opinion
truths of this kind should be drawn from the ideas involved rather than
from notations. (Gauss, 1801, article 76; trans. Clarke, 1966.
Dedekind, 1895, quotes the original Latin.)

Dedekind goes on:

When one takes them in the most general sense, a great scientific
thought is expressed in these words, a decision in favor of the
internal [Innerliche], in contrast to the external [Ausserlichen].
This contrast is repeated in almost every area of mathematics; one
need only think of the theory of [Complex] functions, and Riemann's
definition of functions through internal characteristic properties,
from which the external forms of representation necessarily arise.
(Dedekind,1895, Werke vol. 2, 54-55; my translation)

When it comes to making sense of the passages above, it is easier to
say what Dedekind is trying to avoid: definitions of mathematical
objects and systems of objects in terms of symbolic expressions and
methods of acting upon them (for example, in thinking of functions as
given by expressions of a certain sort), and proofs that rely on a
particular choice of representation when many equivalent representations
are available. Saying, in a positive way, how the new methods of
reasoning are supposed to accomplish this takes more effort.
Dedekind's mathematical work, however, provides us with a sense of
what he had in mind. For one thing, his foundational essays (Dedekind
1854, 1872, 1888) all focus on axiomatic characterization of
structures of interest; and in (Dedekind, 1888) there is a proof that
his axiomatization of the natural numbers is categorical, i.e. it
characterizes the structure uniquely, up to isomorphism. We will see
below that this focus on axiomatic characterization is central to his
work. We will also see Dedekind embrace the ability to introduce new
mathematical structures and operations on these structures using
set-theoretic definitions, without concern for the way that the
elements of these structures are to be represented syntactically, and
therefore without providing algorithms that mirror the set-theoretic
operations. Indeed, Dedekind shows no qualms in taking the elements of
a structure to be infinitary objects in their own right, or referring,
in a definition, to the totality of subsets of an infinite domain. It
is clear that Dedekind sees these methods as part of the "conceptual"
approach as well.

It is worth emphasizing that it is not axiomatic methods in and of
themselves that distinguish Dedekind's work from Kronecker's. In
extending Kummer's work on ideal divisors of the cyclotomic integers,
Kronecker (1870) himself gave an early an influential axiomatization of
the notion of a group. Referring to Gauss's classification of binary
quadratic forms, he wrote:

The extremely simple principles on which Gauss's method rests cannot
only be applied in the place mentioned above, but also in many others
and, in particular, already in the most elementary parts of number
theory. This fact suggests, and it is easy to convince oneself of it,
that these principles belong to a more general and more abstract
realm of ideas. It seems therefore to be appropriate to free
the further development of the latter from all inessential
restrictions, so that one is then spared from having to repeat the same
argument in the different cases of application. The advantage comes to
the fore already in the development itself, and the presentation (if it
is given in the most general way possible) thereby gains in simplicity
and clarity, since it clearly exhibits what alone is essential.
(Kronecker,1870, Werke vol. 1, 274-275; trans. Schlimm, 2005; also
translated in Wussing, 1984, 64)

Below we will see Dedekind offer similar pronouncements as to the
simplicity and generality to be gained from algebraic methods. So the
differences between the two do not lie in the use of algebraic notions
per se, but, rather, in the manner in which these notions are employed.

The progression from Dedekind's first theory of ideals to his last
represents a steady transition from Kummer's algorithmic style of
reasoning to a style that is markedly more abstract and set theoretic.
Thus, it is not surprising that Edwards, who laments mathematics'
departure from the explicitly algorithmic styles of Gauss, Kummer, and
Kronecker, judges Dedekind's first version of ideal theory to be his
best (Edwards 1980, 1992). In contrast, Emmy Noether, who inherited the
mantle of structuralism from Dedekind through Hilbert, expressed a
clear preference for the last (see McLarty's contribution to this
volume). Tracing the development of Dedekind's thinking can therefore
help us understand what is at stake.

As noted in the introduction, this essay focuses on Dedekind's point
of view. This point of view was controversial at the time, and since
then there has been a small but committed minority that agrees with
Edwards' contemporary assessment that something important has been lost
in turning away from a more explicit, algorithmic standpoint. Although
it is hard for mathematicians with a modern training to recapture a
nineteenth century algorithmic sensibility, anyone studying the great
works of that century cannot but appreciate the crisp elegance and
efficiency of the associated conceptions. (Edwards' expository texts
(1984, 1990, 1994, 1995, 1996, 2001, 2005) do a fine job of conveying
a feel for this style of thought.) The reader should therefore keep in
mind that in the discussion that follows, alternative, Kroneckerian
points of view are not adequately represented.
[...]
Dedekind's writings show that he is acutely aware of the role that
definitions play in structuring a theory. One finds him concerned with
such issues of systematization as early as 1854, in his Habilitations-
rede. There, he characterizes a process of extending operations like
addition and multiplication to extended domains, whereby one
identifies the laws they satisfy in a restricted domain, and
stipulates that these laws are to maintain their general validity

One reason for this requirement can be found in Dedekind's insistence
that definitions and methods of proof used in an extended domain
should parallel the definitions and methods of proof that have been
effective in more restricted domains. For example, in his presentation
of the theory of higher congruences, he highlights the goal of
introducing concepts in such strong analogy to those of elementary
number theory that only a few words need to be changed in the number-
theoretic proofs" (1857, Werke vol. 1, 40). In his presentations of
ideal theory, he is careful to point out where definitions, basic
characteristics, theorems, and proofs with respect to algebraic
integers and ideals agree with their counterparts for the ordinary
integers, and he seems to enjoy citing parallel developments in
Dirichlet's Lectures wherever he can. The methodological benefits
are clear, since it is often easy and efficient to reuse, adapt,
and extend familiar modes of reasoning.
[...]
The insistence on treating systems of numbers - viewed as either sets
of numbers, or as predicates - as objects in their own right does,
however, have important methodological consequences: it encourages one
to speak of "arbitrary" systems, and allows one to define operations on
them in terms of their behavior as sets or predicates, in a manner
that is independent of the way in which they are represented. For
example, in 1871, Dedekind defines the least common multiple of two
modules to be their intersection, without worrying about how a basis
for this intersection can be computed from bases for the initial
modules. We Find a similar use of nonconstructivity when he
characterizes integral bases as those bases of integers whose
discriminants have the least absolute value; he does this without
giving an algorithm for Finding such a basis or determining this least
discriminant. In fact, in both examples just cited, algorithms can be
obtained. But Dedekind's presentation sends the strong message that
such algorithms are not necessary, i.e. that one can have a fully
satisfactory theory that fails to provide them. This paves the way to
more dramatic uses of nonconstructive reasoning, in which one uses
facts about infinitary functions, sets, and sequences that are false on
an algorithmic interpretation. Such reasoning was used, for example,
by Hilbert, in proving his Basissatz in 1890.

Dedekind's insistence on having an explicit set-theoretic object to
stand as the referent for a mathematical term is coupled with the
following exhortation:

o The definition of new objects should be independent of the way they
are represented, and arguments involving them should not depend on any
particular representation.

That way, the arguments "explain" why calculations with and properties
of the objects do not depend on these choices of representations. This
is why Dedekind, in the passage above, complains that his first
attempt at a theory of ideal divisors did not "allow one to recognize
the invariance the concepts in fact have from the outset."

It is precisely these last two criteria that push Dedekind to the use
of set-theoretic language and methods. In particular, his treatment
commits him to acceptance of the following:

o Infinite systems of numbers can be treated as objects in their own
right, and one can reason about the domain of all such objects.

This view is commonly accepted today, but it was novel to the mathematics
of the 1870's.14 It is striking that Dedekind adopts the use of such
systems without so much as a word of clarification or justification.
That is, he simply introduced a style of reasoning that was to have
decisive effects on future generations, without fanfare. When, in
(1883), Cantor published an introduction to his theory of the
infinite, he invoked considerations from history, philosophy and
theology to assess the legitimacy of the use of completed infinite
totalities, and responded to criticisms, both actual and anticipated.
In (1888), Dedekind did spell out, informally, some of the rules he
took to govern the use of infinite systems. But there he simply
characterized such systems as "objects of our thought" and took this
as sufficient justification of their legitimacy
[...]
Even in this short excerpt, we can discern a number of general claims,
including the following:

o Theorems should be stated at an "appropriate level of generality,"
even though it is often not evident how to do this at the outset.

o An important theorem should have a proof which "relates directly" to
the relevant concepts.

o Sometimes, casting a theorem in a "simpler" or more general form
makes it easier to Find a direct proof.

These can be viewed as calls for a kind of methodological directness,
or purity. In the case at hand, even though Theorems 1-4 above are
easily shown to be equivalent, Dedekind takes Theorems 3 and 4 to be
preferred, because they deal with the more general concept of a module
rather than an ideal. (A module, in Dedekind's terminology, is what we
would call a Z-module, and has only an additive structure; the concept
of an ideal relies on the notion of multiplication as well.)