Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Reduction of Radicals

8 views

Skip to first unread message

dimitris

unread,

Dec 3, 2006, 6:40:26 AM12/3/06

Based on this reference

Cardan Polynomials and the Reduction of Radicals (by T. Osler)

(see also references therein)

(you can download the paper here:
http://www.jstor.org/view/0025570x/di021218/02p0059q/0?currentResult=0025570x%2bdi021218%2b02p0059q%2b0%2c03&searchUrl=http%3A%2F%2Fwww.jstor.org%2Fsearch%2FBasicResults%3Fhp%3D25%26so%3DNewestFirst%26si%3D1%26Query%3DOsler
)

the following expression can be reduced to 1

z = (2 + Sqrt[5])^(1/3) + (2 - Sqrt[5])^(1/3)

Mathematica gives

N[%]
1.9270509831248424 + 0.535233134659635*I

This is because by default it returns a complex number for the cube
root of a negative number

List @@ z
N[%]

{(2 - Sqrt[5])^(1/3), (2 + Sqrt[5])^(1/3)}
{0.30901699437494756 + 0.535233134659635*I, 1.618033988749895}

However defining

mycuberoot[x_] := Block[{w}, w = w /. Solve[w^3 == 1][[3]]; If[Re[x] <
0, w*x^(1/3), x^(1/3)]]

Then

{2 - Sqrt[5], 2 + Sqrt[5]}
mycuberoot /@ %
FullSimplify[%]
Together[Plus @@ %]

{2 - Sqrt[5], 2 + Sqrt[5]}
{(-1)^(2/3)*(2 - Sqrt[5])^(1/3), (2 + Sqrt[5])^(1/3)}
{(1/2)*(1 - Sqrt[5]), (1/2)*(1 + Sqrt[5])}
1

Is there a particular reason why by default Mathematicas returns a
complex number for the cube root of a negative number or it is a matter
of choise?

Following the same procedure I prove that

(10 + 6*Sqrt[3])^(1/3) + (10 - 6*Sqrt[3])^(1/3)

is equal to 2. Indeed

{10 + 6*Sqrt[3], 10 - 6*Sqrt[3]}
mycuberoot /@ %
FullSimplify[%]
Together[Plus @@ %]

{10 + 6*Sqrt[3], 10 - 6*Sqrt[3]}
{(10 + 6*Sqrt[3])^(1/3), (-1)^(2/3)*(10 - 6*Sqrt[3])^(1/3)}
{1 + Sqrt[3], 1 - Sqrt[3]}
2

This behavior of Mathematica does not affect simplifications by e.g.
RootReduce?

I must admit that I have gaps on my knowledge in these symbolic aspects

(I start to be interested in after I try to solve the the secular
Rayleigh equation)
so more experienced members of the forum may forgive any possible
mistakes of mine!

Anyway I don't understand this difference in treating nested radicals
between literature and Mathematica.

I really appreciate any kind of insight/guideness/comments.

Regards
Dimitris

José Carlos Santos

unread,

Dec 4, 2006, 6:56:51 AM12/4/06

dimitris wrote:

> Is there a particular reason why by default Mathematicas returns a
> complex number for the cube root of a negative number or it is a matter
> of choise?

Mathematica gives you the principal value of the cubic root function.
In the case of negative numbers, it turns out that it is not a real
number.

Best regards,

Jose Carlos Santos

Andrzej Kozlowski

unread,

Dec 4, 2006, 6:57:52 AM12/4/06

On 3 Dec 2006, at 20:26, dimitris wrote:

> Based on this reference
>
> Cardan Polynomials and the Reduction of Radicals (by T. Osler)
>
> (see also references therein)
>
> (you can download the paper here:
> http://www.jstor.org/view/0025570x/di021218/02p0059q/0?
> currentResult=0025570x%2bdi021218%2b02p0059q%2b0%2c03&searchUrl=http
> %3A%2F%2Fwww.jstor.org%2Fsearch%2FBasicResults%3Fhp%3D25%26so%
> 3DNewestFirst%26si%3D1%26Query%3DOsler
> )
>
> the following expression can be reduced to 1
>
> z = (2 + Sqrt[5])^(1/3) + (2 - Sqrt[5])^(1/3)
>
> Mathematica gives
>
> N[%]
> 1.9270509831248424 + 0.535233134659635*I
>

> This is because by default it returns a complex number for the cube

> root of a negative number
>

> List @@ z
> N[%]
>
> {(2 - Sqrt[5])^(1/3), (2 + Sqrt[5])^(1/3)}
> {0.30901699437494756 + 0.535233134659635*I, 1.618033988749895}
>
> However defining
>
> mycuberoot[x_] := Block[{w}, w = w /. Solve[w^3 == 1][[3]]; If[Re[x] <
> 0, w*x^(1/3), x^(1/3)]]
>
> Then
>
> {2 - Sqrt[5], 2 + Sqrt[5]}
> mycuberoot /@ %
> FullSimplify[%]
> Together[Plus @@ %]
>
> {2 - Sqrt[5], 2 + Sqrt[5]}
> {(-1)^(2/3)*(2 - Sqrt[5])^(1/3), (2 + Sqrt[5])^(1/3)}
> {(1/2)*(1 - Sqrt[5]), (1/2)*(1 + Sqrt[5])}
> 1
>

> Is there a particular reason why by default Mathematicas returns a
> complex number for the cube root of a negative number or it is a
> matter
> of choise?
>

> Following the same procedure I prove that
>
> (10 + 6*Sqrt[3])^(1/3) + (10 - 6*Sqrt[3])^(1/3)
>
> is equal to 2. Indeed
>
> {10 + 6*Sqrt[3], 10 - 6*Sqrt[3]}
> mycuberoot /@ %
> FullSimplify[%]
> Together[Plus @@ %]
>
> {10 + 6*Sqrt[3], 10 - 6*Sqrt[3]}
> {(10 + 6*Sqrt[3])^(1/3), (-1)^(2/3)*(10 - 6*Sqrt[3])^(1/3)}
> {1 + Sqrt[3], 1 - Sqrt[3]}
> 2
>
> This behavior of Mathematica does not affect simplifications by e.g.
> RootReduce?
>
> I must admit that I have gaps on my knowledge in these symbolic
> aspects
>
> (I start to be interested in after I try to solve the the secular
> Rayleigh equation)
> so more experienced members of the forum may forgive any possible
> mistakes of mine!
>
> Anyway I don't understand this difference in treating nested radicals
> between literature and Mathematica.
>
> I really appreciate any kind of insight/guideness/comments.
>
> Regards
> Dimitris
>

The issue of which branch of a multivalued function should be chosen
as the so called "principal branch" is, of course, a matter of
convention. Since Mathematica defines Power[x,y] as Exp[y Log[x]],
the issue of what is (-1)^(1/3) is equivalent to choosing the value
of Log[-1]. Mathematica chooses the value

Log[-1]

I*Pi

Of course once that is decided, everything else follows:

FullSimplify[(-1)^(1/3) - Exp[(1/3)*I*Pi]]
0

It seems to me (though it is not something that lies within the scope
of my "professional" interest), that before the advent of computer
algebra there no need was felt for a uniform way of choosing
principal values for various multivalued functions that occur in
algebra and analysis. In other words, the relation

x^y = Exp[y,Log[x]]

was not treated as the definition of x^y, but as a relation that held
only up to the choice of branches of the multivalued functions
involved. So it seems to me that it was always thought that the
natural choice for Log[-1] is I Pi, but before computer algebra
systems appeared it was not necessarily felt that the "principal
value" of x^(1/3) is the one that makes x^y = Exp[y,Log[x]] hold.
Note that to keep this relation true and to have the principal value
of the cube root of -1 equal to -1, one would have do choose 3 I Pi
as the principal value of Log[-1], which does not seem very natural.

RootReduce is somewhat different. Given an algebraic number it first
tries to find its minimal polynomial. This does not depend on any
particular representation of the algebraic number. After that, it
expresses the answer in terms of roots of that minimal polynomial,
using root isolation and the associated ordering. This is something
that can be done in several ways, with Mathematica alone having two
different ordering methods. Other systems use other methods, so the
the order of roots may not be the same for different systems
(although the real roots will always correspond).
It is only when algebraic numbers are expressed in terms of radicals
that conventions about "principal branches" come into play. I expect
that all symbolic algebra systems use the same convention based on
the definition x^y = Exp[y,Log[x]], but I have not tried to check
this. My own view is that radical representations of algebraic
numbers should now be considered as belonging only to the history of
mathematics.

Best regards

Andrzej

Bob Hanlon

unread,

Dec 4, 2006, 7:00:54 AM12/4/06

Search the archives for "branch cut" and/or "principal value"

z==(2 + Sqrt[5])^(1/3) + (2 - Sqrt[5])^(1/3);

#^3&/@(#-(2 + Sqrt[5])^(1/3)&/@%)

(-(2 + Sqrt[5])^(1/3) + z)^3 == 2 - Sqrt[5]

Reduce[%,z,Reals]//ToRules

{z -> 1}

z==(10 + 6*Sqrt[3])^(1/3) + (10 - 6*Sqrt[3])^(1/3)

z == (10 - 6*Sqrt[3])^(1/3) + (10 + 6*Sqrt[3])^(1/3)

#^3&/@(#-(10 + 6*Sqrt[3])^(1/3)&/@%)

(-(10 + 6*Sqrt[3])^(1/3) + z)^3 == 10 - 6*Sqrt[3]

Reduce[%,z,Reals]//ToRules

{z -> 2}

Bob Hanlon

Murray Eisenberg

unread,

Dec 5, 2006, 6:17:17 AM12/5/06

That Mathematica gives value I Pi for Log[-1] is consistent with the
most common convention is that the principal argument, Arg, of a nonzero
complex number z satisfies -Pi < Arg[z] <= Pi.

Then the usual definition of the principal logarithm, Log, is

Log[z] = Log[Abs[z]] + I Arg[z],

and the multi-valued argument function, arg, would be given as:

arg[z] = set of all Arg[z] + n 2 Pi I (n an integer)

The multi-valued logarithm, log, would be given as

log[z] = Log[Abs[z]] + I arg[z].

In this case one can define

z^w = Exp[w log[z]]

and the principal value of this as Exp[w Log[z]].

Andrzej Kozlowski wrote:
> ...

--
Murray Eisenberg mur...@math.umass.edu
Mathematics & Statistics Dept.
Lederle Graduate Research Tower phone 413 549-1020 (H)
University of Massachusetts 413 545-2859 (W)
710 North Pleasant Street fax 413 545-1801
Amherst, MA 01003-9305

Andrzej Kozlowski

unread,

Dec 6, 2006, 6:18:33 AM12/6/06

Of course this is completely true. But perhaps it has somewhat
obscured the point I was trying to make, which was meant to answer
Dimitris's question: "why is Mathematica's convention about principal
parts in variance with those of classical algebraist's?" It seems to
me that the answer is simply that they (the classical algebraists and
perhaps some of their modern successors) did not care about the
relation with "principal values" of multivalued functions in analysis
and did not even know about them (Vieta, for one, live a whole
century before Euler). Even in certain modern texts on algebra the
relationship between radicals and logarithms is not relevant, so it
is not surprising that some authors even today may find it convenient
even today to use different "principal values' when working in a
purely algebraic setting. The point I was making about computer
algebra programs like Mathematica is that with their universal scope
why have to be consistent across both algebra and analysis in a way
that a book on, say, Galois theory, need not be.

Andrzej Kozlowski

Murray Eisenberg

unread,

Dec 6, 2006, 6:22:35 AM12/6/06

No disagreement!

Andrzej Kozlowski wrote:
> *This message was transferred with a trial version of CommuniGate(tm) Pro*

>> --Murray Eisenberg mur...@math.umass.edu

Daniel Lichtblau

unread,

Dec 7, 2006, 6:39:31 AM12/7/06

The use of principal roots has received some replies but since I liked
the Osler article above and wanted to comment I thought I'd revisit.

(1) As pointed out by others (A. Kozlowski, M. Eisenberg) use of
principal values for fractional roots means, among other things, that
Power can be defined in terms of Log, and they can share a branch cut.
This is useful in and of itself (try figuring out jumps in definite
integration with a proliferation of functions having unrelated btranch
cuts).

Another reason to like the definition a^b==Exp[b*Log[a]] is that for r>0
it makes f[x_]=(-r)^x differentiable in x. With a choice of negative
roots for x equal to 1/n, n an odd integer, such a function would fail
even to be continuous.

Another nice feature is that it becomes simple to recover "surds" (that
is, the full set of values for a radical a^(1/n)) simply by taking the
principal value and multiplying by powers of the principal nth root of
unity. Were we to have (-1)^(1/3), say, be simply -1, then one would be
forced to use explicit complex exponentials instead of root-of-unity
radicals in order to attain the principal value. But having (-1)^(1/3)
be the principal value means we can easily attain other roots such as -1
by multiplying by appropriate powers of this root of unity.

(2) Osler's paper discusses some ways to reduce certain radicals to
simpler forms. This is a special case of radical denesting. In this case
one can use polynomial algebra techniques coupled with a selection
procedure to remove parasite roots.

One example uses something resembling

(2+Sqrt[5])^(1/3) + (2-Sqrt[5])^(1/3)

EXCEPT with the convention that the cube of the negative is a negative
rather than principal value. To find the desired value one might make
new variables for radicals and polynomials to define them (in effect
giving the surds, or full sets of values), eliminate all variables other
than the one representing the value of interest, and then find the root
of the resulting polynomial (in that remaining variable) that lies in
the region of interest. For this example we might let

x=(2+Sqrt[5])^(1/3)

with defining polynomial

x^3-(2+z) (where y is given by z^2-5). Letting t be the value of
interest and continuing in this way, we would get

polys = {t-(x+y), x^3-(2+z), y^3-(2-z), z^2-5};

Now form a Groebner basis eliminating all but t, and solve for t.

roots = t /. Solve[First[GroebnerBasis[polys,t,{x,y,z}]]==0, t];

Last, select the root that is real valued.

In[36]:= Select[roots, Element[#,Reals]&]
Out[36]= {1}

Of course we could use direct built in functionality, provided we first
translate the expression as per note (1) above so that we are indeed
getting the negative root for the second summand. This summand thus
becomes (-1)^(2/3)*(2-Sqrt[5])^(1/3)

and we do

In[37]:= RootReduce[(2+Sqrt[5])^(1/3) + (-1)^(2/3)*(2-Sqrt[5])^(1/3)]
Out[37]= 1

The last example in the paper is a bit more complicated but can be
handled in exactly the same ways. It was from the dedication of a 1997
paper, commerating an anniversary of the birth of Ramanujan.

const = (32*(146410001/48400)^3 - 6*(146410001/48400));
polys = {t^6-(const+b), b^2-(const^2-1)};
roots = t /. Solve[First[GroebnerBasis[polys,t,b]]==0, t];

Now grab any root real and larger than 1.

In[41]:= Select[roots, Element[#,Reals]&&#>1&]
Out[41]= {110}

(3) Osler defines and uses "Cardan polynomials" to do radical reduction.
I cannot help but notice* that these have interesting combinatorial,
algebraic, and analytic properties, a few of which I'll describe. First
his definition:

Ca[n_,x_,y_] := Expand[2*y^(n/2)*ChebyshevT[n,x/(2*Sqrt[y])]]

For example:

In[45]:= InputForm[Ca[9,x,y]]
Out[45]= x^9 - 9*x^7*y + 27*x^5*y^2 - 30*x^3*y^3 + 9*x*y^4

We'll work with a closely related family wherein we take absolute values
of coefficients and also add a term y^n.

Da[n_,x_,y_] := Expand[-I^n*Ca[n,I*x,y]+y^n]

In[47]:= InputForm[Da[9,x,y]]
Out[47]= x^9 + 9*x^7*y + 27*x^5*y^2 + 30*x^3*y^3 + 9*x*y^4 + y^9

Finally we'll want to define some very familiar polynomials.

Ru[n_,x_,y_] := Expand[(x+y)^n]

In[49]:= InputForm[Ru[9,x,y]]
Out[49]=
x^9 + 9*x^8*y + 36*x^7*y^2 + 84*x^6*y^3 + 126*x^5*y^4 + 126*x^4*y^5 +
84*x^3*y^6 + 36*x^2*y^7 + 9*x*y^8 + y^9

Note that Ru[n,x,y] has coefficients from the nth row of the Pascal
triangle.

(A) If one looks at rows of coefficients from Da[n,x,y] as n increases
one sees an assymmetric triangle. For example, the 11th row would be

1 11 44 77 55 11 1

We can derive this from the 11th row of the Pascal triangle using a
one-sided differencing. That is, for the kth element of the mth row,
we'll take absolute value of alternating sums from the prior row, up to
the kth element.

Pa[n_,k_,0] := Binomial[n,k]
Pa[n_,k_,m_] := Abs[Sum[(-1)^(j-1)*Pa[n,j,m-1],{j,m,k}]]

Now let's look at a table of these values, for n=11.

assymmetrictable[n_] := Table[Pa[n,k,j], {j,0,(n-1)/2}, {k,0,n-1}]

In[146]:= InputForm[assymmetrictable[11]]
Out[146]//InputForm=
{{1, 11, 55, 165, 330, 462, 462, 330, 165, 55, 11},
{0, 11, 44, 121, 209, 253, 209, 121, 44, 11, 0},
{0, 0, 44, 77, 132, 121, 88, 33, 11, 0, 0},
{0, 0, 0, 77, 55, 66, 22, 11, 0, 0, 0},
{0, 0, 0, 0, 55, 11, 11, 0, 0, 0, 0},
{0, 0, 0, 0, 0, 11, 0, 0, 0, 0, 0}}

We have recovered the nontrivial "middle" coefficients for Da[11,m] (the
two "end" coefficients are simply unity) from the first nonzero element
of row m: 11, 44, 77, 55, 11.

There are nice combinatorial formulas for these coefficients; I wanted
to illustrate an algorithmic approach to recovering them from the
binomial coefficients of the Pascal triangle.

(B) The maps Ru[n,x,y] and Da[n,x,y] each posess a group invariance
property. And, returning to the original intent of the thread, thses
involve multiplication by principal roots of unity. For Ru[n,x,y] we
have the map given by the matrix rumat = {{e,0},{0,e}} where
e=(-1)^(2/n) is the principal nth root of unity.

This gives rise to a group action {z,w} --> {e*z,e*w}.

For Da[n,x,y] we use damat = {{e,0},{0,e^2}}

giving the group action {{z,w} --> {e*z,e^2*w}. So let's define these
actions explicitly.

groupAction[n_,m_,expr_,x_,y_] := expr /.
{x->x*(-1)^(2/n),y->y*(-1)^(2*m/n)}

Now we'll check the claimed invariance properties for n=9.

In[156]:= InputForm[groupAction[9,1,Ru[9,x,y],x,y]]
Out[156]//InputForm=
x^9 + 9*x^8*y + 36*x^7*y^2 + 84*x^6*y^3 + 126*x^5*y^4 + 126*x^4*y^5 +
84*x^3*y^6 + 36*x^2*y^7 + 9*x*y^8 + y^9

This is indeed simp,ly Ru[9,x,y]

In[157]:= InputForm[groupAction[9,2,Da[9,x,y],x,y]]
Out[157] x^9 + 9*x^7*y + 27*x^5*y^2 + 30*x^3*y^3 + 9*x*y^4 + y^9

which again is just Da[9,x,y]

(C) It turns out that both Ru[n,x,y] and Da[n,x,y] map the lines x+y=1
to 1. That is, if we replace y by 1-x the polynomials will evaluate to
unity.

For example:

In[161]:= Ru[9,x,1-x]
Out[161]= 1

In[162]:= Da[9,x,1-x]
Out[162]= 1

The case of Ru[n,x,1-x] should not be a surprise. After all Ru[n,x,y] is
simply (x+y)^n and this is of course 1 on x+y=1. That Da[n,x,1-x]=1 is a
bit more subtle.

Also: all polynomial maps with this property that are invariant under
the action of groupAction[n,1,...] can be obtained from straightforward
operations that amount to "tensoring", and inversion thereof, of this
basic polynomial Ru[n,x,y]. Similarly, all polynomial maps that take
x+y==1 to 1 and are invariant under groupAction[n,2,...] are obtained
from such operations on Da[n,x,y].

Clearly (he said,) there are no other 2x2 finite matrix group
representations for which there are invariant polynomials taking x+y=1
to 1 (except for those obviously equivalent to damat, as these can each
be represented in two ways).

Anyone else catch these?*

Daniel Lichtblau
Wolfram Research

*Just trolling. This stuff is far from obvious but was familiar from a
previous lifetime. "Pa" is for Pascal, "Ru" for Rudin, "Da" for
D'Angelo. Obviously Rudin's polynomials predate Rudin; his contribution
was to note that they both map the unit ball in C^2 to higher
dimensional balls (analogous to those polynomials taking x+y-1 to 1),
and satisfy a group invariance property.

As noted above, the Cardan polynomials from Osler's article are
D'Angelo's but with one term dropped and signs alternating. For odd n
D'Angelo had defined them in an article from 1988, showing how they give
maps from the unit ball in C^2 with properties similar to Rudin's, but
working with a less trivial group. He also conjectured no other such
groups could be used in this way, either for C^2 or higher dimension
domains. Some possible matrix groups had been ruled out around that time
by F. Forstneric. Ruling out the rest was interesting work, and I think
it's all covered in D'Angelo's book "Several Complex Variables and the
Geometry of Real Hypersurfaces". The cases of n even-valued were used in
an article by D'Angelo from the 90's to discuss mappings from the ball
to hyperquadrics that have similar invariance properties. We do not know
if these polynomials were defined in any context prior to that. Various
combinatorial properties of the D'Angelo polynomials appear in a number
of his articles. The connection of these to Chebyshev polynomials is
observed in an American Mathematical Monthly article by Dilcher and
Stolarsky (October 2005). That all polynomial maps invariant under that
groupAction[n,2,...] and taking x+y=1 to 1 can be obtained from
Da[n,x,y] was proved in a hammock.

0 new messages