Bug in abs(I*x).diff(x)

492 views
Skip to first unread message

Ondřej Čertík

unread,
Nov 13, 2014, 12:19:07 AM11/13/14
to sage-...@googlegroups.com
Hi,

With Sage 6.3, I am getting:

sage: abs(x).diff(x)
x/abs(x)
sage: abs(I*x).diff(x)
-x/abs(I*x)

But abs(I*x) == abs(x). So also abs(x).diff(x) and abs(I*x).diff(x)
must be the same. But in the first case we get x/abs(x), and in the
second we got -x/abs(x).

In SymPy, the answer is:

In [1]: abs(x).diff(x)
Out[1]:
d d
re(x)⋅──(re(x)) + im(x)⋅──(im(x))
dx dx
─────────────────────────────────
│x│


In [2]: x = Symbol("x", real=True)

In [3]: abs(x).diff(x)
Out[3]: sign(x)

In [4]: abs(I*x).diff(x)
Out[4]: sign(x)

In [26]: var("x")
Out[26]: x

Which seems all correct --- in the complex case [1] we get a little
messy expression, but a correct one. For a real case, we get the
correct answer.

In Wolfram Alpha, the answer of abs(I*x).diff(x) is x/abs(x):

http://www.wolframalpha.com/input/?i=Diff%5BAbs%5Bi*x%5D%2C+x%5D

Which is only correct for real "x", but at least it is correct for
this special case.

The Sage result seems wrong for any "x".

Ondrej

Clemens Heuberger

unread,
Nov 13, 2014, 2:18:02 AM11/13/14
to sage-...@googlegroups.com

possibly related to http://trac.sagemath.org/ticket/12588 ?

Regards, CH
--
Univ.-Prof. Dr. Clemens Heuberger Alpen-Adria-Universität Klagenfurt
Institut für Mathematik, Universitätsstraße 65-67, 9020 Klagenfurt, Austria
Tel: +43 463 2700 3121 Fax: +43 463 2700 99 3121
clemens....@aau.at http://wwwu.aau.at/cheuberg

Ondřej Čertík

unread,
Nov 13, 2014, 10:29:41 AM11/13/14
to sage-...@googlegroups.com
Yes. Note also here:

http://mathworld.wolfram.com/AbsoluteValue.html

which says that complex derivative of d|z|/dz does not exist, as
Cauchy-Riemann equations do not hold for Abs(z). And:

"As a result of the fact that computer algebra programs such as
Mathematica generically deal with complex variables (i.e., the
definition of derivative always means complex derivative), d|z|/dz
correctly returns unevaluated by such software."

So perhaps SymPy and Sage should return Derivative(Abs(x), x). Only
when user specifies that "x" is real, only then we can differentiate.

Ondrej
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To post to this group, send email to sage-...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sage-devel.
> For more options, visit https://groups.google.com/d/optout.

Bill Page

unread,
Nov 13, 2014, 12:16:36 PM11/13/14
to sage-devel
It has always seemed very inconvenient to me that "computer algebra
programs such as Mathematica" choose to define derivative as
complex-derivative. I believe a reasonable alternative is what is
known as a Wirtinger derivative. Wirtinger derivatives exist for all
continuous complex-valued functions including non-holonomic functions
and permit the construction of a differential calculus for functions
of complex variables that is analogous to the ordinary differential
calculus for functions of real variables

http://en.wikipedia.org/wiki/Wirtinger_derivatives

Wirtinger derivatives come in conjugate pairs but we have

f(x).diff(conjugate(x)) = conjugate(conjugate(f(x).diff(x))

so we really only need one derivative given an appropriate conjugate
function. The Cauchy-Riemann equations reduce to

f(x).diff(conjugate(x)) = 0

I also like that abs is related to the sgn function

abs(x).diff(x) = x/abs(x)

This is consistent with

abs(x)=sqrt(x*conjugate(x))

The Wirtinger derivative of abs(x) is 1/2 x/abs(x). Its total
Wirtinger derivative is x/abs(x).

I have implemented conjugate and Wirtinger derivatives in FriCAS

http://axiom-wiki.newsynthesis.org/SandBoxWirtinger

Unfortunately I have not yet been able to convince the FriCAS
developers of the appropriateness of this approach. I would be happy
to find someone with whom to discuss this further, pro and con. The
discussion on the FriCAS email list consisted mostly of the related
proper treatment of conjugate without making explicit assumptions
about variables.

Regards,
Bill Page.

Bill Page

unread,
Nov 13, 2014, 12:51:40 PM11/13/14
to sage-devel
On 13 November 2014 12:16, Bill Page <bill...@newsynthesis.org> wrote:
>
> The Wirtinger derivative of abs(x) is 1/2 x/abs(x). Its total
> Wirtinger derivative is x/abs(x).
>

Sorry, I should have written that the Wirtinger derivative of abs(x) is

1/2 conjugate(x)/abs(x)

Bill.

maldun

unread,
Nov 13, 2014, 2:47:03 PM11/13/14
to sage-...@googlegroups.com
We had a similar problem with the complex derivative of logarithms in combination with the complex conjugate, where I also the
use of Wirtinger Operators would solve the problem: https://groups.google.com/forum/?hl=en#!topic/sage-support/bEMPMEYeZKU

Having them in Sage would be a great achievement!

Although this has some sense in complex analysis one should be careful with 'deriving' the absolute value, since
it results in the weak derivative ( http://en.wikipedia.org/wiki/Weak_derivative) , which is in a broader sense the derivative in the distribution sense. 
Thus we have infinite possible derivatives

With this expression it is indirectly forbidden to asign a specific value to the unspecified value at zero:

sage: f = abs(x).diff(x)
sage: f(x=0)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-14-615bbebcb37c> in <module>()
----> 1 f(x=Integer(0))

/home/maldun/sage/sage-6.3/local/lib/python2.7/site-packages/sage/symbolic/expression.so in sage.symbolic.expression.Expression.__call__ (build/cythonized/sage/symbolic/expression.cpp:21933)()

/home/maldun/sage/sage-6.3/local/lib/python2.7/site-packages/sage/symbolic/ring.so in sage.symbolic.ring.SymbolicRing._call_element_ (build/cythonized/sage/symbolic/ring.cpp:8493)()

/home/maldun/sage/sage-6.3/local/lib/python2.7/site-packages/sage/symbolic/expression.so in sage.symbolic.expression.Expression.substitute (build/cythonized/sage/symbolic/expression.cpp:21183)()

ValueError: power::eval(): division by zero



This is in some sense good, since  we don't have to care about the derivative at zero, 
but in an other sense it is not so good, since the subdifferential ∂abs(0) = [0,1] is a bounded and with this definition one could come to the false conclusion that abs(x)
has a pole, althoug by taking limits one can easily see that it should be bounded at zero.

For symbolic purposes, of course one could live on with this.

maldun

unread,
Nov 13, 2014, 2:51:48 PM11/13/14
to sage-...@googlegroups.com
The only clean solution for this behaviour would be a warning e.g: "Warning: This Identity holds only almost everywhere!"
But I don't know if it's worth the effort ...

maldun

unread,
Nov 13, 2014, 3:04:14 PM11/13/14
to sage-...@googlegroups.com

This is in some sense good, since  we don't have to care about the derivative at zero, 
but in an other sense it is not so good, since the subdifferential ∂abs(0) = [0,1] is a bounded and with this definition one could come to the false conclusion that abs(x)
has a pole, althoug by taking limits one can easily see that it should be bounded at zero.




Sorry I meant  ∂abs(0) = [-1,1] ...

And another thing to add: I think the only clean solution could be a warning like: "Warning: This is not a derivative in the classical sense!"
But I don't know if this is really worth the effort ... 

Ondřej Čertík

unread,
Nov 13, 2014, 4:00:24 PM11/13/14
to sage-...@googlegroups.com
Hi Bill,
Thanks for your email! I haven't talked to you in a long time.
Literally just today I learned about Wirtinger derivatives. The
wikipedia page is *really* confusing to me. It took me a while to
realize, that Wirtinger derivatives is simply the derivative with
respect to z or conjugate(z). I.e.

z = x + i*y
conjugate(z) = x - i*y

From this it follows:

x = 1/2*(z + conjugate(z))
y = i/2*(-z+conjugate(z))

Then I take any function and write it in terms of z and conjugate(z),
some examples:

|z| = sqrt(z*conjugate(z))
Re z = x = 1/2 * (z + conjugate(z))
z^2 = (x+i*y)^2

And then I simply differentiate with respect to z or conjugate(z).
This is called the Wirtinger derivative. So:

d|z|/dz = d sqrt(z*conjugate(z)) / dz = 1/2*conjugate(z) / |z|

As you said, the function is analytic if it doesn't functionally
depend on conjugate(z), as can be shown easily. So |z| or Re z are not
analytic, while z^2 is. If the function is analytic, then df/d
conjugate(z) = 0, and df/dz is the complex derivative. Right?

So for analytic functions, Wirtinger derivative gives the same answer
as Mathematica. For non-analytic functions, Mathematica leaves it
unevaluated, but Wirtinger derivative gives you something.

How do you calculate the total Wirtinger derivative? How is that defined?

Because I would like to get

d|x| / d x = x / |x|

for real x. And I don't see currently how is this formula connected to
Wirtinger derivatives. Finally, the derivative operator in a CAS could
return Wirtinger derivatives, I think it's a great idea, if somehow we
can recover the usual formula for abs(x) with real "x".

What are the cons of this approach?

Ondrej

Ondřej Čertík

unread,
Nov 13, 2014, 7:24:05 PM11/13/14
to sage-...@googlegroups.com
To elaborate on this point, if the function has a complex derivative
(i.e. it is analytic),
then the complex derivative f'(z) = \partial f / \partial x. So to
calculate a complex derivative
with respect to z=x+i*y, we just need to differentiate with respect to x.

It can be shown, that the Wirtinger derivatives df/dz is equal to
\partial f / \partial x for analytic functions,
i.e. when df/d conjugate(z) = 0.

So in a CAS, we can simply define the derivative f'(z) as \partial f /
\partial x for any function, even if it doesn't have a complex
derivative.
For any function we can show that:

\partial f / \partial x = d f / d z + d f / d conjugate(z)

Bill, is this what you call the "total Wirtinger derivative"?

For example, for |z| we get:

|z|' = \partial |z| / \partial x = d |z| / d z + d |z| / d
conjugate(z) = conjugate(z) / (2*|z|) + z / (2*|z|) = Re(z) / |z|

Using our definition, this holds for any complex "z". Then, if "z" is
real, we get:

|z|' = z / |z|

Which is exactly the usual real derivative. Bill, is this what you had
in mind? That a CAS could return the derivative of abs(z)
as Re(z) / abs(z) ?

Ondrej

Bill Page

unread,
Nov 13, 2014, 8:12:55 PM11/13/14
to sage-devel
On 13 November 2014 19:24, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Thu, Nov 13, 2014 at 2:00 PM, Ondřej Čertík <ondrej...@gmail.com> wrote:
>>
>> As you said, the function is analytic if it doesn't functionally
>> depend on conjugate(z), as can be shown easily. So |z| or
>> Re z are not analytic, while z^2 is. If the function is analytic,
>> then df/d conjugate(z) = 0, and df/dz is the complex derivative.
>> Right?
>

Yes. In my email I notice that I wrote "holonomic" but what I meant
was "holomorphic". Complex-analytic functions are holomorphic and
vice-versa.

> ...
> So in a CAS, we can simply define the derivative f'(z) as
> \partial f / \partial x for any function, even if it doesn't have a
> complex derivative.

Yes.

> For any function we can show that:
>
> \partial f / \partial x = d f / d z + d f / d conjugate(z)
>
> Bill, is this what you call the "total Wirtinger derivative"?
>

Yes

> For example, for |z| we get:
>
> |z|' = \partial |z| / \partial x = d |z| / d z + d |z| / d
> conjugate(z) = conjugate(z) / (2*|z|) + z / (2*|z|) = Re(z) / |z|
>
> Using our definition, this holds for any complex "z". Then, if "z" is
> real, we get:
>
> |z|' = z / |z|
>
> Which is exactly the usual real derivative. Bill, is this what you had
> in mind? That a CAS could return the derivative of abs(z)
> as Re(z) / abs(z) ?
>
> Ondrej
>
>>
>> So for analytic functions, Wirtinger derivative gives the same answer
>> as Mathematica. For non-analytic functions, Mathematica leaves it
>> unevaluated, but Wirtinger derivative gives you something.
>>
>> How do you calculate the total Wirtinger derivative? How is that defined?
>>
>> Because I would like to get
>>
>> d|x| / d x = x / |x|
>>
>> for real x. And I don't see currently how is this formula connected to
>> Wirtinger derivatives. Finally, the derivative operator in a CAS could
>> return Wirtinger derivatives, I think it's a great idea, if somehow we
>> can recover the usual formula for abs(x) with real "x".
>>
>> What are the cons of this approach?
>>
>> Ondrej
>

Bill Page

unread,
Nov 13, 2014, 8:56:24 PM11/13/14
to sage-devel
Sorry, I hit send before I was quite ready. To continue ...

On 13 November 2014 19:24, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Thu, Nov 13, 2014 at 2:00 PM, Ondřej Čertík <ondrej...@gmail.com> wrote:
> ...
> For example, for |z| we get:
>
> |z|' = \partial |z| / \partial x = d |z| / d z + d |z| / d
> conjugate(z) = conjugate(z) / (2*|z|) + z / (2*|z|) = Re(z) / |z|
>
> Using our definition, this holds for any complex "z". Then, if "z"
> is real, we get:
>
> |z|' = z / |z|
>
> Which is exactly the usual real derivative. Bill, is this what you
> had in mind? That a CAS could return the derivative of abs(z)
> as Re(z) / abs(z) ?
>

Yes, exactly. I think a question might arise whether we should treat
conjugate or Re as elementary.

>> ...
>> What are the cons of this approach?
>>

First, care needs to be taken to properly extend the chain rule to
include the conjugate Wirtinger derivative where necessary.

Second, in principle problems can arise when defining a test for
constant functions. For example this is necessary as part of
rewriting expressions in terms of the smallest number of elementary
functions (normalize) as a kind of zero test for expressions in
FriCAS/Axiom. Usually we assume that

df(x)/dx = 0

is necessary and sufficient for f to be a constant function. But
requiring that the total derivative

d f / d z + d f / d conjugate(z) = 0

is not what we mean by constant. In fact it seems to be an open
question whether Richardson's theorem can be extended to include
conjugate as an elementary function in such a way that the zero test
is still computable. This is the last point of discussion on the
FriCAS email list.

Bill.

Ondřej Čertík

unread,
Nov 14, 2014, 2:14:25 AM11/14/14
to sage-...@googlegroups.com
On Thu, Nov 13, 2014 at 6:56 PM, Bill Page <bill...@newsynthesis.org> wrote:
> Sorry, I hit send before I was quite ready. To continue ...
>
> On 13 November 2014 19:24, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> On Thu, Nov 13, 2014 at 2:00 PM, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> ...
>> For example, for |z| we get:
>>
>> |z|' = \partial |z| / \partial x = d |z| / d z + d |z| / d
>> conjugate(z) = conjugate(z) / (2*|z|) + z / (2*|z|) = Re(z) / |z|
>>
>> Using our definition, this holds for any complex "z". Then, if "z"
>> is real, we get:
>>
>> |z|' = z / |z|
>>
>> Which is exactly the usual real derivative. Bill, is this what you
>> had in mind? That a CAS could return the derivative of abs(z)
>> as Re(z) / abs(z) ?
>>
>
> Yes, exactly. I think a question might arise whether we should treat
> conjugate or Re as elementary.

Ok, thanks for the confirmation.

There is an issue though --- since |z| is not analytic, the
derivatives depend on the direction. So along "x" you get

>
>>> ...
>>> What are the cons of this approach?
>>>
>
> First, care needs to be taken to properly extend the chain rule to
> include the conjugate Wirtinger derivative where necessary.
>
> Second, in principle problems can arise when defining a test for
> constant functions. For example this is necessary as part of
> rewriting expressions in terms of the smallest number of elementary
> functions (normalize) as a kind of zero test for expressions in
> FriCAS/Axiom. Usually we assume that
>
> df(x)/dx = 0
>
> is necessary and sufficient for f to be a constant function. But
> requiring that the total derivative
>
> d f / d z + d f / d conjugate(z) = 0
>
> is not what we mean by constant. In fact it seems to be an open
> question whether Richardson's theorem can be extended to include
> conjugate as an elementary function in such a way that the zero test
> is still computable. This is the last point of discussion on the
> FriCAS email list.
>
> Bill.
>

Ondřej Čertík

unread,
Nov 14, 2014, 2:19:56 AM11/14/14
to sage-...@googlegroups.com
On Fri, Nov 14, 2014 at 12:14 AM, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Thu, Nov 13, 2014 at 6:56 PM, Bill Page <bill...@newsynthesis.org> wrote:
>> Sorry, I hit send before I was quite ready. To continue ...
>>
>> On 13 November 2014 19:24, Ondřej Čertík <ondrej...@gmail.com> wrote:
>>> On Thu, Nov 13, 2014 at 2:00 PM, Ondřej Čertík <ondrej...@gmail.com> wrote:
>>> ...
>>> For example, for |z| we get:
>>>
>>> |z|' = \partial |z| / \partial x = d |z| / d z + d |z| / d
>>> conjugate(z) = conjugate(z) / (2*|z|) + z / (2*|z|) = Re(z) / |z|
>>>
>>> Using our definition, this holds for any complex "z". Then, if "z"
>>> is real, we get:
>>>
>>> |z|' = z / |z|
>>>
>>> Which is exactly the usual real derivative. Bill, is this what you
>>> had in mind? That a CAS could return the derivative of abs(z)
>>> as Re(z) / abs(z) ?
>>>
>>
>> Yes, exactly. I think a question might arise whether we should treat
>> conjugate or Re as elementary.
>
> Ok, thanks for the confirmation.
>
> There is an issue though --- since |z| is not analytic, the
> derivatives depend on the direction. So along "x" you get

Sorry, a bug in gmail sent the message....

along "x" you get:

|z|' = \partial |z| / \partial x = d |z| / d z + d |z| / d
conjugate(z) = conjugate(z) / (2*|z|) + z / (2*|z|) = Re(z) / |z|

but along "y" you get:

|z|' = \partial |z| / \partial i*y = d |z| / d z - d |z| / d
conjugate(z) = conjugate(z) / (2*|z|) - z / (2*|z|) = i*Im(z) / |z|

So I get something completely different. So which direction should be preferred
in the CAS convention and why?

Ondrej

Bill Page

unread,
Nov 14, 2014, 9:43:18 AM11/14/14
to sage-devel
On 13 November 2014 14:47, maldun <dom...@gmx.net> wrote:
>
> Although this has some sense in complex analysis one should be careful
> with 'deriving' the absolute value, since it results in the weak derivative
> ( http://en.wikipedia.org/wiki/Weak_derivative) , which is in a broader sense
> the derivative in the distribution sense.

Yes, I first became interesting in Wirtinger derivatives in the
context of distributions.

> Thus we have infinite possible derivatives
>

Maybe it is better to the derivative of abs to be a partial function,
i.e. just not defined everywhere.

> With this expression it is indirectly forbidden to assign a specific value to
> the unspecified value at zero:
>

Yes.

Bill Page

unread,
Nov 14, 2014, 10:57:58 AM11/14/14
to sage-devel
On 14 November 2014 02:19, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Fri, Nov 14, 2014 at 12:14 AM, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> ...
>> Ok, thanks for the confirmation.
>>
>> There is an issue though --- since |z| is not analytic, the
>> derivatives depend on the direction. So along "x" you get
>
> |z|' = \partial |z| / \partial x = d |z| / d z + d |z| / d conjugate(z) =
> conjugate(z) / (2*|z|) + z / (2*|z|) = Re(z) / |z|
>
> but along "y" you get:
>
> |z|' = \partial |z| / \partial i*y = d |z| / d z - d |z| / d conjugate(z) =
> conjugate(z) / (2*|z|) - z / (2*|z|) = i*Im(z) / |z|
>
> So I get something completely different.

It seems to me that we should forget about x and y. All we really need is

|z|' = d |z| / d z = conjugate(z) / (2*|z|)

and the appropriate algebraic properties of conjugate.

> So which direction should be preferred in the CAS convention and why?
>

Well, um, you did write: "Because I would like to get

d|x| / d x = x / |x|

for real x".

The constant 1/2 is irrelevant.

Bill.

Ondřej Čertík

unread,
Nov 14, 2014, 1:18:36 PM11/14/14
to sage-...@googlegroups.com


On Nov 14, 2014 8:57 AM, "Bill Page" <bill...@newsynthesis.org> wrote:
>
> On 14 November 2014 02:19, Ondřej Čertík <ondrej...@gmail.com> wrote:
> > On Fri, Nov 14, 2014 at 12:14 AM, Ondřej Čertík <ondrej...@gmail.com> wrote:
> >> ...
> >> Ok, thanks for the confirmation.
> >>
> >> There is an issue though --- since |z| is not analytic, the
> >> derivatives depend on the direction. So along "x" you get
> >
> > |z|' = \partial |z| / \partial x = d |z| / d z + d |z| / d  conjugate(z) =
> > conjugate(z) / (2*|z|) + z / (2*|z|) = Re(z) / |z|
> >
> > but along "y" you get:
> >
> > |z|' = \partial |z| / \partial i*y = d |z| / d z - d |z| / d  conjugate(z) =
> > conjugate(z) / (2*|z|) - z / (2*|z|) = i*Im(z) / |z|
> >
> > So I get something completely different.
>
> It seems to me that we should forget about x and y.  All we really need is
>
>  |z|'  = d |z| / d z = conjugate(z) / (2*|z|)
>
> and the appropriate algebraic properties of conjugate.

Sure, we can make a CAS return this. But then you get the 1/2 there.

>
> > So which direction should be preferred in the CAS convention and why?
> >
>
> Well, um, you did write: "Because I would like to get
>
>   d|x| / d x = x / |x|
>
>   for real x".
>
> The constant 1/2 is irrelevant.

Well, but how do I recover the real derivative from the complex one if they differ by a factor of 1/2?

In other words, what is the utility of such a definition then?

I can see the utility of differentiating with respect to x, as at least you must recover the real derivative results.

Ondrej

Bill Page

unread,
Nov 14, 2014, 1:30:24 PM11/14/14
to sage-devel
On 14 November 2014 13:18, Ondřej Čertík <ondrej...@gmail.com> wrote:
>
> On Nov 14, 2014 8:57 AM, "Bill Page" <bill...@newsynthesis.org> wrote:
>>
>> It seems to me that we should forget about x and y. All we really need is
>>
>> |z|' = d |z| / d z = conjugate(z) / (2*|z|)
>>
>> and the appropriate algebraic properties of conjugate.
>
> Sure, we can make a CAS return this. But then you get the 1/2 there.
>

Yes.

>> ...
>> The constant 1/2 is irrelevant.
>
> Well, but how do I recover the real derivative from the complex one if they
> differ by a factor of 1/2?
>

What do you mean by "the real derivative"? Perhaps we can just define that as

d f / d z + d f / d conjugate(z)

> In other words, what is the utility of such a definition then?
>
> I can see the utility of differentiating with respect to x, as at least you
> must recover the real derivative results.
>

You are not differentiating with respect to x, you are differentiating
with respect to

(z+conjugate(z))/2

Bill.

Ondřej Čertík

unread,
Nov 14, 2014, 2:29:45 PM11/14/14
to sage-...@googlegroups.com


On Nov 14, 2014 11:30 AM, "Bill Page" <bill...@newsynthesis.org> wrote:
>
> On 14 November 2014 13:18, Ondřej Čertík <ondrej...@gmail.com> wrote:
> >
> > On Nov 14, 2014 8:57 AM, "Bill Page" <bill...@newsynthesis.org> wrote:
> >>
> >> It seems to me that we should forget about x and y.  All we really need is
> >>
> >>  |z|'  = d |z| / d z = conjugate(z) / (2*|z|)
> >>
> >> and the appropriate algebraic properties of conjugate.
> >
> > Sure, we can make a CAS return this. But then you get the 1/2 there.
> >
>
> Yes.
>
> >> ...
> >> The constant 1/2 is irrelevant.
> >
> > Well, but how do I recover the real derivative from the complex one if they
> > differ by a factor of 1/2?
> >
>
> What do you mean by "the real derivative"? 

The absolute value doesn't have a complex derivative, but it has a real derivative, over the real axis.

> Perhaps we can just define that as
>
>   d f / d z + d f / d  conjugate(z)
>
> > In other words, what is the utility of such a definition then?
> >
> > I can see the utility of differentiating with respect to x, as at least you
> > must recover the real derivative results.
> >
>
> You are not differentiating with respect to x, you are differentiating
> with respect to
>
>   (z+conjugate(z))/2

Is that how you propose to define the derivatives for non-analytic functions? I am a little confused what exactly is your proposal.

I think one either leaves the derivatives of non analytic functions unevaluated, or defines them in such a way that one recovers the real derivative as a special case, as long as there are no inconsistencies.

Bill Page

unread,
Nov 15, 2014, 11:18:55 AM11/15/14
to sage-devel
On 14 November 2014 14:29, Ondřej Čertík <ondrej...@gmail.com> wrote:
>
> On Nov 14, 2014 11:30 AM, "Bill Page" <bill...@newsynthesis.org> wrote:
>>
>> What do you mean by "the real derivative"?
>
> The absolute value doesn't have a complex derivative, but it has a real
> derivative, over the real axis.
>

It seems to me that the concept of "real axis" is rather foreign to
the algebra. Assuming conjugate is implemented properly, i.e.
"algebraically", what you are actually saying is just that

z = conjugate(z)

> ...
>> You are not differentiating with respect to x, you are differentiating
>> with respect to
>>
>> (z+conjugate(z))/2
>
> Is that how you propose to define the derivatives for non-analytic
> functions? I am a little confused what exactly is your proposal.
>

I am sorry for the confusion. What I am proposing is that the
Wirtinger derivative(s) be considered the fundamental case (valid for
complex or even quaternion variables). As you noted previously this is
fine and doesn't change anything for the case of analytic functions.
If someone wants the derivative of a non-analytic function over a
given domain that should be called something else.

> I think one either leaves the derivatives of non analytic functions
> unevaluated,

No, this is just giving up. We should be able to do much better than that.

> or defines them in such a way that one recovers the real derivative
> as a special case, as long as there are no inconsistencies.
>

Yes exactly, the concept of "real derivative" is a special case.

Bill.

kcrisman

unread,
Nov 17, 2014, 10:14:40 AM11/17/14
to sage-...@googlegroups.com
For reference (since Sage uses Ginac for most derivatives) see http://www.cebix.net/pipermail/ginac-devel/2014-April/002105.html

Bill Page

unread,
Nov 17, 2014, 11:17:41 AM11/17/14
to sage-devel
Vladimir V. Kisil kisilv's patch

http://www.ginac.de/pipermail/ginac-devel/2013-November/002053

looks like a good start to me especially if one doesn't want to
consider the issue of derivatives of non-analytic functions in
general.

On 17 November 2014 10:14, kcrisman <kcri...@gmail.com> wrote:
> For reference (since Sage uses Ginac for most derivatives) see
> http://www.cebix.net/pipermail/ginac-devel/2014-April/002105.html
>

kcrisman

unread,
Nov 17, 2014, 11:24:01 AM11/17/14
to sage-...@googlegroups.com

Vladimir V. Kisil kisilv's patch

http://www.ginac.de/pipermail/ginac-devel/2013-November/002053

looks like a good start to me especially if one doesn't want to
consider the issue of derivatives of non-analytic functions in
general.


Ondřej Čertík

unread,
Nov 17, 2014, 3:17:15 PM11/17/14
to sage-...@googlegroups.com
Hi Bill,

On Sat, Nov 15, 2014 at 9:18 AM, Bill Page <bill...@newsynthesis.org> wrote:
> On 14 November 2014 14:29, Ondřej Čertík <ondrej...@gmail.com> wrote:
>>
>> On Nov 14, 2014 11:30 AM, "Bill Page" <bill...@newsynthesis.org> wrote:
>>>
>>> What do you mean by "the real derivative"?
>>
>> The absolute value doesn't have a complex derivative, but it has a real
>> derivative, over the real axis.
>>
>
> It seems to me that the concept of "real axis" is rather foreign to
> the algebra. Assuming conjugate is implemented properly, i.e.
> "algebraically", what you are actually saying is just that
>
> z = conjugate(z)

That's fine.

>
>> ...
>>> You are not differentiating with respect to x, you are differentiating
>>> with respect to
>>>
>>> (z+conjugate(z))/2
>>
>> Is that how you propose to define the derivatives for non-analytic
>> functions? I am a little confused what exactly is your proposal.
>>
>
> I am sorry for the confusion. What I am proposing is that the
> Wirtinger derivative(s) be considered the fundamental case (valid for
> complex or even quaternion variables). As you noted previously this is
> fine and doesn't change anything for the case of analytic functions.
> If someone wants the derivative of a non-analytic function over a
> given domain that should be called something else.

I still don't understand exactly your proposal. We've played with a
few ideas above, in particular we have considered at least (below d/dz
is the Wirtinger derivative, d/dx and d/d(iy) are partial derivatives
with respect to "x" or "iy" in z=x+i*y) :

1) d/dz
2) d/dz + d/d conjugate(z) = d/dx
3) d/dz - d/d conjugate(z) = d/d(iy)
4) 2 * (d/dz + d/d conjugate(z))
5) 2 * d/dz

Which of these do you propose to use? For analytic functions, only 1)
and 2) reduce to the usual complex derivative. 4) and 5) will be off
by a factor of 2. For example, for a function z^2 we get:

1) 2*z
2) 2*z
3) 2*z
4) 4*z
5) 4*z

Since z^2 is analytic, the correct derivative is 2*z, so 1), 2) and 3)
give the right answer.

For abs(z), we get:

1) conjugate(z) / (2*|z|)
2) Re(z) / |z|
3) -i*Im(z) / |z|
4) 2*Re(z) / |z|
5) conjugate(z) / |z|

When "z" is real, then the (real) derivative of |z|' = z/|z|. We want
our complex formula to be equal to z/|z| if "z" is real. Of the above,
only 2) and 5) is equal to z/|z| when "z" is real. Note that I made a
sign mistake in my previous email regarding 3).


Comparing these two cases, options 1) and 3) are eliminated because
the results for abs(z) do not reduce to the correct real derivative.
Option 4) is eliminated, because it gives wrong results for analytic
functions, as well as it doesn't reduce to the correct real derivative
for abs(z). Option 5) is eliminated because it gives wrong results for
analytic functions.

As such, only option 2) is consistent. For all analytic functions, it
gives the correct complex derivative, and for non-analytic functions,
at least for abs(z) it reduces to the correct real derivative in the
special case when "z" is real, i.e. z = conjugate(z). Note that you
cannot apply z = conjugate(z) to the definition of 2) to obtain the
(incorrect) result 2*d/dz, you need to treat z and conjugate(z)
separately when differentiating and only at the end apply z =
conjugate(z).

It seems to work for other non-analytic functions, for example for
Re(z) = (z+conjugate(z))/2, we get:

2) 1

for Im(z) = (-z+conjugate(z))*i/2 we get:

2) 0

So that all works as expected. To prove that this works for all
non-analytic functions, we just use the fact that d/dz + d/d
conjugate(z) = d/dx, as we talked about above. So we are just
calculating the partial derivative with respect to "x" (we use the
notation z = x+i*y). When "z" is real, i.e. z = conjugate(z), it
follows that y = 0, and d/dx is just the usual (real) derivative. When
y is non-zero, we still calculate d/dx but using z and conjugate(z).
As shown above, for analytic functions this d/dx derivative is equal
to the complex derivative. For non-analytic functions the complex
derivative doesn't exist, so it's just our definition, but it reduces
to the usual real derivative (which is always equal to d/dx). In one
of my previous emails, I raised a question for non-analytic functions,
why we can't define it in some other way, perhaps d/d(iy), i.e. the
option 3) above. I think the answer is that we can, but it will not
reduce to the real derivative d/dx if "z" is real, simply because
d/d(iy) is not equal to d/dx, unless the function is analytic, i.e. 3)
works for analytic functions, but fails for abs(z), as shown above (as
well as in my previous email --- note again that I made a sign mistake
there).

>
>> I think one either leaves the derivatives of non analytic functions
>> unevaluated,
>
> No, this is just giving up. We should be able to do much better than that.
>
>> or defines them in such a way that one recovers the real derivative
>> as a special case, as long as there are no inconsistencies.
>>
>
> Yes exactly, the concept of "real derivative" is a special case.

Hopefully the above clarifies, that from everything that we have
considered so far, only the option 2) can work. It turns out that
that's also precisely what also ginac considered for abs(z)'. So the
conclusion seems clear --- simply use 2) for any function, be it
analytic or not.



However, Bill, from your emails, you seem to be giving conflicting
statements. It seems you agree that 2) is the way to go in some
emails, but then in some other emails you write:

> It seems to me that we should forget about x and y. All we really need is
>
> |z|' = d |z| / d z = conjugate(z) / (2*|z|)

Which is the case 1) above, and it is shown that it doesn't work.

Right in the next paragraph you wrote:

>
> The constant 1/2 is irrelevant.

What do you mean that the constant 1/2 is irrelevant? I think it is
very relevant, as it makes the answer incorrect.

Or:

> You are not differentiating with respect to x, you are differentiating
> with respect to
>
> (z+conjugate(z))/2

But differentiating with respect to (z+conjugate(z))/2 is just the
case 5) above, and it is shown that it doesn't work. While
differentiating with respect to "x" is the case 2) above and it is
shown that it works.

Finally, in your last email you wrote:

> I am sorry for the confusion. What I am proposing is that the
> Wirtinger derivative(s) be considered the fundamental case (valid for
> complex or even quaternion variables). As you noted previously this is
> fine and doesn't change anything for the case of analytic functions.
> If someone wants the derivative of a non-analytic function over a
> given domain that should be called something else.

I am completely confused with this paragraph. Let's try to clarify this.

When you say "I am proposing that the Wirtinger derivative(s) be
considered the fundamental case", which of the five cases above are
you proposing? Strictly speaking, Wirtinger derivative is the case 1),
but that doesn't work. Are you proposing the case 2) instead?

> As you noted previously this is
> fine and doesn't change anything for the case of analytic functions.

Correct, cases 1), 2) and 3) don't change anything for analytic functions.

> If someone wants the derivative of a non-analytic function over a
> given domain that should be called something else.

Are you proposing to only consider analytic functions? I thought the
whole conversation in this thread was about how to extend this to
non-analytic functions... If you only consider analytic functions,
then we don't need Wirtinger derivatives at all, since we can just use
the usual complex derivative, as is already the case in most CAS
systems. Of course, then we need to leave abs(z) unevaluated, as it is
not analytic.

I thought the goal was rather to extend the definition of "derivative"
to also apply for non-analytic functions, in the whole complex domain
in such a way, so that it reduces to a complex derivative for analytic
functions, and a real derivative if we restrict "z" to be real. It
seems that 2) above is one such definition that would allow that.

Bill, would you mind clarifying the above misunderstandings? I think
we are on the same page, probably we both just understood something
else with the terminology we used, but I want to make 100% sure.

Ondrej

Ondřej Čertík

unread,
Nov 17, 2014, 3:21:52 PM11/17/14
to sage-...@googlegroups.com
> I still don't understand exactly your proposal. We've played with a
> few ideas above, in particular we have considered at least (below d/dz
> is the Wirtinger derivative, d/dx and d/d(iy) are partial derivatives
> with respect to "x" or "iy" in z=x+i*y) :
>
> 1) d/dz
> 2) d/dz + d/d conjugate(z) = d/dx
> 3) d/dz - d/d conjugate(z) = d/d(iy)
> 4) 2 * (d/dz + d/d conjugate(z))
> 5) 2 * d/dz
>
> Which of these do you propose to use? For analytic functions, only 1)
> and 2) reduce to the usual complex derivative. 4) and 5) will be off
> by a factor of 2. For example, for a function z^2 we get:


Correction: For analytic functions, only 1), 2) and 3) reduce to the
usual complex derivative.
(As is shown below on particular examples.)

Bill Page

unread,
Nov 17, 2014, 9:52:47 PM11/17/14
to sage-devel
On 17 November 2014 15:17, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Sat, Nov 15, 2014 at 9:18 AM, Bill Page <bill...@newsynthesis.org> wrote:
>>
>> I am sorry for the confusion. What I am proposing is that the
>> Wirtinger derivative(s) be considered the fundamental case (valid
>> for complex or even quaternion variables). As you noted previously
>> this is fine and doesn't change anything for the case of analytic
>> functions. If someone wants the derivative of a non-analytic
>> function over a given domain that should be called something
>> else.
>
> I still don't understand exactly your proposal. We've played with a
> few ideas above, in particular we have considered at least (below
> d/dz is the Wirtinger derivative, d/dx and d/d(iy) are partial derivatives
> with respect to "x" or "iy" in z=x+i*y) :
>
> 1) d/dz
> 2) d/dz + d/d conjugate(z) = d/dx
> 3) d/dz - d/d conjugate(z) = d/d(iy)
> 4) 2 * (d/dz + d/d conjugate(z))
> 5) 2 * d/dz
>
> Which of these do you propose to use?

Both d/dz and d/d conjugate(z), i.e. the Wirtinger derivatives.

> ...
> When "z" is real, then the (real) derivative of |z|' = z/|z|. We want
> our complex formula to be equal to z/|z| if "z" is real.

Presumably you intend to choose only one of these? But this cannot
work in the general case.

> ...
> As such, only option 2) is consistent. For all analytic functions, it
> gives the correct complex derivative, and for non-analytic functions,
> at least for abs(z) it reduces to the correct real derivative in the
> special case when "z" is real, i.e. z = conjugate(z).

Yes.

>> ...
>> Yes exactly, the concept of "real derivative" is a special case.
>
> Hopefully the above clarifies, that from everything that we have
> considered so far, only the option 2) can work. It turns out that
> that's also precisely what also ginac considered for abs(z)'. So
> the conclusion seems clear --- simply use 2) for any function, be
> it analytic or not.
>

If there is only one derivative, how will you handle the chain rule?

>
> However, Bill, from your emails, you seem to be giving conflicting
> statements. It seems you agree that 2) is the way to go in some
> emails, but then in some other emails you write:
>
>> It seems to me that we should forget about x and y. All we really
>> need is
>>
>> |z|' = d |z| / d z = conjugate(z) / (2*|z|)
>
> Which is the case 1) above, and it is shown that it doesn't work.
>

We need both Wirtinger derivatives. Option 2) is their sum.

> Right in the next paragraph you wrote:
>
>>
>> The constant 1/2 is irrelevant.
>
> What do you mean that the constant 1/2 is irrelevant? I think it is
> very relevant, as it makes the answer incorrect.
>

I said that actually only one Wirtinger derivative was required
because the other can be expressed in terms of conjugate but I did not
mean to imply that it would meet your criteria of reducing to exactly
to the real case. It just happens that both Wirtinger derivatives are
the same in the case of abs.

>
> When you say "I am proposing that the Wirtinger derivative(s) be
> considered the fundamental case", which of the five cases above
> are you proposing? Strictly speaking, Wirtinger derivative is the case
> 1), but that doesn't work. Are you proposing the case 2) instead?
>

No, I meant that other derivatives (such as the real derivative) can
be obtained from the Wirtinger derivatives but not vice versa.

> ...
>> If someone wants the derivative of a non-analytic function over a
>> given domain that should be called something else.
>
> Are you proposing to only consider analytic functions?

No.

> I thought the whole conversation in this thread was about how to
> extend this to non-analytic functions...

Yes. The main issue is non-analytic (non-holomorphic) functions.

> ...
> I thought the goal was rather to extend the definition of "derivative"
> to also apply for non-analytic functions, in the whole complex domain
> in such a way, so that it reduces to a complex derivative for analytic
> functions, and a real derivative if we restrict "z" to be real. It
> seems that 2) above is one such definition that would allow that.
>

2) alone is not sufficient. In general for non-analytic functions two
derivatives are required.

> Bill, would you mind clarifying the above misunderstandings?
> I think we are on the same page, probably we both just understood
> something else with the terminology we used, but I want to make
> 100% sure.
>

Thank you. I am happy to continue the discussion.

Bill.

Ondřej Čertík

unread,
Nov 17, 2014, 11:16:15 PM11/17/14
to sage-...@googlegroups.com
Hi Bill,

Thanks for the clarification. So your point is that 2) is not
sufficient, that we really need two Wirtinger derivatives --- it's
just that one can be expressed using the other and a conjugate, so
perhaps CAS can only return one, but a chain rule needs modification
and probably some other derivatives handling as well. I need to think
about this harder.

Here is a relation that I found today in [1] (see also the references
there), I don't know if you are aware of it:

D f / D z = df/dz + df/d conjugate(z) * e^{-2*i*theta}

Where Df/Dz is the derivative in a complex plane along the direction
theta (the angle between the direction and the x-axis) and df/dz and
df/d conjugate(z) are the two Wirtinger derivatives. This formula
holds for any function. So all the derivatives no matter which
direction lie on a circle of radius df/d conjugate(z) and center
df/dz.

For analytic functions, we have df/d conjugate(z) = 0, and so the
above formula proves that all the derivatives are independent of
direction theta and equal to df/dz.

For non-analytic functions, the above formula gives all the possible
derivatives, and besides df/dz, the derivatives also depend on df/d
conjugate(z) and theta. But that's it. So as you said, the two
Wirtinger derivatives allow us to calculate the derivative along any
direction theta we want.

From my last email, the case 1) corresponds to df/d conjugate(z)=0,
i.e. analytic functions and the result is independent of theta. Case
2) is theta = 0, pi, 2*pi, ..., i.e. taking the derivative along the
x-axis. Case 3) is theta = pi/2, 3*pi/2, 5*pi/2, ..., i.e. taking the
derivative along the y-axis.


A real derivative of a real function g(x) is simply taken along the
x-axis. You can imagine that g(x) is also (arbitrarily) defined in the
whole complex plane and you are taking the Dg/gz derivative above with
theta = 0. The result is the same. So that's why the case 2), i.e.
theta=0, always reproduces the real derivative, because real
derivative is defined as theta=0.

For CAS, one could probably just say that theta=0 in our definition,
and then everything is consistent, and we only have one derivative,
2). The other option is to return both derivatives and make the
derivative Df/Dz of non-analytic function equal to the above formula,
i.e. depending on df/dz, df/d conjugate(z) and theta.

I need to think about the chain rule. I would simply introduce the
theta dependence into all formulas, as that gives all possible
derivatives and gives the exact functional dependence of all
possibilities. And then see whether we need to keep all formulas in
terms of theta, or perhaps if we can set theta = 0 for everything.

Ondrej

[1] Pyle, H. R., & Barker, B. M. (1946). A Vector Interpretation of
the Derivative Circle. The American Mathematical Monthly, 53(2), 79.
doi:10.2307/2305454

Bill Page

unread,
Nov 18, 2014, 8:57:11 AM11/18/14
to sage-devel
On 17 November 2014 23:16, Ondřej Čertík <ondrej...@gmail.com> wrote:
> Hi Bill,
>
> Thanks for the clarification. So your point is that 2) is not
> sufficient, that we really need two Wirtinger derivatives --- it's
> just that one can be expressed using the other and a conjugate,
> so perhaps CAS can only return one, but a chain rule needs
> modification and probably some other derivatives handling as
> well. I need to think about this harder.
>

Yes, that is a good summary. My tentative conclusion was that we
could implement just one (Wirtinger) derivative, a modified chain rule
and a sufficiently strong conjugate operation. This derivative is the
same as the usual derivative in the case of analytic functions but we
would have to live with the fact that it is slightly different (factor
of 1/2) for the case of common real derivatives of non-analytic
functions such as abs. Introducing a factor of 2, such as in the case
of the definition of the sign function seems like a small price to
pay.

> Here is a relation that I found today in [1] (see also the references
> there), I don't know if you are aware of it:
>
> D f / D z = df/dz + df/d conjugate(z) * e^{-2*i*theta}
>
> Where Df/Dz is the derivative in a complex plane along the direction
> theta (the angle between the direction and the x-axis) and df/dz and
> df/d conjugate(z) are the two Wirtinger derivatives. This formula
> holds for any function. So all the derivatives no matter which
> direction lie on a circle of radius df/d conjugate(z) and center
> df/dz.
>
> [1] Pyle, H. R., & Barker, B. M. (1946). A Vector Interpretation of
> the Derivative Circle. The American Mathematical Monthly, 53(2), 79.
> doi:10.2307/2305454

http://phdtree.org/pdf/36421281-a-vector-interpretation-of-the-derivative-circle/

Thank you. I was not aware of that specific publication. I think
their geometric interpretation is useful.

>
> For CAS, one could probably just say that theta=0 in our definition,
> and then everything is consistent, and we only have one derivative,
> 2). The other option is to return both derivatives and make the
> derivative Df/Dz of non-analytic function equal to the above formula,
> i.e. depending on df/dz, df/d conjugate(z) and theta.

I think you are overly focused on trying to define a derivative that
reduces to the conventional derivative of non-analytic functions over
the reals.

>
> I need to think about the chain rule. I would simply introduce the
> theta dependence into all formulas, as that gives all possible
> derivatives and gives the exact functional dependence of all
> possibilities. And then see whether we need to keep all formulas
> in terms of theta, or perhaps if we can set theta = 0 for everything.
>

It is not clear to me how to use such as "generic" derivative in the
application of the chain rule.

Bill.

David Roe

unread,
Nov 18, 2014, 9:02:55 AM11/18/14
to sage-devel
I've just been casually following this conversation, but I think it's
important that the derivative of abs(x) be sign(x) not 2*sign(x) or
1/2*sign(x).

If you use a different function, like f.wirtinger_derivative(), then
it doesn't matter so much.
David

>
>>
>> I need to think about the chain rule. I would simply introduce the
>> theta dependence into all formulas, as that gives all possible
>> derivatives and gives the exact functional dependence of all
>> possibilities. And then see whether we need to keep all formulas
>> in terms of theta, or perhaps if we can set theta = 0 for everything.
>>
>
> It is not clear to me how to use such as "generic" derivative in the
> application of the chain rule.
>
> Bill.
>

kcrisman

unread,
Nov 18, 2014, 10:11:10 AM11/18/14
to sage-...@googlegroups.com
> I think you are overly focused on trying to define a derivative that
> reduces to the conventional derivative of non-analytic functions over
> the reals.

I've just been casually following this conversation, but I think it's
important that the derivative of abs(x) be sign(x) not 2*sign(x) or
1/2*sign(x).

If you use a different function, like f.wirtinger_derivative(), then
it doesn't matter so much.
David


+1

That notwithstanding, this conversation is really great to see and I hope we get something that works for the usual cases in the original post too!

Bill Page

unread,
Nov 18, 2014, 11:05:54 AM11/18/14
to sage-devel
On 18 November 2014 09:02, David Roe <roed...@gmail.com> wrote:
> On Tue, Nov 18, 2014 at 5:57 AM, Bill Page <bill...@newsynthesis.org> wrote:
>>
>> > I think you are overly focused on trying to define a derivative that
>> > reduces to the conventional derivative of non-analytic functions
>> > over the reals.
>>
>> I've just been casually following this conversation, but I think it's
>> important that the derivative of abs(x) be sign(x) not 2*sign(x) or
>> 1/2*sign(x).
>>

What makes it important that "the" derivative of abs(x) be sign(x)?
An important point here is that there is no one single unique
derivative of non-analytic functions like abs, but rather than all of
their derivatives can be expressed in terms of just two. I am
seriously interested in reasons for retaining the status quo.

>> If you use a different function, like f.wirtinger_derivative(), then
>> it doesn't matter so much.
>> David
>>

On 18 November 2014 10:11, kcrisman <kcri...@gmail.com> wrote:
>
> +1
>

Although I guess this would be consistent with the over all
"assimilation philosophy" adopted by Sage, I am rather strongly
against this in general. In my opinion it is in part what has lead to
the rather confusing situation in most other computer algebra systems.
I think rather that one should strive for the most general solution
consistent with the mathematics. I suppose that to some extent this
is conditioned by how the subject is taught. It came as a surprise to
me that a solution of this problem (Wirtinger calculus or CR-calculus)
was apparently "well-known" is some circles but considered only a
marginal curiosity in others (if at all).

> That notwithstanding, this conversation is really great to see and I hope
> we get something that works for the usual cases in the original post
> too!
>

Provided that one realizes its limitations I think the solution
proposed by Vladimir V. Kisil for ginac and in more generality by
Ondrej is quite good. I don't think a new name for this is desirable.

Bill.

David Roe

unread,
Nov 18, 2014, 11:28:41 AM11/18/14
to sage-devel
On Tue, Nov 18, 2014 at 8:05 AM, Bill Page <bill...@newsynthesis.org> wrote:
> On 18 November 2014 09:02, David Roe <roed...@gmail.com> wrote:
>> On Tue, Nov 18, 2014 at 5:57 AM, Bill Page <bill...@newsynthesis.org> wrote:
>>>
>>> > I think you are overly focused on trying to define a derivative that
>>> > reduces to the conventional derivative of non-analytic functions
>>> > over the reals.
>>>
>>> I've just been casually following this conversation, but I think it's
>>> important that the derivative of abs(x) be sign(x) not 2*sign(x) or
>>> 1/2*sign(x).
>>>
>
> What makes it important that "the" derivative of abs(x) be sign(x)?
> An important point here is that there is no one single unique
> derivative of non-analytic functions like abs, but rather than all of
> their derivatives can be expressed in terms of just two. I am
> seriously interested in reasons for retaining the status quo.

Because derivative is not just used in the context of functions of a
complex variable (whether they are analytic or not). Probably more
than 90% of Sage users don't know any complex analysis (as frequently
lamented by rtf). I will certainly acknowledge that people get things
wrong with regard to sqrt and log by not knowing about branch cuts.
But when it comes to the definition of derivative, we need to stay
consistent with the standard definition of lim_{h -> 0} (f(x + h) -
f(x))/h for functions of a real variable (or functions that many
people interpret as just functions of a real variable). Any other
decision will cause frustration for the vast majority of our users.
David

>
>>> If you use a different function, like f.wirtinger_derivative(), then
>>> it doesn't matter so much.
>>> David
>>>
>
> On 18 November 2014 10:11, kcrisman <kcri...@gmail.com> wrote:
>>
>> +1
>>
>
> Although I guess this would be consistent with the over all
> "assimilation philosophy" adopted by Sage, I am rather strongly
> against this in general. In my opinion it is in part what has lead to
> the rather confusing situation in most other computer algebra systems.
> I think rather that one should strive for the most general solution
> consistent with the mathematics. I suppose that to some extent this
> is conditioned by how the subject is taught. It came as a surprise to
> me that a solution of this problem (Wirtinger calculus or CR-calculus)
> was apparently "well-known" is some circles but considered only a
> marginal curiosity in others (if at all).
>
>> That notwithstanding, this conversation is really great to see and I hope
>> we get something that works for the usual cases in the original post
>> too!
>>
>
> Provided that one realizes its limitations I think the solution
> proposed by Vladimir V. Kisil for ginac and in more generality by
> Ondrej is quite good. I don't think a new name for this is desirable.
>
> Bill.
>

Ondřej Čertík

unread,
Nov 18, 2014, 12:29:26 PM11/18/14
to sage-...@googlegroups.com
Well, I think it doesn't matter if you know complex analysis or not.
The point is rather that there is a real derivative and a complex
derivative. The complex derivative being a generalization of the real
one (http://en.wikipedia.org/wiki/Derivative#Generalizations,
http://en.wikipedia.org/wiki/Generalizations_of_the_derivative#Complex_analysis).
As such, it must reduce to the real derivative as a special case when
all variables are real, otherwise you get inconsistencies.

For example for real numbers, you get:

|x|' = x / |x| = sign(x)

and you can do this numerically. Here is a function that does this for
any angle theta in the complex plane:

def diff(f, z0, theta, eps=1e-8):
h = eps*exp(I*theta)
return (f(z0+h)-f(z0)) / h

For real numbers, you need to set theta=0. This then obviously becomes
the standard definition of a real derivative. So any other definition
than |x|' = sign(x) gives wrong answers. No matter if you know complex
analysis or not.

As far as the derivative of abs(x) in the complex plane for any theta,
the above "diff" function is just the directional derivative, i.e.
derivative in the direction theta. Based on my previous email, the
(only) correct analytic answer is, using Python notation:

x.conjugate()/(2*abs(x)) + x/(2*abs(x)) * exp(-2*I*theta)

And you can check numerically using the function "diff" above that
this is indeed the correct answer (just plug in various complex or
real values for "x" and check that "diff" and the above formula gives
the same numerical answer for all theta).

Bill, you wrote "I think rather that one should strive for the most
general solution
consistent with the mathematics.". Well, the above (i.e.
x.conjugate()/(2*abs(x)) + x/(2*abs(x)) * exp(-2*I*theta)) is the most
general solution consistent with mathematics.

Of these options, only theta=0 gives the real derivative as a special
case, that's what the GiNaC proposal does.

Ondrej

Bill Page

unread,
Nov 18, 2014, 1:08:36 PM11/18/14
to sage-devel
On 18 November 2014 12:29, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Tue, Nov 18, 2014 at 9:28 AM, David Roe <roed...@gmail.com> wrote:
>> ...
>> Because derivative is not just used in the context of functions of a
>> complex variable (whether they are analytic or not). Probably more
>> than 90% of Sage users don't know any complex analysis (as frequently
>> lamented by rtf). I will certainly acknowledge that people get things
>> wrong with regard to sqrt and log by not knowing about branch cuts.
>> But when it comes to the definition of derivative, we need to stay
>> consistent with the standard definition of lim_{h -> 0} (f(x + h) -
>> f(x))/h for functions of a real variable (or functions that many
>> people interpret as just functions of a real variable). Any other
>> decision will cause frustration for the vast majority of our users.
>
> Well, I think it doesn't matter if you know complex analysis or not.

I agree, but apparently for a different reason.

> The point is rather that there is a real derivative and a complex
> derivative. The complex derivative being a generalization of the
> real one (http://en.wikipedia.org/wiki/Derivative#Generalizations,
> http://en.wikipedia.org/wiki/Generalizations_of_the_derivative#Complex_analysis).
> As such, it must reduce to the real derivative as a special case when
> all variables are real, otherwise you get inconsistencies.
>

As I said in another email, I think this is highly dependent on one's
education and experience. Although I admit that it is very common
(almost ubiquitous) to teach calculus starting from the notion of
continuity and limits, I my opinion these references on wikipedia are
especially biased. To me from the point of view of computer algebra,
the algebraic properties of derivatives are more important.

For the sake of continuing the argument, from the point of view of
algebra why should we consider derivatives of complex functions as a
generalization of the real one rather than the real derivative as a
defined in terms of something more general? In particular notice that
the so called Wirtinger derivatives also make sense in the case of
quaternion analysis, so should we be expecting to view quaternion
calculus also as a "generalization' of real derivatives?

OK, maybe I am pushing this a little too far. I admit that the
argument from the point of view of limits and without any reference to
conjugate is quite convincing.

> ...
> Bill, you wrote "I think rather that one should strive for the most
> general solution
> consistent with the mathematics.". Well, the above (i.e.
> x.conjugate()/(2*abs(x)) + x/(2*abs(x)) * exp(-2*I*theta)) is the
> most general solution consistent with mathematics.
>
> Of these options, only theta=0 gives the real derivative as a special
> case, that's what the GiNaC proposal does.
>

Have you had a chance to consider the issue of the chain-rule yet?

Bill.

Ondřej Čertík

unread,
Nov 18, 2014, 1:42:01 PM11/18/14
to sage-...@googlegroups.com
I don't know that much about quaternions. Real numbers and real
analysis is essentially a subset of complex analysis. Is that true
that complex analysis is a subset of quaternion analysis? But I think
it's true that quaternion analysis is a generalization of a real
analysis, so I would definitely expect the quaternion derivatives to
match the real ones.

>
> OK, maybe I am pushing this a little too far. I admit that the
> argument from the point of view of limits and without any reference to
> conjugate is quite convincing.
>
>> ...
>> Bill, you wrote "I think rather that one should strive for the most
>> general solution
>> consistent with the mathematics.". Well, the above (i.e.
>> x.conjugate()/(2*abs(x)) + x/(2*abs(x)) * exp(-2*I*theta)) is the
>> most general solution consistent with mathematics.
>>
>> Of these options, only theta=0 gives the real derivative as a special
>> case, that's what the GiNaC proposal does.
>>
>
> Have you had a chance to consider the issue of the chain-rule yet?

Yes. Very straightforward, as I suggested in my last email. Just start with:

D f / D z = df/dz + df/d conjugate(z) * e^{-2*i*theta}

and then consider the chain rule for Wirtinger derivatives
(http://en.wikipedia.org/wiki/Wirtinger_derivatives#Functions_of_one_complex_variable_2),
I am sure that can be proven quite easily. Then you just calculate
directly:

D f(g) / D z = df(g)/dz + df(g)/d conjugate(z) * e^{-2*i*theta} =

= (df/dg * dg/dz + df/d conjugate(g) * d conjugate(g) / dz) + (df/dg *
dg/d conjugate(z) + df/d conjugate(g) * d conjugate(g) / d
conjugate(z)) * e^{-2*i*theta} =

= df/dg * (dg/dz + dg/d conjugate(z) * e^{-2*i*theta}) + df/d
conjugate(g) * (d conjugate(g)/dz + d conjugate(g)/d conjugate(z) *
e^{-2*i*theta}) =

= df/dg Dg/Dz + df/d conjugate(g) D conjugate(g) / Dz

So at the end, the theta dependence got absorbed into Dg/Dz and D
conjugate(g) / Dz, but it assumes that these directional derivatives
are taken with the same angle theta.

You can now use it to do the example from GiNaC:

cout << abs(log(z)).diff(z) << endl;
// (before) -> D[0](abs)(log(z))*z^(-1)
// (now) -> 1/2*(z^(-1)*conjugate(log(z))+log(z)*conjugate(z)^(-1))*abs(log(z))^(-1)

I.e. f(g) = |g|, g(z) = log(z) and you get:

D |log(z)| / D z = conjugate(g)/(2*|g|) * D log(z) / Dz + g / (2*|g|)
* D conjugate(log(z)) / Dz =

= conjugate(log(z)) / (2*|log(z)|) * 1/z + log(z) / (2*|log(z)|) *
1/conjugate(z) * e^{-2*i*theta})

= 1/(2*|log(z)|) * (conjugate(log(z)) / z + log(z) / conjugate(z) *
e^{-2*i*theta})

So it exactly agrees, except that there is a theta dependence in the
final answer and GiNaC implicitly chose theta=0. Everything in this
example should be straightforward except perhaps:

D conjugate(log(z)) / Dz = d conjugate(log(z)) / dz + d
conjugate(log(z)) / d conjugate(z) * e^{-2*i*theta}

where we first write conjugate(log(z)) = log|z| - I*arg(z) =
log|conjugate(z)| + I*arg(conjugate(z)) = log(conjugate(z)) and then
we can see that the first Wirtinger derivative is zero (no functional
dependence on "z"), and the second one is 1/conjugate(z). So the
answer is:

D conjugate(log(z)) / Dz = 1/conjugate(z) * e^{-2*i*theta}


I hope I didn't make some mistake somewhere, but it looks all
straightforward to me.

Ondrej

Bill Page

unread,
Nov 18, 2014, 2:14:40 PM11/18/14
to sage-devel
On 18 November 2014 13:41, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Tue, Nov 18, 2014 at 11:08 AM, Bill Page <bill...@newsynthesis.org> wrote:
>> ...
>> Have you had a chance to consider the issue of the chain-rule yet?
>
> Yes. Very straightforward, as I suggested in my last email. Just start with:
>
> D f / D z = df/dz + df/d conjugate(z) * e^{-2*i*theta}
>
> and then consider the chain rule for Wirtinger derivatives
> (http://en.wikipedia.org/wiki/Wirtinger_derivatives#Functions_of_one_complex_variable_2),
> I am sure that can be proven quite easily.

Let me make sure I understand your proposal. Are you saying that you
would introduce the symbolic expression

e^{-2*i*theta}

with theta undefined in the result of all derivatives? So that
diff(x) is always the sum of two terms. In particular

abs(x).diff(x)

would return the symbolic expression

conjugate(x)/(2*abs(x)) + conjugate(x)/(2*abs(x))* e^{-2*i*theta}

If you are, then clearly one can recover both Wirtinger derivatives
from this expression and the rest holds.

> Then you just calculate directly:
> ...
> So it exactly agrees, except that there is a theta dependence in the
> final answer and GiNaC implicitly chose theta=0.
>...
> I hope I didn't make some mistake somewhere, but it looks all
> straightforward to me.
>

It looks OK to me but I must say, it probably seems rather peculiar
from the point of view expressed earlier by David Roe.

How can you explain the presence of the e^theta term to someone
without experience in complex analysis or at least multi-variable
calculus?

I thought rather that what you were proposing was to set theta=0 from
the start. If you did that, then I think you still have problems with
the chain rule.

Bill.

Bill Page

unread,
Nov 18, 2014, 2:38:12 PM11/18/14
to sage-devel
On 18 November 2014 14:14, Bill Page <bill...@newsynthesis.org> wrote:
> On 18 November 2014 13:41, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> On Tue, Nov 18, 2014 at 11:08 AM, Bill Page <bill...@newsynthesis.org> wrote:
>>> ...
>>> Have you had a chance to consider the issue of the chain-rule yet?
>>
>> Yes. Very straightforward, as I suggested in my last email. Just start with:
>>
>> D f / D z = df/dz + df/d conjugate(z) * e^{-2*i*theta}
>>
>> and then consider the chain rule for Wirtinger derivatives
>> (http://en.wikipedia.org/wiki/Wirtinger_derivatives#Functions_of_one_complex_variable_2),
>> I am sure that can be proven quite easily.
> ...
> I thought rather that what you were proposing was to set theta=0 from
> the start. If you did that, then I think you still have problems with
> the chain rule.
>

Let me add that the kind of solution to this problem that I did
imagine was to implement two derivatives, for example both

f.diff(z) = df/dz + df/d conjugate(z)

and

f.diff2(z) = df/dz - df/d conjugate(z)

diff(z) would equal diff2(z) for all analytic functions and diff would
reduce to the derivative of real non-analytic functions as you desire.
Note that for abs we have

abs(z).diff2(z) = 0

but not in general. There would be no need to discuss this 2nd
derivative with less experienced users until they were ready to
consider more "advanced" mathematics.

Clearly we could implement the chain rule given these two derivatives.

Bill.

Ondřej Čertík

unread,
Nov 18, 2014, 3:19:20 PM11/18/14
to sage-...@googlegroups.com
On Tue, Nov 18, 2014 at 12:14 PM, Bill Page <bill...@newsynthesis.org> wrote:
> On 18 November 2014 13:41, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> On Tue, Nov 18, 2014 at 11:08 AM, Bill Page <bill...@newsynthesis.org> wrote:
>>> ...
>>> Have you had a chance to consider the issue of the chain-rule yet?
>>
>> Yes. Very straightforward, as I suggested in my last email. Just start with:
>>
>> D f / D z = df/dz + df/d conjugate(z) * e^{-2*i*theta}
>>
>> and then consider the chain rule for Wirtinger derivatives
>> (http://en.wikipedia.org/wiki/Wirtinger_derivatives#Functions_of_one_complex_variable_2),
>> I am sure that can be proven quite easily.
>
> Let me make sure I understand your proposal. Are you saying that you
> would introduce the symbolic expression
>
> e^{-2*i*theta}
>
> with theta undefined in the result of all derivatives? So that
> diff(x) is always the sum of two terms. In particular
>
> abs(x).diff(x)
>
> would return the symbolic expression
>
> conjugate(x)/(2*abs(x)) + conjugate(x)/(2*abs(x))* e^{-2*i*theta}

I think you made a mistake, the correct expression is:

conjugate(x)/(2*abs(x)) + x/(2*abs(x)) * e^{-2*i*theta}

>
> If you are, then clearly one can recover both Wirtinger derivatives
> from this expression and the rest holds.

For now I just wanted to get the math right in the most general case.
I wasn't even considering what a CAS should do.

>
>> Then you just calculate directly:
>> ...
>> So it exactly agrees, except that there is a theta dependence in the
>> final answer and GiNaC implicitly chose theta=0.
>>...
>> I hope I didn't make some mistake somewhere, but it looks all
>> straightforward to me.
>>
>
> It looks OK to me but I must say, it probably seems rather peculiar
> from the point of view expressed earlier by David Roe.
>
> How can you explain the presence of the e^theta term to someone
> without experience in complex analysis or at least multi-variable
> calculus?
>
> I thought rather that what you were proposing was to set theta=0 from
> the start. If you did that, then I think you still have problems with
> the chain rule.

For a CAS, I was leaning towards using theta=0. But given your
objections, I first needed to figure out the most general case that
covers everything. I think that's now sufficiently clarified.

> Let me add that the kind of solution to this problem that I did
> imagine was to implement two derivatives, for example both
>
> f.diff(z) = df/dz + df/d conjugate(z)
>
> and
>
> f.diff2(z) = df/dz - df/d conjugate(z)
>
> diff(z) would equal diff2(z) for all analytic functions and diff would
> reduce to the derivative of real non-analytic functions as you desire.

Right, diff() is for theta = 0. diff2() is for theta=pi/2, i.e. taking
the derivative along the imaginary axis.

> Note that for abs we have
>
> abs(z).diff2(z) = 0

Actually, for abs you have:

abs(z).diff2(z) = (conjugate(z)-z)/(2*abs(z))

> but not in general. There would be no need to discuss this 2nd
> derivative with less experienced users until they were ready to
> consider more "advanced" mathematics.
>
> Clearly we could implement the chain rule given these two derivatives.

So I think that functions can return their own correct derivative, for
example analytic functions just return the unique complex derivative,
for example:

log(z).diff(z) = 1/z

This holds for all cases. Non-analytic functions like abs(f) can return:

abs(f).diff(z) = (conjugate(f)*f.diff(z) +
f*conjugate(f).diff(z)*e^{-2*i*theta}) / (2*abs(f))

I think that's the correct application of the chain rule. We can set
theta=0, so we would just return:

abs(f).diff(z) = (conjugate(f)*f.diff(z) + f*conjugate(f).diff(z)) / (2*abs(f))

Which for real "f" (i.e. conjugate(f)=f) simplifies to (as a special case):

abs(f).diff(z) = (f*f.diff(z) + f*f.diff(z)) / (2*abs(f)) = f/abs(f) *
f.diff(z) = sign(f) * f.diff(z)

So it all works.

Unless there is some issue that I don't see, it seems to me we just
need to have one diff(z) function, no need for diff2().

Ondrej

Ondřej Čertík

unread,
Nov 18, 2014, 3:46:37 PM11/18/14
to sage-...@googlegroups.com
Actually, I think I made a mistake. Let's do abs(f).diff(x) again for
the most general case. We use:

D f(g) / D z =

= df/dg * (dg/dz + dg/d conjugate(z) * e^{-2*i*theta}) + df/d
conjugate(g) * (d conjugate(g)/dz + d conjugate(g)/d conjugate(z) *
e^{-2*i*theta}) =

= df/dg Dg/Dz + df/d conjugate(g) D conjugate(g) / Dz

Which we derived above. We have f(g) -> |g| and g(z) -> f(z). So we get:

D |f| / Dz = d|f|/df * Df/Dz + d|f|/d conjugate(f) * D conjugate(f) / Dz =

= (conjugate(f) * Df/Dz + f * D conjugate(f) / Dz) / (2*abs(f))

And then:

Df/Dz = f.diff(z)
D conjugate(f) / Dz = conjugate(f).diff(z)

So I think the formula:

abs(f).diff(z) = (conjugate(f)*f.diff(z) + f*conjugate(f).diff(z)) / (2*abs(f))

is the most general formula for any theta. The theta dependence is hidden
in conjugate(f).diff(z), since if "f" is analytic, like f=log(z), the
conjugate(f) is
not analytic, and so the derivative is theta dependent.

The below holds though:

Bill Page

unread,
Nov 18, 2014, 4:50:20 PM11/18/14
to sage-devel
On 18 November 2014 15:19, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Tue, Nov 18, 2014 at 12:14 PM, Bill Page <bill...@newsynthesis.org> wrote:
>>
>> abs(x).diff(x)
>>
>> would return the symbolic expression
>>
>> conjugate(x)/(2*abs(x)) + conjugate(x)/(2*abs(x))* e^{-2*i*theta}
>
> I think you made a mistake, the correct expression is:
>
> conjugate(x)/(2*abs(x)) + x/(2*abs(x)) * e^{-2*i*theta}
>

Yes, sorry.

>> ...
>> I thought rather that what you were proposing was to set theta=0
>> from the start. If you did that, then I think you still have problems
>> with the chain rule.
>
> For a CAS, I was leaning towards using theta=0. But given your
> objections, I first needed to figure out the most general case that
> covers everything. I think that's now sufficiently clarified.
>

OK.

>> Let me add that the kind of solution to this problem that I did
>> imagine was to implement two derivatives, for example both
>>
>> f.diff(z) = df/dz + df/d conjugate(z)
>>
>> and
>>
>> f.diff2(z) = df/dz - df/d conjugate(z)
>>
>> diff(z) would equal diff2(z) for all analytic functions and diff would
>> reduce to the derivative of real non-analytic functions as you desire.
>
> Right, diff() is for theta = 0. diff2() is for theta=pi/2, i.e. taking
> the derivative along the imaginary axis.
>
>> Note that for abs we have
>>
>> abs(z).diff2(z) = 0
>
> Actually, for abs you have:
>
> abs(z).diff2(z) = (conjugate(z)-z)/(2*abs(z))
>

Yes again, sorry. Of course 0 only if conjugate(z)=z.

>> but not in general. There would be no need to discuss this 2nd
>> derivative with less experienced users until they were ready to
>> consider more "advanced" mathematics.
>>
>> Clearly we could implement the chain rule given these two derivatives.
> ...

On 18 November 2014 15:46, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Tue, Nov 18, 2014 at 1:19 PM, Ondřej Čertík <ondrej...@gmail.com> wrote:
> ..
>>
>> So I think that functions can return their own correct derivative, for
>> example analytic functions just return the unique complex derivative,
>> for example:
>>
>> log(z).diff(z) = 1/z
>>
>> This holds for all cases. Non-analytic functions like abs(f) can return:
>>
>> abs(f).diff(z) = (conjugate(f)*f.diff(z) +
>> f*conjugate(f).diff(z)*e^{-2*i*theta}) / (2*abs(f))
>
> Actually, I think I made a mistake. Let's do abs(f).diff(x) again for
> the most general case. We use:
>
> D f(g) / D z =
>
> = df/dg * (dg/dz + dg/d conjugate(z) * e^{-2*i*theta}) + df/d
> conjugate(g) * (d conjugate(g)/dz + d conjugate(g)/d conjugate(z) *
> e^{-2*i*theta}) =
>
> = df/dg Dg/Dz + df/d conjugate(g) D conjugate(g) / Dz
>
> Which we derived above. We have f(g) -> |g| and g(z) -> f(z). So we get:
>
> D |f| / Dz = d|f|/df * Df/Dz + d|f|/d conjugate(f) * D conjugate(f) / Dz =
>
> = (conjugate(f) * Df/Dz + f * D conjugate(f) / Dz) / (2*abs(f))
>
> And then:
>
> Df/Dz = f.diff(z)
> D conjugate(f) / Dz = conjugate(f).diff(z)
>
> So I think the formula:
>
> abs(f).diff(z) = (conjugate(f)*f.diff(z) + f*conjugate(f).diff(z)) / (2*abs(f))
>
> is the most general formula for any theta. The theta dependence is hidden
> in conjugate(f).diff(z), since if "f" is analytic, like f=log(z), the
> conjugate(f) is not analytic, and so the derivative is theta dependent.
>
> The below holds though:
>
>>
>> I think that's the correct application of the chain rule. We can set
>> theta=0, so we would just return:
>>
>> abs(f).diff(z) = (conjugate(f)*f.diff(z) + f*conjugate(f).diff(z)) / (2*abs(f))
>>
>> Which for real "f" (i.e. conjugate(f)=f) simplifies to (as a special case):
>>
>> abs(f).diff(z) = (f*f.diff(z) + f*f.diff(z)) / (2*abs(f)) = f/abs(f) *
>> f.diff(z) = sign(f) * f.diff(z)
>>
>> So it all works.
>>
>> Unless there is some issue that I don't see, it seems to me we just
>> need to have one diff(z) function, no need for diff2().
>>

Hmmm... So given only f(z).diff(z) as you have defined it above, how
do I get Df(z)/D conjugate(z), i.e. the other Wirtinger derivative?
Or are you claiming that this is not necessary in general in spite of
the Wirtinger formula for the chain rule?

Bill.

Ondřej Čertík

unread,
Nov 18, 2014, 5:40:04 PM11/18/14
to sage-...@googlegroups.com
In my notation, the Wirtinger derivative is d f(z) / d z and d f(z) /
d conjugate(z). The Df(z) / Dz is the complex derivative taking in
direction theta (where it could be theta=0). Given the chain rule, as
I derived above using chain rules for the Wirtinger derivative:

D f(g) / D z = df/dg Dg/Dz + df/d conjugate(g) D conjugate(g) / Dz

I don't see why you would need the isolated Wirtinger derivatives. The
method that implements the derivative of the given function, like
log(z) or abs(z) would simply return the correct formula, as I said
above, e.g.

log(z).diff(z) = 1/z

abs(f).diff(z) = (conjugate(f)*f.diff(z) + f*conjugate(f).diff(z)) / (2*abs(f))

Both formulas hold for any theta. I guess it depends on how the CAS is
implemented, maybe some CASes have a general machinery for
derivatives. But I am pretty sure you can simply implemented it as I
outlined.

Let me know if you found any issue with this.

Ondrej

w huang

unread,
Nov 18, 2014, 8:43:05 PM11/18/14
to sage-...@googlegroups.com
Hi,

With Sage 6.3, I am getting:

sage: abs(x).diff(x)
x/abs(x)
sage: abs(I*x).diff(x)
-x/abs(I*x)

But abs(I*x) == abs(x). So also abs(x).diff(x) and abs(I*x).diff(x)
must be the same. But in the first case we get x/abs(x), and in the
second we got -x/abs(x).

In SymPy, the answer is:
------///

www.mathHandbook.com show
http://www.mathHandbook.com/input/?guess=d(abs(x))

Bill Page

unread,
Nov 18, 2014, 8:51:18 PM11/18/14
to sage-devel
On 18 November 2014 17:40, Ondřej Čertík <ondrej...@gmail.com> wrote:
>
> In my notation, the Wirtinger derivative is d f(z) / d z and d f(z) /
> d conjugate(z). The Df(z) / Dz is the complex derivative taking in
> direction theta (where it could be theta=0). Given the chain rule, as
> I derived above using chain rules for the Wirtinger derivative:
>
> D f(g) / D z = df/dg Dg/Dz + df/d conjugate(g) D conjugate(g) / Dz
>
> I don't see why you would need the isolated Wirtinger derivatives.

You mean that only the function being differentiated needs to the
Writinger derivatives (as part of the "formula" that it implements for
the chain rule)?

> The
> method that implements the derivative of the given function, like
> log(z) or abs(z) would simply return the correct formula, as I said
> above, e.g.
>
> log(z).diff(z) = 1/z
>
> abs(f).diff(z) = (conjugate(f)*f.diff(z) + f*conjugate(f).diff(z)) / (2*abs(f))
>

If the chain rule must be implemented by each function then I suppose
that you also have

log(f).diff(z) = f.diff(z) / z

right?

> Both formulas hold for any theta.

The generality provided by theta seems not be be of much interest.

> I guess it depends on how the CAS is implemented, maybe
> some CASes have a general machinery for derivatives. But
> I am pretty sure you can simply implemented it as I outlined.
>
> Let me know if you found any issue with this.
>

Is this how derivatives are implemented in sympy?

Bill.

Ondřej Čertík

unread,
Nov 18, 2014, 9:22:10 PM11/18/14
to sage-...@googlegroups.com
On Tue, Nov 18, 2014 at 6:51 PM, Bill Page <bill...@newsynthesis.org> wrote:
> On 18 November 2014 17:40, Ondřej Čertík <ondrej...@gmail.com> wrote:
>>
>> In my notation, the Wirtinger derivative is d f(z) / d z and d f(z) /
>> d conjugate(z). The Df(z) / Dz is the complex derivative taking in
>> direction theta (where it could be theta=0). Given the chain rule, as
>> I derived above using chain rules for the Wirtinger derivative:
>>
>> D f(g) / D z = df/dg Dg/Dz + df/d conjugate(g) D conjugate(g) / Dz
>>
>> I don't see why you would need the isolated Wirtinger derivatives.
>
> You mean that only the function being differentiated needs to the
> Writinger derivatives (as part of the "formula" that it implements for
> the chain rule)?

Did you mean to write "You mean that only the function being
differentiated needs to do the
Writinger derivatives..."?

Yes.

>
>> The
>> method that implements the derivative of the given function, like
>> log(z) or abs(z) would simply return the correct formula, as I said
>> above, e.g.
>>
>> log(z).diff(z) = 1/z
>>
>> abs(f).diff(z) = (conjugate(f)*f.diff(z) + f*conjugate(f).diff(z)) / (2*abs(f))
>>
>
> If the chain rule must be implemented by each function then I suppose
> that you also have
>
> log(f).diff(z) = f.diff(z) / z
>
> right?

Yes, correct.

>
>> Both formulas hold for any theta.
>
> The generality provided by theta seems not be be of much interest.

I agree, I think one can implicitly set theta=0 in a CAS. So that's
the option 2) in my email above.

>
>> I guess it depends on how the CAS is implemented, maybe
>> some CASes have a general machinery for derivatives. But
>> I am pretty sure you can simply implemented it as I outlined.
>>
>> Let me know if you found any issue with this.
>>
>
> Is this how derivatives are implemented in sympy?

It's a little more complicated in sympy, but that's how derivatives
are implemented in csympy (https://github.com/sympy/csympy).

Ondrej

Bill Page

unread,
Nov 19, 2014, 9:36:29 AM11/19/14
to sage-devel


On 18 November 2014 21:22, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Tue, Nov 18, 2014 at 6:51 PM, Bill Page <bill...@newsynthesis.org> wrote:
>> On 18 November 2014 17:40, Ondřej Čertík <ondrej...@gmail.com> wrote:
>>>
>>> In my notation, the Wirtinger derivative is d f(z) / d z and d f(z) /
>>> d conjugate(z). The Df(z) / Dz is the complex derivative taking in
>>> direction theta (where it could be theta=0). Given the chain rule, as
>>> I derived above using chain rules for the Wirtinger derivative:
>>>
>>> D f(g) / D z = df/dg Dg/Dz + df/d conjugate(g) D conjugate(g) / Dz
>>>
>>>
>>> abs(f).diff(z) = (conjugate(f)*f.diff(z) + f*conjugate(f).diff(z)) / (2*abs(f))
>>>
>>
>>>
>>> Let me know if you found any issue with this.
>>>

I implemented this in FriCAS and tried a few examples, e.g.

(4) -> D(abs(f(z,conjugate(z))),z)

            _     _        _         _
        f(z,z)f  (z,z) + f(z,z)f  (z,z)
               ,2               ,1
   (4)  -------------------------------
                           _
                  2abs(f(z,z))
                                                    Type: Expression(Integer)



where the ,1 and ,2 notation represents the derivative with respect the the first and second variable of f, respectively.

Then I noticed that if we have f=z we get

  conjugate(z).diff(z)

which is 0.  So the 2nd term is 0 and the result is just the first Wirtinger derivative.

Perhaps I am misinterpreting something?

Bill.

Bill Page

unread,
Nov 19, 2014, 10:19:07 AM11/19/14
to sage-devel


On 2014-11-19 9:36 AM, "Bill Page" <bill...@newsynthesis.org> wrote:
> ...

> Then I noticed that if we have f=z we get
>
>   conjugate(z).diff(z)
>
> which is 0.  So the 2nd term is 0 and the result is just the first Wirtinger derivative.
>
> Perhaps I am misinterpreting something?
>

Oops, my fault.  According to your definition

  conjugate(z).diff(z) = 1

Bill.

Bill Page

unread,
Nov 19, 2014, 11:32:22 AM11/19/14
to sage-devel
OK, this looks better!

(1) -> D(abs(x),x)

         _
         x + x
   (1)  -------
        2abs(x)
                                                    Type: Expression(Integer)
(2) -> D(conjugate(x),y)

   (2)  0
                                                    Type: Expression(Integer)
(3) -> D(conjugate(x),x)

   (3)  1
                                                    Type: Expression(Integer)
(4) -> f:=operator 'f

   (4)  f
                                                          Type: BasicOperator
(5) -> D(abs(f(x)),x)

             , _      _  ,
        f(x)f (x) + f(x)f (x)

   (5)  ---------------------
              2abs(f(x))
                                                    Type: Expression(Integer)
(6) -> D(abs(log(x)),x)

        _    _
        xlog(x) + x log(x)
   (6)  ------------------
            _
          2xxabs(log(x))
                                                    Type: Expression(Integer)

Ondřej Čertík

unread,
Nov 19, 2014, 11:39:57 AM11/19/14
to sage-...@googlegroups.com
Right, because this "diff" is the total derivative in the direction
theta, so the first Wirtinger derivative is 0, the second one is 1 and
you get:

0 + 1*e^{-2*i*theta})

and if you implicitly set theta=0, then you get 1.

Ondrej

Ondřej Čertík

unread,
Nov 19, 2014, 11:42:27 AM11/19/14
to sage-...@googlegroups.com
That looks good, right? What about arg(z). What are the Wirtinger
derivatives of arg(z)? Do you have other examples of non-analytic
functions?

Would you mind posting your patch to FriCAS somewhere? I would be
interested in how you implemented it.

Ondrej

Ondřej Čertík

unread,
Nov 19, 2014, 11:51:21 AM11/19/14
to sage-...@googlegroups.com
I'll try to compile FriCAS myself and apply your patch, so that I can
play with it. Can you also try:

abs(I*x)

1/abs(x)
1/abs(x)^2
x/abs(x)^3
abs(x)^2

The x/abs(x)^3 is a Coulomb's law in 1D.

Ondrej

Bill Page

unread,
Nov 19, 2014, 12:29:37 PM11/19/14
to sage-devel, fricas-devel
Since this mostly concerns FriCAS I am cross posting to that group.  I will also post the patch there.  For FriCAS list reference the original email thread is here:

https://groups.google.com/forum/#!topic/sage-devel/6j-LcC6tpkE

Here is the result of compiling the patch against the current SourceForge svn trunk:

wspage@opensuse:~> fricas
The directory for FriCAS, /usr/local/lib/fricas/target/x86_64-suse-linux, does not exist.
Goodbye.
wspage@opensuse:~> fricas
Checking for foreign routines
AXIOM="/usr/local/lib64/fricas/target/x86_64-suse-linux"
spad-lib="/usr/local/lib64/fricas/target/x86_64-suse-linux/lib/libspad.so"
foreign routines found
openServer result 0
                       FriCAS Computer Algebra System
                         Version: FriCAS 2014-11-14
                   Timestamp: Wed Nov 19 11:57:49 EST 2014
-----------------------------------------------------------------------------
   Issue )copyright to view copyright notices.
   Issue )summary for a summary of useful system commands.
   Issue )quit to leave FriCAS and return to shell.
-----------------------------------------------------------------------------

 

(1) -> D(abs(x),x)

         _
         x + x
   (1)  -------
        2abs(x)
                                                    Type: Expression(Integer)
(2) -> D(conjugate(x),x)

   (2)  1
                                                    Type: Expression(Integer)
(3) -> f:=operator 'f

   (3)  f
                                                          Type: BasicOperator
(4) -> D(abs(f(x)),x)


             , _      _  ,
        f(x)f (x) + f(x)f (x)

   (4)  ---------------------
              2abs(f(x))
                                                    Type: Expression(Integer)
(5) -> D(abs(log(x)),x)


        _    _
        xlog(x) + x log(x)
   (5)  ------------------
            _
          2xxabs(log(x))
                                                    Type: Expression(Integer)
(6) -> D(log(abs(x)),x)

          _
          x + x
   (6)  --------
               2
        2abs(x)
                                                    Type: Expression(Integer)
(7) -> D(abs(%i*x),x)

           _
           x + x
   (7)  ----------
        2abs(%i x)
                                           Type: Expression(Complex(Integer))
(8) -> D(1/abs(x),x)

           _
         - x - x
   (8)  --------
               3
        2abs(x)
                                                    Type: Expression(Integer)
(9) -> D(1/abs(x)^2,x)

          _
        - x - x
   (9)  -------
              4
        abs(x)
                                                    Type: Expression(Integer)
(10) -> D(x/abs(x)^3,x)

             _          2     2
         - 3xx + 2abs(x)  - 3x
   (10)  ----------------------
                       5
                2abs(x)
                                                    Type: Expression(Integer)
(11) -> D(abs(x)^2,x)

         _
   (11)  x + x
                                                    Type: Expression(Integer)

Bill.

kcrisman

unread,
Nov 19, 2014, 9:23:57 PM11/19/14
to sage-...@googlegroups.com

Since this mostly concerns FriCAS I am cross posting to that group.  I will also post the patch there.  For FriCAS list reference the original email thread is here:


But if you come up with a solution Sage (or Ginac, or whatever) can implement too, please let us know!

Bill Page

unread,
Nov 19, 2014, 9:36:25 PM11/19/14
to sage-devel
On 19 November 2014 21:23, kcrisman <kcri...@gmail.com> wrote:
>
>
>> Since this mostly concerns FriCAS I am cross posting to that group. I will also post the patch there. For FriCAS list reference the original email thread is here:
>>
>
> But if you come up with a solution Sage (or Ginac, or whatever) can implement too, please let us know!
>

Right now Ondrej's proposed definition is looking pretty good to me
but I think it needs more extensive testing. Apparently Ginac with
Vladimir V. Kisil's patch is able to compute at least some of the
results I showed with FriCAS. If someone has used Ginac and is able
to compile it with the patch, it would be good to have these results
for comparison.

Yes, certainly. We can also continue this thread.

Bill.

Ondřej Čertík

unread,
Nov 20, 2014, 1:54:32 AM11/20/14
to sage-...@googlegroups.com
What you posted looks good. But we need to test it for arg(z), re(z),
im(z) and any other non-analytic function that we can find.

Ondrej

Bill Page

unread,
Nov 20, 2014, 9:41:09 AM11/20/14
to sage-devel
On 20 November 2014 01:54, Ondřej Čertík <ondrej...@gmail.com> wrote:
>
> What you posted looks good. But we need to test it for arg(z), re(z),
> im(z) and any other non-analytic function that we can find.
>

(1) -> re(x)==(conjugate(x)+x)/2  
                                                                   Type: Void
(2) -> im(x)==%i*(conjugate(x)-x)/2
                                                                   Type: Void
(3) -> arg(x)==log(x/abs(x))/%i  
                                                                   Type: Void
(4) -> re %i
   Compiling function re with type Complex(Integer) -> Fraction(Complex
      (Integer))

   (4)  0
                                             Type: Fraction(Complex(Integer))
(5) -> im %i
   Compiling function im with type Complex(Integer) -> Fraction(Complex
      (Integer))

   (5)  1
                                             Type: Fraction(Complex(Integer))
(6) -> arg %i
   Compiling function arg with type Complex(Integer) -> Expression(
      Complex(Integer))

   (6)  - %i log(%i)
                                           Type: Expression(Complex(Integer))
(7) -> complexNumeric %

   (7)  1.5707963267_948966192
                                                         Type: Complex(Float)
(8) -> D(re(x),x)
   Compiling function re with type Variable(x) -> Expression(Integer)

   (8)  1
                                                    Type: Expression(Integer)
(9) -> D(im(x),x)
   Compiling function im with type Variable(x) -> Expression(Complex(
      Integer))

   (9)  0
                                           Type: Expression(Complex(Integer))
(10) -> D(arg(x),x)
   Compiling function arg with type Variable(x) -> Expression(Complex(
      Integer))

             _             2       2
         %i xx - 2%i abs(x)  + %i x
   (10)  ---------------------------
                           2
                  2x abs(x)
                                           Type: Expression(Complex(Integer))


I had a thought. I suppose that all non-analytic (nonholomorphic) functions of interest can be written in terms of conjugate and some analytic functions, e.g.

  abs(x)=sqrt(x*conjugate(x))

so perhaps all we really need is to know how to differentiate conjugate properly?

Bill

Bill Page

unread,
Nov 20, 2014, 9:53:05 AM11/20/14
to sage-devel
So here (20) is a simpler expression for derivative of arg:

(16) -> abs(x)==sqrt(x*conjugate(x))
   Compiled code for abs has been cleared.
   Compiled code for arg has been cleared.
   1 old definition(s) deleted for function or rule abs
                                                                   Type: Void
(17) -> arg(x)==log(x/abs(x))/%i  
   1 old definition(s) deleted for function or rule arg
                                                                   Type: Void
(18) -> arg %i                      
   Compiling function abs with type Complex(Integer) -> Expression(

      Complex(Integer))
   Compiling function arg with type Complex(Integer) -> Expression(
      Complex(Integer))

   (18)  - %i log(%i)
                                           Type: Expression(Complex(Integer))
(19) -> complexNumeric %          

   (19)  1.5707963267_948966192
                                                         Type: Complex(Float)
(20) -> D(arg(x),x)                
   Compiling function abs with type Variable(x) -> Expression(Integer)

   Compiling function arg with type Variable(x) -> Expression(Complex(
      Integer))

             _
         - %ix + %i x
   (20)  ------------
                _
              2xx
                                           Type: Expression(Complex(Integer))


In general I am a little uncertain if, how and when to deal with simplifications of expressions like abs that can be expressed in terms of more fundamental/elementary functions.  What do you think?

Bill.

On 20 November 2014 09:41, Bill Page <bill...@newsynthesis.org> wrote:
> ...
>

Ondřej Čertík

unread,
Nov 20, 2014, 11:08:01 AM11/20/14
to sage-...@googlegroups.com
I haven't thought of that, but I think you are right. It's definitely
true for abs(x), arg(x), re(x), im(x) and conjugate(x). Other
non-analytic functions are combinations of those. The only other way
to create some non-analytic functions is to define their real and
complex parts using "x" and "y", e.g.

f(x+iy) = (x^2+y^2) + i*(2*x*y)

You can imagine arbitrary complicated expressions. But then you just
substitute z, conjugate(z) for x, y.

So I think that for most things that people would use a CAS for, this is true.

>
> Bill

Ondřej Čertík

unread,
Nov 20, 2014, 11:16:39 AM11/20/14
to sage-...@googlegroups.com
The identity abs(z) = sqrt(z*conjugate(z)) is just the same problem as
for things like exp(x) = E^x, csc(x) = 1/sin(x), sinh(x) =
(exp(x)-exp(-x)/2, asin(x) = -i*log(i*x+sqrt(1-x^2)), asinh(x) =
log(x+sqrt(1+x^2)), ...

Essentially a lot of functions can be written using simpler functions.
The expression is sometimes simpler using abs() and sometimes simpler
using sqrt(x*conjugate(x)), and that is true for all the other cases
too. So a CAS needs to be able to handle both, and allow the user to
convert one to the other. For example in SymPy, we can do:

In [1]: sinh(x)**2+1
Out[1]:
2
sinh (x) + 1

In [2]: (sinh(x)**2+1).rewrite(exp)
Out[2]:
2
⎛ x -x⎞
⎜ℯ ℯ ⎟
⎜── - ───⎟ + 1
⎝2 2 ⎠

In [3]: _.expand()
Out[3]:
2⋅x -2⋅x
ℯ 1 ℯ
──── + ─ + ─────
4 2 4


In general, my approach is that I try to define the derivative of
abs(x) in the simplest possible way, which seems to be in terms of
abs(x) as well, instead of sqrt(x*conjugate(x)). But the CAS needs to
be able to rewrite it later if needed, because sometimes things can
simplify.

Ondrej

>
> Bill.
>
> On 20 November 2014 09:41, Bill Page <bill...@newsynthesis.org> wrote:
>> ...
>>
>> I had a thought. I suppose that all non-analytic (nonholomorphic)
>> functions of interest can be written in terms of conjugate and some analytic
>> functions, e.g.
>>
>> abs(x)=sqrt(x*conjugate(x))
>>
>> so perhaps all we really need is to know how to differentiate conjugate
>> properly?
>>
>> Bill
>

Ondřej Čertík

unread,
Nov 20, 2014, 11:20:17 AM11/20/14
to sage-...@googlegroups.com
Or to say it with different words, the reason we even have functions
like exp(x), csc(x), sinh(x), asin(x), asinh(x) is that things are
simpler if you use those as opposed to their more elementary function
definition. Perhaps with the exception of csc(x) = 1/sin(x), where I
personally don't see an advantage of introducing a new function for
just 1/sin(x). But with all the other ones, they simplify things,
sometimes. And so the art is to use those in a way to create the most
simple expression at the end. I think that's all there is to it.

Ondrej

Bill Page

unread,
Nov 20, 2014, 11:59:20 AM11/20/14
to sage-devel
Perhaps this is more or less where Richardson's theorem enters.

http://en.wikipedia.org/wiki/Richardson%27s_theorem

We badly want a reliable way to determine when an expression is
identically zero. In general this is not possible, but if we restrict
our selves to a subset of "elementary" functions, in particular if we
can avoid 'abs', then it is in principle decidable (not withstanding
the possible undecidability of equality of constants). As I understand
it FriCAS effectively relies on this as part of the machinery for
integration, e.g. in 'rischNormalize'. Waldek's challenge to me on
the FriCAS list in regards to my proposals related to conjugate and
this thread was to show that it is possible to include 'conjugate' and
still have a decidable system given the complex equivalent of
Richardson's theorem.

So far I have not been able to meet this challenge or even to find any
specific relevant related publications. Perhaps it is obvious that
this is not possible given the definition of abs in terms of conjugate
and sqrt. I would be interested in anyone here has considered this
issue or might suggest some leads. Of course this is likely not of
too much interest in computer algebra systems that take a more
pragmatic approach than FriCAS/Axiom.

Bill.

Ondřej Čertík

unread,
Nov 20, 2014, 12:56:39 PM11/20/14
to sage-...@googlegroups.com
On Thu, Nov 20, 2014 at 9:59 AM, Bill Page <bill...@newsynthesis.org> wrote:
> Perhaps this is more or less where Richardson's theorem enters.
>
> http://en.wikipedia.org/wiki/Richardson%27s_theorem
>
> We badly want a reliable way to determine when an expression is
> identically zero. In general this is not possible, but if we restrict
> our selves to a subset of "elementary" functions, in particular if we
> can avoid 'abs', then it is in principle decidable (not withstanding
> the possible undecidability of equality of constants). As I understand
> it FriCAS effectively relies on this as part of the machinery for
> integration, e.g. in 'rischNormalize'. Waldek's challenge to me on
> the FriCAS list in regards to my proposals related to conjugate and
> this thread was to show that it is possible to include 'conjugate' and
> still have a decidable system given the complex equivalent of
> Richardson's theorem.
>
> So far I have not been able to meet this challenge or even to find any
> specific relevant related publications. Perhaps it is obvious that
> this is not possible given the definition of abs in terms of conjugate
> and sqrt. I would be interested in anyone here has considered this
> issue or might suggest some leads. Of course this is likely not of
> too much interest in computer algebra systems that take a more
> pragmatic approach than FriCAS/Axiom.

Can you give an example of an expression that cannot be decided by the
Richardson's theorem? How does FriCAS do the zero testing? I.e. if you
give it

f(x) = sin(x)^2 + cos(x)^2-1

how does it decide that it is equal to 0?

Are we talking about functions of just one variable (f(x)) or more
(f(x, y, z, ...))?

Why cannot you just use the probabilistic testing, where you plug in
various (complex) numbers into f(x) and test that it is equal to zero,
numerically.

Ondrej

Bill Page

unread,
Nov 20, 2014, 9:53:28 PM11/20/14
to sage-devel
On 20 November 2014 12:56, Ondřej Čertík <ondrej...@gmail.com> wrote:
> ...
> Can you give an example of an expression that cannot be decided by
> the Richardson's theorem?

Well, no not exactly. Richardson's theorem is not about individual
expressions, it is about decidability, i.e. computability, in general.
Consider an expression of the form

f(x) - | f(x) |

where f(x) is composed from integers, the variable x, +, * and sin.
The question that Richardson's theorem answers is whether or not there
exists a program that can determine if f(x) - |f(x)| = 0 for all x.
This problem can be reduced to finding an algorithm to determine if
f(x) is everywhere non-negative. Richardson proves that no such
algorithm exists.

> How does FriCAS do the zero testing? I.e. if you
> give it
>
> f(x) = sin(x)^2 + cos(x)^2-1
>
> how does it decide that it is equal to 0?

This can be done by the function 'normalize' which first uses
'realElementary' to rewrite the expression using just 4 fundamental
real-valued elementary transcendental functions (kernels): log, exp,
tan, and arctan. E.g.

sin(x) = 2*tan(x/2)/(tan(x/2)^2+1)
cos(x) = (1-tan(x/2)^2)/(1+tan(x/2)^2)

For your example this suffices but if necessary it next rewrites the
result again using the minimum number of algebraically independent
kernels.

There is also a complex-valued version called 'complexElementary'
which uses only log and exp but may introduce the constant sqrt(-1).

>
> Are we talking about functions of just one variable (f(x)) or more
> (f(x, y, z, ...))?

In general more than one variable.

>
> Why cannot you just use the probabilistic testing, where you plug in
> various (complex) numbers into f(x) and test that it is equal to zero,
> numerically.
>

I suppose that might be pragmatic but I would not call it "computer
algebra" in the mathematical sense.

Bill.

Ondřej Čertík

unread,
Nov 20, 2014, 10:08:52 PM11/20/14
to sage-...@googlegroups.com
On Thu, Nov 20, 2014 at 7:53 PM, Bill Page <bill...@newsynthesis.org> wrote:
> On 20 November 2014 12:56, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> ...
>> Can you give an example of an expression that cannot be decided by
>> the Richardson's theorem?
>
> Well, no not exactly. Richardson's theorem is not about individual
> expressions, it is about decidability, i.e. computability, in general.
> Consider an expression of the form
>
> f(x) - | f(x) |
>
> where f(x) is composed from integers, the variable x, +, * and sin.
> The question that Richardson's theorem answers is whether or not there
> exists a program that can determine if f(x) - |f(x)| = 0 for all x.
> This problem can be reduced to finding an algorithm to determine if
> f(x) is everywhere non-negative. Richardson proves that no such
> algorithm exists.

I see. But what does this have to do with the derivative of |f(x)| that
we are trying to figure out?

As you pointed out, the challenge is that if you include conjugate(x),
then you might be out of luck. But aren't you out of luck already if you
have abs(x) in the expression in the first place? I.e. taking a derivative
is not going to change anything, you are still out of luck.

>
>> How does FriCAS do the zero testing? I.e. if you
>> give it
>>
>> f(x) = sin(x)^2 + cos(x)^2-1
>>
>> how does it decide that it is equal to 0?
>
> This can be done by the function 'normalize' which first uses
> 'realElementary' to rewrite the expression using just 4 fundamental
> real-valued elementary transcendental functions (kernels): log, exp,
> tan, and arctan. E.g.
>
> sin(x) = 2*tan(x/2)/(tan(x/2)^2+1)
> cos(x) = (1-tan(x/2)^2)/(1+tan(x/2)^2)
>
> For your example this suffices but if necessary it next rewrites the
> result again using the minimum number of algebraically independent
> kernels.
>
> There is also a complex-valued version called 'complexElementary'
> which uses only log and exp but may introduce the constant sqrt(-1).

I see, clever.

>
>>
>> Are we talking about functions of just one variable (f(x)) or more
>> (f(x, y, z, ...))?
>
> In general more than one variable.
>
>>
>> Why cannot you just use the probabilistic testing, where you plug in
>> various (complex) numbers into f(x) and test that it is equal to zero,
>> numerically.
>>
>
> I suppose that might be pragmatic but I would not call it "computer
> algebra" in the mathematical sense.

Sure, this might be one of the many methods in a CAS.

Ondrej

Ondřej Čertík

unread,
Nov 21, 2014, 4:58:07 AM11/21/14
to sage-...@googlegroups.com
I've written up all the equations from this thread together with
detailed step by step derivation:

http://www.theoretical-physics.net/dev/math/complex.html

e.g. the derivatives are here:

http://www.theoretical-physics.net/dev/math/complex.html#complex-derivatives

Most of the examples from this thread are there, but few are still
missing, I'll add them tomorrow.

Bill, please let me know if you have any feedback/comments to it.

Ondrej

Bill Page

unread,
Nov 21, 2014, 11:37:44 AM11/21/14
to sage-devel
On 20 November 2014 22:08, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Thu, Nov 20, 2014 at 7:53 PM, Bill Page <bill...@newsynthesis.org> wrote:
> ...
>> This problem can be reduced to finding an algorithm to determine
>> if f(x) is everywhere non-negative. Richardson proves that no such
>> algorithm exists.
>
> I see. But what does this have to do with the derivative of |f(x)| that
> we are trying to figure out?
>

This has to do with 'conjugate' in general, not just derivatives of
expressions containing 'conjugate'. The problem is that 'conjugate'
is transcendental but it cannot be written in terms of log and exp.

> As you pointed out, the challenge is that if you include conjugate(x),
> then you might be out of luck. But aren't you out of luck already if
> you have abs(x) in the expression in the first place? I.e. taking a
> derivative is not going to change anything, you are still out of luck.
>

You are right about the derivative. But my limited understanding is
that the strategy is not to avoid 'abs(x)' but rather to avoid 'sin'.
We cannot similarly avoid 'conjugate' and in general the effect of
including 'conjugate' is apparently unknown. But one effect of
including 'conjugate' is that we can have expressions like

x+conjugate(x)

which is necessarily real-valued, rather like 'abs(x)' for x
real-valued is non-negative. So it would be nice to know, for example
for any expression composed of x, integers, +, *, sin, and conjugate,
if there is an algorithm to determine if this expression is everywhere
real-valued.

Ondřej Čertík

unread,
Nov 21, 2014, 8:18:31 PM11/21/14
to sage-...@googlegroups.com
I am still confused about one thing: is this issue is already present
in FriCAS before your changes?
Because you can already use conjugate, sin, +, *, ..., even without defining
the derivative for abs(x). I fail to see how defining the
abs(x).diff(x) in the way you did it
can introduce issues that weren't present in the first place.

-----

I have finished the writeup, it starts here (you might want to refresh
your browser
to see the latest changes):

http://www.theoretical-physics.net/dev/math/complex.html#complex-conjugate

and it was implemented with these two PRs:

https://github.com/certik/theoretical-physics/pull/39
https://github.com/certik/theoretical-physics/pull/40

I must say one thing that I like about the "theta" is that it tells
you immediately if the function is analytic or not (if theta is
present it is not, if it is not present, then the expression does not
depend on theta, and thus is analytic). For example, for log(z), the
theta cancels, and so the result 1/z is analytic.

I found a bug in these results from FriCAS:

> (4) -> D(abs(f(x)),x)
>
> , _ _ ,
> f(x)f (x) + f(x)f (x)
>
> (4) ---------------------
> 2abs(f(x))
> Type:
> Expression(Integer)
> (5) -> D(abs(log(x)),x)
>
> _ _
> xlog(x) + x log(x)
> (5) ------------------
> _
> 2xxabs(log(x))
> Type:
> Expression(Integer)

The bar must be over the whole f(x) as well as log(x), because
conjugate(log(x)) is only equal log(conjugate(x)) if x is not negative
real number. See the example here:
http://www.theoretical-physics.net/dev/math/complex.html#id1 where I
have it explicitly worked out. You can also check that easily in
Python:

In [1]: from cmath import log

In [2]: x = -1+1j

In [3]: log(x).conjugate()
Out[3]: (0.34657359027997264-2.356194490192345j)

In [4]: log(x.conjugate())
Out[4]: (0.34657359027997264-2.356194490192345j)

In [5]: x = -1

In [6]: log(x).conjugate()
Out[6]: -3.141592653589793j

In [7]: log(x.conjugate())
Out[7]: 3.141592653589793j

In [8]: log(x.conjugate()) - 2*pi*1j
Out[8]: -3.141592653589793j


Where [3] and [4] are equal, but [6] and [7] are not (you need to
subtract 2*pi*i from [7], as in [8], in order to recover [6],
consistent with the formula in the writeup).


Apart from this issue, all other examples that you posted seem to be
correct and agree with my hand based step by step calculation in the
writeup.

Ondrej

Bill Page

unread,
Nov 22, 2014, 9:23:17 AM11/22/14
to sage-devel
On 21 November 2014 at 20:18, Ondřej Čertík <ondrej...@gmail.com> wrote:
>
> I am still confused about one thing: is this issue is already
> present in FriCAS before your changes? Because you can
> already use conjugate, sin, +, *, ..., even without defining the
> derivative for abs(x). I fail to see how defining the abs(x).diff(x)
> in the way you did it can introduce issues that weren't present
> in the first place.
>

FriCAS currently does not implement a symbolic 'conjugate' operator.
The issue concerns whether adding 'conjugate' is a good idea and only
secondly how to differentiate it.

> -----
>
> I have finished the writeup, it starts here (you might want to refresh
> your browser to see the latest changes):
>
> http://www.theoretical-physics.net/dev/math/complex.html#complex-conjugate
>
> and it was implemented with these two PRs:
>
> https://github.com/certik/theoretical-physics/pull/39
> https://github.com/certik/theoretical-physics/pull/40
>

Thanks.

> I must say one thing that I like about the "theta" is that it tells
> you immediately if the function is analytic or not (if theta is
> present it is not, if it is not present, then the expression does not
> depend on theta, and thus is analytic). For example, for log(z),
> the theta cancels, and so the result 1/z is analytic.
>

Still looks ugly to me.

> I found a bug in these results from FriCAS:
>
>> (4) -> D(abs(f(x)),x)
>>
>> , _ _ ,
>> f(x)f (x) + f(x)f (x)
>>
>> (4) ---------------------
>> 2abs(f(x))
>> Type:
>> Expression(Integer)
>> (5) -> D(abs(log(x)),x)
>>
>> _ _
>> xlog(x) + x log(x)
>> (5) ------------------
>> _
>> 2xxabs(log(x))
>> Type:
>> Expression(Integer)
>
> The bar must be over the whole f(x) as well as log(x), because
> conjugate(log(x)) is only equal log(conjugate(x)) if x is not
> negative real number.

In FriCAS with my patch functions defined by

f := operator 'f

are currently assume to be holomorphic and log is holomorphic by definition so

conjugate(log(x)) = log(conjugate(x))

Perhaps you are considering the wrong branch.

> See the example here:
> http://www.theoretical-physics.net/dev/math/complex.html#id1 where I
> have it explicitly worked out. You can also check that easily in
> Python:
>
> In [1]: from cmath import log
>
> In [2]: x = -1+1j
>
> In [3]: log(x).conjugate()
> Out[3]: (0.34657359027997264-2.356194490192345j)
>
> In [4]: log(x.conjugate())
> Out[4]: (0.34657359027997264-2.356194490192345j)
>
> In [5]: x = -1
>
> In [6]: log(x).conjugate()
> Out[6]: -3.141592653589793j
>
> In [7]: log(x.conjugate())
> Out[7]: 3.141592653589793j
>
> In [8]: log(x.conjugate()) - 2*pi*1j
> Out[8]: -3.141592653589793j
>
>
> Where [3] and [4] are equal, but [6] and [7] are not (you need to
> subtract 2*pi*i from [7], as in [8], in order to recover [6],
> consistent with the formula in the writeup).
>

Complex 'log' is a multi-valued like 'sqrt' so you need to consider
more than one branch.

Bill.

Bill Page

unread,
Nov 22, 2014, 11:13:00 AM11/22/14
to sage-devel
On 21 November 2014 at 20:18, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Fri, Nov 21, 2014 at 9:37 AM, Bill Page <bill...@newsynthesis.org> wrote:
>>
>> You are right about the derivative. But my limited understanding
>> is that the strategy is not to avoid 'abs(x)' but rather to avoid 'sin'.
>> We cannot similarly avoid 'conjugate' and in general the effect
>> of including 'conjugate' is apparently unknown. But one effect
>> of including 'conjugate' is that we can have expressions like
>>
>> x+conjugate(x)
>>
>> which is necessarily real-valued, rather like 'abs(x)' for x
>> real-valued is non-negative.
> ...

On reconsidering "my limited understanding" I see that contrary to
what I said, rewriting

sin(x) = 2*tan(x/2)/(tan(x/2)^2+1)

does not avoid Richardson's theorem. Rather I think what is really
going on in FriCAS is rewriting

abs(x) = sqrt(x^2)

or in my case

abs(x) = sqrt(x*conjugate(x))

taking 'sqrt' as "algebraic", i.e. the solution to y^2 = x, without
selecting a specific branch.

Bill.

Ondřej Čertík

unread,
Nov 22, 2014, 12:34:49 PM11/22/14
to sage-...@googlegroups.com
On Sat, Nov 22, 2014 at 7:23 AM, Bill Page <bill...@newsynthesis.org> wrote:
> On 21 November 2014 at 20:18, Ondřej Čertík <ondrej...@gmail.com> wrote:
>>
>> I am still confused about one thing: is this issue is already
>> present in FriCAS before your changes? Because you can
>> already use conjugate, sin, +, *, ..., even without defining the
>> derivative for abs(x). I fail to see how defining the abs(x).diff(x)
>> in the way you did it can introduce issues that weren't present
>> in the first place.
>>
>
> FriCAS currently does not implement a symbolic 'conjugate' operator.
> The issue concerns whether adding 'conjugate' is a good idea and only
> secondly how to differentiate it.

Ah, I had no idea that FriCAS does not implement conjugate(x). How do
you handle complex numbers then?
In SymPy and Sage, conjugate(x) is in it, so then adding a derivative
of abs(x) does not make things worse.
Well, you are right that in theory you define log(z) as
log(z)=log|z|+i*arg(z), and you define arg(z) as multivalued, i.e. you
can add 2*pi*n to it, then you can add 2*pi*i*n to log(z). Since [6]
and [7] differs by 2*pi*i, they are indeed the same number.
However, this definition quickly becomes impractical, because you need
to be able to numerically evaluate symbolic expressions, and you would
need to carry the symbolic term 2*pi*i*n around. This multivalued
approach has always been very confusing to me. But it is a valid
approach (i.e. see http://en.wikipedia.org/wiki/Riemann_surface), so
let's call this is an approach (A).

The other approach, let's call it approach (B), is that languages like
Fortran, C, Python, and CAS like Mathematica, SymPy, Sage all pick a
branch cut, and all of them (as far as I know) pick it along the
negative real axis. For this example I think it doesn't matter where
you choose the branch cut, as the conjugate of log(-1) simply flips
the sign of it, so it won't be equal to log(-1) anymore. In this
approach you need to carry the corrections for branch cuts.

Some examples of identities valid in each approach:

(A) conjugate(log(z)) = log(conjugate(z))
(B) conjugate(log(z)) = log(conjugate(z)) -2*pi*i*floor((arg(z)+pi)/(2*pi))

or

(A) log(a*b) = log(a) + log(b)
(B) log(a*b) = log(a) + log(b) + 2*pi*i*floor((pi-arg(a)-arg(b))/(2*pi))

And so on. I have written a Python script to check many of these
identities in (B), available here:

http://www.theoretical-physics.net/dev/math/complex.html#testing-identities-using-computer-code

and it works like a charm, i.e. those identities are valid for any
complex numbers. On that page, I have also derived those identities
step by step, i.e. you simply define arg(z) = atan2(Im z, Re z) and go
from there, the floor() function comes from properties of the atan2()
function.

I don't think you can mix and match (A) with (B). You have to make a
decision and be consistent everywhere.

Bill, why don't you check if FriCAS is using approach (A) or (B)? This
is very simple to do, simply check the left hand side and right hand
side of any of these identities in (A). Since as you said, FriCAS
doesn't support conjugate(), just use the log(a*b) case. Here is how
you can check this in Python:

>>> from cmath import log
>>> a = -1
>>> b = -1
>>> log(a*b)
0j
>>> log(a)+log(b)
6.283185307179586j

So you can see that the left hand side log(a*b) does not equal the
right hand side log(a)+log(b), so Python is using the approach (B).
You can see the script above, where I check this, but for clarity,
let's just verify that the (B) formula works in Python for this
particular case:

>>> def arg(x): return log(x).imag
...
>>> from math import floor, pi
>>> I = 1j
>>> log(a)+log(b)+2*pi*I*floor((pi-arg(a)-arg(b))/(2*pi))
0j



I would assume that FriCAS is also using the approach (B), and thus
conjugate(log(z)) is not equal to log(conjugate(z)), but let's wait
until what you find.

Ondrej

Bill Page

unread,
Nov 24, 2014, 3:57:30 PM11/24/14
to sage-devel
On 22 November 2014 at 12:34, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Sat, Nov 22, 2014 at 7:23 AM, Bill Page <bill...@newsynthesis.org> wrote:
>> ...
>> FriCAS currently does not implement a symbolic 'conjugate' operator.
>> The issue concerns whether adding 'conjugate' is a good idea and only
>> secondly how to differentiate it.
>
> Ah, I had no idea that FriCAS does not implement conjugate(x).
> How do you handle complex numbers then?

Sorry, I gave you the wrong impression. I specifically referred to the
lack of "symbolic 'conjugate operator". By that I meant that the
'Expression' functor does not export a 'conjugate' operator. My patch
adds such an operator to Expression. But FriCAS has many domains of
computation besides those constructed by 'Expression' and some of them
do include 'conjugate'. For example the 'Complex' functor includes
'conjugate' so we can write:

(1) -> x:Complex Expression Integer := a + %i*b

(1) a + b %i
Type: Complex(Expression(Integer))
(2) -> conjugate(x)

(2) a - b %i
Type: Complex(Expression(Integer))

This effectively and implicitly treats symbols as real valued.

(3) -> D(x,a)

(3) 1
Type: Complex(Expression(Integer))
(4) -> D(x,b)

(4) %i
Type: Complex(Expression(Integer))
(5) -> y:=log(x)

2 2
log(b + a ) b
(5) ------------ + 2atan(--------------)%i
2 +-------+
| 2 2
\|b + a + a
Type: Complex(Expression(Integer))
(6) -> conjugate(y)

2 2
log(b + a ) b
(6) ------------ - 2atan(--------------)%i
2 +-------+
| 2 2
\|b + a + a
Type: Complex(Expression(Integer))

But not this:

(7) -> D(y,x)
...
Cannot find a definition or applicable library operation named D
with argument type(s)
Complex(Expression(Integer))
Complex(Expression(Integer))

So there is no "complex derivative" as such.

We can also define things this way:

(8) -> z:Expression Complex Integer := a + %i*b

(8) %i b + a
Type: Expression(Complex(Integer))
(9) -> D(z,a)

(9) 1
Type: Expression(Complex(Integer))
(10) -> D(z,b)

(10) %i
Type: Expression(Complex(Integer))
(11) -> w:=log(z)

(11) log(%i b + a)
Type: Expression(Complex(Integer))

But now we get:

(12) -> conjugate(z)
...
Cannot find a definition or applicable library operation named
conjugate with argument type(s)
Expression(Complex(Integer))

Perhaps you should use "@" to indicate the required return type,
or "$" to specify which version of the function you need.

(13) -> D(w,z)
...
Cannot find a definition or applicable library operation named D
with argument type(s)
Expression(Complex(Integer))
Expression(Complex(Integer))

The FriCAS 'Expression' functor extends multivariate rational
functions over a specified domain with a large number of
transcendental kernels (symbolic functions) as well as differentiation
and integration operators. No explicit assumption is made about the
domain of the variables. My proposed patch to FriCAS adds 'conjugate'
as another kernel function and provides a 'conjugate' as an operator.

> In SymPy and Sage, conjugate(x) is in it, so then adding a derivative
> of abs(x) does not make things worse.
>

In FriCAS 'abs' is already a kernel function and it implemented the
derivative of 'abs' even before my proposed patch but I think the
current definition is wrong:

(14) -> D(abs(x),x)

abs(x)
(14) ------
x
Type: Expression(Integer)


>>
>> In FriCAS with my patch functions defined by
>>
>> f := operator 'f
>>
>> are currently assume to be holomorphic and log is holomorphic by definition so
>>
>> conjugate(log(x)) = log(conjugate(x))
>>
>> Perhaps you are considering the wrong branch.
>> ...
>> Complex 'log' is a multi-valued like 'sqrt' so you need to consider
>> more than one branch.
>
> Well, you are right that in theory you define log(z) as
> log(z)=log|z|+i*arg(z), and you define arg(z) as multivalued,
> i.e. you can add 2*pi*n to it, then you can add 2*pi*i*n to log(z).
> Since [6] and [7] differs by 2*pi*i, they are indeed the same number.

I would not say that they are the same "number". Also I don't want to
define log(z) the way to suggest. Rather, I think the correct
definition of 'log(z)' is the solution of

z = exp(?)

So we can write z=exp(log(z)) by definition. This is exactly analogous
to the treatment of 'sqrt(z)' as the solution to

z = ? * ?

> However, this definition quickly becomes impractical, because you
> need to be able to numerically evaluate symbolic expressions, and
> you would need to carry the symbolic term 2*pi*i*n around.

We do not need an extra term. We only need axioms for the correct
behavior of the expression 'log(z)'. But 'log(z)' does not denote a
function in the sense of a many-to-one mapping. The inverse of a
function is only a function (possibly partial) if the function is
injective (one-to-one).

> This multivalued approach has always been very confusing to me. But
> it is a valid approach (i.e. see http://en.wikipedia.org/wiki/Riemann_surface),
> so let's call this is an approach (A).

The Riemann surface is an important tool in complex analysis but I
have yet to see it used explicitly in any computer algebra system as a
representation of complex functions.

>
> The other approach, let's call it approach (B), is that languages like
> Fortran, C, Python, and CAS like Mathematica, SymPy, Sage all pick
> a branch cut, and all of them (as far as I know) pick it along the
> negative real axis. For this example I think it doesn't matter where
> you choose the branch cut, as the conjugate of log(-1) simply flips
> the sign of it, so it won't be equal to log(-1) anymore. In this
> approach you need to carry the corrections for branch cuts.
>

In order to numerically evaluate a symbolic expression it is indeed
necessary to choose a branch in the case of "multi-valued functions",
i.e. expressions like 'sqrt(x)' and 'log(x)'. But this choice should
not effect the axiomatic properties of these expressions.

> Some examples of identities valid in each approach:
>
> (A) conjugate(log(z)) = log(conjugate(z))
> (B) conjugate(log(z)) = log(conjugate(z)) -2*pi*i*floor((arg(z)+pi)/(2*pi))
>
> or
>
> (A) log(a*b) = log(a) + log(b)
> (B) log(a*b) = log(a) + log(b) + 2*pi*i*floor((pi-arg(a)-arg(b))/(2*pi))
>
> Bill, why don't you check if FriCAS is using approach (A) or (B)?

In FrCAS we have

(2) -> normalize(log(a*b)-log(a)-log(b))

(2) 0
Type: Expression(Integer)

and with my proposed patch we also have 'conjugate(log(z)) =
log(conjugate(z))' by definition. So this is like your (A).

>
> I would assume that FriCAS is also using the approach (B), and thus
> conjugate(log(z)) is not equal to log(conjugate(z)), but let's wait
> until what you find.
>

For numeric domains which support 'conjugate' FriCAS behaves in the
same way (B) as Python and the other languages you mentioned. For
example:

(3) -> z:Complex Float := 1 - %i

(3) 1.0 - %i
Type: Complex(Float)
(4) -> test( conjugate(log(z)) = log(conjugate(z)) -
2*%pi*%i*floor((argument(z)+%pi)/(2*%pi)) )

(4) true
Type: Boolean
(5) -> a:Complex Float := -1

(5) - 1.0
Type: Complex(Float)
(6) -> b:Complex Float := -2

(6) - 2.0
Type: Complex(Float)
(7) -> test( log(a*b) = log(a)+log(b) +
2*%pi*%i*floor((%pi-argument(a)-argument(b))/(2*%pi)) )

(7) true
Type: Boolean

Now I am a little unclear on what you are proposing. Are you
suggesting that symbolic computations (such as those using
'Expression' in FriCAS) should somehow introduce an independent
'argument' function instead of defining it as

argument(x) = log(x/abs(x))/%i

or that 'abs' is somehow an important part of these equalities? Note
that 'normalize' does not have to introduce these sort of terms to in
order to return 0 in result (2) above.

Bill.

Ondřej Čertík

unread,
Nov 24, 2014, 5:43:57 PM11/24/14
to sage-...@googlegroups.com
Ok, thanks for the clarification.

>
>> In SymPy and Sage, conjugate(x) is in it, so then adding a derivative
>> of abs(x) does not make things worse.
>>
>
> In FriCAS 'abs' is already a kernel function and it implemented the
> derivative of 'abs' even before my proposed patch but I think the
> current definition is wrong:
>
> (14) -> D(abs(x),x)
>
> abs(x)
> (14) ------
> x
> Type: Expression(Integer)

I think that's correct for real numbers, i.e. x/abs(x) = abs(x) / x.

>
>
>>>
>>> In FriCAS with my patch functions defined by
>>>
>>> f := operator 'f
>>>
>>> are currently assume to be holomorphic and log is holomorphic by definition so
>>>
>>> conjugate(log(x)) = log(conjugate(x))
>>>
>>> Perhaps you are considering the wrong branch.
>>> ...
>>> Complex 'log' is a multi-valued like 'sqrt' so you need to consider
>>> more than one branch.
>>
>> Well, you are right that in theory you define log(z) as
>> log(z)=log|z|+i*arg(z), and you define arg(z) as multivalued,
>> i.e. you can add 2*pi*n to it, then you can add 2*pi*i*n to log(z).
>> Since [6] and [7] differs by 2*pi*i, they are indeed the same number.
>
> I would not say that they are the same "number". Also I don't want to
> define log(z) the way to suggest. Rather, I think the correct
> definition of 'log(z)' is the solution of
>
> z = exp(?)
>
> So we can write z=exp(log(z)) by definition.

Indeed, exp(log(z))=z always, see the formula (2) here:

http://www.theoretical-physics.net/dev/math/complex.html#logarithm

Just to clarify, I suggested the following definitions for (A) and (B):

(A) log(z) = log|z| + i*arg(z) + 2*pi*i*n

(B) log(z) = log|z| + i*arg(z)

I both, -pi < arg(z) <= pi, and "n" is integer. Note that in (A) I've
seen people add the 2*pi term into arg(z), so then arg(z) itself is
multivalued, then you get just (A) log(z) = log|z| + i*arg(z). In
either case, in (A), there is an explicit or implicit dependence on
"n", which denotes all the multiple values for log(z). In this email I
assume that arg(z) is single valued and I put the "n" dependence in
all equations explicitly.

In both (A) and (B),
it is true that z = exp(log(z)). However, these are different:

(A) log(exp(z)) = z + 2*pi*i*n
(B) log(exp(z)) = z + 2*pi*i*floor((pi-Im z) / 2*pi)

In (B), you get a single value, but in (A) you get multiple values,
one for each "n".

I am very familiar with the approach (B) and I think I understand
exactly what follows from what and how to derive all the formulas. But
I am not 100% sure with (A), I was hoping you can help, since that's
the approach that you want to use in FriCAS. I think what I wrote is
correct for (A), but please correct me if I am wrong.

> This is exactly analogous
> to the treatment of 'sqrt(z)' as the solution to
>
> z = ? * ?
>
>> However, this definition quickly becomes impractical, because you
>> need to be able to numerically evaluate symbolic expressions, and
>> you would need to carry the symbolic term 2*pi*i*n around.
>
> We do not need an extra term. We only need axioms for the correct
> behavior of the expression 'log(z)'. But 'log(z)' does not denote a
> function in the sense of a many-to-one mapping. The inverse of a
> function is only a function (possibly partial) if the function is
> injective (one-to-one).

Sure, that's why you have "n" in the formula for log(z) in (A), and
the function is multivalued over all "n".

>
>> This multivalued approach has always been very confusing to me. But
>> it is a valid approach (i.e. see http://en.wikipedia.org/wiki/Riemann_surface),
>> so let's call this is an approach (A).
>
> The Riemann surface is an important tool in complex analysis but I
> have yet to see it used explicitly in any computer algebra system as a
> representation of complex functions.

I thought the Riemann surface gives you a way to couple "n" in
log(a*b), log(a) and log(b) in such a way that you get exactly
log(a*b)-log(a)-log(b)=0. I.e. that the Riemann surface is (A). I
agree that I haven't seen it used in CAS. But I think it must be
implicitly used in FriCAS somehow. Maybe you can clarify that.

>
>>
>> The other approach, let's call it approach (B), is that languages like
>> Fortran, C, Python, and CAS like Mathematica, SymPy, Sage all pick
>> a branch cut, and all of them (as far as I know) pick it along the
>> negative real axis. For this example I think it doesn't matter where
>> you choose the branch cut, as the conjugate of log(-1) simply flips
>> the sign of it, so it won't be equal to log(-1) anymore. In this
>> approach you need to carry the corrections for branch cuts.
>>
>
> In order to numerically evaluate a symbolic expression it is indeed
> necessary to choose a branch in the case of "multi-valued functions",
> i.e. expressions like 'sqrt(x)' and 'log(x)'. But this choice should
> not effect the axiomatic properties of these expressions.

I see. I think in the definition:

(A) log(z) = log|z| + i*arg(z) + 2*pi*i*n

the result is multivalued, and you obtain all the numerical values by
plugging different integers for "n".

>
>> Some examples of identities valid in each approach:
>>
>> (A) conjugate(log(z)) = log(conjugate(z))
>> (B) conjugate(log(z)) = log(conjugate(z)) -2*pi*i*floor((arg(z)+pi)/(2*pi))
>>
>> or
>>
>> (A) log(a*b) = log(a) + log(b)
>> (B) log(a*b) = log(a) + log(b) + 2*pi*i*floor((pi-arg(a)-arg(b))/(2*pi))
>>
>> Bill, why don't you check if FriCAS is using approach (A) or (B)?
>
> In FrCAS we have
>
> (2) -> normalize(log(a*b)-log(a)-log(b))
>
> (2) 0
> Type: Expression(Integer)
>
> and with my proposed patch we also have 'conjugate(log(z)) =
> log(conjugate(z))' by definition. So this is like your (A).

What does Expression(Integer) mean? Does it mean that "a" and "b" are
integers? I.e. does the above hold if a=b=-1?

This is precisely the part that I don't understand with the approach
(A). log(a*b), log(a) and log(b) are all multivalued, so you would
naively think, that log(a*b)-log(a)-log(b) = 0 + 2*pi*i*n, for all
"n". But I think this is not the case, I think the "n" in log(a*b) is
coupled to the implicit "n" in log(a) and log(b) in such a way, that
the result is exactly 0. Can you clarify exactly how this works?

Using the approach (B), with a=b=-1, we get:

log(a*b) = 0
log(a) = log(b) = i*pi

and finally, in the equation log(a*b) = log(a) + log(b) +
2*pi*i*floor((pi-arg(a)-arg(b))/(2*pi)), the floor() is "-1", so we
get -2*pi*i as the correction and the result is 0. So it all works and
everything is single valued.
I think that the following equations hold, at least in (B), so I think
you can use any of them to define arg(x):

arg(x) = log(x/abs(x)) / i
arg(x) = im(log(z))
arg(x) = atan2(Im x, Re x)

In the approach (B), I think there is multiple ways how to
define/derive things. I chose one such way, and I start from arg(x) =
atan2(Im x, Re x) and derive everything else. But I am pretty sure you
can also just start from log(z) and derive arg(z) and there are
probably couple other different approaches. In the approach that I
chose, it all becomes just simple algebra, you don't need to think at
all, it's just plugging one expression into another, everything is
single valued and you always obtain correct expressions. So all I am
saying is that in (B) you can directly evaluate things numerically and
everything is self-consistent.

>
> or that 'abs' is somehow an important part of these equalities?

I am not sure what you mean by this question.

> Note
> that 'normalize' does not have to introduce these sort of terms to in
> order to return 0 in result (2) above.

See my questions above about this.

Ondrej

Bill Page

unread,
Nov 25, 2014, 12:23:41 AM11/25/14
to sage-devel
On 24 November 2014 at 17:43, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Mon, Nov 24, 2014 at 1:57 PM, Bill Page <bill...@newsynthesis.org> wrote:
> ...
>>
>> In FriCAS 'abs' is already a kernel function and it implemented the
>> derivative of 'abs' even before my proposed patch but I think the
>> current definition is wrong:
>>
>> (14) -> D(abs(x),x)
>>
>> abs(x)
>> (14) ------
>> x
>> Type: Expression(Integer)
>
> I think that's correct for real numbers, i.e. x/abs(x) = abs(x) / x.
>

I am not very interested in real numbers. I am interested in the
algebra. Would you say that

sqrt(x^2).diff(x) = sqrt(x^2)/x

is OK?

>> Rather, I think the correct definition of 'log(z)' is the solution of
>>
>> z = exp(?)
>>
>> So we can write z=exp(log(z)) by definition.
>
> Indeed, exp(log(z))=z always,

Not just "true always". It is the definition of 'log(z)'.

>
> In both (A) and (B),
> it is true that z = exp(log(z)). However, these are different:
>
> (A) log(exp(z)) = z + 2*pi*i*n
> (B) log(exp(z)) = z + 2*pi*i*floor((pi-Im z) / 2*pi)
>
> In (B), you get a single value, but in (A) you get multiple values,
> one for each "n".
>

Better to say (A) is true "\forall n \in Integer"

> I am very familiar with the approach (B) and I think I understand
> exactly what follows from what and how to derive all the formulas.

(B) is about choosing a particular "branch" of 'log' - the principal
value. But I don't want to be forced to make a choice of branch until
I actually need to evaluate an expression numerically.

> But I am not 100% sure with (A), I was hoping you can help, since
> that's the approach that you want to use in FriCAS.

(A) is not quite the approach I want to use for 'Expression' in
FriCAS. I want 'log(exp(z))' to be the solution of

exp(z) = exp(?)

The best way to say that algebraically (symbolically) is just
'log(exp(z))', i.e. without evaluation. This is what FriCAS already
does in the case of

(1) -> z:Expression Complex Integer
Type: Void
(2) -> exp(log(z))

(2) z
Type: Expression(Complex(Integer))
(3) -> log(exp(z))

z
(3) log(%e )
Type: Expression(Complex(Integer))

but unfortunately not in the case of 'Expression Integer'.

(4) -> z:Expression Integer
Type: Void
(5) -> log(exp(z))

(5) z
Type: Expression(Integer)

I think that what it currently does for 'Expression Integer' should be
considered a bug.

> I think what I wrote is correct for (A), but please correct me if I am wrong.
>
>> This is exactly analogous
>> to the treatment of 'sqrt(z)' as the solution to
>>
>> z = ? * ?
>>

We have

sqrt(z)^2 = z

but

sqrt(z^2) = sqrt(z^2)

>>> However, this definition quickly becomes impractical, because you
>>> need to be able to numerically evaluate symbolic expressions, and
>>> you would need to carry the symbolic term 2*pi*i*n around.
>>
>> We do not need an extra term. We only need axioms for the correct
>> behavior of the expression 'log(z)'. But 'log(z)' does not denote a
>> function in the sense of a many-to-one mapping. The inverse of a
>> function is only a function (possibly partial) if the function is
>> injective (one-to-one).
>
> Sure, that's why you have "n" in the formula for log(z) in (A), and
> the function is multivalued over all "n".
>

No. As you have written it, it is a function of two variables with
one value for each z and n.

f(z,n) = z + 2*pi*i*n

I think what you are trying to say is

(A) log(exp(z)) = { z + 2*pi*i*n | for all n in Integer}

and

sqrt(z) = { x | x^2 =z }

Although it may seem simple in this case, in general implementing sets
with comprehension like this requires logic and takes us outside of
algebra as such into the realm of theorem proving.

>>
>> The Riemann surface is an important tool in complex analysis but I
>> have yet to see it used explicitly in any computer algebra system as
>> a representation of complex functions.
>
> I thought the Riemann surface gives you a way to couple "n" in
> log(a*b), log(a) and log(b) in such a way that you get exactly
> log(a*b)-log(a)-log(b)=0. I.e. that the Riemann surface is (A).
> I agree that I haven't seen it used in CAS. But I think it must be
> implicitly used in FriCAS somehow. Maybe you can clarify that.

I don't think it is used in FriCAS at all. A Riemann surface is a
manifold. At each point in the manifold we need to evaluate a
function, but how would you propose to represent this surface
algebraically? For example, the surface of a sphere is a manifold.
For that we need a co-ordinate system that labels each point on the
surface. In a sense the important thing about a Riemann surface is
its global topology and this is different for each complex holomorphic
function. For 'exp' it might make sense to include 'n' as one of
these co-ordinates. But this is not what FriCAS does.

>> ...
>> In order to numerically evaluate a symbolic expression it is indeed
>> necessary to choose a branch in the case of "multi-valued functions",
>> i.e. expressions like 'sqrt(x)' and 'log(x)'. But this choice should
>> not effect the axiomatic properties of these expressions.
>
> I see. I think in the definition:
>
> (A) log(z) = log|z| + i*arg(z) + 2*pi*i*n
>
> the result is multivalued, and you obtain all the numerical values by
> plugging different integers for "n".
>

Sure but "plugging different integers for n" makes n a parameter.

>>
>> In FrCAS we have
>>
>> (2) -> normalize(log(a*b)-log(a)-log(b))
>>
>> (2) 0
>> Type: Expression(Integer)
>>
>> and with my proposed patch we also have 'conjugate(log(z)) =
>> log(conjugate(z))' by definition. So this is like your (A).
>
> What does Expression(Integer) mean? Does it mean that "a" and "b"
>are integers? I.e. does the above hold if a=b=-1?

No. Integer is the domain of the coefficients of the rational
function. In other words Expression Integer is a ratio of two
polynomials with coefficients from Integer and variables from Symbol
or a kernels over Expression Integer (recursively).

>
> This is precisely the part that I don't understand with the approach
> (A). log(a*b), log(a) and log(b) are all multivalued, so you would
> naively think, that log(a*b)-log(a)-log(b) = 0 + 2*pi*i*n, for all
> "n". But I think this is not the case, I think the "n" in log(a*b) is
> coupled to the implicit "n" in log(a) and log(b) in such a way, that
> the result is exactly 0. Can you clarify exactly how this works?

Try it this way:

a*b = exp(?1)
a = exp(?2)
b = exp(?3)

I think 'normalize' is saying that there is a solution that makes

?1 - ?2 - ?3 = 0.

Maybe there is a way to think of this as a kind of coupling of n's if
that was the way that log was represented but I am quite sure that
this is not what 'normalize' is doing.

>>
>> Now I am a little unclear on what you are proposing. Are you
>> suggesting that symbolic computations (such as those using
>> 'Expression' in FriCAS) should somehow introduce an independent
>> 'argument' function instead of defining it as
>>
>> argument(x) = log(x/abs(x))/%i
> ...
>>
>> or that 'abs' is somehow an important part of these equalities?
>
> I am not sure what you mean by this question.
>

I meant that I did not understand what you are proposing for how to
represent the value of 'log(z)' symbolically, i.e. when the value of z
is unknown.

>> Note
>> that 'normalize' does not have to introduce these sort of terms to
>> in order to return 0 in result (2) above.
>
> See my questions above about this.
>

'normalize' re-writes Expression using the minimum number of
algebraically independent transcendental kernels. In the case above
that is just 0.

--

Let's take a step back: Is this discussion going any place useful for
you? We seem to be talking mostly about FriCAS. Does this discussion
really belong here on the sage-devel mailing list?

Bill.

Ondřej Čertík

unread,
Nov 25, 2014, 1:11:49 AM11/25/14
to sage-...@googlegroups.com
On Mon, Nov 24, 2014 at 10:23 PM, Bill Page <bill...@newsynthesis.org> wrote:
> On 24 November 2014 at 17:43, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> On Mon, Nov 24, 2014 at 1:57 PM, Bill Page <bill...@newsynthesis.org> wrote:
>> ...
>>>
>>> In FriCAS 'abs' is already a kernel function and it implemented the
>>> derivative of 'abs' even before my proposed patch but I think the
>>> current definition is wrong:
>>>
>>> (14) -> D(abs(x),x)
>>>
>>> abs(x)
>>> (14) ------
>>> x
>>> Type: Expression(Integer)
>>
>> I think that's correct for real numbers, i.e. x/abs(x) = abs(x) / x.
>>
>
> I am not very interested in real numbers. I am interested in the
> algebra. Would you say that
>
> sqrt(x^2).diff(x) = sqrt(x^2)/x
>
> is OK?

I think so, using the following calculation:

sqrt(x^2).diff(x) = exp(1/2*log(x^2)).diff(x) = exp(1/2*log(x^2)) *
1/2 * 1/x^2 * 2*x = sqrt(x^2)/x

The function exp(1/2*log(x^2)) that we differentiate is analytic, so I
don't see any issue here.

>
>>> Rather, I think the correct definition of 'log(z)' is the solution of
>>>
>>> z = exp(?)
>>>
>>> So we can write z=exp(log(z)) by definition.
>>
>> Indeed, exp(log(z))=z always,
>
> Not just "true always". It is the definition of 'log(z)'.
>
>>
>> In both (A) and (B),
>> it is true that z = exp(log(z)). However, these are different:
>>
>> (A) log(exp(z)) = z + 2*pi*i*n
>> (B) log(exp(z)) = z + 2*pi*i*floor((pi-Im z) / 2*pi)
>>
>> In (B), you get a single value, but in (A) you get multiple values,
>> one for each "n".
>>
>
> Better to say (A) is true "\forall n \in Integer"

Yes.

>
>> I am very familiar with the approach (B) and I think I understand
>> exactly what follows from what and how to derive all the formulas.
>
> (B) is about choosing a particular "branch" of 'log' - the principal
> value.

Correct.

> But I don't want to be forced to make a choice of branch until
> I actually need to evaluate an expression numerically.

I understand that's what you want. I am just trying to understand how
exactly this works.
Exactly, that's what I meant.

>
> and
>
> sqrt(z) = { x | x^2 =z }

Sure, though I would just define

sqrt(z) = exp(1/2 * log(z))

>
> Although it may seem simple in this case, in general implementing sets
> with comprehension like this requires logic and takes us outside of
> algebra as such into the realm of theorem proving.

Sure. But that's what you want, correct?

>
>>>
>>> The Riemann surface is an important tool in complex analysis but I
>>> have yet to see it used explicitly in any computer algebra system as
>>> a representation of complex functions.
>>
>> I thought the Riemann surface gives you a way to couple "n" in
>> log(a*b), log(a) and log(b) in such a way that you get exactly
>> log(a*b)-log(a)-log(b)=0. I.e. that the Riemann surface is (A).
>> I agree that I haven't seen it used in CAS. But I think it must be
>> implicitly used in FriCAS somehow. Maybe you can clarify that.
>
> I don't think it is used in FriCAS at all. A Riemann surface is a
> manifold. At each point in the manifold we need to evaluate a
> function, but how would you propose to represent this surface
> algebraically? For example, the surface of a sphere is a manifold.
> For that we need a co-ordinate system that labels each point on the
> surface. In a sense the important thing about a Riemann surface is
> its global topology and this is different for each complex holomorphic
> function. For 'exp' it might make sense to include 'n' as one of
> these co-ordinates. But this is not what FriCAS does.

Ok. I don't know what FriCAS does.

>
>>> ...
>>> In order to numerically evaluate a symbolic expression it is indeed
>>> necessary to choose a branch in the case of "multi-valued functions",
>>> i.e. expressions like 'sqrt(x)' and 'log(x)'. But this choice should
>>> not effect the axiomatic properties of these expressions.
>>
>> I see. I think in the definition:
>>
>> (A) log(z) = log|z| + i*arg(z) + 2*pi*i*n
>>
>> the result is multivalued, and you obtain all the numerical values by
>> plugging different integers for "n".
>>
>
> Sure but "plugging different integers for n" makes n a parameter.

We clarified this above, what I meant is a set of all values when n is integer.

>
>>>
>>> In FrCAS we have
>>>
>>> (2) -> normalize(log(a*b)-log(a)-log(b))
>>>
>>> (2) 0
>>> Type: Expression(Integer)
>>>
>>> and with my proposed patch we also have 'conjugate(log(z)) =
>>> log(conjugate(z))' by definition. So this is like your (A).
>>
>> What does Expression(Integer) mean? Does it mean that "a" and "b"
>>are integers? I.e. does the above hold if a=b=-1?
>
> No. Integer is the domain of the coefficients of the rational
> function. In other words Expression Integer is a ratio of two
> polynomials with coefficients from Integer and variables from Symbol
> or a kernels over Expression Integer (recursively).

Ok.

>
>>
>> This is precisely the part that I don't understand with the approach
>> (A). log(a*b), log(a) and log(b) are all multivalued, so you would
>> naively think, that log(a*b)-log(a)-log(b) = 0 + 2*pi*i*n, for all
>> "n". But I think this is not the case, I think the "n" in log(a*b) is
>> coupled to the implicit "n" in log(a) and log(b) in such a way, that
>> the result is exactly 0. Can you clarify exactly how this works?
>
> Try it this way:
>
> a*b = exp(?1)
> a = exp(?2)
> b = exp(?3)
>
> I think 'normalize' is saying that there is a solution that makes
>
> ?1 - ?2 - ?3 = 0.

Ok, but why wouldn't normalize return 2*pi*i instead? Or 4*pi*i?
Surely there is a solution so that

?1 - ?2 - ?3 = 2*pi*i

and a different solution so that

?1 - ?2 - ?3 = 4*pi*i

In other words, how exactly are the operations on the multivalued sets
log(x) defined? Things like log(a*b) - log(a) in this case. In the
case (B), it is defined exactly by the branch cut, so there is no
ambiguity.

>
> Maybe there is a way to think of this as a kind of coupling of n's if
> that was the way that log was represented but I am quite sure that
> this is not what 'normalize' is doing.
>
>>>
>>> Now I am a little unclear on what you are proposing. Are you
>>> suggesting that symbolic computations (such as those using
>>> 'Expression' in FriCAS) should somehow introduce an independent
>>> 'argument' function instead of defining it as
>>>
>>> argument(x) = log(x/abs(x))/%i
>> ...
>>>
>>> or that 'abs' is somehow an important part of these equalities?
>>
>> I am not sure what you mean by this question.
>>
>
> I meant that I did not understand what you are proposing for how to
> represent the value of 'log(z)' symbolically, i.e. when the value of z
> is unknown.

Ah ok. I would represent it by the approach (B). But then, as we
talked about, it's not true that
conjugate(log(z)) = log(conjugate(z)). Since you want this property to
hold, then the approach (B)
does not work for you, obviously. So I am trying to understand how
exactly are all the operations
defined in your approach. You said your approach is not (A) exactly.
So I am just trying to understand.

>
>>> Note
>>> that 'normalize' does not have to introduce these sort of terms to
>>> in order to return 0 in result (2) above.
>>
>> See my questions above about this.
>>
>
> 'normalize' re-writes Expression using the minimum number of
> algebraically independent transcendental kernels. In the case above
> that is just 0.
>
> --
>
> Let's take a step back: Is this discussion going any place useful for
> you? We seem to be talking mostly about FriCAS. Does this discussion
> really belong here on the sage-devel mailing list?

This discussion is about how a CAS should handle (complex)
differentiation. Since it started here, I would finish it here, so
that the whole thread is in one mailinglist for future reference.

Ondrej

Bill Page

unread,
Nov 25, 2014, 1:30:36 PM11/25/14
to sage-devel
On 25 November 2014 at 01:11, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Mon, Nov 24, 2014 at 10:23 PM, Bill Page <bill...@newsynthesis.org> wrote:
>> ...
>> I am not very interested in real numbers. I am interested in the
>> algebra. Would you say that
>>
>> sqrt(x^2).diff(x) = sqrt(x^2)/x
>>
>> is OK?
>
> I think so, using the following calculation:
>
> sqrt(x^2).diff(x) = exp(1/2*log(x^2)).diff(x) = exp(1/2*log(x^2)) *
> 1/2 * 1/x^2 * 2*x = sqrt(x^2)/x
>
> The function exp(1/2*log(x^2)) that we differentiate is analytic, so I
> don't see any issue here.
>

I did not ask whether it was technically correct or not. What I meant
was is this expression what you would expect given the rest of the
machinery of differentiation in any given computer algebra system?

>
>> But I don't want to be forced to make a choice of branch until
>> I actually need to evaluate an expression numerically.
>
> I understand that's what you want. I am just trying to understand how
> exactly this works.
>

OK.

>> ...
>> I think what you are trying to say is
>>
>> (A) log(exp(z)) = { z + 2*pi*i*n | for all n in Integer}
>
> Exactly, that's what I meant.
> ...
>>
>> Although it may seem simple in this case, in general implementing
>> sets with comprehension like this requires logic and takes us
>> outside of algebra as such into the realm of theorem proving.
>
> Sure. But that's what you want, correct?
>

No, not at all. I want this to be "algebraic", not some theorem of
predicate calculus. That is what I meant by taking

x + conjugate(x)

as the definition of a real valued variable.

> ...
>>
>>>
>>> This is precisely the part that I don't understand with the approach
>>> (A). log(a*b), log(a) and log(b) are all multivalued, so you would
>>> naively think, that log(a*b)-log(a)-log(b) = 0 + 2*pi*i*n, for all
>>> "n". But I think this is not the case, I think the "n" in log(a*b) is
>>> coupled to the implicit "n" in log(a) and log(b) in such a way, that
>>> the result is exactly 0. Can you clarify exactly how this works?
>>
>> Try it this way:
>>
>> a*b = exp(?1)
>> a = exp(?2)
>> b = exp(?3)
>>
>> I think 'normalize' is saying that there is a solution that makes
>>
>> ?1 - ?2 - ?3 = 0.
>
> Ok, but why wouldn't normalize return 2*pi*i instead? Or 4*pi*i?

These are equivalent in the sense of having the same number of
algebraically independent transcendental kernels, i.e. none.

>
> In other words, how exactly are the operations on the multivalued
> sets log(x) defined?

FriCAS does not perform operations on multivalued sets to determine the above.

>>
>> I meant that I did not understand what you are proposing for how to
>> represent the value of 'log(z)' symbolically, i.e. when the value of z
>> is unknown.
>
> Ah ok. I would represent it by the approach (B). But then, as we
> talked about, it's not true that conjugate(log(z)) = log(conjugate(z)).
> Since you want this property to hold, then the approach (B) does
> not work for you, obviously. So I am trying to understand how
> exactly are all the operations defined in your approach. You said
> your approach is not (A) exactly. So I am just trying to understand.
>

OK.

>
> This discussion is about how a CAS should handle (complex)
> differentiation. Since it started here, I would finish it here, so
> that the whole thread is in one mailinglist for future reference.
>

OK. It would be nice to know if other sage-devel subscribers actually
remain interested...

Let's return to differentiation for a moment. Using your definitions
what would you say is the correct result for

log(exp(z-conjugate(z))).diff(z)

My patched version of FriCAS based your definition in this thread
currently returns 0. Do you get the same result?

Since the derivative is 0 would we want to say therefore that

log(exp(z-conjugate(z)))

is a constant? If not, isn't this an argument for needing another
derivative? The result of this test currently causes a problem during
manipulations of expressions of this form. Check the two Wirtinger
derivatives for this case. If we have both derivatives we can avoid
this problem quite easily as my previous version of the patch showed.

Bill.

kcrisman

unread,
Nov 25, 2014, 1:38:52 PM11/25/14
to sage-...@googlegroups.com
>
> This discussion is about how a CAS should handle (complex)
> differentiation. Since it started here, I would finish it here, so
> that the whole thread is in one mailinglist for future reference.
>

OK.  It would be nice to know if other sage-devel subscribers actually
remain interested...



In the hopes that eventually something correct gets into Sage, absolutely. 

Ondřej Čertík

unread,
Nov 25, 2014, 2:51:21 PM11/25/14
to sage-...@googlegroups.com
On Tue, Nov 25, 2014 at 11:30 AM, Bill Page <bill...@newsynthesis.org> wrote:
> On 25 November 2014 at 01:11, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> On Mon, Nov 24, 2014 at 10:23 PM, Bill Page <bill...@newsynthesis.org> wrote:
>>> ...
>>> I am not very interested in real numbers. I am interested in the
>>> algebra. Would you say that
>>>
>>> sqrt(x^2).diff(x) = sqrt(x^2)/x
>>>
>>> is OK?
>>
>> I think so, using the following calculation:
>>
>> sqrt(x^2).diff(x) = exp(1/2*log(x^2)).diff(x) = exp(1/2*log(x^2)) *
>> 1/2 * 1/x^2 * 2*x = sqrt(x^2)/x
>>
>> The function exp(1/2*log(x^2)) that we differentiate is analytic, so I
>> don't see any issue here.
>>
>
> I did not ask whether it was technically correct or not. What I meant
> was is this expression what you would expect given the rest of the
> machinery of differentiation in any given computer algebra system?

Ah ok. I would actually expect to get x/sqrt(x^2), which is equivalent.

>
>>
>>> But I don't want to be forced to make a choice of branch until
>>> I actually need to evaluate an expression numerically.
>>
>> I understand that's what you want. I am just trying to understand how
>> exactly this works.
>>
>
> OK.
>
>>> ...
>>> I think what you are trying to say is
>>>
>>> (A) log(exp(z)) = { z + 2*pi*i*n | for all n in Integer}
>>
>> Exactly, that's what I meant.
>> ...
>>>
>>> Although it may seem simple in this case, in general implementing
>>> sets with comprehension like this requires logic and takes us
>>> outside of algebra as such into the realm of theorem proving.
>>
>> Sure. But that's what you want, correct?
>>
>
> No, not at all. I want this to be "algebraic", not some theorem of
> predicate calculus. That is what I meant by taking
>
> x + conjugate(x)
>
> as the definition of a real valued variable.

Ok.

>
>> ...
>>>
>>>>
>>>> This is precisely the part that I don't understand with the approach
>>>> (A). log(a*b), log(a) and log(b) are all multivalued, so you would
>>>> naively think, that log(a*b)-log(a)-log(b) = 0 + 2*pi*i*n, for all
>>>> "n". But I think this is not the case, I think the "n" in log(a*b) is
>>>> coupled to the implicit "n" in log(a) and log(b) in such a way, that
>>>> the result is exactly 0. Can you clarify exactly how this works?
>>>
>>> Try it this way:
>>>
>>> a*b = exp(?1)
>>> a = exp(?2)
>>> b = exp(?3)
>>>
>>> I think 'normalize' is saying that there is a solution that makes
>>>
>>> ?1 - ?2 - ?3 = 0.
>>
>> Ok, but why wouldn't normalize return 2*pi*i instead? Or 4*pi*i?
>
> These are equivalent in the sense of having the same number of
> algebraically independent transcendental kernels, i.e. none.

I don't understand that. Is the result of normalize() multivalued?
Or how else could 0 be equivalent to 2*pi*i or 4*pi*i?

>
>>
>> In other words, how exactly are the operations on the multivalued
>> sets log(x) defined?
>
> FriCAS does not perform operations on multivalued sets to determine the above.

Ok. Though my question stands, how are the operations defined in your approach?

>
>>>
>>> I meant that I did not understand what you are proposing for how to
>>> represent the value of 'log(z)' symbolically, i.e. when the value of z
>>> is unknown.
>>
>> Ah ok. I would represent it by the approach (B). But then, as we
>> talked about, it's not true that conjugate(log(z)) = log(conjugate(z)).
>> Since you want this property to hold, then the approach (B) does
>> not work for you, obviously. So I am trying to understand how
>> exactly are all the operations defined in your approach. You said
>> your approach is not (A) exactly. So I am just trying to understand.
>>
>
> OK.
>
>>
>> This discussion is about how a CAS should handle (complex)
>> differentiation. Since it started here, I would finish it here, so
>> that the whole thread is in one mailinglist for future reference.
>>
>
> OK. It would be nice to know if other sage-devel subscribers actually
> remain interested...
>
> Let's return to differentiation for a moment. Using your definitions
> what would you say is the correct result for
>
> log(exp(z-conjugate(z))).diff(z)
>
> My patched version of FriCAS based your definition in this thread
> currently returns 0. Do you get the same result?

No, the derivative is most definitely not zero:

log(exp(z-conjugate(z))).diff(z) =
exp(z-conjugate(z))/exp(z-conjugate(z)) * [1 - 1*exp(-2*i*theta)] = 1
- exp(-2*i*theta)

In other words, the two Wirtinger derivatives are 1 and -1. You can
easily check numerically that this formula is correct for all complex
"x" and angles theta, I've done it here:

https://github.com/certik/theoretical-physics/blob/f9406a02ef8e04b2daa669f444148186b6b892e8/src/math/code/test_complex_diff.py#L118

and it works.

>
> Since the derivative is 0 would we want to say therefore that
>
> log(exp(z-conjugate(z)))
>
> is a constant?

If you got 0, then I think you can say that the function is constant.
We didn't get 0, so the function is not constant.

> If not, isn't this an argument for needing another
> derivative?

In some of your previous emails you wrote that this theta factor
"still looks ugly to you". Maybe it's ugly, but it's correct, as you
fell into this trap yourself: if you omit theta and implicitly assume
theta=0, then you don't know if what you got is analytic or not.

Perhaps this comment in a sympy issue might help:

https://github.com/sympy/sympy/issues/8502#issuecomment-64415017

Essentially the formula with theta is equivalent to just returning a
tuple of the two Wirtinger derivatives. So what holds for one approach
holds for the other one.

> The result of this test currently causes a problem during
> manipulations of expressions of this form. Check the two Wirtinger
> derivatives for this case. If we have both derivatives we can avoid
> this problem quite easily as my previous version of the patch showed.

My current best solution is to define a function `diff(x, theta=0)`,
where the theta argument is 0 by default, but you can pass any angle
into it, or a symbol theta if you want. That way you won't get the
theta factors by default, but if in doubt, you can always get them.

Let me know if you have a better proposal.

Ondrej

>
> Bill.

Erik Massop

unread,
Nov 25, 2014, 3:15:05 PM11/25/14
to sage-...@googlegroups.com, Bill Page
On Tue, 25 Nov 2014 13:30:33 -0500
Bill Page <bill...@newsynthesis.org> wrote:

> On 25 November 2014 at 01:11, Ondřej Čertík <ondrej...@gmail.com> wrote:
> > On Mon, Nov 24, 2014 at 10:23 PM, Bill Page <bill...@newsynthesis.org> wrote:
...
> >> But I don't want to be forced to make a choice of branch until
> >> I actually need to evaluate an expression numerically.
> >
> > I understand that's what you want. I am just trying to understand how
> > exactly this works.
>
> OK.

Without a choice of branch for sqrt, I cannot answer this question:
* Is there complex number x such that x*conjugate(x) equals sqrt(2)?
This seems a non-numerical question to me. It seems to me that sqrt
without a choice of branch is ill-defined, but perhaps it is
sufficiently well-defined if you restrict to a certain kind of
questions? If so, what questions can I ask? I think I know too
little about the subject of this thread and of FriCAS.

> >> ...
> >> I think what you are trying to say is
> >>
> >> (A) log(exp(z)) = { z + 2*pi*i*n | for all n in Integer}
> >
> > Exactly, that's what I meant.
> > ...
> >>
> >> Although it may seem simple in this case, in general implementing
> >> sets with comprehension like this requires logic and takes us
> >> outside of algebra as such into the realm of theorem proving.
> >
> > Sure. But that's what you want, correct?
> >
>
> No, not at all. I want this to be "algebraic", not some theorem of
> predicate calculus. That is what I meant by taking
>
> x + conjugate(x)
>
> as the definition of a real valued variable.

Do you mean that z is considered real-valued when there is x such that x
+ conjugate(x) is z? I got lost in this part of the thread.

> > ...
> >>
> >>>
> >>> This is precisely the part that I don't understand with the approach
> >>> (A). log(a*b), log(a) and log(b) are all multivalued, so you would
> >>> naively think, that log(a*b)-log(a)-log(b) = 0 + 2*pi*i*n, for all
> >>> "n". But I think this is not the case, I think the "n" in log(a*b) is
> >>> coupled to the implicit "n" in log(a) and log(b) in such a way, that
> >>> the result is exactly 0. Can you clarify exactly how this works?
> >>
> >> Try it this way:
> >>
> >> a*b = exp(?1)
> >> a = exp(?2)
> >> b = exp(?3)
> >>
> >> I think 'normalize' is saying that there is a solution that makes
> >>
> >> ?1 - ?2 - ?3 = 0.
> >
> > Ok, but why wouldn't normalize return 2*pi*i instead? Or 4*pi*i?
>
> These are equivalent in the sense of having the same number of
> algebraically independent transcendental kernels, i.e. none.

Am I understanding correctly that normalize picks some arbitrary
representant of an equivalence class of answers? That seems scary to
me, but perhaps it is sufficiently well-defined for some questions?

...
> > This discussion is about how a CAS should handle (complex)
> > differentiation. Since it started here, I would finish it here, so
> > that the whole thread is in one mailinglist for future reference.
>
> OK. It would be nice to know if other sage-devel subscribers actually
> remain interested...

Yes, I find this thread casually interesting. However, I know little of
the subject of or FriCAS, which is also the reason I did not write
before.

> Let's return to differentiation for a moment. Using your definitions
> what would you say is the correct result for
>
> log(exp(z-conjugate(z))).diff(z)
>
> My patched version of FriCAS based your definition in this thread
> currently returns 0. Do you get the same result?

I'm not interested enough to calculate this by hand, sorry.

> Since the derivative is 0 would we want to say therefore that
>
> log(exp(z-conjugate(z)))
>
> is a constant? If not, isn't this an argument for needing another
> derivative? The result of this test currently causes a problem during
> manipulations of expressions of this form. Check the two Wirtinger
> derivatives for this case. If we have both derivatives we can avoid
> this problem quite easily as my previous version of the patch showed.

The Wikipedia page suggests that df/d conjugate(z) is
conjugate(conjugate(f).diff(z)). If that is indeed the case, then it
seems that df/d conjugate(z) might be handled without implementing a
second diff-method.


Regards,

Erik Massop

Bill Page

unread,
Nov 26, 2014, 12:18:05 PM11/26/14
to sage-devel
On 25 November 2014 at 14:51, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Tue, Nov 25, 2014 at 11:30 AM, Bill Page <bill...@newsynthesis.org> wrote:
...
>>>> Try it this way:
>>>>
>>>> a*b = exp(?1)
>>>> a = exp(?2)
>>>> b = exp(?3)
>>>>
>>>> I think 'normalize' is saying that there is a solution that makes
>>>>
>>>> ?1 - ?2 - ?3 = 0.
>>>
>>> Ok, but why wouldn't normalize return 2*pi*i instead? Or 4*pi*i?
>>
>> These are equivalent in the sense of having the same number of
>> algebraically independent transcendental kernels, i.e. none.
>
> I don't understand that. Is the result of normalize() multivalued?

No.

> Or how else could 0 be equivalent to 2*pi*i or 4*pi*i?

It is not equality it is an equivalence relation i.e. "modulo
constants". To dig deeper on this I think would need to consult the
source code and someone who is much more of an expert in this subject:
Waldek Hebisch.

>>> In other words, how exactly are the operations on the multivalued
>>> sets log(x) defined?
>>
>> FriCAS does not perform operations on multivalued sets to determine
> the above.
>
> Ok. Though my question stands, how are the operations defined in your
> approach?
>

Does it help if a say the operations are defined "symbolically"?
Maybe we need to define exactly what operations we are talking about.

> ...
> Essentially the [derivative] formula with theta is equivalent to just
> returning a tuple of the two Wirtinger derivatives. So what holds for
> one approach holds for the other one.
>

Yes, so we agree that in general more than one derivative operator is necessary.

> ...
> My current best solution is to define a function `diff(x, theta=0)`,
> where the theta argument is 0 by default, but you can pass any
> angle into it, or a symbol theta if you want. That way you won't get
> the theta factors by default, but if in doubt, you can always get them.
>

It seems that you prefer an "infinite" number of derivative operators
while I still think it is best to define only two.

> Let me know if you have a better proposal.
>

After continued thinking about this and my current experiments in
FriCAS I am still of the opinion that the best option is to implement
just the Wirtinger derivative (only one since the other can be
obtained by 'conjugate'). This has the affect of making the derivative
of non-analytic functions subtly different than what you call the
conventional "real derivative" (e.g. factor of 1/2 in derivative of
'abs'). I have decided that I would prefer to explain this difference
to a less experienced user, rather than to get into a discussion of
theta and directional derivatives.

Bill.

Bill Page

unread,
Nov 26, 2014, 12:34:59 PM11/26/14
to Erik Massop, sage-devel
On 25 November 2014 at 15:14, Erik Massop <e.ma...@hccnet.nl> wrote:
> On Tue, 25 Nov 2014 13:30:33 -0500
> Bill Page <bill...@newsynthesis.org> wrote:
>
>> On 25 November 2014 at 01:11, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> > On Mon, Nov 24, 2014 at 10:23 PM, Bill Page <bill...@newsynthesis.org> wrote:
> ...
>> >> But I don't want to be forced to make a choice of branch until
>> >> I actually need to evaluate an expression numerically.
>> >
>> > I understand that's what you want. I am just trying to understand how
>> > exactly this works.
>>
>> OK.
>
> Without a choice of branch for sqrt, I cannot answer this question:
> * Is there complex number x such that x*conjugate(x) equals sqrt(2)?
> This seems a non-numerical question to me. It seems to me that
> sqrt without a choice of branch is ill-defined, but perhaps it is
> sufficiently well-defined if you restrict to a certain kind of
> questions? If so, what questions can I ask? I think I know too
> little about the subject of this thread and of FriCAS.
>

It seems to me that you comment and example are quite appropriate
although for discussion of FriCAS I do recommend the fricas-devel
email list. I think you are right that one must restrict the kind of
questions. In particular I think one needs to be very careful to
define what one means by "equal". Usually this means that we can only
answer questions up to some equivalence relation.

>> ... I want this to be "algebraic", not some theorem of
>> predicate calculus. That is what I meant by taking
>>
>> x + conjugate(x)
>>
>> as the definition of a real valued variable.
>
> Do you mean that z is considered real-valued when there is x such that x
> + conjugate(x) is z? I got lost in this part of the thread.
>

Yes exactly.

>> > ...
>> >> Try it this way:
>> >>
>> >> a*b = exp(?1)
>> >> a = exp(?2)
>> >> b = exp(?3)
>> >>
>> >> I think 'normalize' is saying that there is a solution that makes
>> >>
>> >> ?1 - ?2 - ?3 = 0.
>> >
>> > Ok, but why wouldn't normalize return 2*pi*i instead? Or 4*pi*i?
>>
>> These are equivalent in the sense of having the same number of
>> algebraically independent transcendental kernels, i.e. none.
>
> Am I understanding correctly that normalize picks some arbitrary
> representative of an equivalence class of answers? That seems scary
> to me, but perhaps it is sufficiently well-defined for some questions?
>

Yes. More specifically FriCAS 'normalize' is an important part of the
machinery for integration but has other uses.

> ...
>> > This discussion is about how a CAS should handle (complex)
>> > differentiation. Since it started here, I would finish it here, so
>> > that the whole thread is in one mailinglist for future reference.
>>
>> OK. It would be nice to know if other sage-devel subscribers actually
>> remain interested...
>
> Yes, I find this thread casually interesting. However, I know little of
> the subject of or FriCAS, which is also the reason I did not write
> before.

No problem. I am happy to continue this discussion in whatever
direction and where ever (fricas-devel?) you like.

> ...
> The Wikipedia page suggests that df/d conjugate(z) is
> conjugate(conjugate(f).diff(z)). If that is indeed the case, then it
> seems that df/d conjugate(z) might be handled without implementing
> a second diff-method.
>

You are right. In fact that is exactly the proposal with which I
initial continued Ondřej's original thread. The main sticking point I
think is that the resulting derivative is subtly different for
non-holomorphic functions and that in this case using both Wirtinger
derivatives (or just one and 'conjugate') is necessary.

Bill.

Ondřej Čertík

unread,
Nov 26, 2014, 12:58:25 PM11/26/14
to sage-...@googlegroups.com
All I want is if you can give me an algorithm of your approach in
sufficient detail, so that it can be implemented by me on a computer.
And by "your approach", I mean an approach, where conjugate(log(x)) =
log(conjugate(x)) for all x.

I have provided all the details of the algorithm (B). In approach (B),
it is not true that
conjugate(log(x)) = log(conjugate(x)) for all x.

This equation (when conjugate(log(x)) = log(conjugate(x)) holds)
started this whole discussion.
So I was trying to understand your approach how to make this hold for
all "x", and I suggested various ways how maybe it could be
implemented, and to most of it you said "that's not how FriCAS does
it". At this point I don't have any more ideas how it could be done,
so I don't know how to implement your approach. Which is sad -- even
though I am not advocating for your approach, I wanted to really
understand it, so that I can make my own opinion on the pros and cons.

> Maybe we need to define exactly what operations we are talking about.

Sure. Let's just stick to one example, let me just copy & paste it
from my previous email:

>>> from cmath import log
>>> a = -1
>>> b = -1
>>> log(a*b)
0j
>>> log(a)+log(b)
6.283185307179586j

>>> def arg(x): return log(x).imag
...
>>> from math import floor, pi
>>> I = 1j
>>> log(a)+log(b)+2*pi*I*floor((
pi-arg(a)-arg(b))/(2*pi))
0j


As you confirmed, even if you evaluate this in FriCAS, log(a*b) is not
equal to log(a) + log(b), when a=b=-1.
However, you claim that "symbolically" it is true that log(a*b) =
log(a) - log(b) for all "a" and "b" and you provided a FriCAS function
"normalize" that does it, but you said that for deeper understanding
you would need to consult Waldek Hebisch. Can you explain the
discrepancy/inconsistency?

How exactly are the operations in log(a*b) = log(a) - log(b) defined,
so that this equation holds, even though when you put in a=b=-1, you
get a different number on the LHS and RHS, as confirmed by FriCAS?

Once we resolve this, we can get back to conjugate(log(x)) =
log(conjugate(x)) which also clearly doesn't hold for x=-1 for the
same reason, and so you must be able to somehow extend the operations
so that this equation holds even for x=-1 somehow in your approach.

>
>> ...
>> Essentially the [derivative] formula with theta is equivalent to just
>> returning a tuple of the two Wirtinger derivatives. So what holds for
>> one approach holds for the other one.
>>
>
> Yes, so we agree that in general more than one derivative operator is necessary.
>
>> ...
>> My current best solution is to define a function `diff(x, theta=0)`,
>> where the theta argument is 0 by default, but you can pass any
>> angle into it, or a symbol theta if you want. That way you won't get
>> the theta factors by default, but if in doubt, you can always get them.
>>
>
> It seems that you prefer an "infinite" number of derivative operators
> while I still think it is best to define only two.

The two approaches are equivalent, as I just pointed out. Even if you
define only the two Wirtinger derivatives, nothing stops you from
adding the theta factor and you also obtain the "infinite" number of
derivatives.

>
>> Let me know if you have a better proposal.
>>
>
> After continued thinking about this and my current experiments in
> FriCAS I am still of the opinion that the best option is to implement
> just the Wirtinger derivative (only one since the other can be
> obtained by 'conjugate'). This has the affect of making the derivative
> of non-analytic functions subtly different than what you call the
> conventional "real derivative" (e.g. factor of 1/2 in derivative of
> 'abs'). I have decided that I would prefer to explain this difference
> to a less experienced user, rather than to get into a discussion of
> theta and directional derivatives.

Cool, thanks. So you propose that abs(x).diff(x) returns
conjugate(x)/(2*abs(x)) ?

I personally don't think that's a good idea for a CAS, if this was the
case, then I would prefer "diff" to return unevaluated, and create a
new function diff_wirtinger() that returns conjugate(x)/(2*abs(x)).
Then there is no issue.

Ondrej

Bill Page

unread,
Nov 27, 2014, 12:27:52 AM11/27/14
to sage-devel
On 26 November 2014 at 12:58, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Wed, Nov 26, 2014 at 10:17 AM, Bill Page <bill...@newsynthesis.org> wrote:
>>
>> Does it help if a say the operations are defined "symbolically"?
>
> All I want is if you can give me an algorithm of your approach
> in sufficient detail, so that it can be implemented by me on a
> computer. And by "your approach", I mean an approach, where
> conjugate(log(x)) = log(conjugate(x)) for all x.
>

I am sorry, we seem to be having some trouble communicating. Is that
something infecting this email list? :)

Making "conjugate(log(x)) = log(conjugate(x)) for all x" is trivial
so long as it is treated symbolically: the 'conjugate' operation is
just defined to rewrite itself (auto-simplify) when applied to any
operand of the form log(_), so 'conjugate(log(_))' is evaluated as
'log(conjugate(_))', where _ stands for any element of the domain
Expression. This is what I meant when I said it was considered true
by definition, i.e. by definition of the symbolic 'conjugate'
operation. Exactly the same sort of thing happens when the
'conjugate' operation acts on 'conjugate' so that
'conjugate(conjugate(x))' is simply rewritten as 'x'.

> I have provided all the details of the algorithm (B). In approach (B),
> it is not true that
> conjugate(log(x)) = log(conjugate(x)) for all x.
>
> This equation (when conjugate(log(x)) = log(conjugate(x)) holds)
> started this whole discussion.

That

log(a*b) = log(a) + log(b)

is considerably less trivial that the case of 'conjugate'. From my
point of view that is what actually started this branch of the
"fabric" of this discussion. That is where 'normalize' comes in.

> So I was trying to understand your approach how to make this hold
> for all "x", and I suggested various ways how maybe it could be
> implemented, and to most of it you said "that's not how FriCAS does
> it". At this point I don't have any more ideas how it could be done,
> so I don't know how to implement your approach. Which is sad --
> even though I am not advocating for your approach, I wanted to
> really understand it, so that I can make my own opinion on the pros
> and cons.

Thank you for attempting to understand.

I think I only used the phrase "that's not how FriCAS does it" in the
context of multi-valued functions. My point is that FriCAS makes no
attempt to evaluate a multi-valued function symbolically. But FriCAS
does rewrite expressions involving mutli-valued functions in some
cases automatically and in others when asked to do so by operators
like 'normalize'.

>
>> Maybe we need to define exactly what operations we are talking about.
>
> Sure. Let's just stick to one example, let me just copy & paste it
> from my previous email:
>
>>>> from cmath import log
>>>> a = -1
>>>> b = -1
>>>> log(a*b)
> 0j
>>>> log(a)+log(b)
> 6.283185307179586j
>
>>>> def arg(x): return log(x).imag
> ...
>>>> from math import floor, pi
>>>> I = 1j
>>>> log(a)+log(b)+2*pi*I*floor((
> pi-arg(a)-arg(b))/(2*pi))
> 0j
>
> As you confirmed, even if you evaluate this in FriCAS, log(a*b) is not
> equal to log(a) + log(b), when a=b=-1.

Yes, I showed that as expected this was not equal when 'log' is
evaluated in a numeric domain but I am talking about a domain
constructed by 'Expression' which is a "symbolic" domain.

> However, you claim that "symbolically" it is true that log(a*b) =
> log(a) - log(b) for all "a" and "b" and you provided a FriCAS function
> "normalize" that does it,

No not exactly. I am sorry that I did not express myself more
clearly. Actually if I evaluate

test ( log(a*b) = log(a)+log(b) )

FriCAS returns 'false' since no automatic simplifications apply here
and these are obviously to different expressions. What I showed was
that

normalize(log(a*b)-log(a)-log(b))

returns 0.

> but you said that for deeper understanding you would need to consult
> Waldek Hebisch. Can you explain the discrepancy/inconsistency?

Well, um, what I tried to say was that for a deeper understanding of
'normalize' we would have to either read the source code of
'normalize' or talk with Waldek who as studied the source code more
carefully and throughly than I have. 'normalize' was written by Manuel
Bronstein. There is no specific documentation except for that
contained in the source code:

https://github.com/fricas/fricas/blob/master/src/algebra/efstruc.spad#L83

and unfortunately Manuel Bronstein is dead. Bronstein did however
publish several books and numerous articles. In particular 'normalize'
is part of his implementation of the "Risch structure theorem". E.g.
http://dl.acm.org/citation.cfm?id=74566 As I recall there was some
Google Summer of Code work on sympy related to this.

But Waldek has made a number if important recent changes to this package.

>
> How exactly are the operations in log(a*b) = log(a) - log(b) defined,
> so that this equation holds, even though when you put in a=b=-1, you
> get a different number on the LHS and RHS, as confirmed by FriCAS?
>

My admittedly primitive understanding of how 'normalize' operates in
this case is that it is similar in principle to what one does to show
for example that '(a*b)/(a*c) - b/c = 0', i.e. by rewriting the
expression to a canonical equivalent form (although of course this is
actually done by automatic simplifications in FriCAS). It is my
intention to continue to work toward improving my understanding of
this part of FriCAS especially since Waldek has expressed doubts about
the soundness of introducing 'conjugate' into Expression in the
context of this function.

> ...

Bill.

Ondřej Čertík

unread,
Dec 5, 2014, 3:20:41 PM12/5/14
to sage-...@googlegroups.com
Hi Bill,

I thought about this a lot (essentially I studied complex analysis
from several books as well as consulted with many colleagues) and I
figured out some answers to my questions.

In the approach (A), you have:

log(a*b) = log(a) + log(b)

What that means is that log() is multivalued, so you can add 2*pi*i*n
for all "n". The way to do arithmetic and compare multivalued
functions is simply to make sure that the infinite (sometimes it could
be finite) set of values on the left is equivalent to the infinite set
of values on the right. In other words, if you pick a value on the
left, for the sake of an argument let's say a=b=-1 and we pick n = 5,
so we get log(a*b) = log(1) = 0 + 2*pi*i*5 = 10*pi*i, then if you can
find combinations of values on the right hand side to make the result
equal to 10*pi*i, and you can do this for all integer "n", and if you
can do the opposite, i.e. that you pick any combination of values on
the right hand side and are able to find a value on the left hand side
that is equal to it, then you prove the equality. I.e. you prove that
the infinite set of multivalues on the left hand side and right hand
side are equal.

Once we have an understanding how log(z) works, we simply can derive
all kinds of formulas in the approach (A). The way it works is that
you put in the 2*pi*n factors, i.e. you explicitly enumerate all
possibilities, then you derive some formulas, and at the end you
absorb the 2*pi*n factors into the multivalued functions, i.e. you can
always absorb 2*pi*i*n into log(). But sometimes it might not be
possible to completely absorb all these factors.

Now let's apply this to the problems below:

On Wed, Nov 26, 2014 at 10:27 PM, Bill Page <bill...@newsynthesis.org> wrote:
> On 26 November 2014 at 12:58, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> On Wed, Nov 26, 2014 at 10:17 AM, Bill Page <bill...@newsynthesis.org> wrote:
>>>
>>> Does it help if a say the operations are defined "symbolically"?
>>
>> All I want is if you can give me an algorithm of your approach
>> in sufficient detail, so that it can be implemented by me on a
>> computer. And by "your approach", I mean an approach, where
>> conjugate(log(x)) = log(conjugate(x)) for all x.
>>
>
> I am sorry, we seem to be having some trouble communicating. Is that
> something infecting this email list? :)
>
> Making "conjugate(log(x)) = log(conjugate(x)) for all x" is trivial
> so long as it is treated symbolically: the 'conjugate' operation is
> just defined to rewrite itself (auto-simplify) when applied to any
> operand of the form log(_), so 'conjugate(log(_))' is evaluated as
> 'log(conjugate(_))', where _ stands for any element of the domain
> Expression. This is what I meant when I said it was considered true
> by definition, i.e. by definition of the symbolic 'conjugate'
> operation. Exactly the same sort of thing happens when the
> 'conjugate' operation acts on 'conjugate' so that
> 'conjugate(conjugate(x))' is simply rewritten as 'x'.

Sure, on this level you can implement it. I was thinking on a deeper
level, i.e. imagining putting a number x=-1 in and see how could this be true:

conjugate(log(-1)) = log(conjugate(-1))

The answer that I was looking for is this:

LHS: conjugate(log(-1)) = conjugate(i*pi + 2*pi*i*n) = -i*pi-2*pi*i*n
RHS: log(conjugate(-1)) = log(-1) = i*pi + 2*pi*i*m

If we pick n=-m-1, we always get LHS=RHS, so the two infinite set of
multivalues are equivalent, and the relation conjugate(log(-1)) =
log(conjugate(-1)) holds.
When you evaluate log(-1), you cannot just give i*pi, you need to give
all the multivalues. But otherwise it works.

>
>> I have provided all the details of the algorithm (B). In approach (B),
>> it is not true that
>> conjugate(log(x)) = log(conjugate(x)) for all x.
>>
>> This equation (when conjugate(log(x)) = log(conjugate(x)) holds)
>> started this whole discussion.
>
> That
>
> log(a*b) = log(a) + log(b)
>
> is considerably less trivial that the case of 'conjugate'. From my
> point of view that is what actually started this branch of the
> "fabric" of this discussion. That is where 'normalize' comes in.

I think the above answers both, it all works and is consistent in the
approach (A). You just need to remember that if a function is
multivalued, e.g. log(z), then you always need to enumerate all the
values and prove that the LHS is equivalent to RHS.

There is a theorem, that says that actually, if you give me a complex
function values on just one branch, I can reconstruct the function in
all branches. So it is probably the case that you only need to find
one set of "n", "m" and "k" to satisfy the equation and it will then
hold for the other values as well. But for clarity, I always prove it
for all values.

>
>> So I was trying to understand your approach how to make this hold
>> for all "x", and I suggested various ways how maybe it could be
>> implemented, and to most of it you said "that's not how FriCAS does
>> it". At this point I don't have any more ideas how it could be done,
>> so I don't know how to implement your approach. Which is sad --
>> even though I am not advocating for your approach, I wanted to
>> really understand it, so that I can make my own opinion on the pros
>> and cons.
>
> Thank you for attempting to understand.
>
> I think I only used the phrase "that's not how FriCAS does it" in the
> context of multi-valued functions. My point is that FriCAS makes no
> attempt to evaluate a multi-valued function symbolically. But FriCAS
> does rewrite expressions involving mutli-valued functions in some
> cases automatically and in others when asked to do so by operators
> like 'normalize'.

Yes, log(a*b) can always be rewritten to log(a)+log(b) as long as
everything is multivalued.

>
>>
>>> Maybe we need to define exactly what operations we are talking about.
>>
>> Sure. Let's just stick to one example, let me just copy & paste it
>> from my previous email:
>>
>>>>> from cmath import log
>>>>> a = -1
>>>>> b = -1
>>>>> log(a*b)
>> 0j
>>>>> log(a)+log(b)
>> 6.283185307179586j
>>
>>>>> def arg(x): return log(x).imag
>> ...
>>>>> from math import floor, pi
>>>>> I = 1j
>>>>> log(a)+log(b)+2*pi*I*floor((
>> pi-arg(a)-arg(b))/(2*pi))
>> 0j
>>
>> As you confirmed, even if you evaluate this in FriCAS, log(a*b) is not
>> equal to log(a) + log(b), when a=b=-1.
>
> Yes, I showed that as expected this was not equal when 'log' is
> evaluated in a numeric domain but I am talking about a domain
> constructed by 'Expression' which is a "symbolic" domain.

Right, so the point is that when you evaluate numerically, you *need*
to implicitly add the 2*pi*i*n factor and only compare the infinite
set of values.
Then there is no issue.

>
I think we don't need to know how normalize works anymore, since
obviously in the approach (A),
log(a*b) = log(a) + log(b).

> But Waldek has made a number if important recent changes to this package.
>
>>
>> How exactly are the operations in log(a*b) = log(a) - log(b) defined,
>> so that this equation holds, even though when you put in a=b=-1, you
>> get a different number on the LHS and RHS, as confirmed by FriCAS?
>>
>
> My admittedly primitive understanding of how 'normalize' operates in
> this case is that it is similar in principle to what one does to show
> for example that '(a*b)/(a*c) - b/c = 0', i.e. by rewriting the
> expression to a canonical equivalent form (although of course this is
> actually done by automatic simplifications in FriCAS). It is my
> intention to continue to work toward improving my understanding of
> this part of FriCAS especially since Waldek has expressed doubts about
> the soundness of introducing 'conjugate' into Expression in the
> context of this function.

I played with various formulas for multivalued functions and it's all
consistent, and for example these definitely hold:

log(a*b) = log(a)+log(b)
conjugate(log(x)) = log(conjugate(x))

But then I tried:

(x^a)^b = ( e^(a*log(x)) )^b = e^(b*log(e^(a*log(x)))) =
e^(b*(a*log(x) + 2*pi*i*n)) = e^(a*b*log(x) + b*2*pi*i*n) = x^(a*b) *
e^(b*2*pi*i*n)

I was just using the definitions and put the 2*pi*i factors in at
appropriate places. As you can see, in this case, the 2*pi*i factor
can't be absorbed. So this is ugly. But it works, i.e.

sqrt(x^2) = (x^2)^(1/2) = x^(2*1/2) * e^(1/2 * 2*pi*i*n) = x *
e^(pi*i*n) = x * (-1)^n

We are still using the approach (A), so everything is multivalued. In
this case, we have only 2 values, +x and -x, but it's still a
multivalued function with both of these values holding at the same
time.

It should be now clear, that it is *not* true that (x^a)^b = x^(a*b),
because then for a=2, b=1/2, you would get:

sqrt(x^2) = x

But the function on the left is multivalued (with values/branches +x
and -x), while the function on the right is single valued with only
one value "x". The only way you could make this work is if you say
that it is possible to find a branch on the left (+x) that agrees with
the single value on the right. But for a CAS, it would be a mistake to
simplify sqrt(x^2) to x. It would be ok to simplify sqrt(x^2) to
x*(-1)^n, but it's ugly, since now you have "n" in there.

For this reason, the approach (A) is not very well suited for a CAS
and I think approach (B) is much better. The approach (B) follows from
(A) by simply choosing such "n", that picks the principal branch. So
for example for sqrt(x^2), it picks n = floor((pi-2*arg(x)) / (2*pi)).
As an added bonus, since numerical evaluations of log(z) and other
functions also returns the principal branch, all the formulas are
consistent and no need to worry about any 2*pi*n factors.

Ondrej

Ondřej Čertík

unread,
Dec 5, 2014, 4:32:58 PM12/5/14
to sage-...@googlegroups.com
But there is one issue actually. We prove that

log(a*b) = log(a) + log(b)

in the sense that the set of multivalues on the LHS is equal to the
set of multivalues on the RHS. So we can write:

log(a*b) - log(a) + log(b) = 2*pi*i*n

I.e. the LHS = log(a*b) - log(a) + log(b) is a multivalued function,
it is not zero. The set of values is equal to RHS = 2*pi*i*n.
So if you write:

log(a*b) - log(a) + log(b) = 0

Then it only holds in a sense, that you can pick "n" or a branch on
the LHS such that it is equal to the RHS, i.e. 0. But nothing stops
you from picking a different branch, let's say n=5, i.e. LHS =
2*pi*i*5 = 10*pi*i, and that is most definitely not equal to 0. It is
the same as with the case sqrt(x^2) = x*(-1)^n above. You can write it
as sqrt(x^2) = x, but then it only holds in the sense that you can
always (i.e. for any 'x') pick a branch of sqrt(x^2) such that it is
equal to 'x'. But the problem is that this branch pick depends on "x"
or "a,b", i.e. for some values of "x" or "a,b" you have to pick one
branch, but for other values you have to pick a different branch. This
is obvious for sqrt(x^2) = x, i.e. for x=3, you need to pick the +x
branch, but for x=-3, you need to pick the -x branch. The same for
log(a*b) - log(a) + log(b) = 0, i.e. for a=b=1 you pick the branch
where log(1) = 0, but for a=b=-1, you have to pick log(-1) = i*pi, but
log(1) = 2*pi*i (those are two different branches), so that log(1) -
log(-1) - log(-1) = 0.

In other words, if you like that a CAS simplifies / "normalizes"

1) log(a*b) - log(a) + log(b) = 0

then you should also want the CAS to normalize:

2) sqrt(x^2) = x

Do you agree with me, based on the above analysis, that these two
cases 1) and 2) are exactly equivalent? I.e. in both you need to pick
a specific branch on the LHS to make it equal the RHS and the branch
pick depends on the values of "a,b" or "x".


Assuming you agree, the next step is to realize that the
simplification 2), i.e. sqrt(x^2) = x is especially problematic, since
every high schooler knows that for real numbers, we have sqrt(x^2) =
|x|, not sqrt(x^2) = x. Yes, you can make 2) work, and above I
described in detail how it works, but this is not what most people
use. Just google the internet for sqrt(x^2), i.e. here:

http://math.stackexchange.com/a/961795/30944

Everybody will tell you that sqrt(x^2) = -x for negative "x", i.e.
that sqrt(x^2) = |x|. I can't even imagine doing any kinds of
calculations with assuming sqrt(x^2) = x, that just quickly leads to
wrong answers so easily. In fact, the stackexchange question assumed
sqrt(x^2) = x and it lead to a wrong answer (yes, the poster should
have picked all the branches consistently if he wanted to use
sqrt(x^2) = x).

But let me know if you have any arguments, why we should even
entertain the cases 1) and 2), i.e. equating a multivalued function to
a single valued function (i.e. 0 or "x").

What I think can be made to work is to simply always equate
multivalued functions to multivalued ones, e.g.:

log(a*b) - log(a) + log(b) = 2*pi*i*n
sqrt(x^2) = x*(-1)^n

That works and the chances of mistakes are quite low. So we have three
approaches to complex analysis:

(A) multivalued approach, e.g.:

sqrt(x^2) = x*(-1)^n
log(a*b) = log(a) + log(b)
log(a*b) - log(a) + log(b) = 2*pi*i*n
conjugate(log(z)) = log(conjugate(z))
conjugate(log(z)) - log(conjugate(z)) = 2*pi*i*n

(A') multivalued approach, when you are allowed to equate multivalued
functions to single valued by picking a specific branch (but not
always the same branch for all "x" or "a,b"), like

sqrt(x^2) = x
log(a*b) = log(a) + log(b)
log(a*b) - log(a) + log(b) = 0
conjugate(log(z)) - log(conjugate(z)) = 0

(B) single valued approach on a principal branch (the same branch for
all "x" or "a,b"), like

sqrt(x^2) = x * (-1)^floor((pi-2*arg(x)) / (2*pi))
log(a*b) = log(a) + log(b) + 2*pi*i*floor((pi-arg(a)-arg(b))/(2*pi))
log(a*b) - log(a) - log(b) = 2*pi*i*floor((pi-arg(a)-arg(b))/(2*pi))
conjugate(log(z)) = log(conjugate(z)) -2*pi*i*floor((arg(z)+pi)/(2*pi))
conjugate(log(z)) - log(conjugate(z)) = -2*pi*i*floor((arg(z)+pi)/(2*pi))



These examples should very clearly clarify the differences between each.

You can clearly see, that (B) is just (A), where we pick a specific
"n" (depending on "x" or "a,b") that always picks the principal branch
(i.e. the branch pick is independent on "x", "a,b", it's always the
principal branch). All formulas in (B) thus have a specific "n" in
them, usually in terms of the floor() function. All formulas hold for
all "x", "a,b" as they are (no further branch picking is necessary,
one can directly evaluate them numerically).

In (A), some of the formulas look "nice", because the "n" dependence
is absorbed in the multivalued functions, but some other formulas have
an explicit "n" dependence. There is no way around it. All formulas
hold for all "x" or "a,b", but when evaluating numerically, one needs
to keep the "n" dependence in it and be able to treat them a
collection of values (multivalued).

Finally, approach (A') results from (A) by picking "n" which is
*independent* of "x" or "a,b", typically just n=0. From (B) it follows
that this approach inevitably makes the branch pick dependent on "x"
or "a,b", which means that for some "x" it picks one branch, but for
some other "x" it picks another branch. As such, some of these
formulas do *not* hold for all "x" or "a,b" with the same branch, but
rather depending on "x" or "a,b", one needs to pick the right branch
when evaluating numerically. Some other formulas are the same as in
(A), so those hold for all "x" or "a,b".


Ondrej

kcrisman

unread,
Mar 24, 2015, 9:08:29 PM3/24/15
to sage-...@googlegroups.com
A related question just popped up on ask.sagemath:
Not exactly the same but I think it gets at the same underlying issues of "is it a complex variable or isn't it".

Bill Page

unread,
Mar 24, 2015, 10:53:35 PM3/24/15
to sage-devel
I am still working on this in FriCAS and currently have an operational
version 0.2. Although this is the Sage list, I would be glad to
continue the discussion and especially with someone willing to review
what I have developed so far in FriCAS. I think I now understand all
this a bit better than I did in December. From my point of view the
issue really is: What is a "real variable" from an algebraic
perspective? The usual treatment is more topological or at least
geometric than it is algebraic. It seems to me that this is ultimately
what has lead to radical differences and ad hoc solutions in most
computer algebra systems to date. In short, to define what we mean by
real algebraically I think it is first necessary to define conjugate,
or more specifically involutive (star)-algebra.
Reply all
Reply to author
Forward
0 new messages