Function defined with matrices won't plug in values for function

99 views
Skip to first unread message

saad khalid

unread,
Nov 30, 2018, 4:59:40 PM11/30/18
to sage-support
Hi all,

I'm not sure what is happening, but I defined some matrices:
Mz = matrix([[0,1],[1,0]])
Mx = matrix([[1,0],[0,-1]])
M1
= matrix([[1,0],[0,1]])

And then I tried defining a function where I multiply these matrices by some variable and add them together:

h(s) = M1 + s*Mx
h
(.1)
However, when I try to compute this h(.1), the function doesn't plug in .1 in for s in the function, and it returns:

[ s + 1      0]
[     0 -s + 1]

What exactly is happening?

Dima Pasechnik

unread,
Nov 30, 2018, 5:44:28 PM11/30/18
to sage-s...@googlegroups.com
It seems that doing this kind of thing over SR is not implemented.
However, you can do something like this:
R.<s>=QQ[]
Mz = matrix(R,[[0,1],[1,0]])
Mx = matrix(R,[[1,0],[0,-1]])
M1 = matrix(R,[[1,0],[0,1]])
h = M1 + s*Mx

sage: h(s=1/2)
[3/2 0]
[ 0 1/2]

Here h is a matrix with entries in R, and substitution works...
> --
> You received this message because you are subscribed to the Google Groups "sage-support" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-support...@googlegroups.com.
> To post to this group, send email to sage-s...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sage-support.
> For more options, visit https://groups.google.com/d/optout.

saad khalid

unread,
Nov 30, 2018, 9:22:58 PM11/30/18
to sage-support
I'm quite confused, why do you think this is not implemented? It seems like the kind of thing that is so simple, it should "just work." Is there some strange complication that could arise by defining it over SR that I'm not thinking of? I feel like it should be possible to define the function and have it at least try to plug in the input value.

Nils Bruin

unread,
Dec 1, 2018, 12:22:40 AM12/1/18
to sage-support
On Friday, November 30, 2018 at 1:59:40 PM UTC-8, saad khalid wrote:
However, when I try to compute this h(.1), the function doesn't plug in .1 in for s in the function, and it returns:

[ s + 1      0]
[     0 -s + 1]

What exactly is happening?

The problem is that a matrix over SR is not a symbolic expression itself:

sage: parent(M1+s*Mx)
Full MatrixSpace of 2 by 2 dense matrices over Symbolic Ring

The effect of "h(s)=..." is (you can check by calling 'preparse("h(s)=...")'):

 h = symbolic_expression(M1+s*Mx).function(s)

which "forces" the sage object M1+s*Mx into the symbolic ring. Nearly everything can be stuffed into the symbolic ring, but in this case it basically becomes a constant:

sage: symbolic_expression(M1+s*Mx).variables()
()
sage: symbolic_expression(M1+s*Mx).is_constant()
True

Once stuffed into SR, it's an opaque entity, and the occurrence of s in it is not visible. So turning it into a function of s is straightforward: it's a constant function!

The solution here is to use that matrices over symbolic rings do inherit some properties of symbolic expressions. One is substitution, so

sage: h=M1+s*Mx
sage: h(s=1.0)
[ 2.00000000000000                 0]
[                0 0.000000000000000]

works just fine.

If you want to turn it into something for which h(1.0) works, you can do

h=lambda s0: (M1+s*Mx)(s=s0)

Carl Eberhart

unread,
Dec 1, 2018, 8:41:17 PM12/1/18
to sage-support
I am using the lambda notation for matrix functions.  It works very well

m1=matrix([[2,3],[1,6]]);m2=matrix([[3,6],[9,1]])
f=lambda s:m1+s*m2
f(.4) 

saad khalid

unread,
Dec 2, 2018, 1:22:25 PM12/2/18
to sage-support
I appreciate the help with solving my problem, though I do have some comments/questions to make with regard to usability for new users:

Why can't matrices be allowed in SR in a nontrivial way? Or, for some function H(s), why cant the behavior H(1) = H(s=1) be the default behavior? I guess my issue is that there doesn't seem to be a simple/consistent way to define a function that works, and I believe there should be. Mathematica manages this functionality somehow, what makes it difficult here? I am certainly a fan of using lambda functions, but iirc they are from python and are not inherently Sage/mathematical objects, right? Lambda functions aren't the "suggested" way to make functions in the Sage tutorial. Perhaps it is just not known to me, but I would love for there to be some simple consistent standard with defining functions such that, if I define one in this way and try to plot it or plug in values with a certain notation, it will Always work and not give me something weird like this. Does such a consistent notation exist that I haven't seen? The page "Some Common Issues with Functions" in the documentation essentially comments on the existence of this issue.

Thanks for all of the help so far, I appreciate it!

Nils Bruin

unread,
Dec 2, 2018, 4:38:09 PM12/2/18
to sage-support
On Sunday, December 2, 2018 at 10:22:25 AM UTC-8, saad khalid wrote:
I appreciate the help with solving my problem, though I do have some comments/questions to make with regard to usability for new users:

Why can't matrices be allowed in SR in a nontrivial way?

From a mathematical point of you, you DO want them to have different parents: One generally first defines a ring (SR in this case) and THEN considers the space of n-by-m matrices over that ring. It's very natural for them to NOT have the same parent (e.g., a square matrix would have a "determinant" method on it -- a method generally not available on SR elements). Hence:

sage: M=matrix(2,2,[1,0,0,x])
sage: M.parent()

Full MatrixSpace of 2 by 2 dense matrices over Symbolic Ring
 
Many "symbolic" methods are now available on this object:

sage: M.variables()
(x,)
sage: M.simplify_trig()
[1 0]
[0 x]

but also matrix-specific ones, including some confusing ones:

sage: M.characteristic_polynomial()
x^2 + (-x - 1)*x + x
sage: M.characteristic_polynomial('t')
t^2 + (-x - 1)*t + x

(note the two different objects that print as "x" in the first expression)

To see the problem that arises from putting matrices into SR, consider for instance

sage: A=matrix(2,2,[0,M,M,0])
sage: B=matrix(2,2,[0,SR(M),SR(M),0])

sage: parent(A)
Full MatrixSpace of 2 by 2 dense matrices over Full MatrixSpace of 2 by 2 dense matrices over Symbolic Ring
sage: parent(B)

Full MatrixSpace of 2 by 2 dense matrices over Symbolic Ring

Especially when you take into account that SR.is_commutative() == True, you see that "B" has no chance of acting appropriately, whereas A would (matrices over non-commutative rings haven't received much attention in Sage yet, so I'd tread carefully, though).

Or, for some function H(s), why cant the behavior H(1) = H(s=1) be the default behavior?

sage: var('x,y')
(x, y)
sage: f=x+2*y

Should f(1) return "1+2y" or "x+2" ? It's abmiguous, and therefore deprecated. Compare instead

sage: g=f.function(x)
sage: parent(g)
Callable function ring with argument x

Now it's completely clear what g(1) should return.

I guess my issue is that there doesn't seem to be a simple/consistent way to define a function that works, and I believe there should be. Mathematica manages this functionality somehow, what makes it difficult here? I am certainly a fan of using lambda functions, but iirc they are from python and are not inherently Sage/mathematical objects, right?

Sage is built on top of python, so there is nothing wrong with using python constructs in sage. In fact. that's part of the design: It means that sage didn't have to develop all usual programming language infrastructure. The functionality provided by "symbolic function" that is not covered between symbolic expressions and "lambda" functions is really quite limited. In principle, "M.function(x)" could be implemented, returning a "Callable matrix over SR with argument x", but there hasn't been suficient demand for it.

The most compelling use for "callable expressions" comes from differential operators, and doing that for vector-valued functions is quite a different beast -- "sage manifolds" deals with that in a more general setting.

Lambda functions aren't the "suggested" way to make functions in the Sage tutorial. Perhaps it is just not known to me, but I would love for there to be some simple consistent standard with defining functions such that, if I define one in this way and try to plot it or plug in values with a certain notation, it will Always work and not give me something weird like this.

I'd think "lambda functions". There you know exactly what happens when you plug something in, because you specify it!
 
Does such a consistent notation exist that I haven't seen? The page "Some Common Issues with Functions" in the documentation essentially comments on the existence of this issue.

Indeed, that page addresses a lot of the issues. Whenever you run into a question where "lambda functions" wouldn't do what you want (because of the presence of matrices, for instance), there's a good chance that a "callable expression" wouldn't do what you want either, because likely you're trying to take a derivative or something like that, rather than just evaluate the function.

These kinds of problems are fundamental to using a computer algebra system, and different systems deal with them in different ways. This means that there will be some differences between how "easily" certain tasks are completed with different systems. Usually, once you get used to the gotchas of a specific system (there always are!) it becomes easier to deal with them.

Simon King

unread,
Dec 2, 2018, 5:36:26 PM12/2/18
to sage-s...@googlegroups.com
Hi Saad,

as an algebraist I may be biased, but I have never understood why so
many people try to use symbolics when they are not doing calculus.
Calculus is a pancake, not a panacea (sorry for the bad joke).

Without a joke: Quite many user questions can be answered in that way:
"You try to achieve xy by using symbolic variables, but that's not what
they are made for; use one of the following specialised tools for xy:
..."

On 2018-12-02, saad khalid <saad...@gmail.com> wrote:
> Why can't matrices be allowed in SR in a nontrivial way?

I guess the shortest answer is: A matrix knows what to answer when the user
does "M[x,y]". SR is the ring of symbolic variables, and symbolic variables
don't know what a user expects from a matrix. In fact, matrices are a
construction taken from linear algebra, not from calculus. So, it isn't a
surprise that SR can't handle it.

Let's get back to your original post:

>> I'm not sure what is happening, but I defined some matrices:
>> Mz = matrix([[0,1],[1,0]])
>> Mx = matrix([[1,0],[0,-1]])
>> M1 = matrix([[1,0],[0,1]])
>>
>> And then I tried defining a function where I multiply these matrices by
>> some variable and add them together:
>>
>> h(s) = M1 + s*Mx
>> h(.1)

Apparently what you want is not a function that maps a number s to a
matrix. What you want is a matrix with a parameter s. And in fact all
matrix entries are polynomial in s.

SR is a very general tool, but being "very general" implies being
"sub-optimal for most purposes". So, when you work with polynomials then
define a polynomial ring, and when you work with matrices then define
matrices over that polynomial ring:

sage: R.<s> = QQ[]
sage: Mx = matrix(R,[[1,0],[0,-1]])
sage: M1 = matrix(R, [[1,0],[0,1]])
sage: h = M1 + s*Mx

Calling that matrix specialises the parameter s in the way you expected
(in particular, the syntax h(s=.1) isn't needed):

sage: h(.1)
[ 1.10000000000000 0.000000000000000]
[0.000000000000000 0.900000000000000]

And you can still do calculus, to some extent:

sage: h.derivative(s)
[ 1 0]
[ 0 -1]

Summary: If you use tools that were designed to work with the mathematical
notions you are using (matrices over polynomial rings),...
sage: type(h)
<type 'sage.matrix.matrix_polynomial_dense.Matrix_polynomial_dense'>
... then things are more likely to work in the way you expect it.

Best regards,
Simon


saad khalid

unread,
Dec 5, 2018, 1:15:52 PM12/5/18
to sage-support
Thank you all for the thoughtful responses, your comments helped me a lot. One last bit of trouble that I've been having related to this topic. This was is the code I was running:

Mz = matrix([[0,1],[1,0]])
Mx = matrix([[1,0],[0,-1]])

M1
= Matrix([[1,0],[0,1]])
import numpy as numpy
#R.<s> = QQ[]
_
= var('J,hx,alpha')
alphas
= numpy.linspace(0,1,10)
Bp = Mz.tensor_product(M1).tensor_product(M1).tensor_product(M1) * M1.tensor_product(Mz).tensor_product(M1).tensor_product(M1) * M1.tensor_product(M1).tensor_product(Mz).tensor_product(M1) * M1.tensor_product(M1).tensor_product(M1).tensor_product(Mz)
Plaqz = Mx.tensor_product(M1).tensor_product(M1).tensor_product(M1)
F
= lambda s: -(1-s)*Bp - s*Plaqz

show
(F(.1).eigenvalues())

If I run it, it gives me results with the error: "
UserWarning: Using generic algorithm for an inexact ring, which will probably give incorrect results due to numerical precision issues."

I expected this error to be some minute difference in the far out decimal place range, like I so often see with
complex numbers, but the answer I get is actually Quite a bit off from the correct answer. Upon googling, i found that this can
be fixed by appending CDF to the beginning of every matrix definition, like this:

Mz = matrix(CDF,[[0,1],[1,0]])
Mx = matrix(CDF,[[1,0],[0,-1]])
M1
= Matrix(CDF,[[1,0],[0,1]])
import numpy as numpy
#R.<s> = QQ[]
_
= var('J,hx,alpha')
alphas
= numpy.linspace(0,1,10)
Bp = Mz.tensor_product(M1).tensor_product(M1).tensor_product(M1) * M1.tensor_product(Mz).tensor_product(M1).tensor_product(M1) * M1.tensor_product(M1).tensor_product(Mz).tensor_product(M1) * M1.tensor_product(M1).tensor_product(M1).tensor_product(Mz)
Plaqz = Mx.tensor_product(M1).tensor_product(M1).tensor_product(M1)
F
= lambda s: -(1-s)*Bp - s*Plaqz

show
(F(.1).eigenvalues())

My question is, why is the default an alg that seems to give incorrect answers, instead of just defaulting to CDF?
Also, is there a better way of doing this besides just appending CDF to the beginning every time?

Thank you!





Simon King

unread,
Dec 15, 2018, 6:53:40 AM12/15/18
to sage-s...@googlegroups.com
Hi Saad,

sorry that it took long to answer; I thought others would reply sooner
than I.

On 2018-12-05, saad khalid <saad...@gmail.com> wrote:
> If I run it, it gives me results with the error: "
>
> UserWarning: Using generic algorithm for an inexact ring, which will probably give incorrect results due to numerical precision issues."

It's not an error but a warning. The absence of a warning doesn't mean
that the result is more trustworthy than a result of a computation that
doesn't create a warning. See below: When doing numerical computations,
it is advisable to assess the results by cross verifications and
consistency checks.

> I expected this error to be some minute difference in the far out decimal place range, like I so often see with
> complex numbers, but the answer I get is actually Quite a bit off from the correct answer.

No surprise. That's in the nature of numerical computations.

> My question is, why is the default an alg that seems to give incorrect answers, instead of just defaulting to CDF?
> Also, is there a better way of doing this besides just appending CDF to the beginning every time?

The default complex field CC has a precision of 53 bits. In principle
you can create a complex field with arbitrary finite precision, by using
ComplexField(bitwise_precision)

CDF is just a wrapper of machine double precision complex numbers, which have a
fixed precision of 53 bit as well. However, computations in CDF are based on the GNU
Scientific Library (GSL). Apparently CC uses a generic algorithm to compute
eigenvalues --- which is both slower and more prone to rounding effects
than the algorithm in GSL.

So, as usual when doing numerical computations, one needs to make the
best of the available resources. For example, if in some parts of your
computation you do "ordinary" arithmetic computations and need to do it
in very high precision, you could create a complex field in the required
precision, and build your matrix M on top of it; and when you in another
part of your computations you need a stable numerical algorithm that
isn't offered by CC, then you can always change to CDF.

For example:
sage: C = ComplexField(500)
sage: M = random_matrix(C, 10)
sage: M2 = M.change_ring(CDF)
sage: abs(M.determinant()^20 - (M^20).determinant())
3.22553502188764194905953565908366462673376034391868764719927365055487394028435964347935057202262954253087699368051026966711621636866859093419028355926e-59
sage: abs(M2.determinant()^20 - (M2^20).determinant())
2.1231804490158553e+41

Of course, the absolute value of the difference should theoretically be zero.
So, as you can see here, in my example it is many orders of magnitude better
to use a high precision field instead of CDF.

Also, the eigenvalues and characteristic polynomial are more consistent than
those computed in CDF: Let's compare the characteristic polynomial with the
product of (x-eigenvalue):
sage: R.<x> = C[]
sage: max(abs(c) for c in (M.charpoly()-prod((x-ev) for ev in
M.eigenvalues())).list())
/home/king/Sage/git/sage/src/bin/sage-ipython:1: UserWarning: Using
generic algorithm for an inexact ring, which will probably give
incorrect results due to numerical precision issues.
#!/usr/bin/env sage-python23
4.69238225433539279163184034797611678687652260951909375574059096465606393834197062303005271312591139060256023453735495755056822286510738017751316446341e-148
sage: max(abs(c) for c in (M2.charpoly()-prod((x-ev) for ev in
M2.eigenvalues())).list())
3.096434044909977e-12

But if the precision of the field C is only marginally larger than the
precision of CDF, the picture changes:
sage: C = ComplexField(60)
sage: R.<x> = C[]
sage: M = random_matrix(C, 10)
sage: M2 = M.change_ring(CDF)
sage: abs(M.determinant()^20 - (M^20).determinant())
7.9015802853808835e72
sage: abs(M2.determinant()^20 - (M2^20).determinant())
4.509812763227185e+43
sage: max(abs(c) for c in (M.charpoly()-prod((x-ev) for ev in
M.eigenvalues())).list())
/home/king/Sage/git/sage/src/bin/sage-ipython:1: UserWarning: Using
generic algorithm for an inexact ring, which will probably give
incorrect results due to numerical precision issues.
#!/usr/bin/env sage-python23
9.9301366129890920e-16
sage: max(abs(c) for c in (M2.charpoly()-prod((x-ev) for ev in
M2.eigenvalues())).list())
3.5482789005657048e-12

This time, determinant and matrix multiplication are more consistent in
CDF than in C (although the precision used in C is still slightly better
than in CDF!). But the consistency of eigenvalue and charpoly computation
is still better in C and in CDF.

However, the fact that eigenvalues and charpoly are more *consistent* in
CC than in CDF does not mean that the eigenvalues are more close to
reality. So, let's start with a matrix with known eigenvalues, then
compare with computations in CC resp. in CDF resp. in precision 2000:
sage: EV = [CC.random_element() for _ in range(10)]
sage: C2000 = ComplexField(2000)
sage: EV = [C2000.random_element() for _ in range(10)]
sage: S = random_matrix(C2000,10)
sage: A = S*diagonal_matrix(EV)*~S

Eigenvalues in precision 2000
sage: max(abs(e1-e2) for (e1,e2) in zip(sorted(EV),sorted(A.eigenvalues())))
<WARNING>
1.4520029484851043459...e-598
Eigenvalues in CC:
sage: max(abs(e1-e2) for (e1,e2) in zip(sorted(EV),sorted(A.change_ring(CC).eigenvalues())))
<WARNING>
1.23127592121274e-12
Eigenvalues in CDF:
sage: max(abs(e1-e2) for (e1,e2) in zip(sorted(EV),sorted(A.change_ring(CDF).eigenvalues())))
<NO WARNING>
2.504777312823609e-15

I guess it's a general advice in numerical computations to do cross verifications
in order to assess the validity of the result.

Best regards,
Simon

Simon King

unread,
Dec 15, 2018, 6:57:25 AM12/15/18
to sage-s...@googlegroups.com
On 2018-12-15, Simon King <simon...@uni-jena.de> wrote:
> It's not an error but a warning. The absence of a warning doesn't mean
> that the result is more trustworthy than a result of a computation that
> doesn't create a warning.

Ooops, one negation too many. I meant to say: "The absence of a warning
doesn't mean that the result is more trustworthy than the result of a
computation that DOES create a warning. Sorry.

Nils Bruin

unread,
Dec 15, 2018, 4:23:24 PM12/15/18
to sage-support
On Wednesday, December 5, 2018 at 10:15:52 AM UTC-8, saad khalid wrote:
Thank you all for the thoughtful responses, your comments helped me a lot. One last bit of trouble that I've been having related to this topic. This was is the code I was running:

Mz = matrix([[0,1],[1,0]])
Mx = matrix([[1,0],[0,-1]])
M1
= Matrix([[1,0],[0,1]])
import numpy as numpy
#R.<s> = QQ[]
_
= var('J,hx,alpha')
alphas
= numpy.linspace(0,1,10)
Bp = Mz.tensor_product(M1).tensor_product(M1).tensor_product(M1) * M1.tensor_product(Mz).tensor_product(M1).tensor_product(M1) * M1.tensor_product(M1).tensor_product(Mz).tensor_product(M1) * M1.tensor_product(M1).tensor_product(M1).tensor_product(Mz)
Plaqz = Mx.tensor_product(M1).tensor_product(M1).tensor_product(M1)
F
= lambda s: -(1-s)*Bp - s*Plaqz

show
(F(.1).eigenvalues())

If I run it, it gives me results with the error: "
UserWarning: Using generic algorithm for an inexact ring, which will probably give incorrect results due to numerical precision issues."

I expected this error to be some minute difference in the far out decimal place range, like I so often see with
complex numbers, but the answer I get is actually Quite a bit off from the correct answer. Upon googling, i found that this can
be fixed by appending CDF to the beginning of every matrix definition, like this:
 
My question is, why is the default an alg that seems to give incorrect answers, instead of just defaulting to CDF? 

Also, is there a better way of doing this besides just appending CDF to the beginning every time?

This is actually a real omission in the code. Using ComplextField(...) instead of CDF is slower but provides more options, for instance working with more than 53 bits. While computing eigenvalues of matrices numerically is fundamentally a tricky operation that generally needs care and error control, there are reasonable strategies to do a lot better than what happens with ComplexField by default (as you can see with CDF). It would be really good if matrices over ComplexField and RealField would use algorithms that at least use some common sense. The warning about imprecise answers would still apply, but to a much lower degree. (Technical note: echelonization for CC and RR presently doesn't even use partial pivoting. If we at least do that, we get something that's a little closer to the "best effort" result people can reasonably expect.

Of course, by the time you're using numpy anyway, you should probably prefer to use the state of the art numerical tools offered there. Only if you need more than 53 bits should you consider the sage types.
 
Reply all
Reply to author
Forward
0 new messages