Matrix inconsistency with decimal/fraction representations

178 views
Skip to first unread message

Michael Orlitzky

unread,
Nov 1, 2009, 12:34:28 PM11/1/09
to sage-s...@googlegroups.com
This one had me stumped for a while. I'm using 4.1.1 here, but found the
same results in a 4.1.2 notebook. The solve_foo() methods are broken,
too; probably as a consequence.


# Good

sage: m = matrix([ [(-3/10), (1/5), (1/10)],
[(1/5), (-2/5), (2/5)],
[(1/10), (1/5), (-1/2)] ])

sage: m.echelon_form()

[ 1 0 -3/2]
[ 0 1 -7/4]
[ 0 0 0]


# Bad

sage: n = matrix([ [-0.3, 0.2, 0.1],
[0.2, -0.4, 0.4],
[0.1, 0.2, -0.5] ])

sage: n.echelon_form()

[ 1.00000000000000 0.000000000000000 0.000000000000000]
[0.000000000000000 1.00000000000000 0.000000000000000]
[0.000000000000000 0.000000000000000 1.00000000000000]


# Ugly

sage: m == n
True

John Cremona

unread,
Nov 2, 2009, 5:18:20 AM11/2/09
to sage-support
sage: n.det()
1.04083408558608e-17
sage: n.parent()
Full MatrixSpace of 3 by 3 dense matrices over Real Field with 53 bits
of precision

So to the given precision, n is invertible so its echelon form is the
identity. But if you convert m to a very high precision RealField:
MatrixSpace(RealField(10000),3)(m).echelon_form()
still gives the identity matrix, so this looks bad. (The determinant
is 0 to 10^4 decimals in that example!)

>
> # Ugly
>
> sage: m == n
> True

I think this is a coercion issue. I agree taht the result is not
mathematically nice at all:
sage: m==n
True
sage: m.rank(), n.rank()
(2, 3)

John Cremona

William Stein

unread,
Nov 2, 2009, 11:33:10 AM11/2/09
to sage-s...@googlegroups.com

This is really no different than:

sage: n = 5; m = Mod(5, 19)
sage: n == m
True
sage: n.additive_order()
+Infinity
sage: m.additive_order()
19

I don't consider this a bug. The original poster's issues might stem
from misunderstanding about how floating point numbers are used and
abused in Sage (or MATLAB or any other math software, for that
matter).

William

Michael Orlitzky

unread,
Nov 3, 2009, 7:45:22 PM11/3/09
to sage-s...@googlegroups.com
William Stein wrote:
>
> This is really no different than:
>
> sage: n = 5; m = Mod(5, 19)
> sage: n == m
> True
> sage: n.additive_order()
> +Infinity
> sage: m.additive_order()
> 19
>
> I don't consider this a bug. The original poster's issues might stem
> from misunderstanding about how floating point numbers are used and
> abused in Sage (or MATLAB or any other math software, for that
> matter).
>
> William

You're going to have a rather hard time convincing me that, upon
encountering a decimal point, any other math software is going to flip
out and silently return incorrect results for basic matrix operations.

Recall:

sage: m = matrix([ [-0.3, 0.2, 0.1],


[0.2, -0.4, 0.4],
[0.1, 0.2, -0.5] ])

sage: m.echelon_form()

[ 1.00000000000000 0.000000000000000 0.000000000000000]
[0.000000000000000 1.00000000000000 0.000000000000000]
[0.000000000000000 0.000000000000000 1.00000000000000]

And for any other math software:

1. http://michael.orlitzky.com/screenshots/derive-v6.10.png
2. http://michael.orlitzky.com/screenshots/maple-v13.png
3. http://michael.orlitzky.com/screenshots/mathcad-v14.png
4. http://michael.orlitzky.com/screenshots/mathematica-v7.png
5. http://michael.orlitzky.com/screenshots/matlab-v2009b.png
6. http://michael.orlitzky.com/screenshots/minitab-v15.1.1.png

Jason Grout

unread,
Nov 3, 2009, 8:52:11 PM11/3/09
to sage-s...@googlegroups.com
Michael Orlitzky wrote:
>
> You're going to have a rather hard time convincing me that, upon
> encountering a decimal point, any other math software is going to flip
> out and silently return incorrect results for basic matrix operations.
>
> Recall:
>
> sage: m = matrix([ [-0.3, 0.2, 0.1],
> [0.2, -0.4, 0.4],
> [0.1, 0.2, -0.5] ])
>
> sage: m.echelon_form()
>
> [ 1.00000000000000 0.000000000000000 0.000000000000000]
> [0.000000000000000 1.00000000000000 0.000000000000000]
> [0.000000000000000 0.000000000000000 1.00000000000000]
>


The problem here is that no one has really implemented a numerically
stable version of echelon_form for RR. I thought we called scipy for
rank calculations over RDF, but apparently we don't even do that! Scipy
answers correctly:

sage: import scipy
sage: scipy.rank(m.numpy())
2


Very little attention has been paid to numerically stable linear algebra
for non-exact rings. We do get some numerically stable things for
matrices over RDF and CDF because those are based on numpy and scipy
(which call blas and lapack routines). However, apparently we don't
call numpy/scipy to calculate the rank, but instead rely on our unstable
echelon form computation!

Scipy and numpy don't have facilities for calculating the echelon form
because it does not make sense to calculate the echelon form in
numerical settings (paraphrasing their words to the best of my memory).

That said, there is plenty of work for someone who would like to
implement numerically stable, arbitrary precision linear algebra in
Sage. I've talked with at least one person who has worked on such a C
library (arbitrary precision, numerically stable linear algebra) about
contributing to Sage in the last year, but politics have held them up
from contributing any code. We don't even need huge experts in the
area; even someone implementing basic textbook algorithms would be a
huge improvement, I think.

If someone wanted to take make a good library of numerically stable
linear algebra routines based on mpfr, I think it would be absolutely
awesome.

Thanks,

Jason

--
Jason Grout

Michael Orlitzky

unread,
Nov 3, 2009, 10:22:12 PM11/3/09
to sage-s...@googlegroups.com
Jason Grout wrote:

> The problem here is that no one has really implemented a numerically
> stable version of echelon_form for RR. I thought we called scipy for
> rank calculations over RDF, but apparently we don't even do that! Scipy
> answers correctly:
>
> sage: import scipy
> sage: scipy.rank(m.numpy())
> 2
>
>
> Very little attention has been paid to numerically stable linear algebra
> for non-exact rings. We do get some numerically stable things for
> matrices over RDF and CDF because those are based on numpy and scipy
> (which call blas and lapack routines). However, apparently we don't
> call numpy/scipy to calculate the rank, but instead rely on our unstable
> echelon form computation!

This was hinted at by the (m == n) example in my original message, but
if I can type it, it isn't an irrational number. Since any (decimal)
number entered manually is guaranteed to be rational, it can be
represented as a fraction, and those are handled correctly.

This is probably already implemented to some extent, or else (m == n)
would not have returned True.


> If someone wanted to take make a good library of numerically stable
> linear algebra routines based on mpfr, I think it would be absolutely
> awesome.

Irrationals have to be entered symbolically, and SAGE can already
manipulate symbols well-enough. I would like to say, "just compute the
rref symbolically," but that idea breaks down somewhat given two
irrationals that are linearly dependent.

I should review my algebra before saying something dumb, but I think
that arbitrary precision would be useful in that case.

Jason Grout

unread,
Nov 3, 2009, 11:13:23 PM11/3/09
to sage-s...@googlegroups.com

I don't think it's an issue of irrational versus rational. It's
numerical precision and inexact floating point numbers. This matrix is
terribly ill-conditioned. It is right on the border line between being
invertible or not, numerically speaking:

sage: m.change_ring(RDF).SVD()[1]

[ 0.772642968023 0.0 0.0]
[ 0.0 0.450580563234 0.0]
[ 0.0 0.0 3.13289758759e-17]

As you probably know, the ratio between the smallest and largest
eigenvalues being so high gives us an indication that this matrix is
really a messy one numerically, and deserves the strictest of attention.

If we are allowed to do the computations with exact arithmetic (e.g.,
using fractions instead of decimals), then we can compute its rank and
echelon form exactly. You see the same problems in other math software
too, with different matrices. If pressed, I could come up with an a
very nice-looking matrix example from a linear algebra textbook problem
that matlab totally gave the wrong answer to because it was a very
ill-conditioned matrix.

Robert Bradshaw

unread,
Nov 4, 2009, 12:39:01 AM11/4/09
to sage-s...@googlegroups.com

I'd also like to point out that we don't just want to fall back and do
everything over the rationals (even though any finite decimal
expansion is rational) as things get much slower due to coefficient
explosion. For example

sage: m = random_matrix(ZZ, 10); m

[ -1 -1 -1 -3 -23 -1 -1 0 1 0]
[ -1 -1 -1 -1 3 -2 3 2 -1 0]
[ 0 13 -1 0 -25 -4 -2 -1 1 -1]
[ 1 7 -1 -1 0 -1 3 0 1 2]
[ 2 4 -2 1 3 0 -3 1 -2 0]
[ -7 -1 6 -3 0 1 -1 -60 0 -1]
[ 0 2 0 -3 1 -1 -1 46 -1 -46]
[ -19 -380 1 -6 2 1 3 1 -2 -1]
[ 0 -1 0 -2 -2 0 -2 -5 0 0]
[ 1 8 -2 1 2 0 2 0 0 8]

Looks tame enough, right?

sage: m^-1

[ 640562611/3385102976 -63055861/846275744 -549008479/3385102976
1157497801/1692551488 780319975/1692551488 -140389705/3385102976
-208906825/1692551488 -2793317/423137872 -1842638119/3385102976
-1535072739/1692551488]
[ -5600347/1692551488 3587565/423137872
4483047/1692551488 -25271681/846275744 -16141679/846275744
2475841/1692551488 3754913/846275744 -559627/211568936
36871727/1692551488 28063787/846275744]
[ 523145421/3385102976 160100405/846275744
-550981409/3385102976 306090295/1692551488 164983577/1692551488
-267861687/3385102976 -279353655/1692551488 -9830251/423137872
-1509145049/3385102976 -1738898909/1692551488]
[ -360747883/3385102976 -118724611/846275744 405509879/3385102976
-128430449/1692551488 29384129/1692551488 116143857/3385102976
44713841/1692551488 3489709/423137872 -824715777/3385102976
323560411/1692551488]
[ -140903663/1692551488 324025/423137872
46587723/1692551488 -36980957/846275744 -50688435/846275744
-1387523/1692551488 13708285/846275744 360201/211568936
240384563/1692551488 91072991/846275744]
[ 447725143/1692551488 -73220577/423137872
-377118163/1692551488 -11011595/846275744 81470715/846275744
62756171/1692551488 15510539/846275744 -964865/211568936
-478364923/1692551488 71808441/846275744]
[ 649005011/3385102976 51552491/846275744
-463894527/3385102976 487440489/1692551488 278630215/1692551488
59739863/3385102976 -67119209/1692551488 -1656309/423137872
-1667806919/3385102976 -533883395/1692551488]
[ -169891/1692551488 12587301/423137872
-7854769/1692551488 -51955289/846275744 -38099159/846275744
-35116903/1692551488 -1753223/846275744 -398835/211568936
56466071/1692551488 22643/846275744]
[ -185218317/423137872 -41492209/105784468
131301281/423137872 -25994167/211568936 -111115473/211568936
8319751/423137872 30984231/211568936 1213689/52892234
335120761/423137872 193991029/211568936]
[ 15210787/3385102976 42522395/846275744 -36094351/3385102976
-104937127/1692551488 -71997097/1692551488 -83127577/3385102976
-46898457/1692551488 -1191381/423137872 179148969/3385102976
-39909971/1692551488]

This is the typical behavior.

Note also that 0.3 and 3/10 have different behaviors in Sage,
sometimes you want one, sometimes you want the other. For example

sage: 0.3^1000 # fast, approximate to a fixed number of digits
1.32207081948076e-523
sage: (3/10)^1000 # exact and memory intensive
132207081948080663689045525975214436596542203275214816766492036822682859
734670489954077831385060806196390977769687258235595095458210061891186534
272525795367402762022519832080387801477422896484127439040011758861804112
894781562309443806156617305408667449050617812548034440554705439703889581
746536825491613622083026856377858229022841639830788789691855640408489893
760937324217184635993869551676501894058810906042608967143886410281435038
5648747165832010614366132173102768902855220001/1000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Note that 0.3 can't even be represented exactly as a (binary) floating
point number.

- Robert

Michael Orlitzky

unread,
Nov 4, 2009, 2:08:49 PM11/4/09
to sage-s...@googlegroups.com
Jason Grout wrote:
>
> I don't think it's an issue of irrational versus rational. It's
> numerical precision and inexact floating point numbers. This matrix is
> terribly ill-conditioned. It is right on the border line between being
> invertible or not, numerically speaking:

No, it isn't. *My* matrix contains nine integer multiples of 0.1, which
any high school student can show represents a linearly-dependent system
in a few seconds.


> sage: m.change_ring(RDF).SVD()[1]
>
> [ 0.772642968023 0.0 0.0]
> [ 0.0 0.450580563234 0.0]
> [ 0.0 0.0 3.13289758759e-17]
>
> As you probably know, the ratio between the smallest and largest
> eigenvalues being so high gives us an indication that this matrix is
> really a messy one numerically, and deserves the strictest of attention.

I don't know that. Why would I know that? Why would I compute the
smallest and largest eigenvalues when I just want the RREF of the thing?
The original matrix is *not* messy. It's only messy once SAGE handles it
in a messy way.

What's worse is that there is no indication that the results of
echelon_form() are incorrect. If I didn't know that matrix was singular,
I would have happily continued my work with results that were
essentially made up. Sure, I could check each matrix by hand, but
/that's why I'm using SAGE in the first place/.


> If we are allowed to do the computations with exact arithmetic (e.g.,
> using fractions instead of decimals), then we can compute its rank and
> echelon form exactly. You see the same problems in other math software
> too, with different matrices. If pressed, I could come up with an a
> very nice-looking matrix example from a linear algebra textbook problem
> that matlab totally gave the wrong answer to because it was a very
> ill-conditioned matrix.

I concede that MATLAB et al. can under certain circumstances produce
incorrect results, but that doesn't mean that SAGE should give up on
correctness completely. No other software fails on this example.

Michael Orlitzky

unread,
Nov 4, 2009, 2:25:18 PM11/4/09
to sage-s...@googlegroups.com
Robert Bradshaw wrote:
>
> I'd also like to point out that we don't just want to fall back and do
> everything over the rationals (even though any finite decimal
> expansion is rational) as things get much slower due to coefficient
> explosion. For example

Who cares about speed when the answer is wrong?

If I don't have to produce the correct answer, I can solve any problem
instantly. The current implementation over the reals might as well be,

def echelon_form(self):
return matrix([ [1,0,0], [0,1,0], [0,0,1] ])

and it doesn't get much faster than that.


> Note also that 0.3 and 3/10 have different behaviors in Sage,
> sometimes you want one, sometimes you want the other.

Retarded.


> Note that 0.3 can't even be represented exactly as a (binary) floating
> point number.

And that's why.

Correctness should be the default. If using n(3/10) is going to be a
couple of milliseconds faster, let the guy who cares about that type a
few 'n's. I shouldn't have to solve my problem by hand before feeding it
to SAGE just to get the correct answer.

Robert Bradshaw

unread,
Nov 4, 2009, 2:33:01 PM11/4/09
to sage-s...@googlegroups.com
On Nov 4, 2009, at 11:08 AM, Michael Orlitzky wrote:

>
> Jason Grout wrote:
>>
>> I don't think it's an issue of irrational versus rational. It's
>> numerical precision and inexact floating point numbers. This
>> matrix is
>> terribly ill-conditioned. It is right on the border line between
>> being
>> invertible or not, numerically speaking:
>
> No, it isn't. *My* matrix contains nine integer multiples of 0.1,
> which
> any high school student can show represents a linearly-dependent
> system
> in a few seconds.

Technically, your matrix does not contain integer multiples of 0.1, it
contains approximations to integer multiples of 0.1, represented base
2, truncated to 53 bits of precision.

sage: (0.1).exact_rational()
3602879701896397/36028797018963968

As you pointed out, you could do this over the rationals (where they
are actual integer multiples of 0.1) and it works fine. Doing it with
floating point numbers introduces rounding error, whence the
discrepancy. I would highly recommend

http://docs.sun.com/source/806-3568/ncg_goldberg.html

at least the first couple of sections.

>
>> sage: m.change_ring(RDF).SVD()[1]
>>
>> [ 0.772642968023 0.0 0.0]
>> [ 0.0 0.450580563234 0.0]
>> [ 0.0 0.0 3.13289758759e-17]
>>
>> As you probably know, the ratio between the smallest and largest
>> eigenvalues being so high gives us an indication that this matrix is
>> really a messy one numerically, and deserves the strictest of
>> attention.
>
> I don't know that. Why would I know that? Why would I compute the
> smallest and largest eigenvalues when I just want the RREF of the
> thing?
> The original matrix is *not* messy. It's only messy once SAGE
> handles it
> in a messy way.

It's messy the instant you type it in with decimal points, as it
starts using floating point numbers internally.

> What's worse is that there is no indication that the results of
> echelon_form() are incorrect. If I didn't know that matrix was
> singular,
> I would have happily continued my work with results that were
> essentially made up. Sure, I could check each matrix by hand, but
> /that's why I'm using SAGE in the first place/.

In your case I would recommend staying away from linear algebra over
inexact fields (like the reals) and do everything exactly until you
have a good sense of how rounding errors can be controlled, especially
if you're dealing with possibly singular matrices (which are ill
conditioned as indicated above). If you want exact answers, work over
an exact ring, and the (well known) issues of rounding errors won't
bite you like this.

>> If we are allowed to do the computations with exact arithmetic (e.g.,
>> using fractions instead of decimals), then we can compute its rank
>> and
>> echelon form exactly. You see the same problems in other math
>> software
>> too, with different matrices. If pressed, I could come up with an a
>> very nice-looking matrix example from a linear algebra textbook
>> problem
>> that matlab totally gave the wrong answer to because it was a very
>> ill-conditioned matrix.
>
> I concede that MATLAB et al. can under certain circumstances produce
> incorrect results, but that doesn't mean that SAGE should give up on
> correctness completely. No other software fails on this example.

Eventually, using floating point, they're going to give incorrect
results. It's unclear whether its better to have consistent failure
right away, or work for some simple cases but give bad results later on.

- Robert


Robert Bradshaw

unread,
Nov 4, 2009, 2:53:13 PM11/4/09
to sage-s...@googlegroups.com
On Nov 4, 2009, at 11:25 AM, Michael Orlitzky wrote:

> Robert Bradshaw wrote:
>>
>> I'd also like to point out that we don't just want to fall back and
>> do
>> everything over the rationals (even though any finite decimal
>> expansion is rational) as things get much slower due to coefficient
>> explosion. For example
>
> Who cares about speed when the answer is wrong?

Lots of people, if you can precisely put a finger on how wrong an
answer is. In fact, I'm working on a problem right now where I only
care about the first couple of digits of my answer being right. The
rest may be "wrong" but I don't care because I know by how much.

The reason SVD was even brought up in the last email is it tells you
"how wrong" the answer might be if working over the reals, and in this
case it told you the answer could be "very wrong."

> If I don't have to produce the correct answer, I can solve any problem
> instantly. The current implementation over the reals might as well be,
>
> def echelon_form(self):
> return matrix([ [1,0,0], [0,1,0], [0,0,1] ])
>
> and it doesn't get much faster than that.

Some would argue that's correct for floating point matrices, as every
square matrix over the reals is infinitely close to an invertible
matrix.

>> Note also that 0.3 and 3/10 have different behaviors in Sage,
>> sometimes you want one, sometimes you want the other.
>
> Retarded.

Just because you don't understand floating point numbers doesn't mean
we shouldn't support them. In fact, far more people (by a factor of 10
at least, I'd guess even a factor of 100) care about floating point
arithmetic than working exactly over the rationals.

>> Note that 0.3 can't even be represented exactly as a (binary)
>> floating
>> point number.
>
> And that's why.
>
> Correctness should be the default. If using n(3/10) is going to be a
> couple of milliseconds faster, let the guy who cares about that type a
> few 'n's.

Or he can write 0.3. Note the decimal point. You can write 3/10. This
is a common convention:

sage: mathematica('0.3^1000')
1.3220708194807853*^-523

sage: maxima('0.3^1000')
0.0

sage: maxima('0.3^100')
5.153775207320102e-53

sage: maple('0.3^1000;')
.1322070819e-522

sage: matlab('0.3^1000')
0

sage: matlab('0.3^100')
5.1538e-53


> I shouldn't have to solve my problem by hand before feeding it
> to SAGE just to get the correct answer.

No, you shouldn't. You should either learn about or avoid floating
point numbers.

- Robert

Michael Orlitzky

unread,
Nov 4, 2009, 2:55:01 PM11/4/09
to sage-s...@googlegroups.com
Robert Bradshaw wrote:
>
>
> Technically, your matrix does not contain integer multiples of 0.1, it
> contains approximations to integer multiples of 0.1, represented base
> 2, truncated to 53 bits of precision.
>
> sage: (0.1).exact_rational()
> 3602879701896397/36028797018963968
>
> As you pointed out, you could do this over the rationals (where they
> are actual integer multiples of 0.1) and it works fine. Doing it with
> floating point numbers introduces rounding error, whence the
> discrepancy. I would highly recommend
>
> http://docs.sun.com/source/806-3568/ncg_goldberg.html
>
> at least the first couple of sections.
>

So when I'm done reading that 76-page document, I'll be able to do my
linear algebra homework? I happen to have read it before, but it's an
absurd thing to expect of end users.

Furthermore, my knowledge of floating point arithmetic in no way lead me
to believe that SAGE would use it in lieu of the alternatives.


> It's messy the instant you type it in with decimal points

Not the end user's fault.


> it starts using floating point numbers internally.

I didn't tell it to do that.


> In your case I would recommend staying away from linear algebra over
> inexact fields (like the reals) and do everything exactly until you
> have a good sense of how rounding errors can be controlled, especially
> if you're dealing with possibly singular matrices (which are ill
> conditioned as indicated above). If you want exact answers, work over
> an exact ring, and the (well known) issues of rounding errors won't
> bite you like this.
>

Ok, but (assuming it can be done) how do you propose I convert my
problem to an exact field? By hand? If only there were some sort of
program...


> Eventually, using floating point, they're going to give incorrect
> results. It's unclear whether its better to have consistent failure
> right away, or work for some simple cases but give bad results later on.

This is one argument I can understand. If it's not possible to perform a
calculation exactly, and the user hasn't specified that he or she would
accept an approximation, SAGE should emit a warning, preferably
accompanied by blinking Christmas lights and/or sirens.

Jason Grout

unread,
Nov 4, 2009, 4:40:18 PM11/4/09
to sage-s...@googlegroups.com
Michael Orlitzky wrote:


>> It's messy the instant you type it in with decimal points
>
> Not the end user's fault.
>
>
>> it starts using floating point numbers internally.
>
> I didn't tell it to do that.
>

I think there are several points here:

1. The moment Sage sees a decimal, it starts using approximate, floating
point arithmetic. If you don't want that, then don't use decimals; use
fractions. This is consistent with most mathematical software (though
other software may have better algorithms that minimize the effect of
the round-off errors).

2. Sage currently does not deal very well with round-off error in linear
algebra. This needs to be fixed. Furthermore, Sage doesn't tell you
(like matlab *sometimes* does) when round-off error might be a big
issue. For your matrix, Sage probably should have warned you that
round-off error might lead to an incorrect response.

I've opened up http://trac.sagemath.org/sage_trac/ticket/7392 and
http://trac.sagemath.org/sage_trac/ticket/7391 to address specific
issues that came up in this thread. Implementing numerically stable
arbitrary precision linear algebra is a much bigger project, so I'm
holding off on opening up a ticket until someone starts making it happen.


> Ok, but (assuming it can be done) how do you propose I convert my
> problem to an exact field? By hand? If only there were some sort of
> program...
>


Don't use decimals; use fractions. Then Sage does everything exactly,
with no round-off error.


Thanks,

Jason

Michael Orlitzky

unread,
Nov 4, 2009, 5:27:08 PM11/4/09
to sage-s...@googlegroups.com
Jason Grout wrote:
>
> I think there are several points here:
>
> 1. The moment Sage sees a decimal, it starts using approximate, floating
> point arithmetic. If you don't want that, then don't use decimals; use
> fractions. This is consistent with most mathematical software (though
> other software may have better algorithms that minimize the effect of
> the round-off errors).
>
> 2. Sage currently does not deal very well with round-off error in linear
> algebra. This needs to be fixed. Furthermore, Sage doesn't tell you
> (like matlab *sometimes* does) when round-off error might be a big
> issue. For your matrix, Sage probably should have warned you that
> round-off error might lead to an incorrect response.
>
> I've opened up http://trac.sagemath.org/sage_trac/ticket/7392 and
> http://trac.sagemath.org/sage_trac/ticket/7391 to address specific
> issues that came up in this thread. Implementing numerically stable
> arbitrary precision linear algebra is a much bigger project, so I'm
> holding off on opening up a ticket until someone starts making it happen.

Thanks. If there's a warning, this is a lot less dangerous.


> Don't use decimals; use fractions. Then Sage does everything exactly,
> with no round-off error.

But how would you go about that? You'd have to convert them all by hand,
which sort of defeats the purpose. It would be nice if we could, say,
map (lambda x: x.exact_rational()) the matrix, but once the decimal
number is stored as floating point, its representation is already
incorrect. For example,

sage: y = 0.3
sage: y.exact_rational()
5404319552844595/18014398509481984

looks like nonsense unless one understands the internals (and most
people shouldn't).

I admit to not searching very hard, but is there even any mention in the
docs that this is going to happen as soon as you introduce a decimal point?

Simon King

unread,
Nov 4, 2009, 5:34:47 PM11/4/09
to sage-support
Hi Michael!

On 4 Nov., 20:55, Michael Orlitzky <mich...@orlitzky.com> wrote:
[...]
> > it starts using floating point numbers internally.
>
> I didn't tell it to do that.

You did. 0.5 is a floating point number.

> Ok, but (assuming it can be done) how do you propose I convert my
> problem to an exact field? By hand? If only there were some sort of
> program...

There is a simple program (namely the method change_ring()). Starting
with your example:

sage: n = matrix([ [-0.3, 0.2, 0.1],[0.2, -0.4, 0.4],[0.1, 0.2,
-0.5] ])
sage: m = n.change_ring(QQ)
sage: m
[-3/10 1/5 1/10]
[ 1/5 -2/5 2/5]
[ 1/10 1/5 -1/2]

So, n.change_ring(QQ) tries to interprete the approximate data as
exact fractions. It might not always be as easy, but here it works:
sage: m.echelon_form()
[ 1 0 -3/2]
[ 0 1 -7/4]
[ 0 0 0]

Cheers,
Simon

Michael Orlitzky

unread,
Nov 4, 2009, 6:01:25 PM11/4/09
to sage-s...@googlegroups.com
Simon King wrote:
> Hi Michael!
>
> On 4 Nov., 20:55, Michael Orlitzky <mich...@orlitzky.com> wrote:
> [...]
>>> it starts using floating point numbers internally.
>> I didn't tell it to do that.
>
> You did. 0.5 is a floating point number.

I guess it comes down to that, when I say 0.3 I mean 0.3, and SAGE
assumes that I mean (5404319552844595/18014398509481984).

My expectation was misguided, I see, but I think it's fairly standard.
And the first warning to that effect shouldn't be wildly incorrect
results (already discussed).

Jason Grout

unread,
Nov 4, 2009, 6:03:17 PM11/4/09
to sage-s...@googlegroups.com


Apparently this tries harder to make a "nice" fraction:

sage: QQ(0.3)
3/10

So you could do this like Simon says in his post, or like this:

sage: n = matrix(QQ, [ [-0.3, 0.2, 0.1],


[0.2, -0.4, 0.4],
[0.1, 0.2, -0.5] ])


or

sage: n.apply_map(lambda x: QQ(x))

[-3/10 1/5 1/10]
[ 1/5 -2/5 2/5]
[ 1/10 1/5 -1/2]

>

> I admit to not searching very hard, but is there even any mention in the
> docs that this is going to happen as soon as you introduce a decimal point?


That's a good question. If you find a place in the docs that is very
natural to talk about floating point numbers and computers (which is a
very general subject), that would probably be the place for mentioning
this. Feel free to send in a patch, or even just the text that should
be added!

Thanks,

Jason


Robert Bradshaw

unread,
Nov 4, 2009, 10:42:07 PM11/4/09
to sage-s...@googlegroups.com

I'm pretty sure it didn't come through in my other emails, but I
certainly agree there is a lot of room for improvement here, both in
documentation, warnings, and more numerically stable algorithms.

- Robert

Matt Rissler

unread,
Nov 5, 2009, 1:03:51 AM11/5/09
to sage-support
I would second much of the discussion here. If I'm teaching a basic
first semester Linear Algebra course, the internals of numerical
approximation is at best tangential to the vast majority of the
course. Students expect it to just work, and on a simple nxn case
where n<10, I'm not sure that's unmerited. Yes, it would be great if
they had a better understanding of floating point numbers, but I'm not
sure it's appropriate that a program that wants to be an alternative
to Mathematica or Maple to expect that level of maturity from its
users.

Maybe that's just a bump for a student mode that makes some
assumptions about how we want internals to work. What's the status of
that?

Thanks,

Matt

Jason Grout

unread,
Nov 5, 2009, 1:27:19 AM11/5/09
to sage-s...@googlegroups.com


Interesting. I consider a brief discussion of numerical instability an
essential feature of a first-semester linear algebra course. These
students will most likely be using numerical linear algebra in real-life
problems, and it will help a *lot* to at least have had exposure to
realize that the computer may not be all-knowing and all-powerful like
they had supposed.

That said, I almost always define my matrices over QQ if I'm trying to
mirror a student's work that they've done by hand.

Simon King

unread,
Nov 5, 2009, 4:30:35 AM11/5/09
to sage-support
Hi Jason!

On Nov 5, 6:27 am, Jason Grout <jason-s...@creativetrax.com> wrote:
[...]
> > Maybe that's just a bump for a student mode that makes some
> > assumptions about how we want internals to work.  What's the status of
> > that?
>
> Interesting.  I consider a brief discussion of numerical instability an
> essential feature of a first-semester linear algebra course.

I am sorry if my ideas on higher education are a bit too old-
fashioned.

I *never* got a detailed account on numerical instability when I was
student. In the good old times in Germany (15 years ago) students had
much freedom to choose their subjects, and if one was more interested
in pure mathematics then it used to be possible to not attend any
class on numerics. And numerical instability was certainly not
considered a subject for linear algebra.

But after all, we were not supposed to use a computer. Now times have
changed.

Since nowadays computers are used, it is essential that the students
know from the very beginning that "the same" data can be represented
by different data types. They should be aware that it is very easy to
generate nonsense results (e.g., some simple calculators fail on (1/3)
*3, since they use floating point).

They should also know that for a singular matrix the slightest
perturbation suffices to make it non-singular. This can even be a
rigorous computation for homework: You start with a singular 3x3
integer matrix, and you replace one mark, say 3, by 3+x, and let the
students show that the matrix becomes non-singular for any x!=0. Now
the students should appreciate that floating point numbers can yield
fishy results.

Moreover, I do think that students should still be able to do certain
computations by hand -- and they should certainly do so using "proper"
fractions, not decimal fractions!

IIRC, at high school we were not allowed to give the answer in form of
a decimal fraction if the problem was formulated in terms of proper
fractions. And at university, decimal fractions were used in physics,
but certainly not in mathematics. So, I am somehow surprised that
apparently you would allow your students to use "0.3" instead of
"3/10".

Nonetheless, if numerically stable algorithms are available, Sage
should implement them.

Cheers,
Simon

Michael Orlitzky

unread,
Nov 5, 2009, 1:31:18 PM11/5/09
to sage-s...@googlegroups.com
Jason Grout wrote:
>
> Interesting. I consider a brief discussion of numerical instability an
> essential feature of a first-semester linear algebra course. These
> students will most likely be using numerical linear algebra in real-life
> problems, and it will help a *lot* to at least have had exposure to
> realize that the computer may not be all-knowing and all-powerful like
> they had supposed.
>
> That said, I almost always define my matrices over QQ if I'm trying to
> mirror a student's work that they've done by hand.

I don't mean to beat a dead horse, but this problem isn't something that
can be overcome by teaching numerical (in)stability. I understand the
floating point issues well enough; my question was never,

Why does this happen with matrices of floats?

but rather,

If we know matrices of floats return junk, why does SAGE parse my
(matrix) inputs as floats?

The answer "because you told it to," isn't very satisfying, because it
relies on my knowing the nowhere-documented[1] internal detail that
inputs containing decimals will be stored as floating point numbers,
even in cases (e.g. matrices) where the outcome will be less than desirable.

Now if you were to ask me a week ago, "what would happen if you entered
floating point approximations of fractions in to a matrix?" then my
answer would have rightly been, "all hell is going to break loose."

The students would hopefully answer similarly after a lecture on
numerical stability, but hopefully you can see how from their viewpoint,
that doesn't make it any easier to predict SAGE's behavior.

[1] I spent some more time looking, but still haven't found anything.
I could be proven wrong.

William Stein

unread,
Nov 5, 2009, 2:16:53 PM11/5/09
to sage-s...@googlegroups.com
On Thu, Nov 5, 2009 at 10:31 AM, Michael Orlitzky <mic...@orlitzky.com> wrote:
>
> Jason Grout wrote:
>>
>> Interesting.  I consider a brief discussion of numerical instability an
>> essential feature of a first-semester linear algebra course.  These
>> students will most likely be using numerical linear algebra in real-life
>> problems, and it will help a *lot* to at least have had exposure to
>> realize that the computer may not be all-knowing and all-powerful like
>> they had supposed.
>>
>> That said, I almost always define my matrices over QQ if I'm trying to
>> mirror a student's work that they've done by hand.
>
> I don't mean to beat a dead horse, but this problem isn't something that
> can be overcome by teaching numerical (in)stability. I understand the
> floating point issues well enough; my question was never,
>
>   Why does this happen with matrices of floats?
>
> but rather,
>
>   If we know matrices of floats return junk, why does SAGE parse my
>   (matrix) inputs as floats?
>
> The answer "because you told it to," isn't very satisfying, because it
> relies on my knowing the nowhere-documented[1] internal detail that
> inputs containing decimals will be stored as floating point numbers,
> even in cases (e.g. matrices) where the outcome will be less than desirable.

So we document it. Problem solved.

William

Reply all
Reply to author
Forward
0 new messages