RealField isn't doing it right

249 views
Skip to first unread message

aw

unread,
Apr 15, 2023, 5:39:40 PM4/15/23
to sage-devel
Guys, this is serious. Any dev who starts reading this, give it a hard look, ok?

Below I give some examples where Sage give the wrong answer, with no error message or any indication that the answer is not correct.

Probably the single most important rule in software is, never ever give users the wrong answer.

Either give them the right answer, or give them no answer along with an error message explaining why the right answer can't be given.

Why the heck is Sage not following this rule?

After finding these examples, it's hard for me to trust any answer I get from Sage, unless I check it against other software, like Wolfram Alpha.

One possible response to my examples below is, I am not using Sage functions in an approved way.

But that's not a valid objection. A user should be able to send any argument to a function, and it's the responsibility of the programmer to make sure one of two things happen:

(a) the user is given the right answer;
or (b) the user is given no answer and an error message.

Sage is failing to do this.

On to the examples.

Example 1, straight from the sage RealField docs:

RealField(120)(RR(13/10))
1.3000000000000000444089209850062616

RR has default 53 bits, so this is really:

RealField(120)(RealField(53)(13/10))

Here's the same thing with less precision to make it clear that this has nothing to do with Python's single or double precision floats:

RealField(18)(RealField(12)(13/10))                     (A)

What should happen here is this:
RealField(12) should eval 13/10 as 1.30
RealField(12) should pass to RealField(18) only those digits, 1.30
RealField(18) should then zero-pad that to 18 bits, yielding 1.3000

Here's what sage does:

RealField(18)(RealField(12)(13/10))
1.2998

Let's check the types of all args:

type(13/10)
sage.rings.rational.Rational

type(18)    # type(12) is the same
sage.rings.integer.Integer

All args to RealField in (A) are Sage types.
Hence everything in the calculation (A) is under the complete control of Sage.
Hence Sage could easily do the right thing here.

But it doesn't.

You might say, this is an artificial example, passing the output of one RealField to another in a non-sanctioned way.
Ok then, let's just pass arguments to one RealField.
Can we get it to give wrong answers?

Yes, very easily, by passing it an arithmetic expression.

Sending in arithmetic expressions should be totally fine, because the RealField docs say: "This is a binding for the MPFR arbitrary-precision floating point library."

The purpose of MPFR is to be able to do basic arithmetic in arbitrary precision.
Basic arithmetic means any expression using the following ops (and maybe some others): +, -, *, /, ^  

So let's pass an arithmetic expression to RealField.
The whole point of MPFR is to always give the right answer for the value of such an expression, for any expression and any requested precision.
The docs say RealField is a binding for MPFR.
So RealField should only give right answers.
But in fact it sometimes gives the wrong answer.

Here's one:

RealField(200)(2*1.1)
2.2000000000000001776356839400250464677810668945312500000000

Let's check the types:

type(2)
<class 'sage.rings.integer.Integer'>

type(1.1)
<class 'sage.rings.real_mpfr.RealLiteral'>

type(2*1.1)
<class 'sage.rings.real_mpfr.RealNumber'>

Here the expression 2*1.1 is not only a Sage type, it's a Sage type designed to work with MPFR. Hence passing 2*1.1 to RealField should yield the right answer.

But it doesn't.

How the heck was this not noticed earlier?

I think I have a workaround for expressions such as 2*1.1: enter the 1.1 as a sage rational, 11/10:

RealField(200)(2*11/10)
2.2000000000000000000000000000000000000000000000000000000000

That works for this expression, but I worry that with other kinds of expression I could still get wrong answers.

By the way, in this example i used numbers where we know the answer in advance, so it's easy to see that the answer given is wrong.

Here's a very similar example where we don't know the answer in advance:

RealField(200)(e^(1.1))
3.0041660239464333947978502692421898245811462402343750000000

RealField(200)(e^(11/10))
3.0041660239464331120584079535886723932826810260162727621298

Wolfram Alpha says the second one is correct.

Let's check the types for the wrong one:

type(e)
<class 'sage.symbolic.constants_c.E'>

type(1.1)
<class 'sage.rings.real_mpfr.RealLiteral'>

type(e^(1.1))
<class 'sage.rings.real_mpfr.RealNumber'>

All are sage types, and the expression as a whole is a sage type designed to work with MPFR. Sage should give the right answer here, but it doesn't.

I like the idea of Sage. And I know the Sage devs must be solid professionals. But giving users junk answers makes Sage look bush-league.

The MPFR devs would never allow their library to give a junk answer.

Wolfram would never allow Mathematica or Alpha to give a junk answer.

Why the heck are the Sage devs allowing Sage to give junk answers?

-aw


William Stein

unread,
Apr 15, 2023, 5:55:24 PM4/15/23
to sage-...@googlegroups.com
https://github.com/sagemath/sage/blob/develop/CODE_OF_CONDUCT.md
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/83acb2d5-5210-4151-b5a1-826cdd6a6d05n%40googlegroups.com.



--
William (http://wstein.org)

David Roe

unread,
Apr 15, 2023, 6:20:57 PM4/15/23
to sage-...@googlegroups.com
I agree with William that you should refrain from insulting the Sage developers, especially when the underlying problem comes from your misunderstanding of how floating point arithmetic works.  

To respond to the content of your message, finite precision real fields in Sage are implemented using floating point arithmetic at a precision set when creating the field.  As such, they can only exactly represent rational numbers whose denominators are powers of 2 (and that lie within a certain range, but that's not the issue here).  So 1.1 is rounded to the closest number that can be represented in 53-bit arithmetic:
sage: x = 1.1
sage: x.as_integer_ratio()
(2476979795053773, 2251799813685248)
sage: factor(2251799813685248)
2^51

When you then try to cast to a field of higher precision, you only have that rounded value, and the result is the same in the higher precision field, but when you print it you get more digits since you explicitly asked for a higher precision value.

There are several ways to avoid these kinds of issues.  First, you should almost always refrain from working at a lower precision and then increasing the precision: the result will not be accurate.  Second, you can try to work in exact fields (like the rationals) as long as you can.  Third, you can work with Sage's RealIntervalField and RealBallField to get actual error bounds in your computations.

If you have questions about this, we might be able to help provide references, but we're not obligated to help instruct you when you are incorrectly insistenting that Sage is doing it all wrong.
David


Michael Orlitzky

unread,
Apr 15, 2023, 7:25:27 PM4/15/23
to sage-...@googlegroups.com
On Sat, 2023-04-15 at 18:20 -0400, David Roe wrote:
> I agree with William that you should refrain from insulting the Sage
> developers, especially when the underlying problem comes from your
> misunderstanding of how floating point arithmetic works.

I was given this response many times, and I think it's condescending
(no offense). My favorite permabug:

sage: A = matrix([[-3, 2, 1 ],
....: [ 2,-4, 4 ],
....: [ 1, 2,-5 ]])
sage: B = (2 * 0.5 * A)
sage: B == A
True
sage: B.rank() == A.rank()
False

I promise you that I know what a floating point number is, and that
they aren't the problem. The problem is that reasonable expectations
are not met by the Sage user interface. It shouldn't be that easy to
get the computer to lie to me. An explanation is of course possible,
but only after the fact, and only with a level of knowledge about the
preprocessor that is beyond most users.

William Stein

unread,
Apr 15, 2023, 7:57:20 PM4/15/23
to sage-...@googlegroups.com
On Sat, Apr 15, 2023 at 4:25 PM Michael Orlitzky <mic...@orlitzky.com> wrote:
>
> On Sat, 2023-04-15 at 18:20 -0400, David Roe wrote:
> > I agree with William that you should refrain from insulting the Sage
> > developers, especially when the underlying problem comes from your
> > misunderstanding of how floating point arithmetic works.
>
> I was given this response many times, and I think it's condescending
> (no offense).

I agree with you that it's best to assume that the original poster "aw" does
understand the semantics of floating point numbers in Sage, and just doesn't
like them. I think you also understand the semantics of floating point in
Sage as well, and you also don't like them.

We are definitely not changing these semantics. If you have questions about
how to use Sage's various floating point models as they are to solve problems,
we would be happy to answer them. However, insulting the sage developers
and demanding that the semantics change is not appropriate for this
mailing list.
Such comments would be appropriate and welcome at

https://groups.google.com/g/sage-flame

-- William


--
William (http://wstein.org)

Oscar Benjamin

unread,
Apr 15, 2023, 8:05:27 PM4/15/23
to sage-...@googlegroups.com
On Sat, 15 Apr 2023 at 22:39, aw <aw.phon...@gmail.com> wrote:
>
>A user should be able to send any argument to a function, and it's the responsibility of the programmer to make sure one of two things happen:
>
> (a) the user is given the right answer;
> or (b) the user is given no answer and an error message.

That is quite a strong requirement. I guess the programmer needs to
build a program that ignores what the user actually types in and
instead understands what they want some other way (psychically?). That
way the user can call any functions with any arguments but the program
will always do what the user wanted it to do regardless of what code
they entered.

Seriously though, floating point is not something where responsibility
can be *entirely* outsourced at *any* level. Anyone who wants to use
floating point types/routines directly or indirectly needs to have
some level of understanding of what the implications are even if that
is as basic as just knowing that rounding errors can happen and
knowing which operations might be affected by them. Explicitly mixing
or converting between precisions as in these examples requires more
care and understanding and it is just not possible for the
"programmer" to eliminate that responsibility from the "user"
altogether.

> RealField(18)(RealField(12)(13/10)) (A)
>
> What should happen here is this:
> RealField(12) should eval 13/10 as 1.30
> RealField(12) should pass to RealField(18) only those digits, 1.30
> RealField(18) should then zero-pad that to 18 bits, yielding 1.3000

That is what happens except that it happens in binary rather than
decimal floating point. Binary floating point means that 13/10 in
RealField(12) is already not exactly 1.3 which is why padding it with
zero bits also gives something that is not exactly 1.3. The most
accurate representation of 13/10 in RealField(18) is not just a
zero-padded version of the result from RealField(12) so the conversion
to higher precision does not recover the accuracy that is already
lost:

sage: RealField(12)(13/10)
1.30
sage: RealField(12)(13/10).as_integer_ratio()
(1331, 1024)
sage: 1331./1024
1.29980468750000
sage: RealField(18)(RealField(12)(13/10))
1.2998
sage: RealField(18)(RealField(12)(13/10)).as_integer_ratio()
(1331, 1024)

Notice as_integer_ratio gives the same result for the number in
RealField(18) as in RealField(12). They are both the exact same number
but they display differently because the decimal display
representation depends on the precision of the field.

--
Oscar

Michael Orlitzky

unread,
Apr 15, 2023, 8:17:37 PM4/15/23
to sage-...@googlegroups.com
On Sat, 2023-04-15 at 16:56 -0700, William Stein wrote:
>
> I agree with you that it's best to assume that the original poster "aw" does
> understand the semantics of floating point numbers in Sage, and just doesn't
> like them. I think you also understand the semantics of floating point in
> Sage as well, and you also don't like them.

This misses the point. The semantics are fine when you're expecting
them, but the expectation is key.

Nils Bruin

unread,
Apr 15, 2023, 10:11:22 PM4/15/23
to sage-devel
On Saturday, 15 April 2023 at 16:25:27 UTC-7 Michael Orlitzky wrote:
sage: A = matrix([[-3, 2, 1 ],
....: [ 2,-4, 4 ],
....: [ 1, 2,-5 ]])
sage: B = (2 * 0.5 * A)
sage: B == A
True
sage: B.rank() == A.rank()
False

I promise you that I know what a floating point number is, and that
they aren't the problem. The problem is that reasonable expectations
are not met by the Sage user interface.
 
I fail to see what the reasonable expectations are here. As soon as you multiply by "0.5" you now have an "imprecise" result. When people learn to use a scientific calculator properly they are very quickly confronted with the reality that "dots in numbers" lead to arithmetic that isn't quite what we learn in the lower grades of primary school.

Do you think we should have "B == A" return false? Indeed, any time equality comparisons are performed between floats, it's almost surely a bug, but if we were to break away from IEEE on this, we'd be failing to meet a whole different array of "reasonable" expectations.

Indeed, in Magma "A ==B" would lead to an error rather than a True/False, but unfortunately, Python blows up rather horribly if equality tests don't work (look up the problems for float('nan') in python that also spawn from IEEE conventions that python does implement for floats and then breaks it own expectations).

A more reasonably solution might be to have B.rank() return an error, since no useful information will be obtained from computing the rank of a float matrix; particularly because it actually doesn't even try to do the row reduction in a vaguely numerically stable way. The only reason why B.rank() works is because there's a generic algorithm on the matrix class that isn't actively intercepted when the base ring is floaty. A reason to *not* block it is that on interval rings (and p-adics and power series etc. that keep track of their error), the computation may still have some meaning.


Nils Bruin

unread,
Apr 15, 2023, 11:12:09 PM4/15/23
to sage-devel
On Saturday, 15 April 2023 at 14:39:40 UTC-7 aw wrote:
Here's one:

RealField(200)(2*1.1)
2.2000000000000001776356839400250464677810668945312500000000

Let's check the types:

type(2)
<class 'sage.rings.integer.Integer'>

type(1.1)
<class 'sage.rings.real_mpfr.RealLiteral'>

type(2*1.1)
<class 'sage.rings.real_mpfr.RealNumber'>

The type on that last result gives you a hint that something has changed by multiplying by 2:

sage: parent(2*1.1)
Real Field with 53 bits of precision

As you saw, "1.1" lead to a "real literal". Doing arithmetic with it will result in something that's not a literal anymore (the literal is mainly there so that "RealField(100)(1.1)" does what you'd expect). It has to choose a default precision which, without further information, is taken to be 53 bits (because that agrees with Python's float and the IEEE standard for "double" floats). Since the literal (but not the real number that results from multiplying by 2) does remember the string it was created from, you can actually imply different precision:

sage: parent(2*1.1000000000000000000000000000000)
Real Field with 107 bits of precision

The representation is still not exactly 11/5, because that number does not admit an exact representation in binary floating point, for any precision, but it will result in a binary floating point that lies closer to 11/5.

The design decision here to let the multiplication of a "RealLiteral" by a sage integer result in a "RealNumber" is because your use of RealLiteral was taken to signal intent to use floats over exact types. If you wanted exact results you could have used 2*11/10 . The usual reason for preferring floats is either because you're interested in doing computations that lead to transcendental numbers (which are problematic to compute with exactly) or because you want to avoid the explosion-of-digits that arises from exact arithmetic. Compare:

sage: %time a=factorial(10000000)
CPU times: user 2.22 s, sys: 41 ms, total: 2.26 s
Wall time: 2.27 s
sage: %time b=factorial(10000000.0)
CPU times: user 396 µs, sys: 0 ns, total: 396 µs
Wall time: 414 µs

As you can see, the results coming from sagemath may not meet *your* expectations, but the design decisions leading to the behaviour are actually quite well-motivated. It's entirely possible the design can be refined and improved even further, but you didn't make a convincing case that your examples would lead to such an improvement.

Georgi Guninski

unread,
Apr 16, 2023, 2:46:34 AM4/16/23
to sage-...@googlegroups.com
On Sun, Apr 16, 2023 at 2:25 AM Michael Orlitzky <mic...@orlitzky.com> wrote:
>

> sage: A = matrix([[-3, 2, 1 ],
> ....: [ 2,-4, 4 ],
> ....: [ 1, 2,-5 ]])
> sage: B = (2 * 0.5 * A)
> sage: B == A
> True
> sage: B.rank() == A.rank()
> False
>

Doesn't this complicates citing sagemath in papers?
e.g. Proof: according \cite{sage} we have A != B, QED.
The testcases appear plausible to me.

Michael Orlitzky

unread,
Apr 16, 2023, 7:40:50 AM4/16/23
to sage-...@googlegroups.com
On Sat, 2023-04-15 at 19:11 -0700, Nils Bruin wrote:
>
> I fail to see what the reasonable expectations are here. As soon as you
> multiply by "0.5" you now have an "imprecise" result. When people learn to
> use a scientific calculator properly they are very quickly confronted with
> the reality that "dots in numbers" lead to arithmetic that isn't quite what
> we learn in the lower grades of primary school.

It's reasonable to expect that multiplying by one won't cause a viable
alternative to Mathematica et al. to go bonkers. I haven't declared a
float variable, and I haven't type-cast anything to float. The
expression "0.5" is, a priori, quite equal to 1/2. Mathematica knows
it, my old Casio calculator knows it, and in fact, Sage knows it:

sage: 2 * 0.5 == ZZ(1)
True

We're pre-parsing the user's input, so there is an opportunity to do
the right thing. But instead, we silently convert 2*0.5 to float, and
then silently convert the entire matrix to a float matrix, on which
half the methods in sage are completely meaningless.


> Do you think we should have "B == A" return false? ...
> A more reasonably solution might be to have B.rank() return an error

Yes, ultimately, I think it would be better to have a class hierarchy
that can distinguish between inexact rings where these operations are
meaningful and those where they're not. But these are Sage library
issues and that's not my main complaint.

If you're writing python code, you should expect 2*0.5 to return a
float. But if you're learning linear algebra for the first time and
typing a matrix into the Sage notebook, typing 0.5 instead of 1/2
should not ruin the entire assignment without so much as a warning.

Dima Pasechnik

unread,
Apr 16, 2023, 7:47:26 AM4/16/23
to sage-...@googlegroups.com
perhaps the preparser should have an option to convert the floating
point input into
rationals.

>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/caa19202f0a1dc1eef92dbab1df231efb1f2741c.camel%40orlitzky.com.

Dima Pasechnik

unread,
Apr 16, 2023, 7:53:36 AM4/16/23
to sage-...@googlegroups.com
On Sun, Apr 16, 2023 at 12:25 AM Michael Orlitzky <mic...@orlitzky.com> wrote:
>
> On Sat, 2023-04-15 at 18:20 -0400, David Roe wrote:
> > I agree with William that you should refrain from insulting the Sage
> > developers, especially when the underlying problem comes from your
> > misunderstanding of how floating point arithmetic works.
>
> I was given this response many times, and I think it's condescending
> (no offense). My favorite permabug:
>
> sage: A = matrix([[-3, 2, 1 ],
> ....: [ 2,-4, 4 ],
> ....: [ 1, 2,-5 ]])
> sage: B = (2 * 0.5 * A)
> sage: B == A
> True
> sage: B.rank() == A.rank()
> False
>

that's a bug in B.rank() - in my book at least. It's perfectly
possible to either reliably derive that rank(B)=2,
as far as it makes sense in the given precision, or throw an error.
It's just not implemented.


> I promise you that I know what a floating point number is, and that
> they aren't the problem. The problem is that reasonable expectations
> are not met by the Sage user interface. It shouldn't be that easy to
> get the computer to lie to me. An explanation is of course possible,
> but only after the fact, and only with a level of knowledge about the
> preprocessor that is beyond most users.
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/c8531f5211ebfe590ed2b38390b33ef26df1d1eb.camel%40orlitzky.com.

Trevor Karn

unread,
Apr 16, 2023, 10:52:02 AM4/16/23
to sage-devel
I don't have much understanding of how floating point arithmetic works, but the argument 

>If you're writing python code, you should expect 2*0.5 to return a
>float. But if you're learning linear algebra for the first time and
>typing a matrix into the Sage notebook, typing 0.5 instead of 1/2
>should not ruin the entire assignment without so much as a warning.

seems important to me. I also sometimes worry that Sage seems to be written for users who are both experts in [insert mathematical field here] and Python. This prevents (a) mathematical experts who don't know python from using Sage and (b) more junior students from using sage. Maybe one can make the argument that mathematical experts should just learn python (although I would disagree), but I think (b) is a genuine problem. Mathematica doesn't seem to require the user to be expert in either math or Wolfram language (or maybe I haven't used it enough). I don't have an actionable complaint here, but it is something I encounter when helping folks around my math department with their code. I'm trying to keep it in mind as I develop code. 

I see that

sage: A = matrix([[-3, 2, 1 ],[ 2,-4, 4 ], [ 1, 2,-5 ]])
sage: B = (2*0.5*A)
sage: rank(B)
3

What about floating point arithmetic makes this break? I print (B) and see 

[-3.00000000000000  2.00000000000000  1.00000000000000]
[ 2.00000000000000 -4.00000000000000  4.00000000000000]
[ 1.00000000000000  2.00000000000000 -5.00000000000000]

which doesn't have any obvious reason to be not rank-2. Could someone ELI5?

It is also not unprecedented for an error to be thrown for the rank method. From the documentation:

      sage: m = matrix(Integers(4), 2, [2,2,2,2])

      sage: m.rank()

      Traceback (most recent call last):

      ...

      NotImplementedError: Echelon form not implemented over 'Ring of integers modulo 4'.

It seems reasonable to me that a similar NotImplementedError could be thrown for generic dense matrices. Is that an acceptable solution @Dima and @Michael? It seems easy enough to add a `.rank()` method in `matrix/matrix_generic_dense.pyx` which throws the error. I'd be happy to make this PR if that is reasonable to folks.

Dima Pasechnik

unread,
Apr 16, 2023, 11:22:01 AM4/16/23
to sage-devel


On Sun, 16 Apr 2023, 15:52 Trevor Karn, <trevor...@gmail.com> wrote:
I don't have much understanding of how floating point arithmetic works, but the argument 

>If you're writing python code, you should expect 2*0.5 to return a
>float. But if you're learning linear algebra for the first time and
>typing a matrix into the Sage notebook, typing 0.5 instead of 1/2
>should not ruin the entire assignment without so much as a warning.

seems important to me. I also sometimes worry that Sage seems to be written for users who are both experts in [insert mathematical field here] and Python. This prevents (a) mathematical experts who don't know python from using Sage and (b) more junior students from using sage. Maybe one can make the argument that mathematical experts should just learn python (although I would disagree), but I think (b) is a genuine problem. Mathematica doesn't seem to require the user to be expert in either math or Wolfram language (or maybe I haven't used it enough). I don't have an actionable complaint here, but it is something I encounter when helping folks around my math department with their code. I'm trying to keep it in mind as I develop code. 

I see that

sage: A = matrix([[-3, 2, 1 ],[ 2,-4, 4 ], [ 1, 2,-5 ]])
sage: B = (2*0.5*A)
sage: rank(B)
3

What about floating point arithmetic makes this break? I print (B) and see 

[-3.00000000000000  2.00000000000000  1.00000000000000]
[ 2.00000000000000 -4.00000000000000  4.00000000000000]
[ 1.00000000000000  2.00000000000000 -5.00000000000000]

which doesn't have any obvious reason to be not rank-2. Could someone ELI5?

probably because B.echelon_form() returns the identity matrix - obviously wrong too 


Oscar Benjamin

unread,
Apr 16, 2023, 1:03:09 PM4/16/23
to sage-...@googlegroups.com
On Sun, 16 Apr 2023 at 15:52, Trevor Karn <trevor...@gmail.com> wrote:
>
> I don't have much understanding of how floating point arithmetic works, but the argument
>
> >If you're writing python code, you should expect 2*0.5 to return a
> >float. But if you're learning linear algebra for the first time and
> >typing a matrix into the Sage notebook, typing 0.5 instead of 1/2
> >should not ruin the entire assignment without so much as a warning.
>
> seems important to me. I also sometimes worry that Sage seems to be written for users who are both experts in [insert mathematical field here] and Python. This prevents (a) mathematical experts who don't know python from using Sage and (b) more junior students from using sage. Maybe one can make the argument that mathematical experts should just learn python (although I would disagree), but I think (b) is a genuine problem. Mathematica doesn't seem to require the user to be expert in either math or Wolfram language (or maybe I haven't used it enough). I don't have an actionable complaint here, but it is something I encounter when helping folks around my math department with their code. I'm trying to keep it in mind as I develop code.

This hardly requires expert knowledge of Python but rather just the
understanding that 0.5 means a floating point number and then some
understanding of the implications of floating point numbers on
exactness. This is not uniquely a feature of Python by any stretch.
The use of a decimal point to denote approximate or floating point
numbers is the same in Mathematica, Maple, Maxima, Matlab, and not
just Python but most general purpose programming languages. You can
see the Mathematica and Maple docs explaining this here:

https://www.maplesoft.com/support/help/maple/view.aspx?path=float
https://reference.wolfram.com/language/howto/ChangeTheFormatOfNumbers.html

Select quotes from there:

> > > Enter an approximate real number by explicitly including a decimal point:
> > > The presence of a floating-point number in an expression generally implies that the computation will use floating-point evaluation.
> > > any calculation mixing exact and approximate numbers gives an approximate result

The fact that writing 0.5 (rather than 1/2) implies the use of
floating point which in turn implies rounding errors and inexactness
is certainly something that can surprise beginners. In the context of
Sage it could have been a valid choice not to do that and to have 0.5
mean an exact rational number when it appears as a literal in the
user's code. There is no difficulty in making that work any more than
making 1/2 return a rational number (since in Python itself 1/2 is a
float). This is a design choice though and using decimal literals for
floats is useful for anyone who actually wants to do calculations with
floats since it means that there is a succinct syntax for floats as
well as a separate syntax for exact rationals.

> I see that
>
> sage: A = matrix([[-3, 2, 1 ],[ 2,-4, 4 ], [ 1, 2,-5 ]])
> sage: B = (2*0.5*A)
> sage: rank(B)
> 3
>
> What about floating point arithmetic makes this break? I print (B) and see
>
> [-3.00000000000000 2.00000000000000 1.00000000000000]
> [ 2.00000000000000 -4.00000000000000 4.00000000000000]
> [ 1.00000000000000 2.00000000000000 -5.00000000000000]
>
> which doesn't have any obvious reason to be not rank-2. Could someone ELI5?

Arithmetic is needed to be able to compute the rank. With floats that
arithmetic is going to be approximate and will have some rounding
error. Knowing that the rank is 2 here requires knowing that the third
row is *exactly* equal to some linear combination of the first and
second but rounding errors will make the linear combinations not be
exactly equal.

It is basically impossible to compute the rank of a matrix using
inexact arithmetic since the tiniest rounding errors will make most
matrices appear to be full rank. That does not stop there being
floating point routines for computing the rank but they work
differently, are inherently heuristic and need thresholds as discussed
here:
https://numpy.org/doc/stable/reference/generated/numpy.linalg.matrix_rank.html

NumPy's matrix_rank function does give the expected answer here and is
probably a better implementation of this for machine precision:

sage: np.linalg.matrix_rank(B)
2

The Mathematica docs have a caveat about computing the rank of a
matrix of floats:
https://reference.wolfram.com/language/ref/MatrixRank.html
Under the "possible issues" section it says:

> > > Use machine arithmetic. Machine numbers cannot distinguish between 9 and 9 + 1e-20

It then shows an example in which the rank of a matrix of floats is
computed incorrectly. With that example NumPy fails for the same
reason:

sage: A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9 + 1e-20]])
sage: A
array([[1., 2., 3.],
[4., 5., 6.],
[7., 8., 9.]])
sage: np.linalg.matrix_rank(A)
2

It is not that the matrix_rank function gets anything wrong here but
the use of floats even means that 9 + 1e-20 is equal to 9 so the
matrix A already has the wrong values before matrix_rank is called.
The matrix_rank function returned the exact correct answer for the
inputs that it was given. This also shows why converting all floats to
rational and then computing the rank does not work in general. If the
matrix arrived at the rank function with floats in it then most likely
rounding errors have already happened. In that case computing the
exact rank based on the exact values of the floats in the matrix is
probably not what the user intended.

--
Oscar

Nils Bruin

unread,
Apr 16, 2023, 1:50:36 PM4/16/23
to sage-devel
On Sunday, 16 April 2023 at 04:40:50 UTC-7 Michael Orlitzky wrote:

It's reasonable to expect that multiplying by one won't cause a viable
alternative to Mathematica et al. to go bonkers. I haven't declared a
float variable, and I haven't type-cast anything to float. The
expression "0.5" is, a priori, quite equal to 1/2. Mathematica knows
it, my old Casio calculator knows it, and in fact, Sage knows it:

sage: 2 * 0.5 == ZZ(1)
True
 
That's a facetious example that sticks to values that can be exactly represented in binary floats. These identities don't hold generally:

sage: 49*(1.0/49) == ZZ(1)
False

(it depends on your working precision if rounding makes this come out as an exact match or not, but at 53 bits this one fails to give equality. In all cases, there are values where this breaks; necessarily)

 

aw

unread,
Apr 16, 2023, 5:31:43 PM4/16/23
to sage-devel
Awesome, let's talk about floating point semantics.

There are two main kinds, that I can see.

One is "ordinary person" semantics, the other is IEEE-like semantics.

Ordinary-person semantics is what you learned in grade school.

In grade school, suppose we want to add 1.1 to 2.8324, using pencil & paper.

Here's what we do:

   1.1000
+ 2.8324
--------------
= 3.9324

We zero-pad the 1.1 to whatever length is needed to match the other number.
Because we see 1.1 as a shorthand for 1.1000000000000....  (infinitely many zeros)

That's the ordinary-person semantics of the string "1.1".

IEEE-like semantics is a bit different. There, the interpretation of "1.1" is something like:

1.1000000 * 10^0

where the digits in the mantissa beyond the shown ones are unspecified and the general expectation is that for real-world quantities coming from physical measurements, those digits will be random-ish, especially once you get past the first unspecified one.

Ok, so when a user enters the expression 2*1.1 into Sage, which semantics do they want?

I claim that almost every user entering that expression wants the ordinary-person semantics, where 1.1 is a shorthand for 1.10000000... =11/10, exactly.

I claim those users continue to want the ordinary-person semantics even in expressions with transcendental quantities like e or pi. When they type e^1.1, they expect the 1.1 to be exactly 11/10, and if they ask for the value of e^1.1 to 500 decimal places, they expect every digit given to be correct.

I claim those users continue to want the ordinary-person semantics even in expressions where some other quantity has an inexact IEEE-type floating point value coming from a physical measurement, eg x=3.437621 * 10^-8. When they type 1.1*x, they are thinking of the 1.1 as exactly 11/10, but they also know the value of 1.1*x will be in IEEE floating point because x is, and of course the answer should be in the less precise format.

To meet the semantic expectations of the vast majority of users (I would guess well over 99%), Sage should interpret float literals using ordinary-person semantics, ie as rationals.

1.1 should mean 11/10, unless the user wants to override that by explicitly requesting the value to be an IEEE float or some other kind of float.

This is not some fancy technical issue, it's just common sense.

Remember that a big chunk of your users are beginners or students. The default behavior of Sage should be designed for them, not for experts. Experts can easily override default behavior to get whatever they want; beginners or students don't know how to do that, so the default semantics should be exactly what *they* expect.

Wolfram Alpha also has beginners and students as a big chunk of its user base, and they get the default semantics exactly right.

If you enter "100 digits of 1.1" into Alpha, the answer is the list [1,1,0,0,0,...,0], all zeros after the 1,1. That's the ordinary-person semantics.

i'm going to quote your own marketing materials back at you:

from: https://www.sagemath.org/library-stories.html
"Success Stories and Testimonials"

"SAGE is doing remarkably well at keeping a balance between ease-of-use for beginners and high-end users.  -David Kohel, 2008-11-03"

RealField is not doing this well. When a beginner wants to know e^1.1 to many decimal places, they enter something like this:

RealField(200)(e^1.1)

and they get this:

3.0041660239464333947978502692421898245811462402343750000000

where everything past the 16th digit is junk, crap, nonsense, not true.

Why aren't you embarrassed that for years you've been giving beginners and students these junk digits, looking for all the world as if they are the right ones? Those people put their trust in you to always give them the right answer, and here you are betraying that trust.

You don't need to change the semantics for your IEEE-type floats. You just need to acknowledge and respect the ordinary-person semantics for strings like "1.1" that users type in themselves.

One way to do that would be to adjust the pre-processing of Sage args to represent float literals like 1.1 as rationals like 11/10, unless overridden by the user.

I have a lot more to say on this topic, but for now I'll end there.

-aw

Nils Bruin

unread,
Apr 16, 2023, 6:44:56 PM4/16/23
to sage-devel
On Sunday, 16 April 2023 at 14:31:43 UTC-7 aw wrote:
Awesome, let's talk about floating point semantics. [...]

We zero-pad the 1.1 to whatever length is needed to match the other number.
Because we see 1.1 as a shorthand for 1.1000000000000....  (infinitely many zeros)

That's the ordinary-person semantics of the string "1.1".

That's not floating-point semantics. That's just a funny way of writing 11/10. Those are also not the semantics that scientific calculators use (or excel for that matter), so I suspect that "ordinary persons" are already quite used to 1.1 not meaning an exact  rational number but a possible approximation to something quite close to 11/10. In fact, I've seen people use this as an argument why standard units are more precise than metric: one would talk about 5/16 in, but that is only approximately 0.79375 cm (I'm not saying I'm convinced by the argument, but it is a good illustration that ordinary people are quite aware of rounding errors in decimal fraction notation)

The fact that computers often don't use base-10 floating point but base-2 floating point is a more subtle trap, but once you're realizing that doing floating point arithmetic quickly leads to not all your digits being accurate, those difference tend to only be in digits that you shouldn't be trusting anyway.

Perhaps you're interested in exploring RealLazyField. It actually attempts to provide a framework to compute with (some) real numbers exactly. I don't think it's ready to be the default just yet, if ever.

Edgar Costa

unread,
Apr 16, 2023, 6:56:05 PM4/16/23
to sage-...@googlegroups.com
It's reasonable to expect that multiplying by one won't cause a viable
alternative to Mathematica et al. to go bonkers. I haven't declared a
float variable, and I haven't type-cast anything to float. The
expression "0.5" is, a priori, quite equal to 1/2. Mathematica knows
it

I believe you misunderstand Mathematica semantics.

Mathematica 12.0.0 Kernel for Mac OS X x86 (64-bit)
Copyright 1988-2019 Wolfram Research, Inc.

In[1]:= 2*0.5 === 1
Out[1]= False
In[2]:= 2*1/2 === 1
Out[2]= True
In[3]:= 2*0.5 == 1
Out[3]= True
In[4]:= 3004166023946433/10^15 == E^1.1
Out[4]= True

Wolfram Alpha also has beginners and students as a big chunk of its user base, and they get the default semantics exactly right.


--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

aw

unread,
Apr 16, 2023, 10:46:32 PM4/16/23
to sage-devel
On Sunday, April 16, 2023 at 4:56:05 PM UTC-6 Edgar Costa wrote:

Wolfram Alpha also has beginners and students as a big chunk of its user base, and they get the default semantics exactly right.


Yeah, good catch, so even Wolfram screws up sometimes!

But that doesn't give you Sage devs a free pass to screw up as much as you want.
You and Wolfram should both be trying to eliminate all such mistakes, right?

-aw

aw

unread,
Apr 16, 2023, 10:46:57 PM4/16/23
to sage-devel
On Sunday, April 16, 2023 at 4:44:56 PM UTC-6 Nils Bruin wrote:
On Sunday, 16 April 2023 at 14:31:43 UTC-7 aw wrote:
Awesome, let's talk about floating point semantics. [...]

We zero-pad the 1.1 to whatever length is needed to match the other number.
Because we see 1.1 as a shorthand for 1.1000000000000....  (infinitely many zeros)

That's the ordinary-person semantics of the string "1.1".

That's not floating-point semantics. That's just a funny way of writing 11/10. Those are also not the semantics that scientific calculators use (or excel for that matter), so I suspect that "ordinary persons" are already quite used to 1.1 not meaning an exact  rational number but a possible approximation to something quite close to 11/10.

You've got to be kidding. The only people who would look at "1.1" and see a value different than 11/10 are programmers who are up to their necks in the details of floating point implementations. Everybody else in the world looks at "1.1" and sees the value 11/10.

As for the 11/10 interpretation not being "floating-point semantics": I don't care what you call it, it is one of Sage's big problems here. Call it float literal semantics, finite decimal string semantics, it doesn't matter what you call it. It is the right way to interpret "1.1", for 99%+ of Sage's users. So it's the interpretation you guys should be implementing as the default.

-aw 

aw

unread,
Apr 16, 2023, 10:47:15 PM4/16/23
to sage-devel
On Saturday, April 15, 2023 at 9:12:09 PM UTC-6 Nils Bruin wrote:
The design decision here to let the multiplication of a "RealLiteral" by a sage integer result in a "RealNumber" is because your use of RealLiteral was taken to signal intent to use floats over exact types. If you wanted exact results you could have used 2*11/10 .

Even in an expression using floats, when a user types "1.1", they intend that to mean 11/10, exactly. They don't expect it to get changed to a different number, as RealField does.

Here's what users expect when they type an expression into a higher-precision environment: they expect the answer to be the exact answer, truncated to the precision of that environment. Period. This is not negotiable. There is no latitude for the software to deviate from that.

When I type 2*1.1 into an environment with a precision of 200 bits, I expect the answer to be the exact answer truncated to 200 bits, or about 59 digits.

Here's what Sage gives me:

RealField(200)(2*1.1)
2.2000000000000001776356839400250464677810668945312500000000

How the heck can you defend that? Look at that junk!

This violates some of the most basic expectations of users. No user expects junk like that to be in their answer.

RealField is doing something completely unanticipated here, and worse, is doing it silently: it silently interprets "1.1" as the nearest 53-bit binary rational, which is not 1.1 but some other value. Hence the result is not 2.2, but some other value, a value with junk in it.

Users should never be handed junk.
And defaults should always be what most users expect (especially when it's 99%+ of users, as it is in the case).

-aw

aw

unread,
Apr 16, 2023, 10:47:31 PM4/16/23
to sage-devel
On Saturday, April 15, 2023 at 5:25:27 PM UTC-6 Michael Orlitzky wrote:
On Sat, 2023-04-15 at 18:20 -0400, David Roe wrote:
My favorite permabug:

sage: A = matrix([[-3, 2, 1 ],
....: [ 2,-4, 4 ],
....: [ 1, 2,-5 ]])
sage: B = (2 * 0.5 * A)
sage: B == A
True
sage: B.rank() == A.rank()
False
 
Unbelievable.
This is failing a basic consistency check: if x==y, then we should have f(x)==f(y) for any function f.

The problem here is that float literals are being mishandled. The string "0.5" should be interpreted as 1/2, unless the user overrides that.

Can you post the other "permabugs" you know of that involve float literals?

-aw

Nils Bruin

unread,
Apr 17, 2023, 12:42:01 AM4/17/23
to sage-devel
On Sunday, 16 April 2023 at 19:47:15 UTC-7 aw wrote:

Here's what users expect when they type an expression into a higher-precision environment: they expect the answer to be the exact answer, truncated to the precision of that environment. Period. This is not negotiable. There is no latitude for the software to deviate from that.

When I type 2*1.1 into an environment with a precision of 200 bits, I expect the answer to be the exact answer truncated to 200 bits, or about 59 digits.

 
Here's what Sage gives me:

RealField(200)(2*1.1)
2.2000000000000001776356839400250464677810668945312500000000

As far as I can see, when you specify 1.1, I'd expect about 2 digits, i.e., about 7 bits of precision. If you want 59 digits of precision then write it!
sage: a=1.1000000000000000000000000000000000000000000000000000000000
sage: RealField(200)(2*a)
2.2000000000000000000000000000000000000000000000000000000000

That's what I was taught in school about implying precision in decimal fractions, so the fact that sage does the same thing as what I was taught in school actually surprises me pleasantly here. I'm not talking university here. Just high school, physics, chemistry, basic maths etc.

We discussed before that the interpreter sees "2*1.1" before it ever gets near the RealField(200), so as far as "2*1.1" is concerned, there is no precision implied there other than the approximately 7 bits of precision in 1.1.

At this point I think it's clear that *your* expectation when  you write 1.1 is rational arithmetic. That's not what it means in python and that's not what it means in sage.

And defaults should always be what most users expect (especially when it's 99%+ of users, as it is in the case).

I think more than 1% of sage's user base received an education similar to mine (no value judgement in either direction implied!) so I don't agree with your 99% claim there.

If you prefer, you can make sage behave in the way you like by just including a little definition in your configuration file:

old_RealNumber=RealNumber
def RealNumber(*args,*kwargs):
    return QQ(old_RealNumber(*args,*kwargs))

With that you'll be good to go. The preparser will do exactly what you expect. There are always going to be differences in preference and expectation for what the user interface provides (by default). So rather than argue about tastes, perhaps just tailor your environment to your preferences.

Give it a try! Perhaps it works for you. Or perhaps you find it makes other things work poorly, in which case you may gain some insight why the design decision in sage was made as it is.

Nils Bruin

unread,
Apr 17, 2023, 1:46:16 AM4/17/23
to sage-devel
On Sunday, 16 April 2023 at 21:42:01 UTC-7 Nils Bruin wrote:

old_RealNumber=RealNumber
def RealNumber(*args,**kwargs):
    return QQ(old_RealNumber(*args,**kwargs))

Apologies, typos corrected in code above (note the double star in **kwargs)

Michael Orlitzky

unread,
Apr 17, 2023, 6:53:42 AM4/17/23
to sage-...@googlegroups.com
On Sun, 2023-04-16 at 10:50 -0700, Nils Bruin wrote:
>
> That's a facetious example that sticks to values that can be exactly
> represented in binary floats. These identities don't hold generally:
>
> sage: 49*(1.0/49) == ZZ(1)
> False
>
> (it depends on your working precision if rounding makes this come out as an
> exact match or not, but at 53 bits this one fails to give equality. In all
> cases, there are values where this breaks; necessarily)

I didn't make that example up to make a point, it's a real problem I
reported ~15 years ago in undergrad. And the fact that it's an easy
case only makes it *more* surprising that sage doesn't handle it.

We don't need a perfect solution to get it right most of the time. How
often does someone type 1/49 = 0.02040816326530612... into a matrix?
But Sage actually knows that number too:

$ sage -python
Python 3.11.2 (main, Mar 16 2023, 15:12:59) [GCC 12.2.1 20230121] on 
linux
Type "help", "copyright", "credits" or "license" for more 
information.
>>> from sage.all import *
>>> QQ(0.02040816326530612)
1/49


Michael Orlitzky

unread,
Apr 17, 2023, 7:02:50 AM4/17/23
to sage-...@googlegroups.com
On Sun, 2023-04-16 at 12:47 +0100, Dima Pasechnik wrote:
>
> perhaps the preparser should have an option to convert the floating
> point input into rationals.

We should try ZZ first, but yeah, something like that. If Sage thinks
QQ(x) == float(x), then the former should be preferred. Or maybe it
only needs to kick in when an exact value would be coerced to an
inexact one (with the goal of avoiding that). I don't think it would be
easy; there would be lots of fine-tuning and unintended consequences,
but none of that makes the complaint invalid.

John Cremona

unread,
Apr 17, 2023, 7:13:51 AM4/17/23
to sage-...@googlegroups.com
Even in the light of the detailed explanations already given, I do find it hard to see how these outputs are so different:

sage: RealField(200)(RR(11/10))
1.1000000000000000888178419700125232338905334472656250000000
sage: RealField(200)(RR(1.1))
1.1000000000000000000000000000000000000000000000000000000000

You can increase 200 to 1000: the second one gives all zeros (hooray!) while the first one gives those extra digits which are exactly 1/(5 * 2^51).  Both RR(11/10) and RR(1.1) have parent  Real Field with 53 bits of precision, where they compare equal, but pushing them into RealField(200) gives different results.  And the "better" result (all 0s) somes from the input with the literal 1.1, not from the "rational" input 11/10.   To me, that is mysterious.  Can someone explain?

John

PS It is a red herring to give examples using the equality relation ==, as that is a separate issue.  In Python == is not even associative.

On Sat, 15 Apr 2023 at 22:39, aw <aw.phon...@gmail.com> wrote:
Guys, this is serious. Any dev who starts reading this, give it a hard look, ok?

Below I give some examples where Sage give the wrong answer, with no error message or any indication that the answer is not correct.

Probably the single most important rule in software is, never ever give users the wrong answer.

Either give them the right answer, or give them no answer along with an error message explaining why the right answer can't be given.

Why the heck is Sage not following this rule?

After finding these examples, it's hard for me to trust any answer I get from Sage, unless I check it against other software, like Wolfram Alpha.

One possible response to my examples below is, I am not using Sage functions in an approved way.

But that's not a valid objection. A user should be able to send any argument to a function, and it's the responsibility of the programmer to make sure one of two things happen:


(a) the user is given the right answer;
or (b) the user is given no answer and an error message.

Sage is failing to do this.

On to the examples.

Example 1, straight from the sage RealField docs:

RealField(120)(RR(13/10))
1.3000000000000000444089209850062616

RR has default 53 bits, so this is really:

RealField(120)(RealField(53)(13/10))

Here's the same thing with less precision to make it clear that this has nothing to do with Python's single or double precision floats:


RealField(18)(RealField(12)(13/10))                     (A)

What should happen here is this:
RealField(12) should eval 13/10 as 1.30
RealField(12) should pass to RealField(18) only those digits, 1.30
RealField(18) should then zero-pad that to 18 bits, yielding 1.3000

Here's what sage does:

RealField(18)(RealField(12)(13/10))
1.2998

Let's check the types of all args:

type(13/10)
sage.rings.rational.Rational

type(18)    # type(12) is the same
sage.rings.integer.Integer

All args to RealField in (A) are Sage types.
Hence everything in the calculation (A) is under the complete control of Sage.
Hence Sage could easily do the right thing here.

But it doesn't.

You might say, this is an artificial example, passing the output of one RealField to another in a non-sanctioned way.
Ok then, let's just pass arguments to one RealField.
Can we get it to give wrong answers?

Yes, very easily, by passing it an arithmetic expression.

Sending in arithmetic expressions should be totally fine, because the RealField docs say: "This is a binding for the MPFR arbitrary-precision floating point library."

The purpose of MPFR is to be able to do basic arithmetic in arbitrary precision.
Basic arithmetic means any expression using the following ops (and maybe some others): +, -, *, /, ^  

So let's pass an arithmetic expression to RealField.
The whole point of MPFR is to always give the right answer for the value of such an expression, for any expression and any requested precision.
The docs say RealField is a binding for MPFR.
So RealField should only give right answers.
But in fact it sometimes gives the wrong answer.


Here's one:

RealField(200)(2*1.1)
2.2000000000000001776356839400250464677810668945312500000000

Let's check the types:

type(2)
<class 'sage.rings.integer.Integer'>

type(1.1)
<class 'sage.rings.real_mpfr.RealLiteral'>

type(2*1.1)
<class 'sage.rings.real_mpfr.RealNumber'>

Here the expression 2*1.1 is not only a Sage type, it's a Sage type designed to work with MPFR. Hence passing 2*1.1 to RealField should yield the right answer.

But it doesn't.

How the heck was this not noticed earlier?

I think I have a workaround for expressions such as 2*1.1: enter the 1.1 as a sage rational, 11/10:

RealField(200)(2*11/10)
2.2000000000000000000000000000000000000000000000000000000000

That works for this expression, but I worry that with other kinds of expression I could still get wrong answers.

By the way, in this example i used numbers where we know the answer in advance, so it's easy to see that the answer given is wrong.

Here's a very similar example where we don't know the answer in advance:

RealField(200)(e^(1.1))
3.0041660239464333947978502692421898245811462402343750000000

RealField(200)(e^(11/10))
3.0041660239464331120584079535886723932826810260162727621298

Wolfram Alpha says the second one is correct.

Let's check the types for the wrong one:

type(e)
<class 'sage.symbolic.constants_c.E'>

type(1.1)
<class 'sage.rings.real_mpfr.RealLiteral'>

type(e^(1.1))
<class 'sage.rings.real_mpfr.RealNumber'>

All are sage types, and the expression as a whole is a sage type designed to work with MPFR. Sage should give the right answer here, but it doesn't.

I like the idea of Sage. And I know the Sage devs must be solid professionals. But giving users junk answers makes Sage look bush-league.

The MPFR devs would never allow their library to give a junk answer.

Wolfram would never allow Mathematica or Alpha to give a junk answer.

Why the heck are the Sage devs allowing Sage to give junk answers?

-aw



--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Michael Orlitzky

unread,
Apr 17, 2023, 7:20:15 AM4/17/23
to sage-...@googlegroups.com
On Sun, 2023-04-16 at 19:46 -0700, aw wrote:
>
> Unbelievable.
> This is failing a basic consistency check: if x==y, then we should have
> f(x)==f(y) for any function f.
>
> The problem here is that float literals are being mishandled. The string
> "0.5" should be interpreted as 1/2, unless the user overrides that.
>
> Can you post the other "permabugs" you know of that involve float literals?
>

They're all variations on a theme. Floating point sucks, and Sage
inherits that:

sage: (0.1 + 0.2) + 0.3 == 0.1 + (0.2 + 0.3)
False

Whenever you perform an operation with two different types of "number,"
Sage essentially picks the worst of the two and performs the operation
there. Every example is basically,

1. Something returns a float where you don't expect it
2. You use that result with some other object
3. Now the other object is made of float
4. Nothing works in floating point

The conversion to float is consistent with all of the other intelligent
coercions that Sage can do (embedding ZZ into QQ, for example), so it
goes off silently. But floating point is a uniquely bad base ring for
most of Sage.

Dima Pasechnik

unread,
Apr 17, 2023, 7:30:48 AM4/17/23
to sage-...@googlegroups.com
for some tasks, at least, built-in RBF and CBF does wonders, e.g. for
the example
earlier in the thread:


sage: A = matrix([[-3, 2, 1 ],
....: [ 2,-4, 4 ],
....: [ 1, 2,-5 ]])
sage: B = (2 * 0.5 * A)
sage: B.rank() # bad
3
sage: matrix(RBF,B).rank() # right!
2
sage: type(matrix(RBF,B)) # this is not even using what's available in
arb for matrices
<class 'sage.matrix.matrix_generic_dense.Matrix_generic_dense'>

>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/44b14f57528ac06463e2294fa52cf3a661cc2cc0.camel%40orlitzky.com.

Emmanuel Briand

unread,
Apr 17, 2023, 8:04:00 AM4/17/23
to sage-...@googlegroups.com
Interesting. The two inputs have different types

type(RR(11/10))
<class 'sage.rings.real_mpfr.RealNumber'>

type(RR(1.1))
<class 'sage.rings.real_mpfr.RealLiteral'>

Then from the documention at  https://doc.sagemath.org/html/en/reference/rings_numerical/sage/rings/real_mpfr.html


class sage.rings.real_mpfr.RealLiteral

Bases: RealNumber

Real literals are created in preparsing and provide a way to allow casting into higher precision rings.


Shouldn't RR(11/10) have the same fate?


Emmanuel



Nils Bruin

unread,
Apr 17, 2023, 11:59:38 AM4/17/23
to sage-devel
On Monday, 17 April 2023 at 04:13:51 UTC-7 John Cremona wrote:
Even in the light of the detailed explanations already given, I do find it hard to see how these outputs are so different:

sage: RealField(200)(RR(11/10))
1.1000000000000000888178419700125232338905334472656250000000
sage: RealField(200)(RR(1.1))
1.1000000000000000000000000000000000000000000000000000000000

You can increase 200 to 1000: the second one gives all zeros (hooray!) while the first one gives those extra digits which are exactly 1/(5 * 2^51).  Both RR(11/10) and RR(1.1) have parent  Real Field with 53 bits of precision, where they compare equal, but pushing them into RealField(200) gives different results.  And the "better" result (all 0s) somes from the input with the literal 1.1, not from the "rational" input 11/10.   To me, that is mysterious.  Can someone explain?

That one is surprising to me as well. The mechanism is easy to uncover:

sage: type(RR(1.1))
<class 'sage.rings.real_mpfr.RealLiteral'>
 
I suspect it's because the literal 1.1 advertises its "default" precision as 53 (as you remark) and therefore the coercion system probably concludes that for the conversion R(1.1) no further action is required. Hence, the same literal is returned with the funny side-effect that you mention.

We can ruin this effect quite easily by specifying the literal in such a way that it reports a different parent:

sage: RealField(200)(RR(1.100000000000000))
1.1000000000000000888178419700125232338905334472656250000000

and now it occurs with a different "magic intermediate precision"":

sage: RealField(200)(RealField(54)(1.100000000000000))
1.1000000000000000000000000000000000000000000000000000000000

This is a rather complicated edge case that follows from the way literals are implemented. I'm not sure it's worth fixing; especially because the edge case is probably caused by the coercion framework: the implementation of RealLiteral probably doesn't even get a chance to do something.

One possible fix would be to change the reported parent of a RealLiteral. That might have performance consequences.

kcrisman

unread,
Apr 17, 2023, 12:02:46 PM4/17/23
to sage-devel
This has been an interesting discussion, because it brings up again the inherent tradeoffs present in writing general-purpose software.  We (not just Sage, but other similar systems) want something that is usable by those who need floating-point to do what it must do, as well as be intelligible to those who write 1/3 = 0.333 and assume it is correct.

Although the discussion is not about teaching, it seems that the worst-case scenario of users are in that context.  I think the compromise reached in Sage is acceptable for all teaching purposes I have ever had to encounter, for two reasons.

1) For many things, instructors need to provide an extra layer of sugar anyway.  I have a nice activity I stole from someone else for showing the effect of various matrices on simple graphics, and for students just seeing matrices for the first time, I am not going to expect them to program the whole thing yet.  In such cases, an instructor should understand at least the basics that can "go wrong" and shield users if necessary.  The rank example falls into this category - as soon as we are doing actual matrix computations "for realsies", students MUST be led to understand that there is numerical error inherent.  

Michael said, "But if you're learning linear algebra for the first time and typing a matrix into the Sage notebook, typing 0.5 instead of 1/2 should not ruin the entire assignment without so much as a warning."  Unfortunately, having taught this materials to complete beginners many time, I have to say that anything that goes beyond integer values to learn the terminology should be mentioning this issue in class, even in an Excel-based math for business course (for example).  What level we go to in explaining why they should be careful will depend on their background, but explaining that Excel (or Sage, or whatever) has this limit is very important, even if we then say, "our examples will be selected to avoid this in our beginners' course, but be sure to work with your quantitative people/statisticians if you get weird answers."  Maybe that means Nils' allusion is right and rank needs to be disabled (or yield error) on non-exact matrices?  I'd be loathe to do that, I'd rather have to deal with the issue in class (and have done so, answering exactly these kinds of questions.).  I have to talk about NOT using Excel to get determinants for this reason :-)


2) For other things, anyone using them should know the conventions.  The RealField(120)(RR(13/10)) is an example of this - no one who doesn't know what RealField is doing should be put in a position to even worry about this.  This particular example was discussed very fully above, but my point on it is that if someone encounters this before a numerical analysis course (in which case it is expected behavior), then someone is asking students to do a computation in a way that isn't appropriate for that course.  (That doesn't mean students shouldn't see this kind of thing earlier - there are some very nice examples with things like limits [1] which are very appropriate to talk about in calculus, but we don't talk about floating-point precision in detail, just that we have to be careful "trusting" the computer instead of using analytical tools.)


I used to feel more like the OP about " A user should be able to send any argument to a function, and it's the responsibility of the programmer to make sure one of two things happen ..." but when I found out why PARI does not return an error message when you ask for the primitive root of a number that doesn't have a primitive root (namely, efficiency in avoiding the process of checking for bad input), I modified that to take into account who the intended end users are.  Anyone using RealField(120) should be aware of these issues.  It's possible that the conventions are wrong, but they are spelled out.

Nils Bruin

unread,
Apr 17, 2023, 12:05:16 PM4/17/23
to sage-devel
On Monday, 17 April 2023 at 05:04:00 UTC-7 Emmanuel Briand wrote:
Real literals are created in preparsing and provide a way to allow casting into higher precision rings.

Shouldn't RR(11/10) have the same fate?

No, because RR(11/10) is not a literal. In principle 11/10 could be a "rational literal" but we don't have that in sage: rational numbers have no benefit. By the time you apply a conversion to it (and in this case the conversion needs to do something!) it's definitely not a literal any more.

If you want a "real number" 11/10 that has these kinds of exact properties, RealLazyField()(11/10) is probably the way to go, but that field will have quite a different performance profile than RR does.

Dima Pasechnik

unread,
Apr 17, 2023, 12:14:12 PM4/17/23
to sage-devel
I am beginning to think that more ball/interval arithmetic should be involved here, in particular as dealing with RealField  is not much faster than RealBallField with the same precision, yet allows a degree of control over the error propagation.
(and perhaps the default 53 bits out to be bumped up to something more appropriate for the hardware in  2023).

If one wants fast floats, then RDFs may be used.

Dima


--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

aw

unread,
Apr 17, 2023, 5:12:43 PM4/17/23
to sage-devel
To the folks defending RealField's behavior: the reasons you cite for why it's acceptable would also apply to low-precision environments like single or double precision.

But very high precision environments have a different purpose than those low-precision environments. So you shouldn't be thinking about them in the same way as low-precision environments.

In this post I'm going to highlight two critical things that are specific to very high precision environments, which I think are getting overlooked. I'll label them A and B.

(A), exact vs inexact:

The purpose of a very high precision environment like RealField(200) is to compute with exact quantities - some combination of ints, rationals, exact rationals like sqrt(2), or exact transcendentals like pi.

It would be nuts to pass RealField(200) an inexact quantity x that is knowable only to a much smaller number of bits, for example 23 bits, because you can never get more than 23 good bits from any expression involving x. There's no point in asking for anything more than 23, so it would be crazy and wrong to ask for 200.

In other words, when used correctly, very high precision environments like RealField(200) are for computing exact quantities, not approximate floating point quantities.

They are for exact computation, not inexact computation.

The answer they give is the exact answer truncated to a finite precision that we specify.

This is completely different from an inexact computation limited to a finite precision by the inexactness of its inputs.

I think you guys may have failed to appreciate this fundamental distinction, and that's what led you to mishandle float literals in RealField's input.

You guys were thinking, RealField is for inexact floating point, so we should treat the float literal as an inexact floating point quantity. What you should have been thinking is, RealField is for exact values that we will truncate at the end, so we should treat the float literal as an exact rational.

By the way, if you want some corroboration for my claim that very high precision environments are for exact computation, just look at how they are implemented.

The implementation of a very high precision environment will use integers *only*, so that the calculation can be exact. This says the environment is intended for doing exact computation with exact quantities.

Contrast that with the implementation of a lower precision environment like single or double precision: this will use machine floats, because the calculation doesn't need to be exact. These environments are intended for doing inexact compution with inexact quantities.


(B), required behavior:

I'm gonna get math-y for this one.

Let X be a high precision environment.
We can see X as a function which takes an input expression S along with a parameter P specifying the desired precision for the output. It produces an output value V, with that precision.

I claim that for X to be working correctly, the input-output pairs (S,V) must satisfy the following condition:
 
V = exact value of S truncated to precision P

An easy corollary of this is the following:

V is independent of the internal representation used by X

Another way to put that is, the internal representation used by X should never be visible in X's output.

This is required behavior. Implementations are not allowed to deviate from this. Any deviation is a mistake.


Ok, that was two critical things, (A) and (B), about very high precision environments.

Now let's re-visit the 2*1.1 example:

RealField(200)(2*1.1)
2.2000000000000001776356839400250464677810668945312500000000

There are two separate problems with this, corresponding to (A) and (B):

(1) the literal 1.1 is not treated as the exact rational 11/10
(2) the values of the junk digits in the middle of the output depend on the internal representation

This says there are at least two separate problems in RealField, as a very high precision environment:

(1') RealField can mishandle float literals passed to it
(2') RealField can wrongly allow its implementation details to get into its output

RealField's behavior doesn't meet the requirements for a very high precision environment.

I'm scratching my head trying to understand why you guys are defending RealField's behavior, instead of wanting to fix it.

Here's a question that might help: what kind of user would look at the output  of RealField(200)(2*1.1) and say, "Yeah, that's awesome, that's exactly what I was looking for! Thanks, Sage!"

Describe the kind of user who would say that.

And then ask yourself, what percentage of Sage users are like that?

How about this: what percentage of beginner or student Sage users are like that?

-aw

aw

unread,
Apr 17, 2023, 5:58:49 PM4/17/23
to sage-devel
fixing a typo: it should be "exact irrationals like sqrt(2)"

Nils Bruin

unread,
Apr 17, 2023, 6:05:56 PM4/17/23
to sage-devel
On Monday, 17 April 2023 at 14:12:43 UTC-7 aw wrote:
(A), exact vs inexact:

The purpose of a very high precision environment like RealField(200) is to compute with exact quantities - some combination of ints, rationals, exact rationals like sqrt(2), or exact transcendentals like pi.
 But you WON'T be computing with exact quantities in RealField(200) unless your number can be expressed as +- an unsigned 200-bit integer times a power of two, so you're very fundamentally using the wrong tool if you're interested in exact computation.

If you're interested in exact arithmetic then do so! Go for symbolic in SR or go for RealLazyField. The ideas behind RealLazyField come a lot closer to what you seem to want: it uses (fast) floating point approximations when that allows it to decide what is asked and keeps symbolic relations at hand to use when those are required. Of course, bear in mind that not all questions are decidable symbolically; much less so by a specific implementation.

The main place where I know high-precision computation really finds an application is in transcendental computations where, for some reason, you know the number comes out algebraic. With very high precision computation you can then recognize the algebraic number. See for instance https://github.com/nbruin/RiemannTheta.

A baby example:

sage: pi200=RealField(200).pi()
sage: cos(pi200/13).algdep(8)
64*x^6 - 32*x^5 - 80*x^4 + 32*x^3 + 24*x^2 - 6*x - 1

This shows that cos(pi200/13) is remarkably close to satisfying a sextic equation with remarkably small coefficients (and the routine found a sextic equation even though it was asked to find one of degree at most 8). Doing this example with 53 bits of precision does not yield enough bits to recover the sextic relation given here.

---
Concerning your literal example: Just write 2*RealField(200)(1.1) and you're good to go. You're still not representing 11/5 exactly because RealField(200) does not admit an exact representative,  but at least you'll have all the bits you can.

By writing RealField(200)(2*1.1) you're forcing python's hand to choose a precision, because 2*1.1 is NOT going to be a literal anymore. If you're not interested in getting floating point approximations then don't involve RealField(200).

Nils Bruin

unread,
Apr 17, 2023, 6:21:51 PM4/17/23
to sage-devel
On Monday, 17 April 2023 at 14:58:49 UTC-7 aw wrote:
fixing a typo: it should be "exact irrationals like sqrt(2)"

Ah, NOW I see what the problem likely is: I think you're misled by the field being called "RealField". So ... yeah ... it's not. It should probably be called RealFloats or something like that. This has come up before, but would be a painful change to make : https://groups.google.com/g/sage-devel/c/ba9u_As3T4s/m/oXNgGozmAQAJ

With modern developments like RealLazyField, eventually sage *may* actually have something that actually models computations in the real numbers. If you're interested in algebraic numbers you can already have them with AlgebraicField() and AlgebraicRealField() -- those actually work pretty well. Do be aware that testing equality in such fields may be unexpectedly expensive (but since in floats testing for equality is generally nonsensical, that's probably still a win).

These things are still rather experimental, though, so probably not quite ready to be defaults in SageMath.

Note that "real literals" like "1.1" are not suitable vehicles to enter any interesting exact real number anyway: you can';t specify sqrt(2) exactly in that way. So you end up writing things like AA(2).sqrt(). Or AA(sqrt(2)) (the latter making intermediate use of SR). So I'm not so sure that decimal literals would coerce to exact decimal fractions even in a system where a "true" real field is available: quick and dirty floating point is historically just something that a lot of people expect to be available and people might start complaining if arithmetic with 1.1 starts blowing up in memory use.

aw

unread,
Apr 17, 2023, 7:39:01 PM4/17/23
to sage-devel
On Monday, April 17, 2023 at 4:05:56 PM UTC-6 Nils Bruin wrote:
 But you WON'T be computing with exact quantities in RealField(200) unless your number can be expressed as +- an unsigned 200-bit integer times a power of two, so you're very fundamentally using the wrong tool if you're interested in exact computation.

If properly implemented, it can emulate exact computation followed by a truncation to finite precision.

When I say a very high precision environment is for doing exact computation, I don't mean that it should handling infinite digit strings. I mean that the input-output function for such an environment should be identical to the input-output function of a hypothetical environment that *does* do the full exact computation and then truncates the answer to the requested precision.

In other words, a very high precision environment should emulate an environment that does the full exact computation and truncates the result to finite precision before giving it to the user. And, this emulation should be perfect.

That is the behavior that a very high precision environment is required to have.

For input expressions that don't have float literals in them, it looks like RealField does have the required behavior!

Example:

RealField(200)(sqrt(2))
1.4142135623730950488016887242096980785696718753769480731767

actual digits, from https://apod.nasa.gov/htmltest/gifcity/sqrt2.1mil:
1.41421356237309504880168872420969807856967187537694807317667...

RealField here gives the exact value of sqrt(2) rounded to 59 digits, which is 196 bits.

196 bits is slightly under the requested 200, but let's no get sidetracked by that here.

By the way, when I say "truncated", I mean any acceptable truncation method. Rounding is one such method.

The main point is that based on this and other similar examples involving algebraic irrationals or transcendentals like e and pi, RealField looks like it is trying hard to be a good high precision environment.

But for some input expressions, it fails to have the required behavior. It gives an output that is not any kind of truncation of the exact value.

Here's a question: if in pre-processing we interpret all float literals as rationals, as Dima Pasechnik suggested could be an option in an earlier post, do these failures all disappear? Does RealField then fully meet the behavior requirements for a very high precision environment?

If that one tiny change would make it a good environment, whose answers are always the exact answers truncated to the requested precision, then why the heck would you not make that change?

It would give you something more useful than you have now, at no cost.

-aw

 

aw

unread,
Apr 17, 2023, 7:39:01 PM4/17/23
to sage-devel
On Monday, April 17, 2023 at 4:05:56 PM UTC-6 Nils Bruin wrote:
sage: pi200=RealField(200).pi()
sage: cos(pi200/13).algdep(8)
64*x^6 - 32*x^5 - 80*x^4 + 32*x^3 + 24*x^2 - 6*x - 1

This shows that cos(pi200/13) is remarkably close to satisfying a sextic equation with remarkably small coefficients
 
Very cool fact, thanks!

-aw 

Nils Bruin

unread,
Apr 17, 2023, 8:24:13 PM4/17/23
to sage-devel
On Monday, 17 April 2023 at 16:39:01 UTC-7 aw wrote:
If properly implemented, it can emulate exact computation followed by a truncation to finite precision.

When I say a very high precision environment is for doing exact computation, I don't mean that it should handling infinite digit strings. I mean that the input-output function for such an environment should be identical to the input-output function of a hypothetical environment that *does* do the full exact computation and then truncates the answer to the requested precision.

In other words, a very high precision environment should emulate an environment that does the full exact computation and truncates the result to finite precision before giving it to the user. And, this emulation should be perfect.

Unfortunately, not for a fixed precision. If you're working with 200 bits precision, then you cannot express the roots of
x^2-3 and x^2-(3+2^-200) as different floating point numbers. They are too close for that. If you know that two numbers are roots of polynomials with integer coefficients, you can compute a lower bound (in terms of the coefficients) on their difference if they are distinct at all. With such a bound you can indeed conclude the numbers are equal if their approximations to the required accuracy agree. This has been implemented in AlgebraicField. So if you're interested in exact arithmetic with algebraic numbers you should use that. To some extent these techniques can be extended to certain transcendental number as well. This is more experimental, but an implementation is available in RealLazyField.

Fixed precision floating point is fundamentally unable to do this: the number of real numbers it can represent in the interval [1,2] is bounded (by the set precision), whereas that interval contains infinitely many rational, algebraic, and transcendental real numbers. So there are pairs of numbers in there it won't be able to represent with distinct representations.

(of course computers have only finite memory so they can only represent finitely many numbers as well, so things like AlgebraicField() will run out of memory for some computations. But the bounds floating point sets are much more rigid).

David Roe

unread,
Apr 17, 2023, 11:44:36 PM4/17/23
to sage-...@googlegroups.com
It's certainly not no cost.  Here are several problems with this proposal:
* If you simply interpret float literals as rationals, they will print as rationals.  I would find this extremely surprising and annoying; it makes Sage more difficult to use as a basic calculator (I use Sage for my taxes for example).
* Something that Nils has alluded to a bit is performance.  The above issue could be resolved with some work by making a new kind of real field parent that stored elements internally as rational numbers (at least until you applied a transcendental function) but printed as decimals.  But it would then be susceptible to coefficient blowup, which can make some computations with rationals (especially in linear algebra) drastically slower than computations with floating point values.  Hiding the fact that real literals are implemented as rationals means that even people who are aware of such issues wouldn't be expecting.  And you still face similar problems as soon as you apply a transcendental function, unless you go fully over to the symbolic ring.

So I think a concrete version of Dima's suggestion would be
1. Implement a new field that stores elements internally as rationals, printing as decimals to some precision (you could get a higher precision by casting explicitly to RealField(prec), or by using .n(prec)).
2. Real literals would land in this field, which would have coercion maps from Z and Q so arithmetic would be exact under the hood.
3. Coercion would go to the symbolic ring, so e^1.1 would live in the symbolic ring (rather than RealField(53) where it lives now).  For lack of annoyance it should probably still print as a 53-bit decimal, but you could cast it to RealField(200) and it would give the correct result.  I guess other transcendental functions (log, trig) would also land in the symbolics.
4. Live with the performance implications.  As Dima says, you can use RDF if you want fast floating point arithmetic.  But it's already the case that the symbolic ring is a performance trap for unwary users, and I'm not sure it's a good idea encouraging more use of it implicitly.

This is a nontrivial project, though not unreasonable.  I'm certainly not going to do it.  aw, if there was general agreement on an approach like this (I'm not sure if I think it's a good idea or not, but I could be convinced) would you be interested in implementing it?
David



-aw

 

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Nils Bruin

unread,
Apr 18, 2023, 12:11:03 AM4/18/23
to sage-devel
On Monday, 17 April 2023 at 20:44:36 UTC-7 David Roe wrote:

So I think a concrete version of Dima's suggestion would be
1. Implement a new field that stores elements internally as rationals, printing as decimals to some precision (you could get a higher precision by casting explicitly to RealField(prec), or by using .n(prec)).
2. Real literals would land in this field, which would have coercion maps from Z and Q so arithmetic would be exact under the hood.
3. Coercion would go to the symbolic ring, so e^1.1 would live in the symbolic ring (rather than RealField(53) where it lives now).  For lack of annoyance it should probably still print as a 53-bit decimal, but you could cast it to RealField(200) and it would give the correct result.  I guess other transcendental functions (log, trig) would also land in the symbolics.
4. Live with the performance implications.  As Dima says, you can use RDF if you want fast floating point arithmetic.  But it's already the case that the symbolic ring is a performance trap for unwary users, and I'm not sure it's a good idea encouraging more use of it implicitly.

We don't need to implement something new for this. AlgebraicField() does this for algebraic (and hence rational) numbers and RealLazyField tries to do this for a reasonable class of computable real numbers. AlgebraicRealField is not a reasonable field to use for a default "real field" because it misses common transcendental elements. But RealLazyField might be. I'm not so sure floating point literals should land in it, though, because to me writing a number as a decimal fraction implies it is meant as an imprecise quantity (and the number of digits written is an indication of how many significant digits it has). I'm pretty sure I'm not alone in that expectation, so there is going to be a subset of users who would be upset that their floats end up not being floats any more. I don't think there are interesting elements in RealLazyField that could be easily constructed from a decimal fraction that can't easily be constructed in a different way and a lot of basic elements (like 1/3) cannot be constructed via decimal fractions, so I don't think using decimal fraction literals for RealLazyField elements makes sense. You'd teach people antipatterns that will paint them in a corner fairly soon.

It's certainly reasonable to not call a floating point field "RealField". C and python don't even do that: they call such elements floats. I'm less sure whether such a RealFloats field should be any less prominent than it is now. Plus the whole renaming transition would be super painful.

William Stein

unread,
Apr 18, 2023, 12:31:17 AM4/18/23
to sage-...@googlegroups.com
On Mon, Apr 17, 2023 at 9:11 PM Nils Bruin <nbr...@sfu.ca> wrote:

It's certainly reasonable to not call a floating point field "RealField". C and python don't even do that: they call such elements floats. I'm less sure whether such a RealFloats field should be any less prominent than it is now. Plus the whole renaming transition would be super painful.

 I called it "RealField" in Sage for compatibility with Magma:


Dima Pasechnik

unread,
Apr 18, 2023, 9:25:07 AM4/18/23
to sage-...@googlegroups.com
On Mon, Apr 17, 2023 at 3:47 AM aw <aw.phon...@gmail.com> wrote:
>
> On Saturday, April 15, 2023 at 9:12:09 PM UTC-6 Nils Bruin wrote:
>
> The design decision here to let the multiplication of a "RealLiteral" by a sage integer result in a "RealNumber" is because your use of RealLiteral was taken to signal intent to use floats over exact types. If you wanted exact results you could have used 2*11/10 .
>
>
> Even in an expression using floats, when a user types "1.1", they intend that to mean 11/10, exactly. They don't expect it to get changed to a different number, as RealField does.
>
> Here's what users expect when they type an expression into a higher-precision environment: they expect the answer to be the exact answer, truncated to the precision of that environment. Period. This is not negotiable. There is no latitude for the software to deviate from that.

No. A floating point environment is always fragile, due to
accumulation of rounding errors.
If you want to make sure you have control over error propagation, you
need ball arithmetic (RBF, RealBallField, in Sage), or interval
arithmetic (RIF in Sage).

>
> When I type 2*1.1 into an environment with a precision of 200 bits, I expect the answer to be the exact answer truncated to 200 bits, or about 59 digits.

How about 1.999999999999999999999999....99 ?
Would you also fall off your high horse in surprise?

>
> Here's what Sage gives me:
>
> RealField(200)(2*1.1)
> 2.2000000000000001776356839400250464677810668945312500000000
>
> How the heck can you defend that? Look at that junk!

that's what a system which does eager evaluation (e.g. Sage/Python)
will give you - assuming you accept that
2*1.1 is a floating point number. At the point when 2*1.1 is computed,
it does not know you want precision higher than the default.

If you want to delay evaluation of 2*1.1, you better use a compiler,
not an interpreter, or
an interpreter with lazy evaluation (note that both of these options
are slower to run, and harder to implement). Indeed,
a compiler would be able to see the precision needed here; lazy
evaluation would start with
"RealField(200)(...)" and (hopefully) propagate the precision to the
argument, "2*1.1".

>
> This violates some of the most basic expectations of users. No user expects junk like that to be in their answer.
>
> RealField is doing something completely unanticipated here, and worse, is doing it silently: it silently interprets "1.1" as the nearest 53-bit binary rational, which is not 1.1 but some other value. Hence the result is not 2.2, but some other value, a value with junk in it.
>
> Users should never be handed junk.
> And defaults should always be what most users expect (especially when it's 99%+ of users, as it is in the case).
>
> -aw
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/907a5fe5-8ee1-49b0-a9a1-13d5b84f3145n%40googlegroups.com.

Dima Pasechnik

unread,
Apr 18, 2023, 9:38:46 AM4/18/23
to sage-...@googlegroups.com
On Mon, Apr 17, 2023 at 10:12 PM aw <aw.phon...@gmail.com> wrote:
>
> To the folks defending RealField's behavior: the reasons you cite for why it's acceptable would also apply to low-precision environments like single or double precision.
>
> But very high precision environments have a different purpose than those low-precision environments. So you shouldn't be thinking about them in the same way as low-precision environments.
>
> In this post I'm going to highlight two critical things that are specific to very high precision environments, which I think are getting overlooked. I'll label them A and B.
>
> (A), exact vs inexact:
>
> The purpose of a very high precision environment like RealField(200) is to compute with exact quantities - some combination of ints, rationals, exact rationals like sqrt(2), or exact transcendentals like pi.
>
> It would be nuts to pass RealField(200) an inexact quantity x that is knowable only to a much smaller number of bits, for example 23 bits, because you can never get more than 23 good bits from any expression involving x. There's no point in asking for anything more than 23, so it would be crazy and wrong to ask for 200.
>
> In other words, when used correctly, very high precision environments like RealField(200) are for computing exact quantities, not approximate floating point quantities.
>
> They are for exact computation, not inexact computation.
>
> The answer they give is the exact answer truncated to a finite precision that we specify.
>
> This is completely different from an inexact computation limited to a finite precision by the inexactness of its inputs.
>
> I think you guys may have failed to appreciate this fundamental distinction, and that's what led you to mishandle float literals in RealField's input.
>
> You guys were thinking, RealField is for inexact floating point, so we should treat the float literal as an inexact floating point quantity. What you should have been thinking is, RealField is for exact values that we will truncate at the end, so we should treat the float literal as an exact rational.
>
> By the way, if you want some corroboration for my claim that very high precision environments are for exact computation, just look at how they are implemented.
>
> The implementation of a very high precision environment will use integers *only*, so that the calculation can be exact. This says the environment is intended for doing exact computation with exact quantities.


Hardware floats are no less exact than hardware integers. Arithmetic
on hardware integers
suffers from the same inevitable loss of precision or overflow,
and in fact nowadays computations with
integer data is in some cases done with hardware floats - they are
often much faster, as they have parallel hardware capabilities.

>
> Contrast that with the implementation of a lower precision environment like single or double precision: this will use machine floats, because the calculation doesn't need to be exact. These environments are intended for doing inexact compution with inexact quantities.
>
>
> (B), required behavior:
>
> I'm gonna get math-y for this one.
>
> Let X be a high precision environment.
> We can see X as a function which takes an input expression S along with a parameter P specifying the desired precision for the output. It produces an output value V, with that precision.
>
> I claim that for X to be working correctly, the input-output pairs (S,V) must satisfy the following condition:
>
> V = exact value of S truncated to precision P
>
> An easy corollary of this is the following:
>
> V is independent of the internal representation used by X
>
> Another way to put that is, the internal representation used by X should never be visible in X's output.
>
> This is required behavior. Implementations are not allowed to deviate from this. Any deviation is a mistake.
>
>
> Ok, that was two critical things, (A) and (B), about very high precision environments.
>
> Now let's re-visit the 2*1.1 example:
>
> RealField(200)(2*1.1)
> 2.2000000000000001776356839400250464677810668945312500000000
>
> There are two separate problems with this, corresponding to (A) and (B):
>
> (1) the literal 1.1 is not treated as the exact rational 11/10
> (2) the values of the junk digits in the middle of the output depend on the internal representation
>
> This says there are at least two separate problems in RealField, as a very high precision environment:
>
> (1') RealField can mishandle float literals passed to it
> (2') RealField can wrongly allow its implementation details to get into its output
>
> RealField's behavior doesn't meet the requirements for a very high precision environment.
>
> I'm scratching my head trying to understand why you guys are defending RealField's behavior, instead of wanting to fix it.

you cannot get the behavior you want with eager evaluation, full stop.

>
> Here's a question that might help: what kind of user would look at the output of RealField(200)(2*1.1) and say, "Yeah, that's awesome, that's exactly what I was looking for! Thanks, Sage!"
>
> Describe the kind of user who would say that.
>
> And then ask yourself, what percentage of Sage users are like that?
>
> How about this: what percentage of beginner or student Sage users are like that?
>
> -aw
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/be6cf0f0-cd47-43dc-816a-987a93759d5bn%40googlegroups.com.

aw

unread,
Apr 18, 2023, 2:33:38 PM4/18/23
to sage-devel
On Monday, April 17, 2023 at 6:24:13 PM UTC-6 Nils Bruin wrote:
On Monday, 17 April 2023 at 16:39:01 UTC-7 aw wrote:
If properly implemented, it can emulate exact computation followed by a truncation to finite precision.

When I say a very high precision environment is for doing exact computation, I don't mean that it should handling infinite digit strings. I mean that the input-output function for such an environment should be identical to the input-output function of a hypothetical environment that *does* do the full exact computation and then truncates the answer to the requested precision.

In other words, a very high precision environment should emulate an environment that does the full exact computation and truncates the result to finite precision before giving it to the user. And, this emulation should be perfect.

Unfortunately, not for a fixed precision. If you're working with 200 bits precision, then you cannot express the roots of
x^2-3 and x^2-(3+2^-200) as different floating point numbers. They are too close for that.

In that case the environment sees the two roots as having the same value, which to 200 bits may be absolutely correct. In that case the environment *is* perfectly emulating an exact calculation truncated or rounded to 200 bits, as it is required to do.

I think your comment here is based on a subtle mistake in thinking about the precision parameter. You're thinking of it as the precision of the internal calculations in the implementation of the environment. In effect, you're assuming a certain kind of implementation, something like a fixed grid of binary fractions. But the environment can use any implementation, not just that one.

For general environments, the right way to think of the precision parameter is, it's the required precision in the output. The internals of the environment are free to work in any precision, including infinite precision. Integer calculations are infinite precision. If the environment internals use only integer math to evaluate an expression and then convert the resulting value to finite precision to serve to the user, the only place the precision parameter was used was for formatting the output. The calculation itself was independent of the precision parameter.

Using infinite precision internally is ideal and should always be preferred. For expressions using only the operations *,/,+,- and only rational arguments, you can treat the rationals as pairs of ints and just do integer math, which is infinite precision. But for other expressions this isn't possible, eg a non-square integer to the power 1/2. In that case you have to use some kind of series expansion evaluated to only finitely many terms, or some kind of iterated map that converges to the right answer, but you only iterate it a finite number of times. This is working to only finite precision. In effect you choose an internal precision, by choosing the number of terms to evaluate in the expansion, or the number of times to iterate the map. You choose this internal precision to be *higher* than the user's requested precision, so that you can round your internal result to the user's requested precision and it is correct to that precision.
 
On Monday, April 17, 2023 at 6:24:13 PM UTC-6 Nils Bruin wrote:
Fixed precision floating point is fundamentally unable to do this: the number of real numbers it can represent in the interval [1,2] is bounded (by the set precision), whereas that interval contains infinitely many rational, algebraic, and transcendental real numbers. So there are pairs of numbers in there it won't be able to represent with distinct representations.

For each value of the precision parameter, there are only finitely many distinct numbers that the environment can output. But taking the union over all precision parameters, the environment can output infinitely many numbers - in fact, a dense subset of the rationals, hence a dense subset of the reals.

The fact that this set is dense in the reals is the whole basis for using finite precision strings of digits as arbitrarily good approximations to arbitrary real numbers!

Specify enough digits, and you can approximate a real number as closely as you like.

Set the precision of your environment high enough, and you can approximate exact real quantities as closely as you like.

RealField(200) alone can't do that.
But RealField() can.

And I think it does, except when I pass it an expression with a float literal, because in that case you guys change the literal's value to a 53-bit approximation, and that trashes, for all n>53, the n-bit approximation that RealField(n) is supposed to give us.

-aw
 

 

Dima Pasechnik

unread,
Apr 18, 2023, 3:01:03 PM4/18/23
to sage-devel


On Tue, 18 Apr 2023, 19:33 aw, <aw.phon...@gmail.com> wrote:


On Monday, April 17, 2023 at 6:24:13 PM UTC-6 Nils Bruin wrote:
On Monday, 17 April 2023 at 16:39:01 UTC-7 aw wrote:
If properly implemented, it can emulate exact computation followed by a truncation to finite precision.

When I say a very high precision environment is for doing exact computation, I don't mean that it should handling infinite digit strings. I mean that the input-output function for such an environment should be identical to the input-output function of a hypothetical environment that *does* do the full exact computation and then truncates the answer to the requested precision.

In other words, a very high precision environment should emulate an environment that does the full exact computation and truncates the result to finite precision before giving it to the user. And, this emulation should be perfect.

Unfortunately, not for a fixed precision. If you're working with 200 bits precision, then you cannot express the roots of
x^2-3 and x^2-(3+2^-200) as different floating point numbers. They are too close for that.

In that case the environment sees the two roots as having the same value, which to 200 bits may be absolutely correct. In that case the environment *is* perfectly emulating an exact calculation truncated or rounded to 200 bits, as it is required to do.

I think your comment here is based on a subtle mistake in thinking about the precision parameter. You're thinking of it as the precision of the internal calculations in the implementation of the environment. In effect, you're assuming a certain kind of implementation, something like a fixed grid of binary fractions. But the environment can use any implementation, not just that one.

For general environments, the right way to think of the precision parameter is, it's the required precision in the output. The internals of the environment are free to work in any precision, including infinite precision. Integer calculations are infinite precision. If the environment internals use only integer math to evaluate an expression and then convert the resulting value to finite precision to serve to the user, the only place the precision parameter was used was for formatting the output. The calculation itself was independent of the precision parameter.

Using infinite precision internally is ideal and should always be preferred. For expressions using only the operations *,/,+,- and only rational arguments, you can treat the rationals as pairs of ints and just do integer math, which is infinite precision.


There are plenty of examples where it's impossible in practice to do this, as algebraic operations cause memory blowup, and unacceptable slowdown.
Sage or any other system is not running on an ideal all-powerful computer, there are resource limitations.

Are you familiar with computational complexity theory? Before accusing us of making "subtle mistakes", I suggest you to familiarise yourself with it.




But for other expressions this isn't possible, eg a non-square integer to the power 1/2. In that case you have to use some kind of series expansion evaluated to only finitely many terms, or some kind of iterated map that converges to the right answer, but you only iterate it a finite number of times. This is working to only finite precision. In effect you choose an internal precision, by choosing the number of terms to evaluate in the expansion, or the number of times to iterate the map. You choose this internal precision to be *higher* than the user's requested precision, so that you can round your internal result to the user's requested precision and it is correct to that precision.
 
On Monday, April 17, 2023 at 6:24:13 PM UTC-6 Nils Bruin wrote:
Fixed precision floating point is fundamentally unable to do this: the number of real numbers it can represent in the interval [1,2] is bounded (by the set precision), whereas that interval contains infinitely many rational, algebraic, and transcendental real numbers. So there are pairs of numbers in there it won't be able to represent with distinct representations.

For each value of the precision parameter, there are only finitely many distinct numbers that the environment can output. But taking the union over all precision parameters, the environment can output infinitely many numbers - in fact, a dense subset of the rationals, hence a dense subset of the reals.

The fact that this set is dense in the reals is the whole basis for using finite precision strings of digits as arbitrarily good approximations to arbitrary real numbers!

Specify enough digits, and you can approximate a real number as closely as you like.

Set the precision of your environment high enough, and you can approximate exact real quantities as closely as you like.

RealField(200) alone can't do that.
But RealField() can.

And I think it does, except when I pass it an expression with a float literal, because in that case you guys change the literal's value to a 53-bit approximation, and that trashes, for all n>53, the n-bit approximation that RealField(n) is supposed to give us.

-aw
 

 

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Nils Bruin

unread,
Apr 18, 2023, 3:31:33 PM4/18/23
to sage-devel
On Tuesday, 18 April 2023 at 11:33:38 UTC-7 aw wrote:
On Monday, April 17, 2023 at 6:24:13 PM UTC-6 Nils Bruin wrote:
On Monday, 17 April 2023 at 16:39:01 UTC-7 aw wrote:
If properly implemented, it can emulate exact computation followed by a truncation to finite precision.

When I say a very high precision environment is for doing exact computation, I don't mean that it should handling infinite digit strings. I mean that the input-output function for such an environment should be identical to the input-output function of a hypothetical environment that *does* do the full exact computation and then truncates the answer to the requested precision.

In other words, a very high precision environment should emulate an environment that does the full exact computation and truncates the result to finite precision before giving it to the user. And, this emulation should be perfect.

Unfortunately, not for a fixed precision. If you're working with 200 bits precision, then you cannot express the roots of
x^2-3 and x^2-(3+2^-200) as different floating point numbers. They are too close for that.

In that case the environment sees the two roots as having the same value, which to 200 bits may be absolutely correct. In that case the environment *is* perfectly emulating an exact calculation truncated or rounded to 200 bits, as it is required to do.

I think your comment here is based on a subtle mistake in thinking about the precision parameter. You're thinking of it as the precision of the internal calculations in the implementation of the environment. In effect, you're assuming a certain kind of implementation, something like a fixed grid of binary fractions. But the environment can use any implementation, not just that one.

In such a computational model, there is no reason to carry around the 200 through the computations. In that case you just compute with the objects "lazily" and once you want to get 200 digits approximating the number you're interested in, you just ask "what is the 200 digit approximation to this number". You can then turn around and ask for the 400 digit approximation as well and instead of having to start from scratch, the system can just continue refining its approximation using the work it's already done. This is implemented already. In AlgebraicField for algebraic numbers and in RealLazyField for loads of computable transcendental numbers too.

The difficult question in computation doesn't tend to be "what is the 200 digit approximation to this number" it's "are these two quantities obtained through different computations actually equal". AlgebraicField and RealLazyField try to do that as well (but I suspect with limitations for the latter).

What is presently called RealField is fundamentally not that. In the medium future, it may well be the case that RealLazyField will feature much more prominently in sagemath as a computational model for real numbers, but the current implementation is still under quite active development and, while usable, not quite ready to be a *default* in sagemath.

Example:

sage: R=RealLazyField()
sage: a=cos(R.pi()/13)
sage: a.numerical_approx(200)
0.97094181742605202715698227629378922724986510573900358858764
sage: a.numerical_approx(400)
0.970941817426052027156982276293789227249865105739003588587644526477041708760006676932574606408384261554438205497500991925
sage: a.numerical_approx(800)
0.970941817426052027156982276293789227249865105739003588587644526477041708760006676932574606408384261554438205497500991925446304849693046415555347324930136179166290724598721026150846791024698868142304028116956960541355031116767265861274416574

Once again, someone who's looking for standard scientific computation tools and gets confronted with this instead will be very disappointed because performance will be way different and generally MUCH slower. So a general purpose computer algebra system would need both, as sagemath has.

aw

unread,
Apr 18, 2023, 5:20:04 PM4/18/23
to sage-devel
On Tuesday, April 18, 2023 at 1:31:33 PM UTC-6 Nils Bruin wrote:
sage: R=RealLazyField()
sage: a=cos(R.pi()/13)
sage: a.numerical_approx(200)
0.97094181742605202715698227629378922724986510573900358858764
sage: a.numerical_approx(400)
0.970941817426052027156982276293789227249865105739003588587644526477041708760006676932574606408384261554438205497500991925
sage: a.numerical_approx(800)
0.970941817426052027156982276293789227249865105739003588587644526477041708760006676932574606408384261554438205497500991925446304849693046415555347324930136179166290724598721026150846791024698868142304028116956960541355031116767265861274416574

Nils, eager versus lazy is not the problem.

The problem is very simple: decimal literals like "1.1" are being mishandled when passed in expressions to RealField *or* RealLazyField.

Here, look:

RealField(200)(e^1.1)
3.0041660239464333947978502692421898245811462402343750000000

RealLazyField(e^1.1).numerical_approx(200)
3.0041660239464333947978502692421898245811462402343750000000

RealLazyField()((e^1.1)).numerical_approx(400)
3.00416602394643339479785026924218982458114624023437500000000000000000000000000000000000000000000000000000000000000000000

These outputs purport to be the value of e^1.1 to 200 or 400 bits, but they are not.

Any normal mathematician who writes "e^1.1" intends that to mean "e^(11/10)". They write "1.1" because it is less writing than 11/10. On a computer, 1.1 is 3 characters of typing, 11/10 is 5 characters of typing. So you find yourself typing 1.1 instead of 11/10.

It really is that simple.

Decimal literals, when typed into high-precison environments, are just a shorthand way of writing exact rationals.

You guys are silently giving "1.1" a different value, the nearest 53-bit approximation.

That value has no canonical relation to the exact rational.

That value has no canonical relation to the precision the user has requested.

It is an arbitrary value that no normal mathematician would choose themselves.

That is just a mistake. Plain and simple.

It is simple to fix.

So it should be fixed.

Here's how easy it is:

RealField(200)(e^(11/10))
3.0041660239464331120584079535886723932826810260162727621298

RealLazyField()((e^(11/10))).numerical_approx(200)
3.0041660239464331120584079535886723932826810260162727621298

RealLazyField()((e^(11/10))).numerical_approx(400)
3.00416602394643311205840795358867239328268102601627276212975286052863219354994639393702584656011004248292917244402984647

-aw

Dima Pasechnik

unread,
Apr 18, 2023, 5:29:03 PM4/18/23
to sage-devel


On Tue, 18 Apr 2023, 22:20 aw, <aw.phon...@gmail.com> wrote:
On Tuesday, April 18, 2023 at 1:31:33 PM UTC-6 Nils Bruin wrote:
sage: R=RealLazyField()
sage: a=cos(R.pi()/13)
sage: a.numerical_approx(200)
0.97094181742605202715698227629378922724986510573900358858764
sage: a.numerical_approx(400)
0.970941817426052027156982276293789227249865105739003588587644526477041708760006676932574606408384261554438205497500991925
sage: a.numerical_approx(800)
0.970941817426052027156982276293789227249865105739003588587644526477041708760006676932574606408384261554438205497500991925446304849693046415555347324930136179166290724598721026150846791024698868142304028116956960541355031116767265861274416574

Nils, eager versus lazy is not the problem.

It is a problem, as e^1.1 cannot be represented exactly, and it is evaluated eagerly. To what precision should it be evaluated? To 200 bits?
Then you will complain that you can't get what you want at 400 bits. Etc etc.


--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

aw

unread,
Apr 18, 2023, 5:59:30 PM4/18/23
to sage-devel
On Tuesday, April 18, 2023 at 3:29:03 PM UTC-6 Dima Pasechnik wrote:

It is a problem, as e^1.1 cannot be represented exactly, and it is evaluated eagerly. To what precision should it be evaluated? To 200 bits?
Then you will complain that you can't get what you want at 400 bits. Etc etc.

"1.1" doesn't need to be evaluated, it needs to be replaced by "11/10", in pre-processing.

I don't know the code involved, but I'll take a stab at one way it could be done, something like this: when the Sage interpreter is looking at the user-supplied string "RealLazyField(200)(e^1.1)", it knows that it is about to pass a parse of the string "e^1.1" on to RealLazyField. In this situation, where the receiver of the parse is RealLazyField, the interpreter can easily change its usual parsing of "1.1". Instead of some kind of float, it can parse it as the exact rational 11/10, and send the resulting parse of the whole expression to RealLazyField for evaluation. To RealLazyField, it's as if the user supplied string was "RealLazyField(200)(e^(11/10))".

The details of how this gets done, don't matter.

All that matters is that "1.1" gets replaced by "11/10".

Do that any way you want.

-aw

David Roe

unread,
Apr 18, 2023, 6:14:41 PM4/18/23
to sage-...@googlegroups.com
Did you read my message from last night?  I highlighted exactly the problems with what you're suggesting.
David

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

aw

unread,
Apr 18, 2023, 6:16:00 PM4/18/23
to sage-devel
On Tuesday, April 18, 2023 at 3:59:30 PM UTC-6 aw wrote:
... the user-supplied string "RealLazyField(200)(e^1.1)"...

Whoops, typo: initially I had RealField there, and that syntax is right.
Then I changed RealField to RealLazyField, to emphasize that the problem applies equally to it.
But I forgot to change the syntax.

Instead of RealLazyField(200)(e^1.1), it should be "RealLazyField()(e^1.1).numerical_approx(200)".
Instead of "RealLazyField(200)(e^(11/10))", it should be RealLazyField()(e^(11/10)).numerical_approx(200)".

Sorry for that, was in a rush.

-aw

aw

unread,
Apr 18, 2023, 6:16:03 PM4/18/23
to sage-devel
On Tuesday, April 18, 2023 at 3:20:04 PM UTC-6 aw wrote:

RealLazyField(e^1.1).numerical_approx(200)

Whoops, typo. Should be:

RealLazyField()(e^1.1).numerical_approx(200)

Sorry.

-aw
 

Nils Bruin

unread,
Apr 18, 2023, 6:19:45 PM4/18/23
to sage-devel
On Tuesday, 18 April 2023 at 14:59:30 UTC-7 aw wrote:
On Tuesday, April 18, 2023 at 3:29:03 PM UTC-6 Dima Pasechnik wrote:

It is a problem, as e^1.1 cannot be represented exactly, and it is evaluated eagerly. To what precision should it be evaluated? To 200 bits?
Then you will complain that you can't get what you want at 400 bits. Etc etc.

"1.1" doesn't need to be evaluated, it needs to be replaced by "11/10", in pre-processing.

That's not going to change by default. There is already a way to spell 11/10: as "11/10". The present default is to use "1.1" as a designation of a floating point literal, with in implied precision derived from the number of digits used to write down the mantissa. There are definitely people who are happy with that default, so incentive to change it is rather low. In fact, I've never heard before the expectation that 1.1 would signify an exact quantity and it was always drilled into me that one should write the appropriate number of significant digits when writing a decimal fraction, to indicate the number of digits that are claimed to be correct. It's unfortunate that you're not among the people who are happy with the convention used in sagemath (and maple, and maxima, and magma).

It may not be the default, but you can still have it! As referenced before, just execute upon startup:

old_RealNumber=RealNumber
def RealNumber(*args, **kwargs):
    return QQ(old_RealNumber(*args, **kwargs))

You can place it in a startup file so that it is in force for all your subsequent sessions. See: https://doc.sagemath.org/html/en/reference/repl/startup.html

Dima Pasechnik

unread,
Apr 18, 2023, 6:23:10 PM4/18/23
to sage-devel


On Tue, 18 Apr 2023, 22:59 aw, <aw.phon...@gmail.com> wrote:
On Tuesday, April 18, 2023 at 3:29:03 PM UTC-6 Dima Pasechnik wrote:

It is a problem, as e^1.1 cannot be represented exactly, and it is evaluated eagerly. To what precision should it be evaluated? To 200 bits?
Then you will complain that you can't get what you want at 400 bits. Etc etc.

"1.1" doesn't need to be evaluated, it needs to be replaced by "11/10", in pre-processing.

I don't know the code involved, but I'll take a stab at one way it could be done, something like this: when the Sage interpreter is looking at the user-supplied string "RealLazyField(200)(e^1.1)", it knows that it is about to pass a parse of the string "e^1.1" on to RealLazyField.


what you propose is lazy evaluation.
(there is no Sage interpreter, it's basically Python, which does not do it for such cases)

And this it better thought of as 

1) compute t=e^1.1
2) compute RealLazyField(200)(t)

When you do step 1), you need, in Python, to set a precision, implicitly or explicitly. (Same in Mathematica, by the way, you can do SetPrecision[...]).

When you do 2), and the precision chosen in 1) was not enough, tough luck (in Mathematica, too).



In this situation, where the receiver of the parse is RealLazyField, the interpreter can easily change its usual parsing of "1.1". Instead of some kind of float, it can parse it as the exact rational 11/10, and send the resulting parse of the whole expression to RealLazyField for evaluation. To RealLazyField, it's as if the user supplied string was "RealLazyField(200)(e^(11/10))".

The details of how this gets done, don't matter.

All that matters is that "1.1" gets replaced by "11/10".

Do that any way you want.

-aw

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

aw

unread,
Apr 18, 2023, 8:05:49 PM4/18/23
to sage-devel
On Tuesday, April 18, 2023 at 4:23:10 PM UTC-6 Dima Pasechnik wrote:

1) compute t=e^1.1
2) compute RealLazyField(200)(t)

When you do step 1), you need, in Python, to set a precision, implicitly or explicitly. (Same in Mathematica, by the way, you can do SetPrecision[...]).


If Python is evaluating e^1.1, then it is also evaluating e^(11/10), right?

If that's how it worked, we would never be able to get more than 64 correct bits from RealLazyField.

But we can: if we send in e^(11/10), we can get any number of correct bits, just as expected.

Here's another angle on it:

Python interprets 1.1 and 11/10 to have the same value. In a plain Python interpreter, version 3.10, no Sage involved:

1.1=11/10
True

from math import *
pow(e,1.1)==pow(e,11/10)
True

Hence if Python was evaluating e^1.1 and e^(11/10) and sending the result to RealLazyField, we should get the same output from RealLazyField. But we don't:

RealLazyField()((e^1.1)).numerical_approx(200)
3.0041660239464333947978502692421898245811462402343750000000

RealLazyField()((e^(11/10))).numerical_approx(200)
3.0041660239464331120584079535886723932826810260162727621298

Different answers here mean Sage *is* getting its hands on e^1.1 and e^(11/10).

By the way, I'm running these Sage commands in a Jupyter notebook. That notebook seems to think 1.1 and e are both Sage types:

type(1.1)
<class 'sage.rings.real_mpfr.RealLiteral'>

type(e)
<class 'sage.symbolic.constants_c.E'>

That makes it look like Sage has complete control over what happens here.

Is that not true?

If for some reason Sage does not have full control, a workaround would be to pass expressions to RealField and RealLazyField as strings, and let those functions themselves parse the strings.

Parsing code for arithmetic expressions is standard and would be simple to copy-paste into RealField and RealLazyField, or any other Sage function that we want to have full control over how user-supplied expressions are interpreted.

By the way, Wolfram Alpha is not affected by this problem, so you can't really say it's impossible to do, right?

Alpha:
enter "e^1.1", press "more digits"
3.0041660239464331120584079535886723932826810260162727621297528605...

Sage:
RealLazyField()((e^1.1)).numerical_approx(200)
3.0041660239464333947978502692421898245811462402343750000000

Why is interpreting the 1.1 as 11/10 easy for Alpha, but hard for you guys?

-aw



aw

unread,
Apr 18, 2023, 8:06:03 PM4/18/23
to sage-devel
On Tuesday, April 18, 2023 at 4:14:41 PM UTC-6 David Roe wrote:
Did you read my message from last night?  I highlighted exactly the problems with what you're suggesting.
David

In that post you outlined some pretty heavy ways of doing it. Those would be a lot of work, with performance implications, integration implications, etc.

I don't want you guys to have to do a lot of work. I'm suggesting a simple pre-processing solution that would take almost no work, and would have zero performance implications or integration implications.

-aw

David Roe

unread,
Apr 18, 2023, 8:15:01 PM4/18/23
to sage-...@googlegroups.com
That's what I tried to explain in that post, and what Nils and Dima have explained as well.  You are incorrect that there are no performance or integration implications.  You are literally suggesting that we interpret 1.1 as 11/10, and I explained why that was problematic.
David


-aw

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

aw

unread,
Apr 18, 2023, 8:15:39 PM4/18/23
to sage-devel
On Tuesday, April 18, 2023 at 4:19:45 PM UTC-6 Nils Bruin wrote:

The present default is to use "1.1" as a designation of a floating point literal, with in implied precision derived from the number of digits used to write down the mantissa.

This is true in some context, but not others.

In a physics or engineering context, everyone would see 1.1 as having only two significant figures.

In a *math* context, no one would see it as having only two significant figures.

Here's a hypothetical test you could do:

Suppose someone scrawls "1.1" on the blackboard in the math common room, and a mathematician walks in and sees it, what does he think about its value?

If you ask him to paraphrase the value of it, he would say "11/10", "1 plus 1/10", or something else equivalent to 11/10.

It would not occur to him that it could also be a finite-precision floating point quantity, unless you told him that 11/10 was not the right interpretation.

In other words, the mathematician's default semantics for "1.1" is 11/10.

The physicist's default semantics is 1.1 * 10^0, only two significant figures.

Which semantics should Sage use?

I would say Sage is for math, so it should use the math semantics.

In high-precision environments like RealField(1000), Sage should *definitely* use the math semantics, because physics people, or engineers, or any other applied type folks, have zero use for 1000 bits of precision in anything that they do.

Only a math person, doing math, has a use for 1000 bits.

So RealField and RealLazyField should use the math semantics.

They should see 1.1 as 11/10.

-aw

aw

unread,
Apr 18, 2023, 8:15:53 PM4/18/23
to sage-devel
On Tuesday, April 18, 2023 at 4:19:45 PM UTC-6 Nils Bruin wrote:

It may not be the default, but you can still have it! As referenced before, just execute upon startup:

old_RealNumber=RealNumber
def RealNumber(*args, **kwargs):
    return QQ(old_RealNumber(*args, **kwargs))

You can place it in a startup file so that it is in force for all your subsequent sessions. See: https://doc.sagemath.org/html/en/reference/repl/startup.html

Thanks for this, but that would be a global substitution, right?

I'm arguing just for a local substitution, in the high-precision environments only.

-aw

 

David Roe

unread,
Apr 18, 2023, 8:18:17 PM4/18/23
to sage-...@googlegroups.com
It's not possible in Python to interpret 1.1 as 11/10 in a high-precision environment and as a normal float otherwise, unless you do something that "takes a lot of work," which you claimed you weren't trying to suggest.
David



-aw

 

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

William Stein

unread,
Apr 18, 2023, 8:27:50 PM4/18/23
to sage-...@googlegroups.com
On Tue, Apr 18, 2023 at 5:15 PM aw <aw.phon...@gmail.com> wrote:
> In high-precision environments like RealField(1000), Sage should *definitely* use the math semantics, because physics people, or engineers, or any other applied type folks, have zero use for 1000 bits of precision in anything that they do.

Search for the word "physics" here: https://www.mpfr.org/pub.html

Edgar Costa

unread,
Apr 18, 2023, 8:29:44 PM4/18/23
to sage-...@googlegroups.com
You keep saying

> Any normal mathematician who writes "e^1.1" intends that to mean "e^(11/10)".

To be honest, I don't think that is the case, if a paper presented to you some constant gamma = 1.1, you would not assume that gamma is 11/10, but instead that an approximation to 2 decimal digits for gamma is "1.1".
That is the meaning of finite decimal expansion!

A valid complaint could perhaps be, why are you casting gamma to 53 bits when I didn't give you that many, I think you know the answer to that question.

aw

unread,
Apr 18, 2023, 10:35:16 PM4/18/23
to sage-devel
Bill,

I flipped through several of those papers, and from what I could see, the amount of physics work that uses more than 256 bits is still small.

It may get more popular in future, but at the moment it's small.

The amount of work being done in 1000 bits is very small, in 10000 bits is probably zero.

So what I said was correct to zeroth-order, and probably to first order. Maybe I start to be wrong at second order?

When talking about how things are in the world, or what the ideal thing to do is when there are many options to choose from, it's hard to be exactly right. The most important thing is to get the zeroth-order right.

I think all of my posts here have gotten the zeroth-order right.

Almost all of the replies to my posts, including this one of yours, have nitpicked something at first-order or second-order and ignored the main, zeroth-order point I was making.

I'm not trying to insult you here, I'm just making a sociological observation, like a sociologist might make. You and your people have a *very strong* tendency to nitpick details instead of staying focused on the big picture.

Here's one part of that big picture:


Alpha:
enter "e^1.1", press "more digits"
3.0041660239464331120584079535886723932826810260162727621297528605...

Sage:
RealLazyField()((e^1.1)).numerical_approx(200)
3.0041660239464333947978502692421898245811462402343750000000

What do people think about Sage, when they do some checking and confirm that the Alpha answer is right and the Sage one is wrong?

Do they think good things about Sage, or bad things?

That's big picture stuff.

Your devs have been nitpicking me to death saying it can't be done, performance considerations, not possible in Python, etc etc.

Wolfram did it, so why not you guys?

It's funny to even say it that way, "Wolfram did it", because it's such a tiny, trivial thing. I'm pretty sure you guys could do it, you just don't want to, because of various first-order and second-order reasons.

The zeroth-order should override the first and second-order.

Because the zeroth-order matters more.

Anyway, I'm just trying to help here.

Like I said in my first post in this thread, I like the idea of Sage. I want it to do well. But it's pretty hard to be a booster for Sage when I look at the above two outputs, Alpha vs Sage.

Fix it or don't fix it. Up to you.

By the way, if you haven't already, you really should read all of my posts. There's a lot of good stuff in them, from various angles.

I have to get back to my main work.

Probably we will run into each other at some future date.

Later alligator,
-aw

William Stein

unread,
Apr 19, 2023, 12:37:44 AM4/19/23
to sage-...@googlegroups.com
On Tue, Apr 18, 2023 at 7:35 PM aw <aw.phon...@gmail.com> wrote:
> [...] You and your people have a *very strong* tendency to nitpick details instead of staying focused on the big picture.

You're right, respecting details is indeed a very strong
characteristic of professional mathematicians.

--
William (http://wstein.org)

G. M.-S.

unread,
Apr 19, 2023, 7:06:04 AM4/19/23
to sage-...@googlegroups.com

Dear SageMath developers,

Just to say a big Thank You! to you all for being utterly patient and polite and positive, even when we do not deserve it (I speak for myself).

Best,

Guillermo

kcrisman

unread,
Apr 19, 2023, 8:55:39 AM4/19/23
to sage-devel
Just to say a big Thank You! to you all for being utterly patient and polite and positive

 +100

kcrisman

unread,
Apr 19, 2023, 10:49:47 AM4/19/23
to sage-devel
Alpha:
enter "e^1.1", press "more digits"
3.0041660239464331120584079535886723932826810260162727621297528605...

Sage:
RealLazyField()((e^1.1)).numerical_approx(200)
3.0041660239464333947978502692421898245811462402343750000000


Sage is not Wolfram Alpha, and neither is Mathematica. Alpha is trying to get at something different.  It would be GREAT if there were an open source natural language processing thing of that type (a few attempts have been made, not necessarily with Sage, if I recall correctly), but to my knowledge there isn't.  But that is not Sage's goal.   Sage can obviously do this computation:

sage: z = e^(11/10)
sage: RealField(1000)(z)
3.0041660239464331120584079535886723932826810260162727621297528605...

The real issue is that you want 1.1 to be 11/10.  I think this is a reasonable thing to talk about, but it's also reasonable to recognize that it's not inherently wrong to tell the software users they aren't the same thing.

We have to have some convention (and here, "we" is more than just Sage) for how decimals are interpreted, and the convention for most such software seems to be that it is interpreted as a float of some kind.  I'm super careful to tell students the difference between what their calculator says and the fractions they may or may not represent, and that includes with material below that of calculus, and I certainly wouldn't encourage them to treat 1.1 the same as 11/10 in a system beyond a hand-held calculator.  1.1^x is not the same as (11/10)^x in a modeling situation; the latter implies you know the exact growth rate, the former is a model.  (I do not tell them about floating-point unless they are computer science majors or are likely to take additional mathematics where this becomes relevant, such as linear algebra.)

I strongly doubt I am the only teaching mathematician who does that.  Though it seems Dima has a potential approach, maybe we just have to agree to disagree on this.  To bring reasonably independent evidence, here is a quote from a book on Sage NOT written by a developer (Gregory Bard's "Sage for Undergraduates"):

"The number 0.5 is already a decimal and thus Sage assumes that it is a mere numerical approximation.  However, the number 1/2 ... " and (a few paragraphs earlier), "However, if our project becomes an important part in a larger program ... then wasting half the computation time would be exceptionally unwise."

That seems appropriate for this level in explaining the issue at hand.  This text is intended for high school and (mostly) college educators and their students, and is written by someone with plenty of non-Sage programming, experience, but also plenty of practical teaching experience at an engineering school.  I hope it is clear from this example that reasonable people who care about end users can come to different conclusions about whether 1.1 should be treated the same as 11/10, and whether this is (at least in the educational context) something that can be effectively communicated to students.
Reply all
Reply to author
Forward
0 new messages