Python integer divisions should be fractions

392 views
Skip to first unread message

Martin Teichmann

unread,
May 14, 2021, 1:26:29 PM5/14/21
to sympy
Hi fans of sympy,

often when I am using sympy I accidentally just type "1/2" to mean one half. All of you know that sympy does not like that... So I asked myself, does it need to be like that? So I played around a bit with the Python interpreter, and just made integer divisions return fractions instead of floats. That should keep everybody happy: fractions can easily be converted to floats, they actually are automatically converted in most cases, but still sympy happily can do symbolic math!


Maybe somebody is interested and would like to contribute to the discussion.

To show how it all looks like, here an example interaction (this is the running interpreter with bare sympy, no modifications were necessary):

    >>> from sympy import symbols, sqrt, sin
    >>> x = symbols("x")
    >>> sin(1/2)
    sin(1/2)
    >>> 1/2 + x
   x + 1/2
    >>> 2/3 * x
    2*x/3
    >>> sqrt(1/4)
    1/2

Cheers

Martin

Aaron Meurer

unread,
May 14, 2021, 3:00:19 PM5/14/21
to sympy
The problem is that this would affect all integer divisions anywhere
in Python, which would be a huge compatibility break. I'm sure not
even SymPy's own test suite would pass all tests under this patched
interpreter.

Even if you don't care about backwards compatibility, it's generally a
bad idea to make integer division give fractions. In most use cases,
you do want floats. The problem with fractions is that they tend to
grow unwieldy if you aren't careful. Here's an anecdote from Guido on
why he didn't make Python work like this in the first place, even
though ABC, the language Python is based on, does (from
https://python-history.blogspot.com/2009/02/early-language-design-and-development.html)

"Numbers are one of the places where I strayed most from ABC. ABC had
two types of numbers at run time; exact numbers which were represented
as arbitrary precision rational numbers and approximate numbers which
were represented as binary floating point with extended exponent
range. The rational numbers didn’t pan out in my view. (Anecdote: I
tried to compute my taxes once using ABC. The program, which seemed
fairly straightforward, was taking way too long to compute a few
simple numbers. Upon investigation it turned out that it was doing
arithmetic on numers with thousands of digits of precision, which were
to be rounded to guilders and cents for printing.) For Python I
therefore chose a more traditional model with machine integers and
machine binary floating point. In Python's implementation, these
numbers are simply represented by the C datatypes of long and double
respectively."

A more reasonable approach if you want to make 1/2 give a rational is
to only parse user inputed expressions. This can be done using
sympy.init_session(auto_int_to_Integer=True), which will cause integer
literals entered by the user to automatically be wrapped with
sympy.Integer.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/a8836626-dc1a-4c14-9c36-7b30672d5b3en%40googlegroups.com.

gu...@uwosh.edu

unread,
May 14, 2021, 3:59:19 PM5/14/21
to sympy

I want to second Aaron's comment. Please just use `sympy.init_session(auto_int_to_Integer=True)` if you want that behavior. As a scientist who uses python to process large data sets I do not want it to bog down trying to do exact calculations. Most datasets only have a few significant figures anyway. Python should not be changed.

However, I think there could be a fruitful discussion of whether sympy should by default cast integer input to sympy integers. I personally prefer that behavior, so set it when I am doing something where I care.

Jonathan

Aaron Meurer

unread,
May 14, 2021, 4:33:54 PM5/14/21
to sympy
On Fri, May 14, 2021 at 1:59 PM gu...@uwosh.edu <gu...@uwosh.edu> wrote:
>
>
> I want to second Aaron's comment. Please just use `sympy.init_session(auto_int_to_Integer=True)` if you want that behavior. As a scientist who uses python to process large data sets I do not want it to bog down trying to do exact calculations. Most datasets only have a few significant figures anyway. Python should not be changed.
>
> However, I think there could be a fruitful discussion of whether sympy should by default cast integer input to sympy integers. I personally prefer that behavior, so set it when I am doing something where I care.

SymPy already does pretty aggressively cast all input types into SymPy
types. The issue is that in something like 1/2*x, Python evaluates
this as (2/3)*x, where the 2/3 part is completely Python types, so it
gets evaluated before we have any control over it. It is only when the
"*x" part happens that SymPy is able to cast it to a SymPy type, but
by then, it has already gone through the inexact division. This is why
instead doing 2*x/3 does create an exact result, because this is
evaluated as first 2*x, where the Symbol('x') is able to cast the 2 to
a SymPy Integer type, then the /3 is able to be divided by this SymPy
type and SymPy can do the division exactly.

This is only an issue for division of integers because it's the only
operation on Python types that loses precision. Something like 2*3*x
also works the same way where the 2*3 is evaluated by Python first
before it gets to SymPy, but in that case, it doesn't matter because
the answer is exact, even for large integers.

This is just the way Python works. It is an eagerly evaluated
language. There's not much we can do about it as a library.

Aaron Meurer
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/697b9f9f-9625-4f70-aa33-9b7b60198d7en%40googlegroups.com.

Oscar Benjamin

unread,
May 14, 2021, 5:04:12 PM5/14/21
to sympy
On Fri, 14 May 2021 at 21:33, Aaron Meurer <asme...@gmail.com> wrote:
>
> On Fri, May 14, 2021 at 1:59 PM gu...@uwosh.edu <gu...@uwosh.edu> wrote:
> >
> >
> > I want to second Aaron's comment. Please just use `sympy.init_session(auto_int_to_Integer=True)` if you want that behavior. As a scientist who uses python to process large data sets I do not want it to bog down trying to do exact calculations. Most datasets only have a few significant figures anyway. Python should not be changed.
> >
> > However, I think there could be a fruitful discussion of whether sympy should by default cast integer input to sympy integers. I personally prefer that behavior, so set it when I am doing something where I care.
>
> SymPy already does pretty aggressively cast all input types into SymPy
> types. The issue is that in something like 1/2*x, Python evaluates
> this as (2/3)*x, where the 2/3 part is completely Python types, so it
> gets evaluated before we have any control over it. It is only when the
> "*x" part happens that SymPy is able to cast it to a SymPy type, but
> by then, it has already gone through the inexact division. This is why
> instead doing 2*x/3 does create an exact result, because this is
> evaluated as first 2*x, where the Symbol('x') is able to cast the 2 to
> a SymPy Integer type, then the /3 is able to be divided by this SymPy
> type and SymPy can do the division exactly.
>
> This is only an issue for division of integers because it's the only
> operation on Python types that loses precision. Something like 2*3*x
> also works the same way where the 2*3 is evaluated by Python first
> before it gets to SymPy, but in that case, it doesn't matter because
> the answer is exact, even for large integers.
>
> This is just the way Python works. It is an eagerly evaluated
> language. There's not much we can do about it as a library.

There is an obvious fix that SymPy could do but it wouldn't be backwards compatible. I think it would be better if sympifying a float used nsimplify to convert the float to a Rational. Users are stung by the problems of Float all the time just from either using integer division but also from thinking that an expression like 0.5 or 0.1 should be exact. It's not hard to make this change in SymPy:

diff --git a/sympy/core/sympify.py b/sympy/core/sympify.py

index ed5ba267d9..23250877e5 100644

--- a/sympy/core/sympify.py

+++ b/sympy/core/sympify.py

@@ -353,6 +353,12 @@ def sympify(a, locals=None, convert_xor=True, strict=False, rational=False,

 

     if isinstance(a, CantSympify):

         raise SympifyError(a)

+

+    if isinstance(a, float):

+        from sympy import nsimplify

+        from sympy.core.numbers import Float

+        return nsimplify(Float(a))

+

     cls = getattr(a, "__class__", None)

     if cls is None:

         cls = type(a)  # Probably an old-style class


With that we get e.g.:

In [2]: 1/2*x

Out[2]: 

x

2


In [3]: 0.5*x

Out[3]: 

x

2


In [4]: from math import sqrt


In [5]: sqrt(2)

Out[5]: 1.4142135623730951


In [6]: sqrt(2)*x

Out[6]: √2⋅x



--
Oscar

Aaron Meurer

unread,
May 14, 2021, 5:27:22 PM5/14/21
to sympy
On Fri, May 14, 2021 at 3:04 PM Oscar Benjamin
<oscar.j....@gmail.com> wrote:
>
> On Fri, 14 May 2021 at 21:33, Aaron Meurer <asme...@gmail.com> wrote:
> >
> > On Fri, May 14, 2021 at 1:59 PM gu...@uwosh.edu <gu...@uwosh.edu> wrote:
> > >
> > >
> > > I want to second Aaron's comment. Please just use `sympy.init_session(auto_int_to_Integer=True)` if you want that behavior. As a scientist who uses python to process large data sets I do not want it to bog down trying to do exact calculations. Most datasets only have a few significant figures anyway. Python should not be changed.
> > >
> > > However, I think there could be a fruitful discussion of whether sympy should by default cast integer input to sympy integers. I personally prefer that behavior, so set it when I am doing something where I care.
> >
> > SymPy already does pretty aggressively cast all input types into SymPy
> > types. The issue is that in something like 1/2*x, Python evaluates
> > this as (2/3)*x, where the 2/3 part is completely Python types, so it
> > gets evaluated before we have any control over it. It is only when the
> > "*x" part happens that SymPy is able to cast it to a SymPy type, but
> > by then, it has already gone through the inexact division. This is why
> > instead doing 2*x/3 does create an exact result, because this is
> > evaluated as first 2*x, where the Symbol('x') is able to cast the 2 to
> > a SymPy Integer type, then the /3 is able to be divided by this SymPy
> > type and SymPy can do the division exactly.
> >
> > This is only an issue for division of integers because it's the only
> > operation on Python types that loses precision. Something like 2*3*x
> > also works the same way where the 2*3 is evaluated by Python first
> > before it gets to SymPy, but in that case, it doesn't matter because
> > the answer is exact, even for large integers.
> >
> > This is just the way Python works. It is an eagerly evaluated
> > language. There's not much we can do about it as a library.
>
> There is an obvious fix that SymPy could do but it wouldn't be backwards compatible. I think it would be better if sympifying a float used nsimplify to convert the float to a Rational. Users are stung by the problems of Float all the time just from either using integer division but also from thinking that an expression like 0.5 or 0.1 should be exact. It's not hard to make this change in SymPy:

This would work for simple cases, but since the float is an
approximation, there would always be cases where it would guess the
input incorrectly. From the Zen of Python, "In the face of ambiguity,
refuse the temptation to guess."

Here's an example of a handheld calculator that does this sort of
thing, and the sorts of weirdness it can cause
https://www.youtube.com/watch?v=7LKy3lrkTRA.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxQQiHfcDSSCmi8w_4iz0817ym%2BpavuFMtR-A3EJaS3Pzw%40mail.gmail.com.

Oscar Benjamin

unread,
May 14, 2021, 6:01:46 PM5/14/21
to sympy
That's a great video but it's describing something very different. There he types exact input into the calculator and then calculator somehow makes a numeric/symbolic approximation that goes wrong. I'm not talking about changing anything for exact input. I'm only referring to the first step of passing a float to sympy. Maybe we need a more limited version of nsimplify for this. I can certainly see the reasons for not trying to convert float to Rational but I can also see how much of a problem it is for users. The "power users" will not have so much of a problem converting their input to Float if that's what they meant.

--
Oscar

Aaron Meurer

unread,
May 14, 2021, 6:36:53 PM5/14/21
to sympy
On Fri, May 14, 2021 at 4:01 PM Oscar Benjamin
I see it as the same thing. If you write 2/3*x, the *input* is exact,
but the output is 0.666666666666667*x. The exact input gets
approximated somewhere in the process. That approximation is
necessarily lossy, and any kind of nsimplify would be inexact. It gets
even worse if we start trying to guess more than just rationals, like
sqrt(2.) or pi/2, which is what the calculator in the video is trying
to do. The Casio calculator has a precise way to enter exact inputs,
but clearly doesn't store those exactly, as that would require some
sort of computer algebra system, which it doesn't have. So it does
something very similar to what Python does, and evaluates the
expression to a single floating point number before processing it.

Ultimately, the problem is "refuse the temptation to guess". This
would apparently fix one gotcha, but introduce a whole class of new
gotchas. And the difference is that the new ones would be much more
subtle and surprising, even to advanced users. The existing behavior
is at least understandable once you understand how Python's evaluation
model works.

What would be useful is Python somehow stored the exact input that was
used to create an expression, which SymPy could then use to reverse
the approximate result into an exact output. I don't follow the Python
core development closely enough to say if this has been discussed
before.

Aaron Meurer

>
> --
> Oscar
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxTFy2kKevj_hUuPWpqWu-7i9nM9Q7hb1GMTvBSChtevhQ%40mail.gmail.com.

David Bailey

unread,
May 15, 2021, 5:02:11 PM5/15/21
to sy...@googlegroups.com
On 14/05/2021 23:01, Oscar Benjamin wrote:

That's a great video but it's describing something very different. There he types exact input into the calculator and then calculator somehow makes a numeric/symbolic approximation that goes wrong. I'm not talking about changing anything for exact input. I'm only referring to the first step of passing a float to sympy. Maybe we need a more limited version of nsimplify for this. I can certainly see the reasons for not trying to convert float to Rational but I can also see how much of a problem it is for users. The "power users" will not have so much of a problem converting their input to Float if that's what they meant.


Yes, but surely that would just lead to a new set of anomalies because the float has finite precision. I mean it would be possible to translate floats back into fractions for simple fractions like 3/4, but what about 123456789123456788/123456789123456789? Also, people sometimes pass intentional floats to SymPy without being  power users!

Wouldn't it be possible to persuade the Python team to implement auto_int_to_Integer outside of IPython - it can't be hard to do?

David




Aaron Meurer

unread,
May 16, 2021, 12:43:13 AM5/16/21
to sympy
My argument is that it's really only a good idea to do this in an
interactive environment. In that case, you are probably using
something like IPython or Jupyter notebook.

It would in theory be possible to make it work in a Python script
using a codec trick
(https://rahul.gopinath.org/post/2019/12/25/python-macros/). At that
point, you are more or less creating your own DSL.

For any kind of upstream change to Python itself, you have to remember
that 99% of Python users don't want this feature. So you'd have to
implement it in a way that doesn't affect them.

Aaron Meurer

>
> David
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/c7fa2c5f-b7c3-4e2d-bbfd-29b8ad81e553%40dbailey.co.uk.

Chris Smith

unread,
May 16, 2021, 1:05:58 AM5/16/21
to sympy
There is an `nsimplify` function that you can link up to a function that will change floats to Rationals:

def nrat(e, maxden=10):
  f = e.atoms(Float)
  reps = {}
  for i in f:
    r = nsimplify(i, rational=True)
    if r.q <= maxden: reps[i] = r
  return e.xreplace(reps)

>>> nrat(1/2 + 13/17*x)
0.764705882352941*x + 1/2
>>> nrat(1/2 + 13/17*x, 20)
13*x/17 + 1/2
>>> nrat(x**(1/2))
sqrt(x)

/c

David Bailey

unread,
May 16, 2021, 7:20:58 AM5/16/21
to sy...@googlegroups.com
On 16/05/2021 05:42, Aaron Meurer wrote:
> On Sat, May 15, 2021 at 3:02 PM David Bailey <da...@dbailey.co.uk> wrote:
>> On 14/05/2021 23:01, Oscar Benjamin wrote:
>>
>> That's a great video but it's describing something very different. There he types exact input into the calculator and then calculator somehow makes a numeric/symbolic approximation that goes wrong. I'm not talking about changing anything for exact input. I'm only referring to the first step of passing a float to sympy. Maybe we need a more limited version of nsimplify for this. I can certainly see the reasons for not trying to convert float to Rational but I can also see how much of a problem it is for users. The "power users" will not have so much of a problem converting their input to Float if that's what they meant.
>>
>>
>> Yes, but surely that would just lead to a new set of anomalies because the float has finite precision. I mean it would be possible to translate floats back into fractions for simple fractions like 3/4, but what about 123456789123456788/123456789123456789? Also, people sometimes pass intentional floats to SymPy without being power users!
>>
>> Wouldn't it be possible to persuade the Python team to implement auto_int_to_Integer outside of IPython - it can't be hard to do?
> My argument is that it's really only a good idea to do this in an
> interactive environment. In that case, you are probably using
> something like IPython or Jupyter notebook.
>
> It would in theory be possible to make it work in a Python script
> using a codec trick
> (https://rahul.gopinath.org/post/2019/12/25/python-macros/). At that
> point, you are more or less creating your own DSL.
>
> For any kind of upstream change to Python itself, you have to remember
> that 99% of Python users don't want this feature. So you'd have to
> implement it in a way that doesn't affect them.
>
> Aaron Meurer

Well to be clear, I was primarily disagreeing with Oscar's suggestion,
which (very unusually!) seemed a seriously retrograde step - trying to
recover fractions from floating point representations.

I suppose I find it frustrating that while SymPy run from Python
provides something very, very close to a general algebraic manipulation
language, it has problems inputting expressions like 2/3*x. I could
imagine some potential users encountering that 'feature' on their first
experiments with SymPy, and just abandoning SymPy without further thought.

If Python gave users a hook that would let them preprocess every line of
input, that would solve this problem and perhaps be really useful for
some other Python based applications. I really don't know the politics
of whether such a change would be accepted or not - so it may not be
worth further discussion.

David

gu...@uwosh.edu

unread,
May 16, 2021, 12:09:34 PM5/16/21
to sympy
I agree with Aaron that trying to keep exact expressions makes the most sense in an interactive environment (IPython, Jupyter, etc.). Sagemath uses a preparser before passing the code to Python. I'm not sure if Sympy could adopt this.

Jonathan
Reply all
Reply to author
Forward
0 new messages