unexpected int division

26 views
Skip to first unread message

michel paul

unread,
Oct 29, 2011, 11:09:20 AM10/29/11
to sage...@googlegroups.com
Here's a relatively minor issue that might not be minor for someone new to Sage.

In illustrating very simple probability as len(outcomes)/len(sample_space), integer division occurs, so the probability becomes 0.

Easy enough to correct - but it prompts discussion of why do we even have to bother with that?

In a class where we're interested also in pure bare bones Pythonic expression of ideas, this provides a nice example of how Python 2 and Python 3 differ, but for classes not interested in programming per se, just in 'math', it might be seen as a glitch.

--
==================================
"What I cannot create, I do not understand."

- Richard Feynman
==================================
"Computer science is the new mathematics."

- Dr. Christos Papadimitriou
==================================

David Joyner

unread,
Oct 29, 2011, 11:11:34 AM10/29/11
to sage...@googlegroups.com
On Sat, Oct 29, 2011 at 11:09 AM, michel paul <mpau...@gmail.com> wrote:
> Here's a relatively minor issue that might not be minor for someone new to
> Sage.
> In illustrating very simple probability as len(outcomes)/len(sample_space),
> integer division occurs, so the probability becomes 0.


Can you give a specific example please?


> Easy enough to correct - but it prompts discussion of why do we even have to
> bother with that?
> In a class where we're interested also in pure bare bones Pythonic
> expression of ideas, this provides a nice example of how Python 2 and Python
> 3 differ, but for classes not interested in programming per se, just in
> 'math', it might be seen as a glitch.
>
> --
> ==================================
> "What I cannot create, I do not understand."
> - Richard Feynman
> ==================================
> "Computer science is the new mathematics."
> - Dr. Christos Papadimitriou
> ==================================
>

> --
> You received this message because you are subscribed to the Google Groups
> "sage-edu" group.
> To post to this group, send email to sage...@googlegroups.com.
> To unsubscribe from this group, send email to
> sage-edu+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/sage-edu?hl=en.
>

michel paul

unread,
Oct 29, 2011, 11:19:18 AM10/29/11
to sage...@googlegroups.com
On Sat, Oct 29, 2011 at 8:11 AM, David Joyner <wdjo...@gmail.com> wrote:
On Sat, Oct 29, 2011 at 11:09 AM, michel paul <mpau...@gmail.com> wrote:
> Here's a relatively minor issue that might not be minor for someone new to
> Sage.
> In illustrating very simple probability as len(outcomes)/len(sample_space),
> integer division occurs, so the probability becomes 0.


Can you give a specific example please?

Sure - typical simple intro - throwing two dice:

S = [(die1, die2) for die1 in [1..6] for die2 in [1..6]]
E = [throw for throw in S if sum(throw) == 7]

len(E)/len(S) will produce 0.  We can very easily use 1.0*len(E)/len(S), or we can use Integer(len(E))/len(S).  I prefer the latter.  Either way, a student response will be, "Why do we have to do that here?"

- Michel

David Joyner

unread,
Oct 29, 2011, 11:35:41 AM10/29/11
to sage...@googlegroups.com
On Sat, Oct 29, 2011 at 11:19 AM, michel paul <mpau...@gmail.com> wrote:
> On Sat, Oct 29, 2011 at 8:11 AM, David Joyner <wdjo...@gmail.com> wrote:
>>
>> On Sat, Oct 29, 2011 at 11:09 AM, michel paul <mpau...@gmail.com> wrote:
>> > Here's a relatively minor issue that might not be minor for someone new
>> > to
>> > Sage.
>> > In illustrating very simple probability as
>> > len(outcomes)/len(sample_space),
>> > integer division occurs, so the probability becomes 0.
>>
>>
>> Can you give a specific example please?
>
> Sure - typical simple intro - throwing two dice:
>
> S = [(die1, die2) for die1 in [1..6] for die2 in [1..6]]
> E = [throw for throw in S if sum(throw) == 7]
>
> len(E)/len(S) will produce 0.  We can very easily use 1.0*len(E)/len(S), or
> we can use Integer(len(E))/len(S).  I prefer the latter.  Either way, a
> student response will be, "Why do we have to do that here?"

I see what you mean:

sage: S = [(die1, die2) for die1 in [1..6] for die2 in [1..6]]
sage: E = [throw for throw in S if sum(throw) == 7]
sage: len(E); len(S); len(E)/len(S); (1*len(E))/(1*len(S))
6
36
0
1/6

I would suggest that this "feature" in Python, was "fixed" in Python 3.0.
Sage has not yet upgraded to 3.0.


> - Michel
>
>> > Easy enough to correct - but it prompts discussion of why do we even
>> > have to
>> > bother with that?
>> > In a class where we're interested also in pure bare bones Pythonic
>> > expression of ideas, this provides a nice example of how Python 2 and
>> > Python
>> > 3 differ, but for classes not interested in programming per se, just in
>> > 'math', it might be seen as a glitch.
>
>
> --
> ==================================
> "What I cannot create, I do not understand."
> - Richard Feynman
> ==================================
> "Computer science is the new mathematics."
> - Dr. Christos Papadimitriou
> ==================================
>

kcrisman

unread,
Oct 29, 2011, 8:31:09 PM10/29/11
to sage-edu

> > Sure - typical simple intro - throwing two dice:
>
> > S = [(die1, die2) for die1 in [1..6] for die2 in [1..6]]
> > E = [throw for throw in S if sum(throw) == 7]
>
> > len(E)/len(S) will produce 0.  We can very easily use 1.0*len(E)/len(S), or
> > we can use Integer(len(E))/len(S).  I prefer the latter.  Either way, a
> > student response will be, "Why do we have to do that here?"
>
> I see what you mean:
>
> sage: S = [(die1, die2) for die1 in [1..6] for die2 in [1..6]]
> sage: E = [throw for throw in S if sum(throw) == 7]
> sage: len(E); len(S); len(E)/len(S); (1*len(E))/(1*len(S))
> 6
> 36
> 0
> 1/6

This is a great question for sage-edu. Do others have thoughts on
this? I'm really curious.

Notice that using Python `set`s doesn't help, and even using Sage
`Set`s doesn't help!

sage: S1 = Set(S)
sage: type(S1)
<class 'sage.sets.set.Set_object_enumerated_with_category'>
sage: S1.cardinality()
36
sage: type(S1.cardinality())
<type 'int'>

At least *this* is a bug, I think. I feel like a Sage method on a
Sage object should return Sage objects as much as possible.

> I would suggest that this "feature" in Python, was "fixed" in Python 3.0.
> Sage has not yet upgraded to 3.0.

And given that we haven't been able to even get to 2.7 yet, won't for
quite some time, most likely.

- kcrisman

Jason Grout

unread,
Oct 29, 2011, 10:40:03 PM10/29/11
to sage...@googlegroups.com
On 10/29/11 10:35 AM, David Joyner wrote:
> I would suggest that this "feature" in Python, was "fixed" in Python 3.0.
> Sage has not yet upgraded to 3.0.

Luckily, Python includes a time machine to import things from the future ;)

sage: from __future__ import division


sage: S = [(die1, die2) for die1 in [1..6] for die2 in [1..6]]
sage: E = [throw for throw in S if sum(throw) == 7]

sage: len(E)/len(S)
0.16666666666666666

Thanks,

Jason

kcrisman

unread,
Oct 29, 2011, 11:24:06 PM10/29/11
to sage-edu
Yeah, but won't that screw up some other stuff in Sage?

Plus, the "right" answer is 1/6, not n(1/6), and I think it's
reasonable to want to show that answer without needing to use Integer
or 1* (though perhaps not currently possible).

What about the cardinality being an int instead of an Integer? Just
curious what you think.

- kcrisman

Jason Grout

unread,
Oct 30, 2011, 12:53:57 AM10/30/11
to sage...@googlegroups.com
On 10/29/11 10:24 PM, kcrisman wrote:
>
>
> On Oct 29, 10:40 pm, Jason Grout<jason-s...@creativetrax.com> wrote:
>> On 10/29/11 10:35 AM, David Joyner wrote:
>>
>>> I would suggest that this "feature" in Python, was "fixed" in Python 3.0.
>>> Sage has not yet upgraded to 3.0.
>>
>> Luckily, Python includes a time machine to import things from the future ;)
>>
>> sage: from __future__ import division
>> sage: S = [(die1, die2) for die1 in [1..6] for die2 in [1..6]]
>> sage: E = [throw for throw in S if sum(throw) == 7]
>> sage: len(E)/len(S)
>> 0.16666666666666666
>
> Yeah, but won't that screw up some other stuff in Sage?

I don't think it would screw up internal stuff.

>
> Plus, the "right" answer is 1/6, not n(1/6), and I think it's
> reasonable to want to show that answer without needing to use Integer
> or 1* (though perhaps not currently possible).

Most likely, we aren't going to change len() to return Sage Integers, so
we're stuck with the Python integer, which is to return floating point
division in this case.

>
> What about the cardinality being an int instead of an Integer? Just
> curious what you think.

I'm curious about the combinat people's reasons. I don't have an
opinion on that yet, since I know those guys think long and hard about
most issues, so I'm sure there must be a good reason.

Thanks,

Jason

john_perry_usm

unread,
Oct 30, 2011, 12:15:56 PM10/30/11
to sage-edu
On Oct 29, 10:09 am, michel paul <mpaul...@gmail.com> wrote:
> Here's a relatively minor issue that might not be minor for someone new to
> Sage.
>
> In illustrating very simple probability as len(outcomes)/len(sample_space),
> integer division occurs, so the probability becomes 0.

The origin of this behavior is that the designers of the C language
(RIP DMR) decided to use the slash to compute the quotient of integer
division. Not all computer languages do this; Pascal used DIV, while
Eiffel used a double slash (//). Python's designer, Guido van Rossum,
mostly followed C syntax with operators, a choice which is nice for
things like +=, understandable for != and the distinction between =
and ==, and (as you have discovered) infelicitous for integer
division.

> Easy enough to correct - but it prompts discussion of why do we even have to
> bother with that?

Efficiency. Let me slightly change your example with E and S:

sage: %timeit for each in xrange(100000): len(S)/len(E)
25 loops, best of 3: 27 ms per loop
sage: %timeit for each in xrange(100000): float(len(S))/len(E)
5 loops, best of 3: 49 ms per loop
sage: %timeit for each in xrange(100000): float(len(S))/float(len(E))
5 loops, best of 3: 60.1 ms per loop
sage: %timeit for each in xrange(100000): 1.0*len(S)/len(E)
5 loops, best of 3: 959 ms per loop

The answer is _correct_ each time for this problem. Notice that doing
it with int's is much, much faster: your fix, for example, is 40 times
slower than just using int's, because you're using
sage.rings.real_mpfr.RealLiteral, which has a lot of software
overhead. Dividing int's and float's is mostly hardware.

For a lot of computations, efficiency is a sufficient concern that you
want to stick with int's. Computing the quotient from integer division
is actually useful behavior. You don't want to switch to floats.

As others have noted, Python 3.0 is fixing this by, apparently,
introducing Eiffel's // for the integer quotient. ;-)

regards
john perry

PS: For some reason this reminds me of http://www.elsop.com/wrc/humor/unixhoax.htm

Luiz Felipe Martins

unread,
Oct 30, 2011, 5:43:33 PM10/30/11
to sage...@googlegroups.com
Could there be a function slen() that retuns a Sage integer (analogous
to srange())? Or you can simply define it in your own worksheet:

def slen(lst):
return Integer(len(lst))

William Stein

unread,
Oct 30, 2011, 6:26:58 PM10/30/11
to sage...@googlegroups.com

// is in Python 2.x for integer quotient. That's what we should
always use for floor division of integers in the Sage library. It'll
continue to work fine *when* we transition to Python 3, which I see
happening during 2012.

I would to clarify something. When you wrote above that the reason we
don't have / being int floor division by default is "Efficiency.",
that might suggest that having it be anything else would be slower.
However, that is definitely not the case. See below, where when we
switch to division giving floats, the benchmark is unchanged (i.e.,
just as fast).

sage: S = [(die1, die2) for die1 in [1..6] for die2 in [1..6]]
sage: E = [throw for throw in S if sum(throw) == 7]

sage: timeit('len(E)/len(S)')
625 loops, best of 3: 278 ns per loop
sage: len(E)/len(S)
0


sage: from __future__ import division

sage: timeit('len(E)/len(S)')
625 loops, best of 3: 272 ns per loop
sage: len(E)/len(S)
0.16666666666666666

I was in a talk that Guido Van Rosum gave at Scipy 2006 in P3k, and
somebody (me?) asked him about Python using floor division for integer
division by default, and he humbly called it a design mistake on his
part.

> regards
> john perry
>
> PS: For some reason this reminds me of http://www.elsop.com/wrc/humor/unixhoax.htm
>

> --
> You received this message because you are subscribed to the Google Groups "sage-edu" group.
> To post to this group, send email to sage...@googlegroups.com.
> To unsubscribe from this group, send email to sage-edu+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sage-edu?hl=en.
>
>

--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

john_perry_usm

unread,
Oct 31, 2011, 12:29:04 AM10/31/11
to sage-edu
William

On Oct 30, 5:26 pm, William Stein <wst...@gmail.com> wrote:
> // is in Python 2.x for integer quotient.

Ouch. I misread a webpage, badly.

I guess this quote from the Python website answers the original
question: "Because of severe backwards compatibility issues, not to
mention a major flamewar on c.l.py, we propose the following
transitional measures (starting with Python 2.2):" www.python.org/dev/peps/pep-0238/

> I would to clarify something.  When you wrote above that the reason we
> don't have / being int floor division by default is "Efficiency."

Here I was thinking *only* about why the Sage interpreter doesn't
automatically cast, as the proposer had suggested, not to the use of
floats per se. I was actually surprised by the performance of
float(len(E))/float(len(S)) at first, and reasoned it was due to
overhead from the interpreter.

If I'm wrong here, too, please do correct me...

john

dimpase

unread,
Nov 13, 2011, 1:32:06 AM11/13/11
to sage...@googlegroups.com
This is more a pedagogical issue. When I'm teaching a maths class with some programming, I am always stressing that one of particular things to be learned is some programming, and this is potentially very useful, perhaps more useful than the particular maths concepts I cover. (Given that the majority of our graduates don't get to use much maths in their jobs afterwards, this is making perfect sense, too). YMMV.

Dmitrii
Reply all
Reply to author
Forward
0 new messages