The opportunity of Python 3 migration

229 views
Skip to first unread message

Thierry

unread,
Sep 1, 2019, 3:14:51 PM9/1/19
to sage-...@googlegroups.com
Hi,

it seems to me that Python 3 migration should not only be a syntax
adaptation (like print('blah')), unicode, or the mitigation of issues
related to the fact that different objects are not always comparable.

It should also take into account some deep changes in the logic.


Lists vs Iterables
------------------

In Python 2, lists are a central type. In Python 3, most of them are
turned into iterators, including the very important range():

Python 2:

In [1]: range(3)
Out[1]: [0, 1, 2]

Python 3:

In [1]: range(3)
Out[1]: range(0, 3)


In Sage, there are a lot of blah() methods that return lists or tuples,
which have a blah_iterator() counterpart.

For me, in such situations, the Python 3 migration should let blah()
become an iterator and deprecate blah_iterator(). We should avoid
returning lists when we can return iterators, and this by default to
follow this Python 3 change.


Copies vs Views
---------------

Lot of basic methods do not return copies of (part of) an object, but
remains linked to that object.

Here is an example with dicts:

Python 2:

In [1]: d = {1:2, 3:4}

In [2]: d
Out[2]: {1: 2, 3: 4}

In [3]: k = d.keys()

In [4]: k
Out[4]: [1, 3]

In [5]: d[5] = 6

In [6]: d
Out[6]: {1: 2, 3: 4, 5: 6}

In [7]: k
Out[7]: [1, 3]

Python 3:

In [1]: d = {1:2, 3:4}

In [2]: d
Out[2]: {1: 2, 3: 4}

In [3]: k = d.keys()

In [4]: k
Out[4]: dict_keys([1, 3])

In [5]: d[5] = 6

In [6]: d
Out[6]: {1: 2, 3: 4, 5: 6}

In [7]: k
Out[7]: dict_keys([1, 3, 5])


Our methods should also reflect this logic when migrating to Python 3.

A few examples: vertices() and edges() of graphs should not be lists,
but keep links to the graph itself. Similar for rows, columns of
matrices, as it is already the case in numpy:

In [1]: import numpy as np

In [2]: M = np.array([[1, 2], [3, 4]])

In [3]: M
Out[3]:
array([[1, 2],
[3, 4]])

In [4]: row = M[0]

In [5]: row
Out[5]: array([1, 2])

In [6]: M[0,0] = 5

In [7]: M
Out[7]:
array([[5, 2],
[3, 4]])

In [8]: row
Out[8]: array([5, 2])


One might argue that it will break backward consistency, but
Sage-Python2 user code will break anyway, and users will have to work on
adapting their scripts when Python 3 will become Sage's default.

Hence, more generally, i wonder whether one could take the opportunity,
from Python 3 becoming the default, to fix all other inconsistencies of
Sage, that are kept forever in the name of backward-compatibility.

This would imply not making Python 3 the default too early, let is not
miss this opportunity !

Ciao,
Thierry

Luca De Feo

unread,
Sep 1, 2019, 4:15:26 PM9/1/19
to sage-...@googlegroups.com
+1 to this!
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/20190901191448.frswpsu7wqj6gmq6%40metelu.net.
>

mmarco

unread,
Sep 1, 2019, 5:23:04 PM9/1/19
to sage-devel
I think the change you propose is reasonable. However, it sounds like a lot of work, and the support for python2 ends in just a few months. We should aim to release a python3 based release that passes all tests before the end of the year.

Do you think it is reasonable to do these deep changes in this short time? If it is , then i definitely vote for it.

If it is not, then I would propose to leave it as a longer term goal (hopefully not much longer), and once we have a fully python3 bases sage,  write the guidelines you gave down in the development guide so the new code follows this approach. Then we can port the old code to the new models you suggest, without the rush of a deadline.

Nils Bruin

unread,
Sep 1, 2019, 5:50:48 PM9/1/19
to sage-devel
On Sunday, September 1, 2019 at 12:14:51 PM UTC-7, Thierry (sage-googlesucks@xxx) wrote:
Hi,

it seems to me that Python 3 migration should not only be a syntax
adaptation (like print('blah')), unicode, or the mitigation of issues
related to the fact that different objects are not always comparable.

Indeed, there are opportunities there. However, large overhaul projects have a mixed track record of succeeding, and py2-py3 migration needs to happen: we can't afford to screw it up. So I'm in favor of changing the logic in sage to better align with py3 philosophy but I'm not in favour of doing it together with migrating to py3. We should do that in whatever way makes for the best manageable transition. Once we're on py3, we don't have to worry about py2 compatibility anymore, so changing to py3 semantics will be easier to accomplish.

 

Luca De Feo

unread,
Sep 1, 2019, 7:23:20 PM9/1/19
to sage-...@googlegroups.com
I think the List vs Iterables item is totally doable on time. The rest
is maybe more complicated, but even if it's only 50% done it's ok.

Luca
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/c6798142-8f4e-41ca-9efe-2bc5f7758e3d%40googlegroups.com.

David Coudert

unread,
Sep 2, 2019, 2:34:17 AM9/2/19
to sage-devel

A few examples: vertices() and edges() of graphs should not be lists,
but keep links to the graph itself.

Jeroen Demeyer

unread,
Sep 4, 2019, 2:43:29 PM9/4/19
to sage-...@googlegroups.com
On Sun, Sep 1, 2019 at 9:14 PM Thierry <sage-goo...@lma.metelu.net> wrote:
> It should also take into account some deep changes in the logic.

I agree in general. But in my opinion, it's mostly independent of
porting Sage to Python 3. We could do those changes now or next year,
it doesn't really matter.

E. Madison Bray

unread,
Sep 18, 2019, 7:49:39 AM9/18/19
to sage-devel
On Sun, Sep 1, 2019 at 9:14 PM Thierry <sage-goo...@lma.metelu.net> wrote:
>
Agreed, and that's just the beginning! E.g. there are so many
possibilities for use of type annotations in Sage :) See
https://docs.python.org/3/library/typing.html

Simon King

unread,
Sep 20, 2019, 8:39:20 AM9/20/19
to sage-...@googlegroups.com
Hi!

Although late...

On 2019-09-01, mmarco <mma...@unizar.es> wrote:
> ...
>
> Do you think it is reasonable to do these deep changes in this short time?
> If it is , then i definitely vote for it.
>
> If it is not, then I would propose to leave it as a longer term goal
> (hopefully not much longer), and once we have a fully python3 bases sage,
> write the guidelines you gave down in the development guide so the new code
> follows this approach.

I think those things should definitely be in some written the guidelines!

Since I am currently migrating my main Sage project from python-2 to
python-"2 and 3", I know that not all parts of the change of user code
are trivial.
I'd find it helpful to have some documents explaining the state of the
art in Sage (similar to some documents we have for our coercion
framework).

I have a question on Cython, though: When functions/methods will more
often return an iterator instead of a list/tuple, what is the "cdef"
type of an iterator? I.e., how can one tell Cython that a particular
object is an iterator, so that Cython can produce faster code for it?

Best regards,
Simon

Nils Bruin

unread,
Sep 20, 2019, 10:18:10 AM9/20/19
to sage-devel
I'm pretty sure cython can't. An iterator is a python object that adheres to a certain protocol (it provides "next" and "iter" is idempotent on it), not a particular type. There wouldn't be any optimizations that can be made with just that knowledge. You are of course free to make your own cdef class that adheres to the iterator protocol and in addition provides a c-level shortcut into its "next" functionality. That could save you some python function call overhead in cython code where you know that you'll have an iterator of that specific type.

 In particular, the kind of "optimization" that iterators give over lists is a rather high-level one: by removing the requirement of O(N) memory usage, your code may be (asymptotically) more efficient because you don't need to claim so much memory. Whether for special cases (and small to moderate N) it's worth it, depends the particular application. On python level it seems to work rather well. I'm not so sure that this remains the case on cython level, where a whole slew of optimizations for list access are readily available.
Reply all
Reply to author
Forward
0 new messages