Re: numpy vs. in-place optimizations

22 views
Skip to first unread message

Robert Bradshaw

unread,
Oct 30, 2007, 4:48:09 PM10/30/07
to Carl Witty, sage-...@googlegroups.com, sage-s...@googlegroups.com
On Oct 30, 2007, at 1:11 PM, Carl Witty wrote:

> On Tue, 2007-10-30 at 11:24 -0700, Robert Bradshaw wrote:
>> I don't see this one--could you send me a link?
>>
>> I didn't know we used SAGE objects in any numpy classes.
>
> Numpy provides a type of matrices of arbitrary Python objects.
>
> http://groups.google.com/group/sage-support/browse_thread/thread/
> 96a5f3d336285146/a40540fd8a468ef3#a40540fd8a468ef3

Thanks. I really should subscribe to sage-support I guess, but I'm
cross-posting to sage-devel 'cause I think it's an important issue.

This is due to the inplace operator stuff using refcounts to
determine if it's safe to mutate. The simple workaround is to not use
numpy arrays of SAGE objects. Another question is why would one do so
(i.e. what is lacking in the SAGE linear algebra types?) I think the
benifits of inplace operators in terms of performance are to great to
abandon, but am at loss to find a clean solution so that external
extension classes don't mess them up... other than perhaps a global
option that can be enabled/disabled.

- Robert

Carl Witty

unread,
Oct 30, 2007, 9:29:13 PM10/30/07
to sage-devel
On Oct 30, 1:48 pm, Robert Bradshaw <rober...@math.washington.edu>
wrote:

> This is due to the inplace operator stuff using refcounts to
> determine if it's safe to mutate. The simple workaround is to not use
> numpy arrays of SAGE objects. Another question is why would one do so
> (i.e. what is lacking in the SAGE linear algebra types?) I think the
> benifits of inplace operators in terms of performance are to great to
> abandon, but am at loss to find a clean solution so that external
> extension classes don't mess them up... other than perhaps a global
> option that can be enabled/disabled.

I can think of lots of "solutions", although I don't know which of
them might be worth the effort.

1) a global enable/disable
2) hook into __import__ to notice imports of numpy, and turn off the
optimization globally if numpy is ever imported
3) patch numpy to add the incref/decref operations
4) patch numpy to turn off the optimization on entry to dangerous
functions, and turn the optimization back on at exit

Carl

Carl Witty

unread,
Oct 30, 2007, 11:21:40 PM10/30/07
to sage-devel
On Oct 30, 6:29 pm, Carl Witty <cwi...@newtonlabs.com> wrote:
> On Oct 30, 1:48 pm, Robert Bradshaw <rober...@math.washington.edu>
> wrote:
>
> > This is due to the inplace operator stuff using refcounts to
> > determine if it's safe to mutate. The simple workaround is to not use
> > numpy arrays of SAGE objects. Another question is why would one do so
> > (i.e. what is lacking in the SAGE linear algebra types?) I think the
> > benifits of inplace operators in terms of performance are to great to
> > abandon, but am at loss to find a clean solution so that external
> > extension classes don't mess them up... other than perhaps a global
> > option that can be enabled/disabled.

I've opened a TRAC ticket on this issue here: http://sagetrac.org/sage_trac/ticket/1038

Carl

Robert Bradshaw

unread,
Oct 31, 2007, 3:20:18 PM10/31/07
to sage-...@googlegroups.com

I like 1-2. Options 3 and 4 seem like lots of (error-prone) work on
every numpy release. Numpy isn't imported by default, right?

- Robert

William Stein

unread,
Oct 31, 2007, 3:22:40 PM10/31/07
to sage-...@googlegroups.com
On 10/31/07, Robert Bradshaw <robe...@math.washington.edu> wrote:
> > I can think of lots of "solutions", although I don't know which of
> > them might be worth the effort.
> >
> > 1) a global enable/disable
> > 2) hook into __import__ to notice imports of numpy, and turn off the
> > optimization globally if numpy is ever imported
> > 3) patch numpy to add the incref/decref operations
> > 4) patch numpy to turn off the optimization on entry to dangerous
> > functions, and turn the optimization back on at exit
>
> I like 1-2. Options 3 and 4 seem like lots of (error-prone) work on
> every numpy release. Numpy isn't imported by default, right?

Correct, it isn't (or at least it shouldn't be, since importing it is
slow). However as soon as somebody uses matplotlib for
anything, or uses numerical matrices, it is very likely to get imported.

William

Robert Bradshaw

unread,
Oct 31, 2007, 3:58:43 PM10/31/07
to sage-...@googlegroups.com

So matplotlib depends on Numpy?

> or uses numerical matrices,

I thought we used GSL as a back end, but does it often fall back to
Numpy then?

> it is very likely to get imported.

It doesn't look like arrays over PyObjects have their own module, or
we could hook it to that (someone who knows numpy better could maybe
verify?). We could also patch numpy to turn off inline operations
whenever an array over generic objects is created. This at least
should be pretty rare. Or we could consider options 3/4 (again, it
would be nice if someone familiar with the numpy source could comment
on this).

- Robert

William Stein

unread,
Oct 31, 2007, 4:30:32 PM10/31/07
to sage-...@googlegroups.com
On 10/31/07, Robert Bradshaw <robe...@math.washington.edu> wrote:
>
> So matplotlib depends on Numpy?

Yes.

> > or uses numerical matrices,
>
> I thought we used GSL as a back end, but does it often fall back to
> Numpy then?

We use both numpy and gsl. Numpy is better for certain things.

> > it is very likely to get imported.
>
> It doesn't look like arrays over PyObjects have their own module, or
> we could hook it to that (someone who knows numpy better could maybe
> verify?). We could also patch numpy to turn off inline operations
> whenever an array over generic objects is created. This at least
> should be pretty rare.

That sounds like a very good idea.

> Or we could consider options 3/4 (again, it
> would be nice if someone familiar with the numpy source could comment
> on this).

Maybe you should write directly to Travis Oliphant at BYU who
wrote numpy:

http://www.ee.byu.edu/faculty/oliphant/

-- William

Dan Christensen

unread,
Nov 2, 2007, 1:42:56 PM11/2/07
to sage-...@googlegroups.com, sage-s...@googlegroups.com
Robert Bradshaw <robe...@math.washington.edu> writes:

> This is due to the inplace operator stuff using refcounts to
> determine if it's safe to mutate. The simple workaround is to not use
> numpy arrays of SAGE objects. Another question is why would one do so
> (i.e. what is lacking in the SAGE linear algebra types?) I think the
> benifits of inplace operators in terms of performance are to great to
> abandon, but am at loss to find a clean solution so that external
> extension classes don't mess them up... other than perhaps a global
> option that can be enabled/disabled.

numpy arrays are extremely flexible, with broadcasting, view semantics
and in-place operations being the most important reason why. For
example, if x is an array, then x[3:5] is a view of part of x, and
I can adjust the entries in just that region with x[3:5] += [1,2].
This even works if x is multi-dimensional, with the [1,2] being
broadcast to the right shape.

At the same time, sage provides some fundamental types that can be
useful as array elements.

So it is completely natural to use numpy arrays of sage elements, and I
have certainly done so many times.

More generally, if the current implementation of in-place operations in
sage causes problems in numpy, it is quite likely that there will also
be problems with other python modules. So I regard this as a serious
bug in sage.

Dan

William Stein

unread,
Nov 2, 2007, 1:48:56 PM11/2/07
to sage-...@googlegroups.com, sage-s...@googlegroups.com
On 11/2/07, Dan Christensen <j...@uwo.ca> wrote:
> numpy arrays are extremely flexible, with broadcasting, view semantics
> and in-place operations being the most important reason why. For
> example, if x is an array, then x[3:5] is a view of part of x, and
> I can adjust the entries in just that region with x[3:5] += [1,2].
> This even works if x is multi-dimensional, with the [1,2] being
> broadcast to the right shape.
>
> At the same time, sage provides some fundamental types that can be
> useful as array elements.
>
> So it is completely natural to use numpy arrays of sage elements, and I
> have certainly done so many times.
>
> More generally, if the current implementation of in-place operations in
> sage causes problems in numpy, it is quite likely that there will also
> be problems with other python modules. So I regard this as a serious
> bug in sage.
>

+1 After mulling this over for awhile, I feel exactly the same way
about everything said above.

The inplace optimizations should be something that have to be
explicitly turned on, and they should be off by default. I know they
make certain things faster, but correctness by default is *always* a
much more important consideration with serious mathematical software
such as Sage. Always.

I will very likely disable in-place optimization for sage-2.8.11,
until this issue gets properly discussed and resolved.

-- William

Robert Bradshaw

unread,
Nov 2, 2007, 2:25:28 PM11/2/07
to sage-...@googlegroups.com

:-(, but I have to concede to your logic. The line to change is 148
of coerce.pxi. Setting this value to 0 will turn them completely off.
Other than numpy, (and the builtin libraries), do we use any other
extension types? If there is a finite list of things for which it
won't work, it would be possible to disable it just for those.

- Robert

Carl Witty

unread,
Nov 2, 2007, 3:22:30 PM11/2/07
to sage-devel
On Nov 2, 11:25 am, Robert Bradshaw <rober...@math.washington.edu>
wrote:

> :-(, but I have to concede to your logic. The line to change is 148
> of coerce.pxi. Setting this value to 0 will turn them completely off.
> Other than numpy, (and the builtin libraries), do we use any other
> extension types? If there is a finite list of things for which it
> won't work, it would be possible to disable it just for those.

Another possibility is to figure out where in Sage it's safe to use
and particularly helpful, and explicitly enable it for those sections
of code.

And another possibility, which restores the in-place optimization to
(some) Cython code but not Python code, is to change Cython so that if
it knows the things it's adding/multiplying/etc. are of subtypes of
sage.structure.element.Element, then it bypasses PyNumber_Add and
calls a method on the objects directly. This method could use the in-
place optimization. (Plus, we might gain a tiny bit of speed by
skipping PyNumber_Add anyway.)

Carl Witty

William Stein

unread,
Nov 2, 2007, 3:25:25 PM11/2/07
to sage-...@googlegroups.com

+1 to both of these ideas.

-- William

mabshoff

unread,
Nov 2, 2007, 3:26:45 PM11/2/07
to sage-devel

For now I have applied the following patch:

# HG changeset patch
# User Robert Bradshaw
# Date 1194031602 25200
# Node ID d235183a07ac596bb10db446be4a0e64a3a66a4c
# Parent 71f280c221ca1677e4452b685d584eba49f822a1
turn off in-place optimizations, see #1038

diff -r 71f280c221ca -r d235183a07ac sage/structure/coerce.pxi
--- a/sage/structure/coerce.pxi Thu Nov 01 21:03:52 2007 -0700
+++ b/sage/structure/coerce.pxi Fri Nov 02 12:26:42 2007 -0700
@@ -145,7 +145,7 @@ cdef inline RingElement _idiv_c(RingElem

cdef enum:
# 3 references: handle, scope container, and arithmatic call
stack
- inplace_threshold = 3
+ inplace_threshold = 0


Cheers,

Michael

Robert Bradshaw

unread,
Nov 2, 2007, 3:52:21 PM11/2/07
to sage-...@googlegroups.com
On Nov 2, 2007, at 12:22 PM, Carl Witty wrote:

> On Nov 2, 11:25 am, Robert Bradshaw <rober...@math.washington.edu>
> wrote:
>> :-(, but I have to concede to your logic. The line to change is 148
>> of coerce.pxi. Setting this value to 0 will turn them completely off.
>> Other than numpy, (and the builtin libraries), do we use any other
>> extension types? If there is a finite list of things for which it
>> won't work, it would be possible to disable it just for those.
>
> Another possibility is to figure out where in Sage it's safe to use
> and particularly helpful, and explicitly enable it for those sections
> of code.

We could definitely make a use_inplace_operations context object and
use it in some places (e.g. Coleman integration, and the sum() and
prod() functions) but the fact is that it is useful almost
everywhere. The two most common cases are doing a loop with some
iterative increment/evaluation, (e.g. newton iteration or counter
variables) and in any arithmetic expression with more than one
operation (e.g. evaluating a polynomial, or even constructing a
polynomial, or something like a*b+c.) This is all over the place...

> And another possibility, which restores the in-place optimization to
> (some) Cython code but not Python code, is to change Cython so that if
> it knows the things it's adding/multiplying/etc. are of subtypes of
> sage.structure.element.Element, then it bypasses PyNumber_Add and
> calls a method on the objects directly. This method could use the in-
> place optimization. (Plus, we might gain a tiny bit of speed by
> skipping PyNumber_Add anyway.)

My impression is that this would require rewriting a huge portion of
Cython code to explicitly declare types (and, consequently,
disallowing native types like int or float), and probably a lot of
SAGE-specific things in the Cython codebase.

- Robert

Reply all
Reply to author
Forward
0 new messages