Vector space sum very slow

100 views
Skip to first unread message

John Cremona

unread,
Nov 8, 2012, 4:36:40 AM11/8/12
to SAGE devel
Either I am doing something wrong or a simple operation which should
be fast is in fact very slow, but I do not know why.

If V,W are vector spaces over QQ, subspaces of the same ambient
space, then V.sum(W) should return their sum, with an echelon basis
obtained from the bases of V and W. But this takes a stupidly long
time. For example:

sage: Q100=QQ^100
sage: V=Q100.subspace([Q100.random_element()])
sage: W=Q100.subspace([Q100.random_element()])

# so V,W are both 1-dimensional

sage: V.sum(W)
sage: time V.sum(W)

# takes forever (on 5.3) -- why?

John

Francis Clarke

unread,
Nov 8, 2012, 5:07:30 AM11/8/12
to sage-...@googlegroups.com

I think you want 

sage: V.direct_sum(W)
Vector space of degree 200 and dimension 2 over Rational Field
Basis matrix:
2 x 200 dense matrix over Rational Field

Francis

Timo Kluck

unread,
Nov 8, 2012, 5:10:01 AM11/8/12
to sage-...@googlegroups.com
Op donderdag 8 november 2012 10:36:51 UTC+1 schreef John Cremona het volgende:
It's even slow for

sage: Q100 = QQ^1

To find out where things are slow, you can use %prun, which gives a list of functions and the time spent in each. I haven't had time to look into it yet, but perhaps you'll be able to find out what's going that way.

You could also apply patch #5814 and use %prun from the notebook.

Best, Timo

P Purkayastha

unread,
Nov 8, 2012, 5:38:23 AM11/8/12
to sage-...@googlegroups.com
What Francis said is correct. V.sum(W) sums the elements in W. Over QQ,
the number of elements is not finite. I am surprised if your machine is
still up, because it should have just exhausted all memory! :)

David Loeffler

unread,
Nov 8, 2012, 6:08:38 AM11/8/12
to SAGE devel
I suspect that what John wants is the sum of V and W as subspaces of a
common ambient space (not the direct sum embedded in the direct sum of
two copies of the ambient space, as Francis suggests). This can be
obtained (very quickly) by doing

sage: time V + W
Vector space of degree 100 and dimension 2 over Rational Field
Basis matrix:
2 x 100 dense matrix over Rational Field
Time: CPU 0.02 s, Wall: 0.02 s

It is *really awful* that V.sum(W) attempts to sum all the elements of
W! I don't know what general design decision underlies this but I
don't like it.

David
> --
> You received this message because you are subscribed to the Google Groups
> "sage-devel" group.
> To post to this group, send email to sage-...@googlegroups.com.
> To unsubscribe from this group, send email to
> sage-devel+...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sage-devel?hl=en.
>
>
>

John Cremona

unread,
Nov 8, 2012, 6:31:56 AM11/8/12
to SAGE devel
Thanks to all. The machine is still up, as I was able to stop the
computation with Crtl-C.

Now I see it, of course that's what V.sum(W) says that it does, except
that it in fact makes no sense, since adding up the (infinitely many)
elements of W has nothing to do with V.

I don't quite know how I missed V+W -- thanks, David!

John

On 8 November 2012 10:38, P Purkayastha <ppu...@gmail.com> wrote:

Simon King

unread,
Nov 8, 2012, 7:26:19 AM11/8/12
to sage-...@googlegroups.com
Hi David and John,

On 2012-11-08, David Loeffler <d.a.lo...@warwick.ac.uk> wrote:
> It is *really awful* that V.sum(W) attempts to sum all the elements of
> W! I don't know what general design decision underlies this but I
> don't like it.

That's easy to explain. All additive semigroups should be able to sum up a
list of their elements. Hence, such as summation method is defined by
the category framework.

Apparently it not only accepts a list, but any iterable.
Problem: Vektor spaces may be infinite, but are iterable.

Solution: The sum() method (and prod() similarly) should accept
iterables *of finite length*. Hence if you have a vector space V
and len(V) raises an error (this is what it currently does) or
returns Infinity, then V.sum(V) should raise an error.

Other solution: Rename sum() to sum_of_elements(). I don't like it.

Cheers,
Simon


John Cremona

unread,
Nov 8, 2012, 7:44:33 AM11/8/12
to SAGE devel
I don't think anyone is objecting to being able to add a list of
vectors (or any other objects which can be added). But in my example,
V.sum(W) is taking the (infinite) list of elements of W and adding
them up, which has nothing at all to do with V. Just replace QQ by a
finite field in my example, so that it does make sense to add up all
of the elements of W. Then certainly sum(W) should do this
addition. But why should V.sum() accept as arguemt a list of things
*which are not elemments of V* ??? In fact, why is there a function
V.sum() at all? The only way in which V.sum() refers back to V is
that the sum is initialized to V.zero().

John

Volker Braun

unread,
Nov 8, 2012, 11:02:49 AM11/8/12
to sage-...@googlegroups.com
This occurred to me before, its nice to have iterators/generators but there should be a mechanism to say that the iterator is over a finite or infinite set. Just to avoid doing exhaustive searches over infinite sets. Similar to C++ iterator traits, I guess. Of course if you were to manually implement a generator as a Python class (implementing __next__) you could attach other methods like is_finite(). But then you'd always have to write a lot of boilerplate to return an iterator. Its much easier to use yield, but then you can't return any additional information about finiteness. 

The alternative is to assume that the user knows what he is doing when he passes an iterator / generator object. V.sum(W) can and should still check that W is finite before starting to iterate over W.

Jason Grout

unread,
Nov 8, 2012, 11:16:36 AM11/8/12
to sage-...@googlegroups.com
On 11/8/12 6:26 AM, Simon King wrote:
> Hi David and John,
>
> On 2012-11-08, David Loeffler <d.a.lo...@warwick.ac.uk> wrote:
>> It is *really awful* that V.sum(W) attempts to sum all the elements of
>> W! I don't know what general design decision underlies this but I
>> don't like it.
>
> That's easy to explain. All additive semigroups should be able to sum up a
> list of their elements. Hence, such as summation method is defined by
> the category framework.

Are there any examples where it is more complicated than just sum(args,
self.zero())? If not, I think I side with John in questioning the value
of the function.

If each element of args was changed into an element of V before summing,
then it seems that the function's utility might be argued more strongly.
Something like:

sum((self(i) for i in args), self.zero())

Thanks,

Jason


P Purkayastha

unread,
Nov 8, 2012, 11:29:23 AM11/8/12
to sage-...@googlegroups.com
According to the doc it is supposed to return an element of V.


Definition: V.sum(self, args)
Docstring:
n-ary sum

INPUT:

* "args" -- a list (or iterable) of elements of "self"

Returns the sum of the elements in args, as an element of self.




Also, according to the doc, the input should have been a list of "self"
(whatever it refers to -- that's another gripe I have with writing the
word 'self' in the docs). But again, the explanation below it is
confusing since it seems to indicate that the elements could be changed
to belong to V. Perhaps that's why self.zero() is used?


However, it is not always changed to an element of V, which is not
surprising given the code.

sage: type(QQ.sum([ZZ(1), ZZ(2)]))
<type 'sage.rings.rational.Rational'>
sage: type(QQ.sum([RR(1), RR(2)]))
<type 'sage.rings.real_mpfr.RealNumber'>


Simon King

unread,
Nov 8, 2012, 4:16:52 PM11/8/12
to sage-...@googlegroups.com
Hi Volker,

On 2012-11-08, Volker Braun <vbrau...@gmail.com> wrote:
> But then you'd always have to write a lot of
> boilerplate to return an iterator. Its much easier to use yield, but then
> you can't return any additional information about finiteness.

No? What's wrong with __len__?

Cheers,
Simon

Nils Bruin

unread,
Nov 8, 2012, 5:10:49 PM11/8/12
to sage-devel
On Nov 8, 1:17 pm, Simon King <simon.k...@uni-jena.de> wrote:
> Hi Volker,
>
> On 2012-11-08, Volker Braun <vbraun.n...@gmail.com> wrote:
>
> > But then you'd always have to write a lot of
> > boilerplate to return an iterator. Its much easier to use yield, but then
> > you can't return any additional information about finiteness.
>
> No? What's wrong with __len__?

There are iterators where you do know the result is finite but where
it is hard to predict how many items will be returned, e.g.

( x for x in range(1000) if expensive_predicate(x) ).

It would be silly to run through this iterable twice: once to get the
length and establish it's finite and once to actually sum the entries.
Would you just test for the existence of __len__ ? [On this example it
actually fails because a generator object doesn't have a __len__
attribute.

Do additive semigroups actually need a "sum" method? Is the idea that
it can implement a sum strategy that is appropriate for the semigroup?
If there were an "iterator generating elements of the given semigroup"
type, I could see how having this could be more efficient, but given
that we have no guarantee, we have to dispatch through the generic
coercion framework for __add__ anyway. That means you might as well
just do

target_semigroup(reduce(operator.add, <iterator>, ZZ(0)))

since you don't have control over the intermediate parents anyway. (ZZ
should coerce into any additive semigroup). The above is better
expressed as a generic function than as a method on a parent class.

Simon King

unread,
Nov 8, 2012, 5:56:26 PM11/8/12
to sage-...@googlegroups.com
On 2012-11-08, Nils Bruin <nbr...@sfu.ca> wrote:
> On Nov 8, 1:17 pm, Simon King <simon.k...@uni-jena.de> wrote:
>> Hi Volker,
>>
>> On 2012-11-08, Volker Braun <vbraun.n...@gmail.com> wrote:
>>
>> > But then you'd always have to write a lot of
>> > boilerplate to return an iterator. Its much easier to use yield, but then
>> > you can't return any additional information about finiteness.
>>
>> No? What's wrong with __len__?
>
> There are iterators where you do know the result is finite but where
> it is hard to predict how many items will be returned, e.g.
>
> ( x for x in range(1000) if expensive_predicate(x) ).
>
> It would be silly to run through this iterable twice: once to get the
> length and establish it's finite and once to actually sum the entries.
> Would you just test for the existence of __len__ ? [On this example it
> actually fails because a generator object doesn't have a __len__
> attribute.

My suggestion was that the sum() method (and the prod() method as well)
do things like
def sum(self, L):
try:
n = len(L)
except:
raise ValueError, "the given input is not a finite iterable"
if n not in ZZ:
raise ValueError, "the given input is not a finite iterable"
<do the usual summation and return the result>

For V = QQ^3, an error would be raised on V.sum(V), simply because len(V)
raises an error. Similarly for V = ZZ.

For V = GF(5)^3, V.sum(V) would return the sum of all elements, because
len(V) is finite and V is iterable.

And from the implementation point of view, I guess if P is an iterable
parent (i.e., someone was able to implement __iter__), then usually it
would also be possible to tell the cardinality of P. There are parents
that have a method called cardinality() - why not make __len__ an alias
for the cardinality?

Best regards,
Simon


Robert Bradshaw

unread,
Nov 9, 2012, 12:19:56 AM11/9/12
to sage-...@googlegroups.com
__len__ must return a (machine-sized) int, not elements of ZZ let
alone infinity. (That might suffice for this case, but not in
general.) Also, this would break if you tried to pass in a (finite)
generator, which seems like a perfectly reasonable thing to do.

However, I agree with the sentiment that V.sum(W) grossly violates the
principle of least surprise as well as being of dubious merit over the
builtin sum(...).

- Robert

Simon King

unread,
Nov 9, 2012, 12:58:48 AM11/9/12
to sage-...@googlegroups.com
Hi Robert,

On 2012-11-09, Robert Bradshaw <robe...@math.washington.edu> wrote:
> __len__ must return a (machine-sized) int, not elements of ZZ let
> alone infinity.

Sorry, I didn't know that. Then one should probably go the other way,
implement cardinality() for more parents, and use it in "sum".

Cheers,
Simon


Nils Bruin

unread,
Nov 9, 2012, 1:00:54 AM11/9/12
to sage-devel
On Nov 8, 9:20 pm, Robert Bradshaw <rober...@math.washington.edu>
wrote:

> However, I agree with the sentiment that V.sum(W) grossly violates the
> principle of least surprise as well as being of dubious merit over the
> builtin sum(...).

And note that python's builtin sum is in sage shadowed by

sum=sage.misc.functional.symbolic_sum

which does:

if hasattr(expression, 'sum'):
return expression.sum(*args, **kwds)
elif len(args) <= 1:
return sum(expression, *args)
else:
...

so the builtin sum only gets called in second place.

Nicolas M. Thiery

unread,
Nov 9, 2012, 5:08:43 AM11/9/12
to sage-...@googlegroups.com
On Thu, Nov 08, 2012 at 10:16:36AM -0600, Jason Grout wrote:
> On 11/8/12 6:26 AM, Simon King wrote:
> >Hi David and John,
> >
> >On 2012-11-08, David Loeffler <d.a.lo...@warwick.ac.uk> wrote:
> >>It is *really awful* that V.sum(W) attempts to sum all the elements of
> >>W! I don't know what general design decision underlies this but I
> >>don't like it.
> >
> >That's easy to explain. All additive semigroups should be able to sum up a
> >list of their elements. Hence, such as summation method is defined by
> >the category framework.
>
> Are there any examples where it is more complicated than just
> sum(args, self.zero())? If not, I think I side with John in
> questioning the value of the function.

The main point is that writing V.sum(args) gives more information to
the system than sum(args, self.zero()): this is making a promise that
all elements in args are elements of V. This gives an opportunity for
optimizations: typically one could resolve once for all what the
appropriate method is for adding two elements, and avoid that
resolution at runtime for each addition.

We had such an optimization in MuPAD-Combinat; I don't remember if we
have one yet in Sage, but we definitely should. In the mean time we
want to encourage the use of V.sum so that later on the relevant
pieces of code will benefit from the optimization.

Other points:

- V.sum could be made to work for additive semigroups that don't have a zero
- V.sum could possibly optimize away one addition (which sum does not do)
- Experience tells that people just write sum(args) and forget about
the self.zero(), which leads to broken corner cases that hit hard
down the road.

Granted, it would be nice if there was a minimum of type checking so
that V.sum(...) would bark if the elements are not in V.

Cheers,
Nicolas
--
Nicolas M. Thi�ry "Isil" <nth...@users.sf.net>
http://Nicolas.Thiery.name/

Nicolas M. Thiery

unread,
Nov 9, 2012, 7:40:25 AM11/9/12
to sage-...@googlegroups.com
On Thu, Nov 08, 2012 at 10:56:26PM +0000, Simon King wrote:
> why not make __len__ an alias for the cardinality?

It can't be a plain alias: __len__ is only allowed to return an `int`,
whereas cardinality should return an Integer or +infinity.

Nicolas M. Thiery

unread,
Nov 9, 2012, 7:34:55 AM11/9/12
to sage-...@googlegroups.com
Hi Volker,

On Thu, Nov 08, 2012 at 08:02:49AM -0800, Volker Braun wrote:
> This occurred to me before, its nice to have iterators/generators but
> there should be a mechanism to say that the iterator is over a finite or
> infinite set.
> Just to avoid doing exhaustive searches over infinite sets.
> Similar to C++ iterator traits, I guess. Of course if you were to manually
> implement a generator as a Python class (implementing __next__) you could
> attach other methods like is_finite(). But then you'd always have to write
> a lot of boilerplate to return an iterator. Its much easier to use yield,
> but then you can't return any additional information about finiteness.
> The alternative is to assume that the user knows what he is doing when he
> passes an iterator / generator object. V.sum(W) can and should still check
> that W is finite before starting to iterate over W.

There is already a bit of infrastructure for this; namely an parent
modeling an enumerated set which is infinite should declare itself in
the category InfiniteEnumeratedSets (soon to come: Sets().Infinite()).
Then, methods could do some checks on this; that's already the case
for e.g. ``list``::

sage: ZZ.list()
Traceback (most recent call last):
...
ValueError: since it is infinite, cannot list Integer Ring

Of course a preliminary step would be to make sure that parents are
put in this category when relevant. Here QQ^3 should not only be an
iterable but actually an infinite enumerated set:

sage: (QQ^3) in InfiniteEnumeratedSets() # todo: not implemented
True

More precisely: QQ^3 should be in the category of finite dimensional
modules with basis, and a finite dimensional module with basis over an
infinite field should be in the category of infinite enumerated sets.

Nicolas M. Thiery

unread,
Nov 9, 2012, 7:46:34 AM11/9/12
to sage-...@googlegroups.com
Oops, Robert had already answered :-)

I object to using cardinality in sum. For many parents, cardinality is
implemented by iterating through the elements (for there is nothing
better), and we don't want it do that twice (the point of V.sum is
*speed*).

In any cases, V.sum(...) should quite rarely be called on a parent (it
only makes sense if that parent is a subset of V), so I would not
bother about it.

On the other hand V.sum could bark if it gets objects that are not in
V, and its documentation could certainly use some love.

Nils Bruin

unread,
Nov 9, 2012, 11:44:15 AM11/9/12
to sage-devel
On Nov 9, 2:11 am, "Nicolas M. Thiery" <Nicolas.Thi...@u-psud.fr>
wrote:
> The main point is that writing V.sum(args) gives more information to
> the system than sum(args, self.zero()): this is making a promise that
> all elements in args are elements of V.

In python's call dispatch system, it does not. It says something along
the lines: "Use V" to add up these arguments. I guess the convention
could be that if someone calls this, then they should make sure that
"args" only involves elements from V. But then V.sum(W) would not be
valid in interesting cases either (apart from probably not being a
finite operation).

>  - V.sum could be made to work for additive semigroups that don't have a zero
>  - V.sum could possibly optimize away one addition (which sum does not do)
>  - Experience tells that people just write sum(args) and forget about
>    the self.zero(), which leads to broken corner cases that hit hard
>    down the road.

Yes, this is odd. python's sum really does start with adding zero,
even if no initial value is given.

class A(object):
def __init__(self,n):
self.n=n
def __add__(self,b):
print "called with",self,b
return self.n+b.n
def __radd__(b,self):
print "called with",self,b
return self.n+b.n
def __repr__(self):
return "A(%s)"%self.n

sage: L=[A(1),A(2)]
sage: from __builtin__ import sum
sage: sum(L)
called with 0 A(1)
AttributeError: 'int' object has no attribute 'n'

whereas reduce does the appropriate thing:

sage: reduce(operator.add,L)
called with A(1) A(2)
3
sage: reduce(operator.add,L,int(0))
called with 0 A(1)
AttributeError: 'int' object has no attribute 'n'

Volker Braun

unread,
Nov 9, 2012, 12:41:01 PM11/9/12
to sage-...@googlegroups.com, Nicolas M. Thiery
Thanks, looks good.

Maybe one could factor out the Enumerated (=canonical enumeration) part? Or is every parent with __iter__ supposed to be in EnumeratedSets and, therefore, have a rank() methods etc? The element order is often just arbitrary...

Also, I dislike .list() returning a Python list. We should always return tuples, and if even the Sage framework breaks that rule then its very confusing for new users. An argument could be made that list can be understood as "Python list" and not as the English verb for "series of records". Maybe call it enumerate() with an alias list()?




Nicolas M. Thi�ry "Isil" <nth...@users.sf.net>
http://Nicolas.Thiery.name/

John H Palmieri

unread,
Nov 9, 2012, 12:53:49 PM11/9/12
to sage-...@googlegroups.com, Nicolas M. Thiery


On Friday, November 9, 2012 4:42:42 AM UTC-8, Nicolas M. Thiéry wrote: 
More precisely: QQ^3 should be in the category of finite dimensional
modules with basis, and a finite dimensional module with basis over an
infinite field should be in the category of infinite enumerated sets.


Do you mean over a *countable* field? Otherwise, I don't know what you mean by "enumerated".

--
John
 

Florent Hivert

unread,
Nov 10, 2012, 5:21:54 AM11/10/12
to sage-...@googlegroups.com
On Fri, Nov 09, 2012 at 09:53:49AM -0800, John H Palmieri wrote:
>
>
> On Friday, November 9, 2012 4:42:42 AM UTC-8, Nicolas M. Thi�ry wrote:
> >
> > More precisely: QQ^3 should be in the category of finite dimensional
> > modules with basis, and a finite dimensional module with basis over an
> > infinite field should be in the category of infinite enumerated sets.
> >
> >
> Do you mean over a *countable* field? Otherwise, I don't know what you mean
> by "enumerated".

That's because you don't have Read The F****** Man ;-)

http://www.sagemath.org/doc/reference/sage/categories/enumerated_sets.html

Cheers,

Florent

Florent Hivert

unread,
Nov 10, 2012, 5:28:26 AM11/10/12
to sage-...@googlegroups.com
On Fri, Nov 09, 2012 at 09:41:01AM -0800, Volker Braun wrote:
> Thanks, looks good.
>
> Maybe one could factor out the Enumerated (=canonical enumeration) part? Or
> is every parent with __iter__ supposed to be in EnumeratedSets and,
> therefore, have a rank() methods etc? The element order is often just
> arbitrary...

I don't thik every parent with iter sould be in enumerated. Enumerated means
that there is some cononical enumeration and that as a consequence, list,
__iter__, __getitem__ have to return their result with a consistent
ordering. Nothing is currently implemented, but I had in mind that morphism of
enumerated sets should preserve the canonical enumeration...

> Also, I dislike .list() returning a Python list. We should always return
> tuples, and if even the Sage framework breaks that rule then its very
> confusing for new users.

You are not alone: see #11125

> An argument could be made that list can be understood as "Python list" and
> not as the English verb for "series of records". Maybe call it enumerate()
> with an alias list()?

I like that but I would like for more vote before setting it as a rule.

Cheers,

Florent

Nicolas M. Thiery

unread,
Nov 11, 2012, 4:31:40 AM11/11/12
to sage-...@googlegroups.com
John is right: the sentence I meant is of course:

``A finite dimensional module with basis over an infinite enumerated
field should be in the category of infinite enumerated sets.''

And one should have as well:

``A module with basis over an infinite field should be in the
category of infinite sets.''

Ah, thinking twice about it, there actually is a little issue with
dimension 0 ... Hmm, I have to think how such exceptions should be
handled.

Cheers

Nicolas M. Thiery

unread,
Nov 11, 2012, 4:44:01 AM11/11/12
to sage-...@googlegroups.com
Hi Nils!

On Fri, Nov 09, 2012 at 08:44:15AM -0800, Nils Bruin wrote:
> On Nov 9, 2:11�am, "Nicolas M. Thiery" <Nicolas.Thi...@u-psud.fr>
> wrote:
> > The main point is that writing V.sum(args) gives more information to
> > the system than sum(args, self.zero()): this is making a promise that
> > all elements in args are elements of V.
>
> In python's call dispatch system, it does not. It says something along
> the lines: "Use V" to add up these arguments.

Yes, that's what I mean by ``giving more information'': V has the
required information to do it fast, so use it.

> I guess the convention could be that if someone calls this, then
> they should make sure that "args" only involves elements from V.

This is indeed the intended precondition, as stated in the
documentation of CommutativeAdditiveMonoids.ParentMethods.sum:

- ``args`` -- a list (or iterable) of elements of ``self``

Granted: ``element of`` is a bit vague. The exact intention is that
``self.is_parent_of(x)`` holds for all x in the sum.

> But then V.sum(W) would not be valid in interesting cases either
> (apart from probably not being a finite operation).

If such a feature is desired at some point, then I would want it to be
implemented in a different method.

Nicolas M. Thiery

unread,
Nov 11, 2012, 4:47:10 AM11/11/12
to sage-...@googlegroups.com
On Fri, Nov 09, 2012 at 08:44:15AM -0800, Nils Bruin wrote:
> Yes, this is odd. python's sum really does start with adding zero,
> even if no initial value is given.

This indeed looks like a natural spot for optimization. However it
would change the semantic:

sage: sum([1,2], 0.0)
3.00000000000000

Besides, sum would need to know whether ``start`` is a neutral
element before doing the optimization.
Reply all
Reply to author
Forward
0 new messages