Borrowing ideas from Polys to Matrix

18 views
Skip to first unread message

SherjilOzair

unread,
May 28, 2011, 9:30:20 AM5/28/11
to sympy
Hello everyone,
I've been successful in writing the symbolic cholesky decomposition
for sparse matrices in O(n * c**2) time. This is a reasonable order
for sparse systems, but still the performance is not very good. Using
python bulitins, It factors a 100 * 100 Matrix with sparsity 0.57 in
961 milli-seconds. Using Sympy's numbers, it takes forever or is pain-
stakingly slow for matrices larger than 20 * 20.

I understand why we must integrate groundtypes in matrices to make it
usable. But I don't know how exactly to do it.

I see that the Matrix constructor currently employs sympify, so it
changes regular ints to Sympy's Integer. I had removed this when I
wanted to test for the python builtins in my DOKMatrix implementation.

Here's an idea that we can build on. Add a kwarg argument in the
Matrix constructor called dtype, which could takes values like 'gmpy',
'python', 'sympy', etc. or more detailed, 'int', 'expr', 'poly' etc..
So that, before putting the element in the matrix, we convert it to
the required dtype. eg. val = gmpy.mpz(val)

Is it as simple as this, or am I missing something ?

I would like Mateusz especially to comment on this, and also guide me
and help me learn about the Polys structure.

Regards,
Sherjil Ozair

Mateusz Paprocki

unread,
May 28, 2011, 2:34:49 PM5/28/11
to sy...@googlegroups.com
Hi,

On 28 May 2011 15:30, SherjilOzair <sherji...@gmail.com> wrote:
Hello everyone,
I've been successful in writing the symbolic cholesky decomposition
for sparse matrices in O(n * c**2) time. This is a reasonable order
for sparse systems, but still the performance is not very good. Using
python bulitins, It factors a 100 * 100 Matrix with sparsity 0.57 in
961 milli-seconds. Using Sympy's numbers, it takes forever or is pain-
stakingly slow for matrices larger than 20 * 20.

In [1] you will find a very simple comparison of Integer, int and mpz. This applies to the rational case as well, just the difference is even bigger.
 

I understand why we must integrate groundtypes in matrices to make it
usable. But I don't know how exactly to do it.

I see that the Matrix constructor currently employs sympify, so it
changes regular ints to Sympy's Integer. I had removed this when I
wanted to test for the python builtins in my DOKMatrix implementation.

Here's an idea that we can build on. Add a kwarg argument in the
Matrix constructor called dtype, which could takes values like 'gmpy',
'python', 'sympy', etc. or more detailed, 'int', 'expr', 'poly' etc..
So that, before putting the element in the matrix, we convert it to
the required dtype. eg. val = gmpy.mpz(val)

Is it as simple as this, or am I missing something ?

Following sympy.polys design means that you have to employ static typing (all coefficients in a matrix are of the same type, governed by a domain that understands properties of the type). Suppose we have a matrix M over ZZ, then M[0,0] += 1 is well defined and is fast because it requires only one call to domain.convert() (which will exit almost immediately, depending whether ZZ.dtype is Integer, int, mpz or something else). That was simple, but what about M[0,0] += S(1)/2? 1/2 not in ZZ so += may either fail because there is no way to coerce 1/2 to an integer, but it may also figure out a domain for 1/2 (QQ), upgrade the domain in M and proceed. In polys both scenarios can happen depending whether you use DMP (or any other low-level polynomial representation) or low-level APIs of Poly or high-level APIs of Poly (low-level uses the former and high-level uses the later). The main concern in this case is speed (and type checking but it isn't very strong). Deciding whether 1 is in ZZ is fast, but figuring out a domain for 1/2, unifying domain of 1/2 with ZZ (a sup domain has to be found, which in this case is simple, but may be highly non-trivial in case of composite or algebraic domains) and coercing all elements of M, is slow.

How to figure out a domain for a set of coefficients? Use construct_domain(). It will give you the domain and will coerce all inputs. Refer to Poly.__new__ and all Poly.from_* and Poly._from_* to see how this works. sympy.polys should have all tools you will need, so try not to reinvent things that are already in SymPy. For example speaking about those "detailed types" 'int', 'expr', 'poly': poly -> what domain and variables?, expr -> what simplification algorithm?, etc. Learn to use what the library provides to you. If there is something missing, e.g. you would like construct_domain() to work with nested lists, that can be done, either on your own or just ask it. For now it may be a little tedious to use stuff from sympy.polys in matrices and at some point I will have share, e.g. domains, with other modules.

My suggestion is to start from something simple. You can create a new matrix class that will support the bare minimum of operations to replace Matrix in solve_linear_system(). This new matrix class would support domain construction and type conversions using mechanisms from sympy.polys. Change solve_linear_system() to not use simplify() but rely on the ground types to do the job (solving zero equivalence problem). If this works and is fast, then you can build on top of this.


I would like Mateusz especially to comment on this, and also guide me
and help me learn about the Polys structure.

 

Regards,
Sherjil Ozair

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sy...@googlegroups.com.
To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.


Mateusz

SherjilOzair

unread,
May 29, 2011, 6:44:08 AM5/29/11
to sympy
Thanks for the pointers. I'm working on what you said.

Here are a few questions.

1. Poly(123) should not give an error, why not treat it as a constant
polynomial ?

2. I used construct_domain on the list of elements of the Matrix, It
returned DMP type, which does *not* allow coercion. What to do if one
of my algorithms involve a square root ?

3. Even when I tried operating on DMPs using an algorithm which did
not have a square root, an "unsupported operand type(s) for /: 'int'
and 'DMP'" TypeError was returned.

4. Should I write different code for the algorithms for each
groundtype ? For example, when using Sympy's type, using Add(*(...))
adds efficiency, but it doesn't make sense for other types. I can use
sum(...) but that will sacrifice performance slightly.

5. I think coercion is important for matrix algorithms. Other than
Poly, which other lower-level classes allow coercion ?

- Sherjil Ozair

On May 28, 11:34 pm, Mateusz Paprocki <matt...@gmail.com> wrote:
> Hi,
>
> [1]http://mattpap.github.com/masters-thesis/html/src/internals.html#benc...

SherjilOzair

unread,
May 29, 2011, 6:49:46 AM5/29/11
to sympy
Could you explain why normal multiplication/addition operations on
Poly is slower than on Muls and Adds, when Poly is supposed to be
lower level than Mul, Add ?

In [35]: A = randMatrix(5,5)

In [36]: A = A.applyfunc(lambda i: i + x)

In [37]: %timeit B = A.T * A
1000 loops, best of 3: 1.63 ms per loop

In [38]: C = A.applyfunc(poly)

In [39]: C
Out[39]:
⎡Poly(x + 56, x, domain='ZZ') Poly(x + 89, x, domain='ZZ') Poly(x +
23, x, domain='ZZ') Poly(x + 46, x, domain='ZZ') Poly(x + 95, x,
domain='ZZ')⎤


⎢Poly(x + 41, x, domain='ZZ') Poly(x + 3, x, domain='ZZ') Poly(x +
17, x, domain='ZZ') Poly(x + 13, x, domain='ZZ') Poly(x + 85, x,
domain='ZZ')⎥


⎢Poly(x + 10, x, domain='ZZ') Poly(x + 95, x, domain='ZZ') Poly(x +
60, x, domain='ZZ') Poly(x + 42, x, domain='ZZ') Poly(x + 79, x,
domain='ZZ')⎥


⎢Poly(x + 98, x, domain='ZZ') Poly(x + 84, x, domain='ZZ') Poly(x +
54, x, domain='ZZ') Poly(x + 32, x, domain='ZZ') Poly(x + 52, x,
domain='ZZ')⎥


⎣Poly(x + 10, x, domain='ZZ') Poly(x + 87, x, domain='ZZ') Poly(x +
27, x, domain='ZZ') Poly(x + 23, x, domain='ZZ') Poly(x + 52, x,
domain='ZZ')⎦

In [40]: %timeit D = C.T * C
100 loops, best of 3: 5.72 ms per loop

SherjilOzair

unread,
May 29, 2011, 7:12:38 AM5/29/11
to sympy
On a similar note, I haven't seen the RationalFunction type in Sympy
which should essentially be Poly/Poly.
Poly/Poly current returns a Mul, which would not be very good for the
zero-equivalence problem, I think.

Matrix factorization algorithms employ division on elements often. So,
if the elements belong to Poly, then we expect the matrix
factorization to contain RationalFunction (RationalPoly ?).

I think it would be the most natural type for elements of Matrices. It
will use the Poly class as its numerator and denominator, thus taking
advantage of the well-implemented Poly. Zero-equivalence problem would
be to check if the numerator is zero only, etc.

What does the community think about this class, the RationalFunction ?
Do add ideas from your knowledge and experience also.

Tom Bachmann

unread,
May 29, 2011, 7:15:49 AM5/29/11
to sy...@googlegroups.com
> What does the community think about this class, the RationalFunction ?
> Do add ideas from your knowledge and experience also.
>

I think having such a class would be advantageous. My project deals with
(very small) matrices of rational functions, and I currently convert
them to polys and cancel by hand in intermediate steps. I think working
with a dedicated rational function class all the time might be advantageous.

However I'm not sure RationalFunction should store a pair of Polys, but
rather wrap DMFs. OTOH Mateusz surely knows better whether/how to do this.

Tom

SherjilOzair

unread,
May 29, 2011, 3:14:03 PM5/29/11
to sympy


On May 29, 4:15 pm, Tom Bachmann <e_mc...@web.de> wrote:
> > What does the community think about this class, the RationalFunction ?
> > Do add ideas from your knowledge and experience also.
>
> I think having such a class would be advantageous. My project deals with
> (very small) matrices of rational functions, and I currently convert
> them to polys and cancel by hand in intermediate steps. I think working
> with a dedicated rational function class all the time might be advantageous.
>
How small ? Numbers would be helpful. Thank you.

> However I'm not sure RationalFunction should store a pair of Polys, but
> rather wrap DMFs. OTOH Mateusz surely knows better whether/how to do this.
>

What is a DMF, in detail ? It is mentioned in the thesis only once.
"DMF for dense multivariate fractions".
But that is not descriptive enough.

@everyone: Is there more resource available to learn about Polys other
than the thesis ?

> Tom

Mateusz Paprocki

unread,
May 29, 2011, 3:19:51 PM5/29/11
to sy...@googlegroups.com
Hi,

On 29 May 2011 21:14, SherjilOzair <sherji...@gmail.com> wrote:


On May 29, 4:15 pm, Tom Bachmann <e_mc...@web.de> wrote:
> > What does the community think about this class, the RationalFunction ?
> > Do add ideas from your knowledge and experience also.
>
> I think having such a class would be advantageous. My project deals with
> (very small) matrices of rational functions, and I currently convert
> them to polys and cancel by hand in intermediate steps. I think working
> with a dedicated rational function class all the time might be advantageous.
>
How small ? Numbers would be helpful. Thank you.

> However I'm not sure RationalFunction should store a pair of Polys, but
> rather wrap DMFs. OTOH Mateusz surely knows better whether/how to do this.
>

What is a DMF, in detail ? It is mentioned in the thesis only once.
"DMF for dense multivariate fractions".
But that is not descriptive enough.

p/q where p, q are polynomials and p/q is automatically cancelled.


@everyone: Is there more resource available to learn about Polys other
than the thesis ?

Nope. Learn to read the source code. There are also lots of tests from which you can infer behavior of sympy.polys.
 

> Tom


--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sy...@googlegroups.com.
To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.


Mateusz

Mateusz Paprocki

unread,
May 29, 2011, 3:24:07 PM5/29/11
to sy...@googlegroups.com
Hi,

It should wrap DMF. There is even an issue for this (#2042).
 

Tom


--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sy...@googlegroups.com.
To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.


Mateusz

Mateusz Paprocki

unread,
May 29, 2011, 3:49:12 PM5/29/11
to sy...@googlegroups.com
Hi,

On 29 May 2011 12:44, SherjilOzair <sherji...@gmail.com> wrote:
Thanks for the pointers. I'm working on what you said.

Here are a few questions.

1. Poly(123) should not give an error, why not treat it as a constant
polynomial?

In this setup 123 means 123*x**0 or 123*x**0*y**0*z**0? Without additional information which is implied by calling 123 a polynomial, 123 is number. Consider a different example. Suppose M and N are matrices. What is M + N? Or M.det()? M and N have to have shape defined, otherwise M and N are just meaningless symbols.
 

2. I used construct_domain on the list of elements of the Matrix, It
returned DMP type, which does *not* allow coercion. What to do if one
of my algorithms involve a square root ?

You have to change the ground domain, either to EX (the simples solution) or to an algebraic domain.
 

3. Even when I tried operating on DMPs using an algorithm which did
not have a square root, an "unsupported operand type(s) for /: 'int'
and 'DMP'" TypeError was returned.

That's fine, because you you have operands of different type. I could be convenient to define DMP() / int, but this is really not necessary. Assuming that 'domain' is an instance of PolynomialRing, then you can do DMP() / domain(int) or even better domain.quo(DMP(), domain(int)), to make division ground type agnostic.
 

4. Should I write different code for the algorithms for each
groundtype ? For example, when using Sympy's type, using Add(*(...))
adds efficiency, but it doesn't make sense for other types. I can use
sum(...) but that will sacrifice performance slightly.

If you want to write generic algorithms, you can't assume that you work with any particular ground type. The same as you can't assume that integer multiplication is implemented efficiently (mpz vs. int), the same you can't assume that Add() exists. What exists is + operator. It's SymPy's problem that sum(X) is O(len(X)**2) algorithm, but this is just because we use lists instead of dicts for the internal implementation of Add. This will change in future. Also can you say specifically what you want to "Add", because you may be misusing Add for constructing polynomials?
 

5. I think coercion is important for matrix algorithms. Other than
Poly, which other lower-level classes allow coercion ?

Poly is a high-level class. Most types allow for some sort of coercions, so please specify precisely what kind of coercions for what types do you need. Just remember that sympy.polys doesn't depend directly on built-in coersion rules (see DMP() / int discussion), but implements coercions on top of built-in coercions in domains (see Domain.convert() and all from_ and to_ methods). 

Mateusz

Mateusz Paprocki

unread,
May 29, 2011, 4:37:30 PM5/29/11
to sy...@googlegroups.com
Hi,

On 29 May 2011 12:49, SherjilOzair <sherji...@gmail.com> wrote:
Could you explain why normal multiplication/addition operations on
Poly is slower than on Muls and Adds, when Poly is supposed to be
lower level than Mul, Add ?

In some cases overhead of Poly maybe significant. But I'm quite sure you just didn't turn off cache, because for example:

In [4]: f = x

In [5]: g = Poly(x)

In [6]: %timeit u = f + f
10000 loops, best of 3: 114 us per loop

In [7]: %timeit u = g + g
10000 loops, best of 3: 40.2 us per loop

Your example gives now:

In [8]: A = randMatrix(5,5)

In [9]: A = A.applyfunc(lambda i: i + x)

In [10]: %timeit B = A.T * A
10 loops, best of 3: 20.6 ms per loop

In [11]: C = A.applyfunc(poly)

In [12]: %timeit D = C.T * C
100 loops, best of 3: 11.9 ms per loop

So Poly is 2x faster than Add in this case. If you use %timeit and don't turn off cache then what you time is how fast @cache can retrieve previous results.

Mateusz

Tom Bachmann

unread,
May 29, 2011, 4:53:48 PM5/29/11
to sy...@googlegroups.com

> How small ? Numbers would be helpful. Thank you.
>

Smaller than 5x5. And I'm only multiplying them, not doing any fancy
algorithms. So I suppose I'm not really a use case.

sherji...@gmail.com

unread,
May 29, 2011, 5:10:19 PM5/29/11
to sy...@googlegroups.com
Oh. I understand. And how does one turn off cahe ?

Sent on my BlackBerry® from Vodafone


From: Mateusz Paprocki <mat...@gmail.com>
Date: Sun, 29 May 2011 22:37:30 +0200
Subject: Re: [sympy] Re: Borrowing ideas from Polys to Matrix

Tom Bachmann

unread,
May 29, 2011, 5:11:11 PM5/29/11
to sy...@googlegroups.com
Pass SYMPY_USE_CACHE=no on the environment.

On 29.05.2011 22:10, sherji...@gmail.com wrote:
> Oh. I understand. And how does one turn off cahe ?
>
> Sent on my BlackBerry® from Vodafone
>

> ------------------------------------------------------------------------
> *From: * Mateusz Paprocki <mat...@gmail.com>
> *Sender: * sy...@googlegroups.com
> *Date: *Sun, 29 May 2011 22:37:30 +0200
> *To: *<sy...@googlegroups.com>
> *ReplyTo: * sy...@googlegroups.com
> *Subject: *Re: [sympy] Re: Borrowing ideas from Polys to Matrix


>
> Hi,
>
> On 29 May 2011 12:49, SherjilOzair <sherji...@gmail.com

> <mailto:sherjiloz...@gmail.com>> wrote:
> > Thanks for the pointers. I'm working on what you said.
> >
> > Here are a few questions.
> >
> > 1. Poly(123) should not give an error, why not treat it as a constant
> > polynomial ?
> >
> > 2. I used construct_domain on the list of elements of the Matrix, It
> > returned DMP type, which does *not* allow coercion. What to do if one
> > of my algorithms involve a square root ?
> >
> > 3. Even when I tried operating on DMPs using an algorithm which did
> > not have a square root, an "unsupported operand type(s) for /: 'int'
> > and 'DMP'" TypeError was returned.
> >
> > 4. Should I write different code for the algorithms for each
> > groundtype ? For example, when using Sympy's type, using Add(*(...))
> > adds efficiency, but it doesn't make sense for other types. I can use
> > sum(...) but that will sacrifice performance slightly.
> >
> > 5. I think coercion is important for matrix algorithms. Other than
> > Poly, which other lower-level classes allow coercion ?
> >
> > - Sherjil Ozair
> >
> > On May 28, 11:34 pm, Mateusz Paprocki <matt...@gmail.com

> <mailto:matt...@gmail.com>> wrote:
> >
> >
> >
> >
> >
> >
> >
> > > Hi,
> >
> > > On 28 May 2011 15:30, SherjilOzair <sherjiloz...@gmail.com

> <mailto:sy...@googlegroups.com>.


> > > > To unsubscribe from this group, send email to
> > > > sympy+un...@googlegroups.com

> <mailto:sympy%2Bunsu...@googlegroups.com>.


> > > > For more options, visit this group at
> > > >http://groups.google.com/group/sympy?hl=en.
> >
> > > Mateusz
>
> --
> You received this message because you are subscribed to the Google
> Groups "sympy" group.
> To post to this group, send email to sy...@googlegroups.com

> <mailto:sy...@googlegroups.com>.


> To unsubscribe from this group, send email to
> sympy+un...@googlegroups.com

> <mailto:sympy%2Bunsu...@googlegroups.com>.

Mateusz Paprocki

unread,
May 29, 2011, 5:46:36 PM5/29/11
to sy...@googlegroups.com
Hi,

On 29 May 2011 23:11, Tom Bachmann <e_m...@web.de> wrote:
Pass SYMPY_USE_CACHE=no on the environment.

Or run isympy with -C option, e.g.:

$ bin/isympy -C

In both cases you will see "cache: off" in isympy's banner.

Mateusz

Vinzent Steinberg

unread,
May 30, 2011, 1:12:05 PM5/30/11
to sympy
On 29 Mai, 21:49, Mateusz Paprocki <matt...@gmail.com> wrote:
> If you want to write generic algorithms, you can't assume that you work with
> any particular ground type. The same as you can't assume that integer
> multiplication is implemented efficiently (mpz vs. int), the same you can't
> assume that Add() exists. What exists is + operator. It's SymPy's problem
> that sum(X) is O(len(X)**2) algorithm, but this is just because we use lists
> instead of dicts for the internal implementation of Add. This will change in
> future.

It might make sense to define a generic 'sum' which depends on the
context (the ground type). For Expr instances it should be Add(*...),
for python integer sum(...), for mpmath numbers it would be fsum(...)
etc.

Vinzent

Mateusz Paprocki

unread,
May 30, 2011, 3:52:43 PM5/30/11
to sy...@googlegroups.com
Hi,

That could work:

ZZ.sum([1, 2, 3]) -> sum([1, 2, 3])
RR.sum([1.0, 2.0]) -> mpmath.fsum([1.0, 2.0])
EX.sum([x, y, z]) -> Add(*[x, y, z])

etc.


Vinzent


--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sy...@googlegroups.com.
To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.


Mateusz

Vinzent Steinberg

unread,
May 31, 2011, 12:43:23 PM5/31/11
to sympy
On May 30, 9:52 pm, Mateusz Paprocki <matt...@gmail.com> wrote:
> That could work:
>
> ZZ.sum([1, 2, 3]) -> sum([1, 2, 3])
> RR.sum([1.0, 2.0]) -> mpmath.fsum([1.0, 2.0])
> EX.sum([x, y, z]) -> Add(*[x, y, z])
>
> etc.

This is exactly what I have been thinking of.

Vinzent

Ronan Lamy

unread,
May 31, 2011, 1:11:27 PM5/31/11
to sy...@googlegroups.com

But do we really need lots of different sum() functions? Is there a
difference between ZZ.sum([1, 2, 3]) and EX.sum([1, 2, 3])?

SherjilOzair

unread,
Jun 2, 2011, 12:37:49 AM6/2/11
to sympy
This hasn't been implemented yet. Should I go ahead and add it ?
sums for QQ, etc also needed.

See [1] for the types I need.

[1] http://groups.google.com/group/sympy/browse_thread/thread/c793f7030853f3da

On May 31, 9:43 pm, Vinzent Steinberg

Mateusz Paprocki

unread,
Jun 2, 2011, 12:55:16 AM6/2/11
to sy...@googlegroups.com
Hi,

On 2 June 2011 06:37, SherjilOzair <sherji...@gmail.com> wrote:
This hasn't been implemented yet. Should I go ahead and add it ?
sums for QQ, etc also needed.

First it would be good to see where and why this is needed at all. Then we can think about implementation (which is actually trivial).
 

See [1] for the types I need.

[1] http://groups.google.com/group/sympy/browse_thread/thread/c793f7030853f3da

On May 31, 9:43 pm, Vinzent Steinberg
<vinzent.steinb...@googlemail.com> wrote:
> On May 30, 9:52 pm, Mateusz Paprocki <matt...@gmail.com> wrote:
>
> > That could work:
>
> > ZZ.sum([1, 2, 3]) -> sum([1, 2, 3])
> > RR.sum([1.0, 2.0]) -> mpmath.fsum([1.0, 2.0])
> > EX.sum([x, y, z]) -> Add(*[x, y, z])
>
> > etc.
>
> This is exactly what I have been thinking of.
>
> Vinzent

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sy...@googlegroups.com.
To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.


Mateusz

SherjilOzair

unread,
Jun 2, 2011, 3:56:02 AM6/2/11
to sympy
Well, if Domain.sum(...) is significantly faster than sum(elements of
Domain), then its pretty reasonable that it should be implemented. sum
is almost as important as +, -, * and /. sum is used in many
algorithms. Matrix algorithms particularly use sum for dot products.

If you see my code for the cholesky and LDL decomposition in [1], I
have used the Add(*(...)) construct freely, as Matrix currently
converts its elements to Sympy objects. So Add works as a good
summation.

But if I want to abstractify this, and make Matrix work for XYZ dtype,
then the sum has to be an abstract one. Python's builtin sum suffices
as an abstract sum, but is slow. If I have the 'domain' variable
storing the domain type in the Matrix object, I can use
domain.sum(...), and the appropriate sum would be called by the
algorithm, Add for Sympy, fsum for mpmath and so on.

[1] https://github.com/sympy/sympy/pull/184/files (lines 534, 536,
573, 575, 599, etc.)


On Jun 2, 9:55 am, Mateusz Paprocki <matt...@gmail.com> wrote:
> Hi,
>
> On 2 June 2011 06:37, SherjilOzair <sherjiloz...@gmail.com> wrote:
>
> > This hasn't been implemented yet. Should I go ahead and add it ?
> > sums for QQ, etc also needed.
>
> First it would be good to see where and why this is needed at all. Then we
> can think about implementation (which is actually trivial).
>
>
>
>
>
>
>
>
>
>
>
> > See [1] for the types I need.
>
> > [1]
> >http://groups.google.com/group/sympy/browse_thread/thread/c793f703085...

Mateusz Paprocki

unread,
Jun 2, 2011, 4:09:12 AM6/2/11
to sy...@googlegroups.com
Hi,

On 2 June 2011 09:56, SherjilOzair <sherji...@gmail.com> wrote:
Well, if Domain.sum(...) is significantly faster than sum(elements of
Domain), then its pretty reasonable that it should be implemented. sum
is almost as important as +, -, * and /. sum is used in many
algorithms. Matrix algorithms particularly use sum for dot products.

If you see my code for the cholesky and LDL decomposition in [1], I
have used the Add(*(...)) construct freely, as Matrix currently
converts its elements to Sympy objects. So Add works as a good
summation.

But if I want to abstractify this, and make Matrix work for XYZ dtype,
then the sum has to be an abstract one. Python's builtin sum suffices
as an abstract sum, but is slow. If I have the 'domain' variable
storing the domain type in the Matrix object, I can use
domain.sum(...), and the appropriate sum would be called by the
algorithm, Add for Sympy, fsum for mpmath and so on.

[1] https://github.com/sympy/sympy/pull/184/files (lines 534, 536,
573, 575, 599, etc.)

This pull request is what is was I was asking for. From this it seems that implementing sum() method in appropriate domains will be beneficial. 

Mateusz

Aaron S. Meurer

unread,
Jun 2, 2011, 7:07:48 PM6/2/11
to sy...@googlegroups.com

Yes. You can sum integers perfectly efficiently just by passing them to the built-in sum() (also remember that ZZ(1) != S(1) in general because of the ground types, but EX(1) == S(1) always). But something like Add(*(x, -x, x, x, −x)) is almost two times faster than sum((x, -x, x, x, -x)):

In [1]: Add(*(x, -x, x, x, -x))
Out[1]: x

In [2]: %timeit Add(*(x, -x, x, x, -x))
1000 loops, best of 3: 170 us per loop

In [3]: sum((x, -x, x, x, -x))
Out[3]: x

In [4]: %timeit sum((x, -x, x, x, -x))
1000 loops, best of 3: 313 us per loop

In [16]: b = map(S, range(100))

In [24]: %timeit Add(*b)
1000 loops, best of 3: 546 us per loop

In [25]: %timeit sum(b)
1000 loops, best of 3: 268 us per loop

(all run without the cache, as suggested above). Note that [16] ensures that sum(b) goes through Add. Otherwise, you get

In [26]: %timeit Add(*range(100))
1000 loops, best of 3: 1.45 ms per loop

In [27]: %timeit sum(range(100))
100000 loops, best of 3: 2.7 us per loop

which just proves what we already know, which is that Python ints are (apparently 10x) faster than SymPy Integers.

Aaron Meurer

Aaron S. Meurer

unread,
Jun 2, 2011, 7:15:18 PM6/2/11
to sy...@googlegroups.com
> In [1] you will find a very simple comparison of Integer, int and mpz. This applies to the rational case as well, just the difference is even bigger.

Did you make those graphs manually, or do you have some program that automates it? It would be nice to have something that can make timing graphs like that with very little work.

Aaron Meurer

Mateusz Paprocki

unread,
Jun 2, 2011, 10:40:23 PM6/2/11
to sy...@googlegroups.com
Hi,

On 3 June 2011 01:15, Aaron S. Meurer <asme...@gmail.com> wrote:
> In [1] you will find a very simple comparison of Integer, int and mpz. This applies to the rational case as well, just the difference is even bigger.

Did you make those graphs manually, or do you have some program that automates it?  It would be nice to have something that can make timing graphs like that with very little work.

It's semi-automated. You can find source code here:

 

Aaron Meurer


--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sy...@googlegroups.com.
To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.


Mateusz

Aaron S. Meurer

unread,
Jun 2, 2011, 11:01:10 PM6/2/11
to sy...@googlegroups.com
I see.  It would be nice to have a function, call it timings_graph() or something, that acts similar to the timed() function.  You just pass it some lists of source code that you want to plot (or maybe multiple lists for multiple lines), a list of x-axis values, and other information like axes labels, and it would just create a plot like this.  It would be great way to look at asymptotic behavior and to compare multiple implementations.  We could put it in utilities/timeutils.py.  It would be useful to me to compare risch_integrate() and integrate(), for example. 

Aaron Meurer

Aaron S. Meurer

unread,
Jun 2, 2011, 11:04:14 PM6/2/11
to sy...@googlegroups.com

Ronan Lamy

unread,
Jun 2, 2011, 11:21:50 PM6/2/11
to sy...@googlegroups.com
Le jeudi 02 juin 2011 à 17:07 -0600, Aaron S. Meurer a écrit :
> On May 31, 2011, at 11:11 AM, Ronan Lamy wrote:
>
> > Le mardi 31 mai 2011 à 09:43 -0700, Vinzent Steinberg a écrit :
> >> On May 30, 9:52 pm, Mateusz Paprocki <matt...@gmail.com> wrote:
> >>> That could work:
> >>>
> >>> ZZ.sum([1, 2, 3]) -> sum([1, 2, 3])
> >>> RR.sum([1.0, 2.0]) -> mpmath.fsum([1.0, 2.0])
> >>> EX.sum([x, y, z]) -> Add(*[x, y, z])
> >>>
> >>> etc.
> >>
> >> This is exactly what I have been thinking of.
> >
> > But do we really need lots of different sum() functions? Is there a
> > difference between ZZ.sum([1, 2, 3]) and EX.sum([1, 2, 3])?
>
> Yes. You can sum integers perfectly efficiently just by passing them
> to the built-in sum() (also remember that ZZ(1) != S(1) in general
> because of the ground types, but EX(1) == S(1) always).

ZZ(1) != S(1), but only some times, is rather counter-intuitive. I don't
understand what ZZ is really supposed to mean anyway. If it's the ring
of integers, why can its value change? And why is it instantiable?

> But something
> like Add(*(x, -x, x, x, -x)) is almost two times faster than sum((x,
> -x, x, x, -x)):

That's rather an implementation detail, and using Add(*[...]) is wrong
anyway, if we want to allow extensions of sympy. Besides, it doesn't
explain why EX.sum([n, -n, n, n]) should be different from ZZ.sum([n,
-n, n, n]).

Aaron S. Meurer

unread,
Jun 2, 2011, 11:28:55 PM6/2/11
to sy...@googlegroups.com

On Jun 2, 2011, at 9:21 PM, Ronan Lamy wrote:

> Le jeudi 02 juin 2011 à 17:07 -0600, Aaron S. Meurer a écrit :
>> On May 31, 2011, at 11:11 AM, Ronan Lamy wrote:
>>
>>> Le mardi 31 mai 2011 à 09:43 -0700, Vinzent Steinberg a écrit :
>>>> On May 30, 9:52 pm, Mateusz Paprocki <matt...@gmail.com> wrote:
>>>>> That could work:
>>>>>
>>>>> ZZ.sum([1, 2, 3]) -> sum([1, 2, 3])
>>>>> RR.sum([1.0, 2.0]) -> mpmath.fsum([1.0, 2.0])
>>>>> EX.sum([x, y, z]) -> Add(*[x, y, z])
>>>>>
>>>>> etc.
>>>>
>>>> This is exactly what I have been thinking of.
>>>
>>> But do we really need lots of different sum() functions? Is there a
>>> difference between ZZ.sum([1, 2, 3]) and EX.sum([1, 2, 3])?
>>
>> Yes. You can sum integers perfectly efficiently just by passing them
>> to the built-in sum() (also remember that ZZ(1) != S(1) in general
>> because of the ground types, but EX(1) == S(1) always).
>
> ZZ(1) != S(1), but only some times, is rather counter-intuitive. I don't
> understand what ZZ is really supposed to mean anyway. If it's the ring
> of integers, why can its value change? And why is it instantiable?

ZZ(1) means 1 in the current ground type. If the ground type is Python, it returns int(1). If it's gmpy, it returns mpz(1). Actually, ZZ(1) == S(1) only if you manually choose the sympy ground types. Note that I am using "==" here to mean actually something more like "is". I should really have said "type(ZZ(1)) != type(S(1))". __eq__ itself works as expected regardless of the underlying type.

In general, ZZ stands for the domain of integers, and you can pass it to Poly under the domain option (you can also do stuff like ZZ[x] to create the domain of polynomials in x with integer coefficients). ZZ(1) is just syntactic sugar that makes it easy to coerce elements to that domain.

>
>> But something
>> like Add(*(x, -x, x, x, -x)) is almost two times faster than sum((x,
>> -x, x, x, -x)):
>
> That's rather an implementation detail, and using Add(*[...]) is wrong
> anyway, if we want to allow extensions of sympy. Besides, it doesn't
> explain why EX.sum([n, -n, n, n]) should be different from ZZ.sum([n,
> -n, n, n]).

Do you mean that n is symbolic? If so, then ZZ.sum([n, -n, n, n]) doesn't make sense.

Aaron Meurer

>
>>
>> In [1]: Add(*(x, -x, x, x, -x))
>> Out[1]: x
>>
>> In [2]: %timeit Add(*(x, -x, x, x, -x))
>> 1000 loops, best of 3: 170 us per loop
>>
>> In [3]: sum((x, -x, x, x, -x))
>> Out[3]: x
>>
>> In [4]: %timeit sum((x, -x, x, x, -x))
>> 1000 loops, best of 3: 313 us per loop
>>
>> In [16]: b = map(S, range(100))
>>
>> In [24]: %timeit Add(*b)
>> 1000 loops, best of 3: 546 us per loop
>>
>> In [25]: %timeit sum(b)
>> 1000 loops, best of 3: 268 us per loop
>>
>> (all run without the cache, as suggested above). Note that [16]
>> ensures that sum(b) goes through Add. Otherwise, you get
>>
>> In [26]: %timeit Add(*range(100))
>> 1000 loops, best of 3: 1.45 ms per loop
>>
>> In [27]: %timeit sum(range(100))
>> 100000 loops, best of 3: 2.7 us per loop
>>
>> which just proves what we already know, which is that Python ints are
>> (apparently 10x) faster than SymPy Integers.
>>
>> Aaron Meurer
>>
>
>

Ronan Lamy

unread,
Jun 3, 2011, 12:09:45 AM6/3/11
to sy...@googlegroups.com
OK. So "ZZ" means "integer type". That's what I find counter-intuitive
then.

> In general, ZZ stands for the domain of integers, and you can pass it
> to Poly under the domain option (you can also do stuff like ZZ[x] to
> create the domain of polynomials in x with integer coefficients).
> ZZ(1) is just syntactic sugar that makes it easy to coerce elements to
> that domain.

OK. By domain, you mean type, right? (That 'ZZ[x]' is a rather confusing
bit of syntactic sugar that seems to suggest a mathematical meaning it
doesn't really have and looks weird when you replace ZZ with any of its
possible values)

> >
> >> But something
> >> like Add(*(x, -x, x, x, -x)) is almost two times faster than sum((x,
> >> -x, x, x, -x)):
> >
> > That's rather an implementation detail, and using Add(*[...]) is wrong
> > anyway, if we want to allow extensions of sympy. Besides, it doesn't
> > explain why EX.sum([n, -n, n, n]) should be different from ZZ.sum([n,
> > -n, n, n]).
>
> Do you mean that n is symbolic? If so, then ZZ.sum([n, -n, n, n])
> doesn't make sense.
>

It would make sense if 'ZZ' meant 'the ring of integers'. Actually, even
though that's not the case, it works anyway because ZZ.sum is always
just sum.

Aaron S. Meurer

unread,
Jun 3, 2011, 12:15:35 AM6/3/11
to sy...@googlegroups.com

Yeah, ZZ, QQ, etc. are all just about syntactic sugar that makes things much shorter to type. You can do all of the stuff equivalently in much longer forms. And by the way, the ZZ[x] syntax just exists because that's the mathematical notation (the square brackets). The real confusing thing is that ZZ(x) is not the field of rational functions in x with integer coefficients (for that you need ZZ.frac_field(x)).

>
>>>
>>>> But something
>>>> like Add(*(x, -x, x, x, -x)) is almost two times faster than sum((x,
>>>> -x, x, x, -x)):
>>>
>>> That's rather an implementation detail, and using Add(*[...]) is wrong
>>> anyway, if we want to allow extensions of sympy. Besides, it doesn't
>>> explain why EX.sum([n, -n, n, n]) should be different from ZZ.sum([n,
>>> -n, n, n]).
>>
>> Do you mean that n is symbolic? If so, then ZZ.sum([n, -n, n, n])
>> doesn't make sense.
>>
> It would make sense if 'ZZ' meant 'the ring of integers'. Actually, even
> though that's not the case, it works anyway because ZZ.sum is always
> just sum.

That depends on if type checking happens on this level or not (and if it does work, it's only by accident).

Aaron Meurer

Ronan Lamy

unread,
Jun 3, 2011, 3:41:17 PM6/3/11
to sy...@googlegroups.com
A lot of time is wasted for the sake of a few keystrokes. To understand
anything using "domains", you need to unpick all the levels of
indirection (e.g. ZZ = ZZ_python = PythonIntegerRing which is a wrapper
for PythonIntegerType = int) and understand of a lot of methods and
syntactic shortcuts. It's much easier to understand 'isinstance(a,
int_type)' than 'ZZ.of_type(a)', even if the latter is shorter.

Vinzent Steinberg

unread,
Jun 4, 2011, 7:56:29 AM6/4/11
to sympy
On 3 Jun., 21:41, Ronan Lamy <ronan.l...@gmail.com> wrote:
> > Yeah, ZZ, QQ, etc. are all just about syntactic sugar that makes things
> > much shorter to type. You can do all of the stuff equivalently in much
> > longer forms.  And by the way, the ZZ[x] syntax just exists because
> > that's the mathematical notation (the square brackets).  The real
> > confusing thing is that ZZ(x) is not the field of rational functions in
> > x with integer coefficients (for that you need ZZ.frac_field(x)).
>
> A lot of time is wasted for the sake of a few keystrokes. To understand
> anything using "domains", you need to unpick all the levels of
> indirection (e.g. ZZ = ZZ_python = PythonIntegerRing which is a wrapper
> for PythonIntegerType = int) and understand of a lot of methods and
> syntactic shortcuts. It's much easier to understand 'isinstance(a,
> int_type)' than 'ZZ.of_type(a)', even if the latter is shorter.

Shouldn't 'ZZ.of_type(a)' rather be 'ZZ.contains(a)' or 'a in ZZ', for
the sake of consistency with mathematical notation?

Vinzent

Ronan Lamy

unread,
Jun 4, 2011, 12:24:21 PM6/4/11
to sy...@googlegroups.com

No, it shouldn't, because ZZ isn't a set. That's indeed the problem with
using the name of an abstract mathematical set to designate a concrete
type. "Symbol('n', integer=True) in ZZ" looks like something that should
obviously be True, but actually it doesn't make sense. So 'isinstance(a,
int_type)' isn't just more readable, it also prevents dangerous
misunderstandings.

BTW, ZZ.__contains__ does exist and does something different from
ZZ.of_type.

Vinzent Steinberg

unread,
Jun 5, 2011, 9:13:20 AM6/5/11
to sympy
On 4 Jun., 18:24, Ronan Lamy <ronan.l...@gmail.com> wrote:
> No, it shouldn't, because ZZ isn't a set. That's indeed the problem with
> using the name of an abstract mathematical set to designate a concrete
> type.

If I understand it correctly, ZZ is the implementation of the ring of
integers, so mathematically it is nothing but a set with some special
elements and a few mappings, satisfying some axioms. For convenience,
you refer to the set only, even if you mean the ring. So I think the
implementation is fine.

> "Symbol('n', integer=True) in ZZ" looks like something that should
> obviously be True, but actually it doesn't make sense.

Well, in polys this would be rather ZZ[n], because ZZ means concrete
integers and not symbolic ones.

> So 'isinstance(a,
> int_type)' isn't just more readable, it also prevents dangerous
> misunderstandings.

> BTW, ZZ.__contains__ does exist and does something different from
> ZZ.of_type.

The difference is subtle, ZZ.__contains__ tries to convert the type,
but ZZ.of_type only checks the type. It is like '==' vs. 'is'.

Vinzent

Ronan Lamy

unread,
Jun 5, 2011, 2:38:26 PM6/5/11
to sy...@googlegroups.com
Le dimanche 05 juin 2011 à 06:13 -0700, Vinzent Steinberg a écrit :
> On 4 Jun., 18:24, Ronan Lamy <ronan.l...@gmail.com> wrote:
> > No, it shouldn't, because ZZ isn't a set. That's indeed the problem with
> > using the name of an abstract mathematical set to designate a concrete
> > type.
>
> If I understand it correctly, ZZ is the implementation of the ring of
> integers, so mathematically it is nothing but a set with some special
> elements and a few mappings, satisfying some axioms. For convenience,
> you refer to the set only, even if you mean the ring. So I think the
> implementation is fine.
>
Your understanding seems to contradict Aaron's. ZZ can't be
simultaneously a proxy for a concrete type and a representation of an an
abstract mathematical concept.

> > "Symbol('n', integer=True) in ZZ" looks like something that should
> > obviously be True, but actually it doesn't make sense.
>
> Well, in polys this would be rather ZZ[n], because ZZ means concrete
> integers and not symbolic ones.

But we need symbolic integers if we want to do any symbolic manipulation
on integers. In the main core, there's .is_integer and .is_Integer, they
are different and we need both. Why change the rules in the polynomial
core?

Aaron Meurer

unread,
Jun 5, 2011, 7:06:33 PM6/5/11
to sy...@googlegroups.com
On Sun, Jun 5, 2011 at 12:38 PM, Ronan Lamy <ronan...@gmail.com> wrote:
> Le dimanche 05 juin 2011 à 06:13 -0700, Vinzent Steinberg a écrit :
>> On 4 Jun., 18:24, Ronan Lamy <ronan.l...@gmail.com> wrote:
>> > No, it shouldn't, because ZZ isn't a set. That's indeed the problem with
>> > using the name of an abstract mathematical set to designate a concrete
>> > type.
>>
>> If I understand it correctly, ZZ is the implementation of the ring of
>> integers, so mathematically it is nothing but a set with some special
>> elements and a few mappings, satisfying some axioms. For convenience,
>> you refer to the set only, even if you mean the ring. So I think the
>> implementation is fine.
>>
> Your understanding seems to contradict Aaron's. ZZ can't be
> simultaneously a proxy for a concrete type and a representation of an an
> abstract mathematical concept.
>
>> > "Symbol('n', integer=True) in ZZ" looks like something that should
>> > obviously be True, but actually it doesn't make sense.
>>
>> Well, in polys this would be rather ZZ[n], because ZZ means concrete
>> integers and not symbolic ones.
>
> But we need symbolic integers if we want to do any symbolic manipulation
> on integers. In the main core, there's .is_integer and .is_Integer, they
> are different and we need both. Why change the rules in the polynomial
> core?

Because you can't really work with symbolic integers, or at least not
in the way that you can work with actual ones. It would be awesome if
you could. But what is quo(x**n - 1, x - 1)? Or factor(x**n - 1)?
(these are actually a nice questions because it's not to difficult to
represent the answer using a summation, but you can imagine how
difficult it can be in general)?

That's symbolic powers, but the same thing applies to symbolic
coefficients. For example, what is quo(x**2 + n*x + m, x - 1)? Maybe
you could generate a general answer, but how that's actually
relatively simple. How about determining if x**3 + n*x**2 + m*x + k
is irreducible over the rationals, where n, m, and k are all
rationals. If n, m, and k are given, it can be determined, but for
symbolic n, m, and k an answer would involve all kinds of crazy number
theory functions, if an answer can even be expressed at all.

Aaron Meurer

>
>> > So 'isinstance(a,
>> > int_type)' isn't just more readable, it also prevents dangerous
>> > misunderstandings.
>>
>> > BTW, ZZ.__contains__ does exist and does something different from
>> > ZZ.of_type.
>>
>> The difference is subtle, ZZ.__contains__ tries to convert the type,
>> but ZZ.of_type only checks the type. It is like '==' vs. 'is'.
>>
>> Vinzent
>>
>
>

Ronan Lamy

unread,
Jun 5, 2011, 8:02:35 PM6/5/11
to sy...@googlegroups.com

Yes, I understand why the polynomials are implemented the way they are.
What I'm trying to understand is, given this implementation, why claim
that "domains" have a symbolic meaning they don't have? If domains are
to be extracted from sympy.polys and used elsewhere, this needs to be
clarified.

Aaron Meurer

unread,
Jun 5, 2011, 8:05:31 PM6/5/11
to sy...@googlegroups.com

Anywhere that uses the polys domains (like the matrices) should use
them in the same way that the polys uses them. But I do agree that it
would be useful to have some kind of "set of integers" object, and
that it would be convenient to give it the name ZZ. I wonder what the
best way around this would be.

Aaron Meurer

Reply all
Reply to author
Forward
0 new messages