Re: [sympy] solve is terribly slow compared to sage

Chris Smith

unread,

Feb 24, 2013, 5:24:02 AM2/24/13

to sy...@googlegroups.com

> Can someone help ?

Please post the equations as you entered them for SymPy.

Alessandro Bernardini

unread,

Feb 24, 2013, 7:54:11 AM2/24/13

to sy...@googlegroups.com

I have posted the code in issue 3512 in the issue tracker:

http://code.google.com/p/sympy/issues/detail?id=3512

I post it again here:

PYTHON SOLVE SEEMS TO BE MUCH MUCH SLOWER THAN SAGE SOLVE.
I have tried the options manual=True simplify=False and others with no success. with manual=False or =True in the code below it is the same
If I am doing something wrong please help me.
Here the python sympy code and then the sage code.
sympy version 0.7.1.rc1
python version: 2.7.3.

Here is the code:


from sympy import __version__ 
print __version__
from sympy import *

s = var('s')

V_6, V_8, V_2, r0_Q1, Gm_Q1, CBE_Q1, CBC_Q1, rpi_Q1, V_1, V_0, R5, V_initial_C4_6_0, C4, V_initial_C3_1_6, C3, V_initial_C2_2_1, C2, V_7, I_initial_L1_7_2, L1, R2, V_5, R4, V_4, R3, I_initial_L2_4_5, L2, r0_Q1, V_initial_CBC_Q1_6_8, CBC_Q1, V_initial_CBE_Q1_8_2, CBE_Q1, rpi_Q1, V_initial_C1_8_0, C1, R1 = var(' V_6 V_8 V_2 r0_Q1 Gm_Q1 CBE_Q1 CBC_Q1 rpi_Q1 V_1 V_0 R5 V_initial_C4_6_0 C4 V_initial_C3_1_6 C3 V_initial_C2_2_1 C2 V_7 I_initial_L1_7_2 L1 R2 V_5 R4 V_4 R3 I_initial_L2_4_5 L2 r0_Q1 V_initial_CBC_Q1_6_8 CBC_Q1 V_initial_CBE_Q1_8_2 CBE_Q1 rpi_Q1 V_initial_C1_8_0 C1 R1')


eqnssimple = [(V_4 - V_6)/R3 + I_initial_L2_4_5/s + (V_4 - V_5)/(L2*s)]

a, b, c = var('a b c')
print(solve([a + b, b + c], a,b))

print(solve(eqnssimple, [V_4], manual=True))

eqns = [(V_1 - V_6)*C3*s + (V_1 - V_2)*C2*s + C2*V_initial_C2_2_1 - C3*V_initial_C3_1_6 - (V_0 - V_1)/R5, (V_2 - V_8)*CBE_Q1*s - (V_1 - V_2)*C2*s - C2*V_initial_C2_2_1 + CBE_Q1*V_initial_CBE_Q1_8_2 + (V_2 - V_8)/rpi_Q1 + (V_2 - V_6)/r0_Q1 - I_initial_L1_7_2/s + (V_2 - V_7)/(L1*s), -(V_0 - V_5)/R4 - I_initial_L2_4_5/s - (V_4 - V_5)/(L2*s), (V_4 - V_6)/R3 + I_initial_L2_4_5/s + (V_4 - V_5)/(L2*s), (V_7 - V_8)/R1 + I_initial_L1_7_2/s - (V_2 - V_7)/(L1*s), (V_6 - V_8)*CBC_Q1*s - (V_1 - V_6)*C3*s - (V_0 - V_6)*C4*s + C3*V_initial_C3_1_6 - C4*V_initial_C4_6_0 - CBC_Q1*V_initial_CBC_Q1_6_8 - (V_4 - V_6)/R3 - (V_2 - V_6)/r0_Q1, -(V_6 - V_8)*CBC_Q1*s - (V_2 - V_8)*CBE_Q1*s - (V_0 - V_8)*C1*s - C1*V_initial_C1_8_0 + CBC_Q1*V_initial_CBC_Q1_6_8 - CBE_Q1*V_initial_CBE_Q1_8_2 - (V_7 - V_8)/R1 - (V_2 - V_8)/rpi_Q1 - (V_0 - V_8)/R2]

print(solve(eqns, [V_1, V_2, V_4, V_5, V_6, V_7, V_8]))

# END ---------------------------
Here is the output:
0.7.1.rc1
{a: c, b: -c}
{V_4: (-I_initial_L2_4_5*L2*R3 + L2*V_6*s + R3*V_5)/(L2*s + R3)}

and then the systems hangs up.

In Sage I obtain for the following code:
s = var('s')

V_6, V_8, V_2, r0_Q1, Gm_Q1, CBE_Q1, CBC_Q1, rpi_Q1, V_1, V_0, R5, V_initial_C4_6_0, C4, V_initial_C3_1_6, C3, V_initial_C2_2_1, C2, V_7, I_initial_L1_7_2, L1, R2, V_5, R4, V_4, R3, I_initial_L2_4_5, L2, r0_Q1, V_initial_CBC_Q1_6_8, CBC_Q1, V_initial_CBE_Q1_8_2, CBE_Q1, rpi_Q1, V_initial_C1_8_0, C1, R1 = var(' V_6 V_8 V_2 r0_Q1 Gm_Q1 CBE_Q1 CBC_Q1 rpi_Q1 V_1 V_0 R5 V_initial_C4_6_0 C4 V_initial_C3_1_6 C3 V_initial_C2_2_1 C2 V_7 I_initial_L1_7_2 L1 R2 V_5 R4 V_4 R3 I_initial_L2_4_5 L2 r0_Q1 V_initial_CBC_Q1_6_8 CBC_Q1 V_initial_CBE_Q1_8_2 CBE_Q1 rpi_Q1 V_initial_C1_8_0 C1 R1')


eqnssimple = [(V_4 - V_6)/R3 + I_initial_L2_4_5/s + (V_4 - V_5)/(L2*s)]

a, b, c = var('a b c')
print(solve([a + b, b + c], a,b))

print(solve(eqnssimple, [V_4]))

eqns = [(V_1 - V_6)*C3*s + (V_1 - V_2)*C2*s + C2*V_initial_C2_2_1 - C3*V_initial_C3_1_6 - (V_0 - V_1)/R5, (V_2 - V_8)*CBE_Q1*s - (V_1 - V_2)*C2*s - C2*V_initial_C2_2_1 + CBE_Q1*V_initial_CBE_Q1_8_2 + (V_2 - V_8)/rpi_Q1 + (V_2 - V_6)/r0_Q1 - I_initial_L1_7_2/s + (V_2 - V_7)/(L1*s), -(V_0 - V_5)/R4 - I_initial_L2_4_5/s - (V_4 - V_5)/(L2*s), (V_4 - V_6)/R3 + I_initial_L2_4_5/s + (V_4 - V_5)/(L2*s), (V_7 - V_8)/R1 + I_initial_L1_7_2/s - (V_2 - V_7)/(L1*s), (V_6 - V_8)*CBC_Q1*s - (V_1 - V_6)*C3*s - (V_0 - V_6)*C4*s + C3*V_initial_C3_1_6 - C4*V_initial_C4_6_0 - CBC_Q1*V_initial_CBC_Q1_6_8 - (V_4 - V_6)/R3 - (V_2 - V_6)/r0_Q1, -(V_6 - V_8)*CBC_Q1*s - (V_2 - V_8)*CBE_Q1*s - (V_0 - V_8)*C1*s - C1*V_initial_C1_8_0 + CBC_Q1*V_initial_CBC_Q1_6_8 - CBE_Q1*V_initial_CBE_Q1_8_2 - (V_7 - V_8)/R1 - (V_2 - V_8)/rpi_Q1 - (V_0 - V_8)/R2]

print(solve(eqns, [V_1, V_2, V_4, V_5, V_6, V_7, V_8]))

# -----------------END
when loaded in sage with load. I obtain the result in a couple of seconds. Here it is
----------------------------------------------------------------------
| Sage Version 5.6, Release Date: 2013-01-21                         |
| Type "notebook()" for the browser-based notebook interface.        |
| Type "help()" for help.                                            |
----------------------------------------------------------------------
sage: load("sage_issue.py")
[
[a == c, b == -c]
]
[
V_4 == -(I_initial_L2_4_5*L2*R3 - L2*V_6*s - R3*V_5)/(L2*s + R3)
]
[
[V_1 == 
a very very long list of hopefully correct solution obtainend in just a couple of seconds.

How can I obtain this in sympy too ?
Am I doing something wrong ?

Thank you for any help.

ThanhVu Nguyen

unread,

Mar 8, 2013, 12:50:49 AM3/8/13

to sy...@googlegroups.com

On Sunday, February 24, 2013 3:15:27 AM UTC-7, Alessandro Bernardini wrote:

Hello,

i have to solve systems of linear algebraic equations (nodal equations from circuit theory).
I tried examples witch 7 unknowns and a lot of symbolic parameters: i obtain 7 equations in the 7 unknowns (and with the symbolic parameters).
Then i solve for the 7 unknowns: sympy solve simply runs "out of time".
For 3 unknowns in 3 equations the results are computed after a few minutes. For 7 unknowns in 7 equations i had to stop the computation.

Sage math does it in much lesser time.

I have tried manual=True simplify=False and other options and i always have the same performance problem.

Can someone help ?

Thanks

ThanhVu Nguyen

unread,

Mar 8, 2013, 12:57:13 AM3/8/13

to sy...@googlegroups.com

I concur, the "solve in Sympy is very slow comparing to the "solve" function in Sage. Doesn't need to come up with any special example, just try to solve over 20 unknowns and so and you will see the difference.

One of my problem requires solving for 248 eqts for 164 unknowns. Here's the time: with SAGE 4.5.1 : 82.8926379681 secs, with Sympy 0.7.2: 557.724228144 secs. Both give the exact same solution so no problem on soundness.

I don't like using Sage because it's inconvenient to ask the users to download the huge package, but my stuff relies on equation solvings so I have to stick with Sage.

Aaron Meurer

unread,

Mar 8, 2013, 2:36:38 AM3/8/13

to sy...@googlegroups.com

What kind of equations are these? Are they linear equations, or
polynomial? Are the coefficients rational, floating point, or
symbolic. Maybe you could paste an example somewhere.

solve being slow is a real problem. To fix it, we need to figure out
what part is slowing it down, and either make it faster or figure out
if it can be removed for that computation.

Aaron Meurer

> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

ThanhVu Nguyen

unread,

Mar 9, 2013, 3:41:51 PM3/9/13

to sy...@googlegroups.com

They are linear equations, the coeficients are floating points ... the simpliest kind of linear equations for solving. For example

solve([2x + 2y + 3= 0 , 4.2x + 5y + 1 = 0 ... ] , solution_dict=True)

Here's a real big one 165 unknowns with 165 equations that SAGE's solve runs at least 5 times faster than Sympy's solve: http://pastebin.com/cE87W9m3

Mateusz Paprocki

unread,

Mar 11, 2013, 10:53:51 AM3/11/13

to sy...@googlegroups.com

Hi,

On 9 March 2013 21:41, ThanhVu Nguyen <nguyent...@gmail.com> wrote:
> They are linear equations, the coeficients are floating points ... the
> simpliest kind of linear equations for solving. For example
>
> solve([2x + 2y + 3= 0 , 4.2x + 5y + 1 = 0 ... ] , solution_dict=True)
>
> Here's a real big one 165 unknowns with 165 equations that SAGE's solve runs
> at least 5 times faster than Sympy's solve: http://pastebin.com/cE87W9m3
>

Recently I got some interesting speed improvements regarding this
issue in a development branch (see
https://github.com/sympy/sympy/pull/1840). I was able to solve this
165x165 system under 4 seconds on my computer (Sage takes about 24
seconds when simply using solve()). To be fair, I got this result by
using gmpy for coefficient domain (which SymPy does automatically
anyway if gmpy is installed), but even with pure Python coefficients
(int, Fraction) it takes under 19 seconds to solve the system. Also,
please note that this doesn't use the symbolic core of SymPy
(sympy.core), so there will be additional speed penalty when
converting between symbolic trees and efficient low-level polynomial
and matrix representations. If you want to try it out you have to use
sparse-polys branch from #1840 pull request. Here is a sample code:

In [1]: from sympy.polys.benchmarks.bench_solvers import *

In [2]: %time eqs = eqs_165x165()
CPU times: user 2.12 s, sys: 0.01 s, total: 2.13 s
Wall time: 2.12 s

In [3]: %time sol = solve_lin_sys(eqs, R_165)
CPU times: user 1.60 s, sys: 0.01 s, total: 1.60 s
Wall time: 1.59 s

Mateusz

ThanhVu Nguyen

unread,

Mar 12, 2013, 2:37:43 AM3/12/13

to sy...@googlegroups.com

That sounds very interesting. Can you share some details on what you did that makes it much faster, even when using pure Python coefficients ? why is that the current solve in sympy is so slow comparing to the one in Sage ?

How do I reproduce what you did after obtaining the patch, assuming the input is my original list of 165 equations as given in http://pastebin.com/cE87W9m3 ? What is eqs_165x165() in the code eqs = eqs_165x165() ? Also, how long usually does it take for such a patch to get into the main sympy branch ?

Thanks for the works !

Ondřej Čertík

unread,

Mar 12, 2013, 8:04:26 AM3/12/13

to sy...@googlegroups.com

On Tue, Mar 12, 2013 at 7:37 AM, ThanhVu Nguyen
<nguyent...@gmail.com> wrote:
> That sounds very interesting. Can you share some details on what you did
> that makes it much faster, even when using pure Python coefficients ? why
> is that the current solve in sympy is so slow comparing to the one in Sage ?
>
> How do I reproduce what you did after obtaining the patch, assuming the
> input is my original list of 165 equations as given in
> http://pastebin.com/cE87W9m3 ? What is eqs_165x165() in the code eqs =
> eqs_165x165() ? Also, how long usually does it take for such a patch to
> get into the main sympy branch ?

Normal patch just a few hours or days. This patch is quite big and it
might need some polishing.
If you want to help us out, please try the code out:

https://github.com/sympy/sympy/pull/1840

and provide us with feedback. That would be very helpful.

Just yesterday Mateusz explained to me a bit how it works over a beer.
I am very excited about this new code.
It's using a better polynomial representation and then Mateusz did
some profiling and was able to optimize some slow parts. It's quite
amazing that this is still pure Python (optionally with gmpy).

Ondrej

Mateusz Paprocki

unread,

Mar 13, 2013, 6:29:33 PM3/13/13

to sy...@googlegroups.com

Hi,

On 12 March 2013 07:37, ThanhVu Nguyen <nguyent...@gmail.com> wrote:
> That sounds very interesting. Can you share some details on what you did
> that makes it much faster, even when using pure Python coefficients ? why
> is that the current solve in sympy is so slow comparing to the one in Sage ?
>

Speed improvement is a combination of better algorithms and data
structures, and many hours spent with a profiler. In case of this
development branch all three are equally important. One reason why
solve() is currently slow is because it uses SymPy's symbolic
expression trees for doing coefficient arithmetic in matrices module
(instead of gmpy if available or PythonRational, which is a much
faster cousin of Rational and Fraction (Python 2.6+) types).

> How do I reproduce what you did after obtaining the patch, assuming the
> input is my original list of 165 equations as given in
> http://pastebin.com/cE87W9m3 ? What is eqs_165x165() in the code eqs =
> eqs_165x165() ? Also, how long usually does it take for such a patch to
> get into the main sympy branch ?
>

See https://github.com/mattpap/sympy/blob/sparse-polys/sympy/polys/benchmarks/bench_solvers.py.
Function eqs_165x165() just creates the system of equations. I copied
http://pastebin.com/cE87W9m3 there and created a benchmark out of it.
So if you pull changes from
https://github.com/mattpap/sympy/tree/sparse-polys then just literally
repeat commands from my previous E-mail to reproduce those results.

I made some further improvements and now it takes 1.95 seconds on
average on my computer to load and solve the system. I also solved
this system in Maxima directly and, thought it alone was faster than
Sage, it's still slower than this code. It took it 2.3 seconds to load
equations from a file and 5.7 to solve the system. However, it has to
be pointed out that Maxima does a little more than SymPy in this case,
because it tries to "simplify"/make the results look pretty.
Personally I don't like such behavior, because if I wanted solutions
"simplified" I could feed them to simplify() (although in SymPy
simplify() could hang because it still uses old slow polynomials).
Simplifications can be disabled in Maxima with "simp: false;" and with
this setup it takes 0.6 seconds to load and 4.3 to solve the system.
Even now it does some postprocessing, but I don't think it's very
expensive.

This branch should get merged in a week, but until a stable release of
SymPy, APIs may change quite a bit (and they will), so you may have to
update your code regularly if you will follow development repository
of SymPy. This is, however, a great opportunity to submit early
feedback regarding features, bugs and/or speed.

Aaron Meurer

unread,

Mar 13, 2013, 7:34:20 PM3/13/13

to sy...@googlegroups.com

This validates what I've often said, which is that if you use the
right algorithms and the right data structures, then Python can be
just as fast as a compiled alternative (especially since "the right
data structures" usually means fast built-in data structures).

To me, right now, there a four main things that can make SymPy slow (in order:

- expand
- slow matrices/linear algebra
- slow polys (esp. in highly multivariate situations)
- the core

expand() is the worst. If we can make that faster, then all of SymPy
will feel zippier.

Slow linear algebra is responsible for slow solve(), but also for slow
integrate(), since heurisch() basically hangs here. I'm not sure where
else it is used (other than in the matrices themselves, obviously).

For slow polys, this rarely makes things slow, but it serves as a
barrier to make things faster, since if you rewrite an algorithm using
the polys, it makes it slower whenever the polynomials have more than
a few variables (this is what happened with heurisch).

Whenever I find myself breaking a long computation, I'd say 70% of the
time it is in expand somewhere and 60% of the time it is in the polys.

The core I would say is actually not too slow, at least compared to
these other things, but since it is used everywhere, even marginal
speedups can add up.

Your work (I'm assuming) will fix point 3, and your continuing work
will fix point 2. Do you have any plans to fix point 1 (expand)?

For point 4 of course the best way forward is to remove the old
assumptions. In my opinion, though, speed is the least concern in the
core. Bigger issues are how to better manage canonicalization and how
to deal with global vs. local assumptions.

Aaron Meurer

Ondřej Čertík

unread,

Mar 14, 2013, 5:42:09 AM3/14/13

to sy...@googlegroups.com

On Thu, Mar 14, 2013 at 12:34 AM, Aaron Meurer <asme...@gmail.com> wrote:
> This validates what I've often said, which is that if you use the
> right algorithms and the right data structures, then Python can be
> just as fast as a compiled alternative (especially since "the right
> data structures" usually means fast built-in data structures).
>
> To me, right now, there a four main things that can make SymPy slow (in order:
>
> - expand
> - slow matrices/linear algebra
> - slow polys (esp. in highly multivariate situations)
> - the core
>
> expand() is the worst. If we can make that faster, then all of SymPy
> will feel zippier.
>
> Slow linear algebra is responsible for slow solve(), but also for slow
> integrate(), since heurisch() basically hangs here. I'm not sure where
> else it is used (other than in the matrices themselves, obviously).
>
> For slow polys, this rarely makes things slow, but it serves as a
> barrier to make things faster, since if you rewrite an algorithm using
> the polys, it makes it slower whenever the polynomials have more than
> a few variables (this is what happened with heurisch).
>
> Whenever I find myself breaking a long computation, I'd say 70% of the
> time it is in expand somewhere and 60% of the time it is in the polys.

Can you write examples of expressions for expand() that are slow?

When I did benchmarks of expand() about 2 years ago, of expressions
like (x+y+z+1)^20, I noticed that the main expansion is done using
multinomial_coefficients(), which is very fast (it can be made maybe
2x or so faster in C, but not 100x). However, 95% of time is spent
in creating the symbolic expression from the coefficients.

Which is why I am experimenting with a C++ core, to see how fast
one can get.

>
> The core I would say is actually not too slow, at least compared to
> these other things, but since it is used everywhere, even marginal
> speedups can add up.
>
> Your work (I'm assuming) will fix point 3, and your continuing work
> will fix point 2. Do you have any plans to fix point 1 (expand)?
>
> For point 4 of course the best way forward is to remove the old
> assumptions. In my opinion, though, speed is the least concern in the
> core. Bigger issues are how to better manage canonicalization and how
> to deal with global vs. local assumptions.

Mateusz and I discussed this last few days (I visited him in Wroclaw) and
I think we both agreed that a good idea is for me (or others) to
finish the experimental
C++ core (https://github.com/certik/csympy) so that we can run some
benchmarks and play with several things like hashes and
canonicalization and at the same time keep
working sympy (assumptions as you wrote, and other things).

The general idea is to have an API, so that we can keep sympy in pure
Python by default,
but if needed, be able to use a faster C++ core for the raw symbolics.
One thing is to decouple the core a little bit more from the rest of
sympy, which should happen anyway.

In general, I think it's quite amazing how fast Python actually is,
that sympy can compete very well
with for example Maxima, if the right data structures and algorithms
are used. So I think it's a good idea
to keep sympy in pure Python. But have an optional C++ core is
something that I feel is necessary, if nothing, at least to have an
idea of how fast one can get.

Ondrej

Ronan Lamy

unread,

Mar 14, 2013, 2:50:01 PM3/14/13

to sy...@googlegroups.com

Le 14/03/2013 09:42, Ondřej Čertík a écrit :
> Mateusz and I discussed this last few days (I visited him in Wroclaw) and
> I think we both agreed that a good idea is for me (or others) to
> finish the experimental
> C++ core (https://github.com/certik/csympy) so that we can run some
> benchmarks and play with several things like hashes and
> canonicalization and at the same time keep
> working sympy (assumptions as you wrote, and other things).
>
> The general idea is to have an API, so that we can keep sympy in pure
> Python by default,
> but if needed, be able to use a faster C++ core for the raw symbolics.
> One thing is to decouple the core a little bit more from the rest of
> sympy, which should happen anyway.

Reimplementing all the core in C++ sounds like a crazy amount of work,
especially since the core actually pulls in half of sympy. Even if you
manage it, I seriously doubt that the 2 implementations could be kept in
sync either. So I don't think it's worth it. (Well, it's your time,
though. You're obviously free to spend it however you like)

>
> In general, I think it's quite amazing how fast Python actually is,
> that sympy can compete very well
> with for example Maxima, if the right data structures and algorithms
> are used. So I think it's a good idea
> to keep sympy in pure Python. But have an optional C++ core is
> something that I feel is necessary, if nothing, at least to have an
> idea of how fast one can get.

If you want to see how fast one can get, it's probably way easier to use
PyPy (making sure that you're measuring jitted code, and to exclude the
compilation overhead).

Ondřej Čertík

unread,

Mar 14, 2013, 4:16:49 PM3/14/13

to sy...@googlegroups.com

On Thu, Mar 14, 2013 at 7:50 PM, Ronan Lamy <ronan...@gmail.com> wrote:
> Le 14/03/2013 09:42, Ondřej Čertík a écrit :
>
>> Mateusz and I discussed this last few days (I visited him in Wroclaw) and
>> I think we both agreed that a good idea is for me (or others) to
>> finish the experimental
>> C++ core (https://github.com/certik/csympy) so that we can run some
>> benchmarks and play with several things like hashes and
>> canonicalization and at the same time keep
>> working sympy (assumptions as you wrote, and other things).
>>
>> The general idea is to have an API, so that we can keep sympy in pure
>> Python by default,
>> but if needed, be able to use a faster C++ core for the raw symbolics.
>> One thing is to decouple the core a little bit more from the rest of
>> sympy, which should happen anyway.
>
>
> Reimplementing all the core in C++ sounds like a crazy amount of work,
> especially since the core actually pulls in half of sympy. Even if you
> manage it, I seriously doubt that the 2 implementations could be kept in
> sync either. So I don't think it's worth it. (Well, it's your time, though.
> You're obviously free to spend it however you like)

Haha, so is SymPy in Python. ;) But we did that anyway.

>
>
>>
>> In general, I think it's quite amazing how fast Python actually is,
>> that sympy can compete very well
>> with for example Maxima, if the right data structures and algorithms
>> are used. So I think it's a good idea
>> to keep sympy in pure Python. But have an optional C++ core is
>> something that I feel is necessary, if nothing, at least to have an
>> idea of how fast one can get.
>
>
> If you want to see how fast one can get, it's probably way easier to use
> PyPy (making sure that you're measuring jitted code, and to exclude the
> compilation overhead).

Hm, our Cython code (https://github.com/certik/sympyx) is much faster
already than anything that I was able to do in pypy. But maybe they
got faster. I'll be happy to be proven wrong.

Ondrej

Ronan Lamy

unread,

Mar 14, 2013, 6:45:46 PM3/14/13

to sy...@googlegroups.com

sympyx is only a toy model, so I don't know how much it can actually
tell us about sympy. But anyway, I find that after warm-up, PyPy is
faster than Cython.

I modified benchmarks.py to run everything 3 times. See the results in
the attached file.

timings.txt

Aaron Meurer

unread,

Mar 14, 2013, 8:39:34 PM3/14/13

to sy...@googlegroups.com

I put a print statement at the top of Expr.expand and another at the
end where it returns, and based on the output for some test
expressions, the issue is not so much that it is slow but that it is
called a lot of times. I included a simple counter, and it is called
thousands of times. And note that the function is cached, so the print
statements are only called once per identical input.

I then added "print self" to the output, and found a very startling
thing. Almost all of the inputs were already expanded. So I then used
a keyword argument to create a global variable to count the number of
times that expand has been called, and the number of times that the
input is different from the output. I then ran the tests on several
submodules

The results (number of times input == output / number of times
expand() was called):

solvers: 11487/13341 (86.1%)
integrals: 18952/21791 (86.97%)
physics: 6657/8394 (79.3%)
matrices: 418/500 (83.6%)
core: 1786/2109 (84.68%)
polys: 2619/2822 (92.8%)

and keep in mind that this is with the cache *on*. So this is only
counting globally unique inputs to expand() for each submodule.

So for these three modules, only about 15% of the time is expand doing
anything useful. It would seem that it's not so much expand((x + y +
z)**20) that we need to make fast, but rather just expand(x + y + z).
Some ideas:

- We should be more careful when creating Poly to use expand=False
whenever we know for sure that the input will never need to be
expanded. I got a lot of mileage out of this trick in the risch code.

- expand currently works by recursively calling _eval_expand_hint on
an expression. So if you have something like 1 + 2*log(x*y), it calls
(1 + 2*log(x*y))._eval_expand_log(), then
Integer(1)._eval_expand_log(), then (2*log(x*y))._eval_expand_log(),
and finally log(x*y)._eval_expand_log(). But it is only this final
form that actually does anything. This is a lot of recursive function
calls, which can be slow in Python. We should think of ways to reduce
this, while still keeping the API open (and backwards compatible if
possible).

What it *really* should do in this case is something like
expr.xreplace(dict([(l, l._eval_expand_log()) for l in
expr.atoms(log)])). Except according to the API, expr.atoms(log)
should really be replaced with something that gets all objects with
_eval_expand_log defined.

- For expand_mul and expand_multinomial, the real work horses of
expand(), we should add a fast check if the expression is a
polynomial. This would break the API as it currently sits, though. I
guess we need sort of the reverse of the idea from
https://code.google.com/p/sympy/issues/detail?id=3338.

- I wonder if it would be possible to do a sort of "structural"
caching. How easy is it to say, "is expression 1 the same as
expression 2, except up to a permutation of the Symbols?" (and
actually give the permutation), and most importantly, determine that
efficiently? How complicated would an expression have to be for it to
be faster to apply a permutation of Symbols against a cached result
instead of just doing things the way we do now? Even if it's slower, I
guess it would be much more efficient memory-wise for the cache.

- How can we use Poly to make expand() faster? This is a question that
I hope Mateusz has an answer to.

By the way, does anyone know of a more efficient way to perform the
above profile? I know that cProfile can count the number of times that
a function is called, but is there any way to extend it to check input
== output for each call, and count the results? My simple method made
things run very slowly.

I'm also curious what profile methods you use, Mateusz? Anything
beyond cProfile and kernprof (and timeit)?

Aaron Meurer

Matthew Rocklin

unread,

Mar 14, 2013, 10:03:26 PM3/14/13

to sy...@googlegroups.com

I support the idea of being smarter about running fewer Python functions. Your post brought up a couple experiences I had trying to get unification based rewriterules to run efficiently.

- expand currently works by recursively calling _eval_expand_hint on
an expression. So if you have something like 1 + 2*log(x*y), it calls
(1 + 2*log(x*y))._eval_expand_log(), then
Integer(1)._eval_expand_log(), then (2*log(x*y))._eval_expand_log(),
and finally log(x*y)._eval_expand_log(). But it is only this final
form that actually does anything. This is a lot of recursive function
calls, which can be slow in Python. We should think of ways to reduce
this, while still keeping the API open (and backwards compatible if
possible).

I ran into a problem like this in my rewrite rules code. I ended up filtering small functions like this based on the types present in the expression. This ended up culling large branches of the recursive call tree and was well worth the up-front cost. Unfortunately I'm not very knowledgable about how expand works so I'm not sure how relevant this optimization is here.

- I wonder if it would be possible to do a sort of "structural"
caching. How easy is it to say, "is expression 1 the same as
expression 2, except up to a permutation of the Symbols?" (and
actually give the permutation), and most importantly, determine that
efficiently? How complicated would an expression have to be for it to
be faster to apply a permutation of Symbols against a cached result
instead of just doing things the way we do now? Even if it's slower, I
guess it would be much more efficient memory-wise for the cache.

Commutativity and associativity may present problems here. There are cool algorithms and data structures to do this efficiently but they're challenging to grasp (or at least challenging to me).

http://webloria.loria.fr/~moreau/Papers/KirchnerM-JFP2001.pdf

Chris Smith

unread,

Mar 15, 2013, 1:07:09 AM3/15/13

to sy...@googlegroups.com

On Fri, Mar 15, 2013 at 6:24 AM, Aaron Meurer <asme...@gmail.com> wrote:
> On Thu, Mar 14, 2013 at 3:42 AM, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> On Thu, Mar 14, 2013 at 12:34 AM, Aaron Meurer <asme...@gmail.com> wrote:
>>> This validates what I've often said, which is that if you use the
>>> right algorithms and the right data structures, then Python can be
>>> just as fast as a compiled alternative (especially since "the right
>>> data structures" usually means fast built-in data structures).
>>>
>>> To me, right now, there a four main things that can make SymPy slow (in order:
>>>
>>> - expand
>>> - slow matrices/linear algebra
>>> - slow polys (esp. in highly multivariate situations)
>>> - the core
>>>
>>> expand() is the worst. If we can make that faster, then all of SymPy
>>> will feel zippier.

This was done in the work for issue 1725, see
http://code.google.com/p/sympy/issues/detail?id=1725#c24 , but it
could never get out of review. This, like the changes to
autoexpansion, need community support and quick review of any work
because they are sympy-wide changes that become a headache to maintain
if the code base continues changing beneath them.

Aaron Meurer

unread,

Mar 15, 2013, 2:37:01 AM3/15/13

to sy...@googlegroups.com

Perhaps you should separate those changes that add new features (like
switch managers), and those changes that are designed to improve the
performance, to as much degree as possible.

Unfortunately, it's not so easy to do, because if we really want to be
API kind, we should make most of the significant changes I suggested
part of some new API. For example, we should add some way for expand
methods to register quick-exit handlers.

On the other hand, another way to do it would be to break the API, or
at least limit it. For example, we could require that classes never
change what expand methods are defined on them (i.e., no magic
__getattr__). Then we could manually keep of which classes implement
which methods, and quick-exit based on that. Or if we do structural
caching, we'll have to assert that what expand does cannot depend on
Symbol names.

I haven't fully profiled expand, so this is somewhat premature. I
can't say for sure what really will make it significantly faster. So
we should play with that, and then move forward according to the
results.

Aaron Meurer

Chris Smith

unread,

Mar 19, 2013, 10:24:28 PM3/19/13

to sy...@googlegroups.com

>>>>> expand() is the worst. If we can make that faster, then all of SymPy
>>>>> will feel zippier.
>>
>>
>> This was done in the work for issue 1725, see

> Perhaps you should separate those changes that add new features (like

...

> part of some new API. For example, we should add some way for expand
> methods to register quick-exit handlers.

Wow, you did a good job of cleaning up expand. I just took a closer
look at it and see that much of what appeared to be redundant
recursion has been fixed by your expand cleanup. The only thing that
I think would make this significantly faster is to run all hints at
each subexpression of the tree rather than running each hit through
the whole tree.

Aaron Meurer

unread,

Mar 19, 2013, 10:55:14 PM3/19/13

to sy...@googlegroups.com

Yes, each hint is now run exactly once per expansion, rather than
dozens of times as before. But it still calls each hint recursively
on every part of a symbolic tree. Most of the time, that just calls
the no-op function that recurses down again, but it's still expensive,
especially considering that 85% of the time it doesn't do anything at
all.

How would you propose to run each hint at once, especially given that
some hints won't work until the other is run? Perhaps we should run
each hint, but when a subexpression changes, be sure to run them in
order so that nothing is missed? It would optimize for the common case
(nothing changes).

I also think that fast exiting might be worth trying. The expand
hints that are run by default are mul, multinomial, power_base,
power_exp, log, and basic. log is never relevant unless the expression
contains a log. mul and multinomial are the most used, but they are
only relevant if the expression contains Muls with Adds or a Pow with
Add base and integer exp. power_base and power_exp I think are used
occasionally, but not very often. basic is just a placeholder for
functions that want expansion to run by default. I don't think it's
used by much of anything library code.

If expand knew all these "rules" for when expansions applied, it could
get a very large expression like x1 + 2*A*x2 - B*x3 + ... (there are a
lot expressions like these passed to expand in heurisch for example)
that are already completely expanded, and it could quickly perform the
necessary checks and determine that it is expanded without recursing
through the whole expression tree. It would just need to perform
expr.atoms(log, Mul, Pow) (should be fast, especially if it's run in
parallel), and check each element of that for the corresponding rules.
You might try just manually putting these rules in expand and
profiling to see if it's really worth it (be sure to disable the cache
when profile expand, though).

ThanhVu Nguyen

unread,

Apr 9, 2013, 2:05:00 AM4/9/13

to sy...@googlegroups.com

If you want to try it out you have to use

sparse-polys branch from #1840 pull request. Here is a sample code:

In [1]: from sympy.polys.benchmarks.bench_solvers import *

In [2]: %time eqs = eqs_165x165()
CPU times: user 2.12 s, sys: 0.01 s, total: 2.13 s
Wall time: 2.12 s

In [3]: %time sol = solve_lin_sys(eqs, R_165)
CPU times: user 1.60 s, sys: 0.01 s, total: 1.60 s
Wall time: 1.59 s

>
>

Hi, I just tried the benchmark file from the latest git pull but was not able to achieve such good performance . I don't have GMP installed so using whatever that sympy uses by default. My test below shows solving these equations takes 32 secs, which is 27 times longer than generating them 1.2 s. This contrasts with your stats above which indicates solving these equations takes 1.6 s, which is even faster than generating them in 2.12s (is it even possible ??).

Your solve_lin_sy() function is indeed much faster than the default solve(), taking over 600 s. However when I've tried size 248x248 and Sage's solve() still solves under two min but your solve_lin_sys() takes over 20 mins, at which point I just gave up and kill the process. (If you need these data let me know)

All the tests below were done on my MacBook in 2011, 2.3 Ghz i3 with 8 GB Ram . Perhaps I should install GMP and try again ?

In [1]: import bench_solvers

In [2]: %time _ = bench_solvers.time_eqs_165x165()
CPU times: user 1.18 s, sys: 0.09 s, total: 1.27 s
Wall time: 1.21 s

In [3]: %time _ = bench_solvers.time_to_expr_eqs_165x165()
CPU times: user 6.10 s, sys: 0.12 s, total: 6.22 s
Wall time: 6.16 s

In [4]: %time _ = bench_solvers.time_verify_sol_165x165()
CPU times: user 8.07 s, sys: 0.05 s, total: 8.12 s
Wall time: 8.10 s

In [5]: %time _ = bench_solvers.time_solve_lin_sys_165x165()
CPU times: user 32.52 s, sys: 0.11 s, total: 32.63 s
Wall time: 32.62 s

In [7]: eqs = bench_solvers.eqs_165x165()

In [8]: eqs_ = [eqt.as_expr() for eqt in eqs]

In [11]: from sympy import solve

In [9]: %time _ = solve(eqs_,as_dict=True) #default solve()
CPU times: user 610.49 s, sys: 2.05 s, total: 612.54 s
Wall time: 614.68 s

Aaron Meurer

unread,

Apr 9, 2013, 2:15:33 AM4/9/13

to sy...@googlegroups.com

These are my timings:

gmpy ground types:

In [19]: from sympy.polys.benchmarks import bench_solvers

In [20]: %time _ = bench_solvers.time_eqs_165x165()
CPU times: user 590 ms, sys: 86.9 ms, total: 677 ms
Wall time: 625 ms

In [21]: %time _ = bench_solvers.time_to_expr_eqs_165x165()
CPU times: user 4.01 s, sys: 149 ms, total: 4.16 s
Wall time: 4.08 s

In [22]: %time _ = bench_solvers.time_verify_sol_165x165()
CPU times: user 4.66 s, sys: 106 ms, total: 4.76 s
Wall time: 4.69 s

In [23]: %time _ = bench_solvers.time_solve_lin_sys_165x165()
CPU times: user 2.95 s, sys: 192 ms, total: 3.15 s
Wall time: 3.02 s

Python ground types

In [1]: from sympy.polys.benchmarks import bench_solvers

In [2]: %time _ = bench_solvers.time_eqs_165x165()

CPU times: user 975 ms, sys: 154 ms, total: 1.13 s
Wall time: 1.03 s

In [3]: %time _ = bench_solvers.time_to_expr_eqs_165x165()

CPU times: user 4.98 s, sys: 113 ms, total: 5.09 s
Wall time: 5.03 s

In [4]: %time _ = bench_solvers.time_verify_sol_165x165()

CPU times: user 6.73 s, sys: 64.7 ms, total: 6.79 s
Wall time: 6.75 s

In [5]: %time _ = bench_solvers.time_solve_lin_sys_165x165()

CPU times: user 26.2 s, sys: 112 ms, total: 26.4 s
Wall time: 26.3 s

So the ground types do make a difference, especially in the solve
times. Note that if gmpy is not installed, it just uses Python ints
and Fraction for integers and rational numbers (it's likely Fraction
that is so slow, since it has basic operations like addition and gcd
implemented in Python loops).

Aaron Meurer

> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.

> Visit this group at http://groups.google.com/group/sympy?hl=en-US.

ThanhVu (Vu) Nguyen

unread,

Apr 9, 2013, 9:41:13 AM4/9/13

to sy...@googlegroups.com

On 4/9/13 12:15 AM, Aaron Meurer wrote:
> These are my timings:
>
> gmpy ground types:

How do I install gmpy ground types? Do I just install GMPY from source
and reinstall python to use those ?

Stefan Krastanov

unread,

Apr 9, 2013, 10:01:12 AM4/9/13

to sy...@googlegroups.com

In whatever environment you are working, simply installing the python
module gmpy would be enough.

In linux just use your package manager like `apt-get install
python-gmpy` on ubuntu/debian (or `pip install ...` as described in
http://code.google.com/p/gmpy/)

In windows it might be harder, but unless something is broken with
your python installation it should be sufficient just to download the
`exe` installer from http://code.google.com/p/gmpy/downloads/list

Just check after install that `import gmpy` works. No need to reinstall python.

Ondřej Čertík

unread,

Apr 9, 2013, 11:39:48 AM4/9/13

to sympy

On Tue, Apr 9, 2013 at 8:01 AM, Stefan Krastanov
<krastano...@gmail.com> wrote:
> In whatever environment you are working, simply installing the python
> module gmpy would be enough.
>
> In linux just use your package manager like `apt-get install
> python-gmpy` on ubuntu/debian (or `pip install ...` as described in
> http://code.google.com/p/gmpy/)
>
> In windows it might be harder, but unless something is broken with
> your python installation it should be sufficient just to download the
> `exe` installer from http://code.google.com/p/gmpy/downloads/list
>
> Just check after install that `import gmpy` works. No need to reinstall python.

I think ThanhVu is using a Mac. When I am using a Mac, I quite like Homebrew:

http://mxcl.github.io/homebrew/

it worked really well for me.

Ondrej

ThanhVu Nguyen

unread,

Apr 9, 2013, 11:44:31 AM4/9/13

to sy...@googlegroups.com

So indeed things become much much faster using GMPY ground types.

Another speed related question is doing expression substituion, i.e expr.subs(dict). This subs operation is very slow comparing to Sage. In my case I often have to do this task which instantiates an expression with hundred of thousand of data and this takes lots of time. For example, [(x**3 + y**2 + z + 3).subs(d) for d in big_data_set]. Is there some way to do this more efficiently ? Like using different data structure than Expression ?

Thanks,

In [2]: %time bench_solvers.time_solve_lin_sys_165x165()
CPU times: user 4.02 s, sys: 0.15 s, total: 4.17 s
Wall time: 4.08 s

In [3]: %time bench_solvers.time_eqs_165x165()
CPU times: user 0.94 s, sys: 0.09 s, total: 1.03 s
Wall time: 0.96 s

In [4]: %time bench_solvers.time_verify_sol_165x165()
CPU times: user 7.04 s, sys: 0.04 s, total: 7.09 s
Wall time: 7.08 s

In [5]: %time bench_solvers.time_to_expr_eqs_165x165()
CPU times: user 6.34 s, sys: 0.14 s, total: 6.49 s
Wall time: 6.41 s

ThanhVu Nguyen

unread,

Apr 9, 2013, 11:46:00 AM4/9/13

to sy...@googlegroups.com

I think ThanhVu is using a Mac. When I am using a Mac, I quite like Homebrew:

http://mxcl.github.io/homebrew/

it worked really well for me.

Yes I use Mac. But I was able to use the traditional Unix/Linux instructions to install gmp,mpfr , gmpy just fine.

Ondřej Čertík

unread,

Apr 9, 2013, 11:50:09 AM4/9/13

to sympy

On Tue, Apr 9, 2013 at 9:44 AM, ThanhVu Nguyen <nguyent...@gmail.com> wrote:
> So indeed things become much much faster using GMPY ground types.
>
> Another speed related question is doing expression substituion, i.e
> expr.subs(dict). This subs operation is very slow comparing to Sage. In
> my case I often have to do this task which instantiates an expression with
> hundred of thousand of data and this takes lots of time. For example,
> [(x**3 + y**2 + z + 3).subs(d) for d in big_data_set]. Is there some way
> to do this more efficiently ? Like using different data structure than
> Expression ?

Can you post the code for the subs(d)? So that we can profile it and
see what the bottleneck is.

Thanks for all your work and being in touch with us, that is very useful.

Ondrej

ThanhVu (Vu) Nguyen

unread,

Apr 9, 2013, 1:00:45 PM4/9/13

to sy...@googlegroups.com

A minor question: any plan to put the solve_lin_sys solver as the back
end of solve() ?

Currently my input is a list of expressions can be feed directly to
solve(). To use the solve_lin_sys I would have to first create ring
over the variables, apply that ring over all the expressions and give
them to solve_lin_sys, then finally convert the result back to expressions.

This is the code I wrote to do these tasks

from sympy.polys.rings import vring
from sympy.polys.domains import QQ
from sympy.polys.solvers import solve_lin_sys

def mysolve(eqts)
uks = get_vars(eqts) #set of symbols (variables) from all equations
vr = vring(uks, QQ)
eqts = map(vr,eqts)
rs = solve_lin_sys(eqts, vr)
rs = OrderedDict([(k.as_expr(),v.as_expr()) for k,v in rs.iteritems()])
return rs

Aaron Meurer

unread,

Apr 9, 2013, 10:48:34 PM4/9/13

to sy...@googlegroups.com

On Tue, Apr 9, 2013 at 11:00 AM, ThanhVu (Vu) Nguyen <tng...@cs.unm.edu> wrote:
> A minor question: any plan to put the solve_lin_sys solver as the back
> end of solve() ?

Yes, but there is still work to do. This code is very specialized, and
won't work for symbolic coefficients. The whole matrices need to be
rewritten to allow the entries to be gmpy objects, or more generally,
objects from the polys ground types.

This is all still very new and very internal. Don't be surprised if
the API completely changes without notice!

For subs, you can try things like lambdify.

Aaron Meurer

>
> Currently my input is a list of expressions can be feed directly to
> solve(). To use the solve_lin_sys I would have to first create ring
> over the variables, apply that ring over all the expressions and give
> them to solve_lin_sys, then finally convert the result back to expressions.
>
> This is the code I wrote to do these tasks
>
> from sympy.polys.rings import vring
> from sympy.polys.domains import QQ
> from sympy.polys.solvers import solve_lin_sys
>
> def mysolve(eqts)
> uks = get_vars(eqts) #set of symbols (variables) from all equations
> vr = vring(uks, QQ)
> eqts = map(vr,eqts)
> rs = solve_lin_sys(eqts, vr)
> rs = OrderedDict([(k.as_expr(),v.as_expr()) for k,v in rs.iteritems()])
> return rs
>
>

mario

unread,

Apr 10, 2013, 5:03:01 AM4/10/13

to sy...@googlegroups.com

With the ``polys.ring`` data structure substitutions are done with ``compose``, which is faster than using
``subs`` with SymPy expressions if one is interested in distributed polynomials.

Reply all

Reply to author

Forward