31 views

Skip to first unread message

Aug 10, 2008, 11:41:21 PM8/10/08

to sy...@googlegroups.com

Hi,

first a very good news --- thanks to Kirill, the latest sympy-hg got

again about 10% faster in the last 3 days, so on the benchmark below,

we are now only about 4x slower than sympycore (if you remember, it

used to be couple hundreds times slower half a year ago).

I was curious how fast it would be if we rewrote SymPy directly in C,

using the current design (Add/Mul/Pow classes). I haven't done any

coding in C for couple years, so I had to relearn basically

everything, but I managed to do a very quick draft in the last two

days, you can see for yourself here:

$ hg clone http://freehg.sympy.org/u/certik/csympy/

$ cd csympy

$ make

$ ./t.py

[...]

(1 + y + z + x)^10

# of terms: 1771

time spent doing e+e2: 0.0344660282135

time doing multinomial_coefficients: 0.0283851623535

total time: 0.0746591091156

The above "total time" is is the result of this benchmark in sympy:

In [1]: e = (x+y+z+1)**10

In [2]: %time f = (e**2).expand()+e.expand()

CPU times: user 0.47 s, sys: 0.00 s, total: 0.47 s

Wall time: 0.47 s

csympy:

In [4]: %time f = (e**2).expand()+e.expand()

CPU times: user 0.13 s, sys: 0.00 s, total: 0.13 s

Wall time: 0.13 s

sympycore:

In [1]: from sympycore import var

In [2]: var("x y z")

Out[2]: (Calculus('x'), Calculus('y'), Calculus('z'))

In [3]: e = (x+y+z+1)**10

In [4]: %time f = (e**2).expand()+e.expand()

CPU times: user 0.12 s, sys: 0.00 s, total: 0.12 s

Wall time: 0.12 s

sympycore with a compiled C extension:

In [4]: %time f = (e**2).expand()+e.expand()

CPU times: user 0.09 s, sys: 0.00 s, total: 0.09 s

Wall time: 0.09 s

ginac (the timing varies from 0.4s to 0.6s):

#include "ginac.h"

using namespace std;

using namespace GiNaC;

int main()

{

symbol x("x"), y("y"), z("z");

ex e = pow(x+y+z+1, 10);

ex e2 = pow(e, 2);

ex f = e2.expand()+e.expand();

//cout << f << endl;

return 0;

}

$ time ./test

real 0m0.052s

user 0m0.048s

sys 0m0.004s

The problem with the above timings is that the timings in ipython are

much (50%) higher compared to a measurement in the python script using

timeit.default_timer(), but I don't have time to dig into this. So you

can see that on this particular benchmark, sympy is less than 4x

slower than sympycore, csympy is somewhere between sympycore and

sympycore+C, while ginac is still a little faster, but quite

negligibly on this particular benchmark.

Technical remarks:

* csympy is written in C using structs and functions pointers that

implements the OO style and the sympy design. Everything is in the

basic.c file, you can find there add, mul, pow, symbol and integer

structs and functions (methods) operating on it. Just like sympy, only

in C. Look here for a simple example how it works:

http://wiki.sympy.org/wiki/Playing_with_SymPy_design_in_C

Important point is that you don't need to use some other class scheme

like in sympycore in order to achieve competitive speeds.

* it is exported to Python using Cython, so you can use it just like sympy

* for the above expansion, it uses the multinomial_coefficients from

sympy, that is written in pure python (the C code uses Cython to call

it). Possibly implementing this function in C too, one could get a

little more speedup

* csympy uses a HashTable from glib in Add and Mul classes. The table

seems to preserve the order, so no sorting is necessary. We may look

at how it is implemented internally and maybe exploit things even more

for our needs

* I haven't done any real profiling of the code, so I suspect it could

be made even faster

* it uses reference counting, but currently it is a little buggy, so

the freeing of objects is disabled to prevent segfaults. You can

easily enable it (just uncomment the free(self) line in the

basic__dealloc__ method), the "t.py" script runs with no noticeable

slowdown with it. The reference counting needs to be fixed.

* currently only C ints are used, which means that for the above

benchmark, the ints wrap around to negative values. I haven't

implemented Python ints in csympy yet.

What to do next:

* sympy will remain pure python by default

* we'll continue speeding things up using our current approach, plus

code some parts in Cython (plus maybe C), those will be optional

modules for people that need better speed and don't mind compiling

stuff.

* I may play with csympy a little more to fix the reference counting

and then figure out how to hook up sympy (python) objects in it. I

would like to get as fast as ginac with this design. Actually faster.

:) Basically what is needed are just __hash__ and __eq__ methods, then

it should work fine. Then one could easily extend the basic core which

is in C using pure Python classes. However, this approach is only a

playground to get the perspective of how fast one can get with this

design. This is because this kind of basic manipulation is not very

useful without assumptions and many other simplifications that we do

in SymPy.

* using our experience with csympy, we'll try to speedup our

Add/Mul/Pow classes and also our Poly class (which should be

optionally coded in Cython).

Anyway, I had this design in C in my head for a long time, so I am

glad I managed to put this into code and I am surprised that I got

almost as fast as ginac in just 2 days. I won't have time to work on

it much in the nearest time frame, but if anyone is interested, let me

know what you think.

Ondrej

Aug 12, 2008, 2:07:20 AM8/12/08

to sy...@googlegroups.com

On Mon, Aug 11, 2008 at 05:41:21AM +0200, Ondrej Certik wrote:

>

> Hi,

>

> first a very good news --- thanks to Kirill, the latest sympy-hg got

> again about 10% faster in the last 3 days,

>

> Hi,

>

> first a very good news --- thanks to Kirill, the latest sympy-hg got

> again about 10% faster in the last 3 days,

Just to clarify on what this "10% faster" means:

e = (x+y+z+1)**32

%time %time

integrate(x**3*sin(x), x) q = e.expand()

cache cache

off on off on

old: 7.55 s 4.02 s 18.8 s 14.08 s

new: 5.74 s 3.77 s 6.07 s 5.73 s

speedup: 31% 6.6% 3.1x 2.5x

old = a94579fb550a

new = 8a31d8eebe32

I think we'll converge to Cython which is the way to go.

--

Всего хорошего, Кирилл.

Aug 12, 2008, 7:06:19 AM8/12/08

to sy...@googlegroups.com

On Tue, Aug 12, 2008 at 8:07 AM, Kirill Smelkov

<ki...@landau.phys.spbu.ru> wrote:

>

> On Mon, Aug 11, 2008 at 05:41:21AM +0200, Ondrej Certik wrote:

>>

>> Hi,

>>

>> first a very good news --- thanks to Kirill, the latest sympy-hg got

>> again about 10% faster in the last 3 days,

>

> Just to clarify on what this "10% faster" means:

>

>

> e = (x+y+z+1)**32

>

> %time %time

> integrate(x**3*sin(x), x) q = e.expand()

>

> cache cache

> off on off on

>

> old: 7.55 s 4.02 s 18.8 s 14.08 s

> new: 5.74 s 3.77 s 6.07 s 5.73 s

>

>

> speedup: 31% 6.6% 3.1x 2.5x

>

> old = a94579fb550a

> new = 8a31d8eebe32

<ki...@landau.phys.spbu.ru> wrote:

>

> On Mon, Aug 11, 2008 at 05:41:21AM +0200, Ondrej Certik wrote:

>>

>> Hi,

>>

>> first a very good news --- thanks to Kirill, the latest sympy-hg got

>> again about 10% faster in the last 3 days,

>

> Just to clarify on what this "10% faster" means:

>

>

> e = (x+y+z+1)**32

>

> %time %time

> integrate(x**3*sin(x), x) q = e.expand()

>

> cache cache

> off on off on

>

> old: 7.55 s 4.02 s 18.8 s 14.08 s

> new: 5.74 s 3.77 s 6.07 s 5.73 s

>

>

> speedup: 31% 6.6% 3.1x 2.5x

>

> old = a94579fb550a

> new = 8a31d8eebe32

Exactly. Thanks for the clarification.

>> Anyway, I had this design in C in my head for a long time, so I am

>> glad I managed to put this into code and I am surprised that I got

>> almost as fast as ginac in just 2 days. I won't have time to work on

>> it much in the nearest time frame, but if anyone is interested, let me

>> know what you think.

>

> I think we'll converge to Cython which is the way to go.

Yes. I tried to fix the reference counting (which I need to do by hand

in C) and it was a night mare. I gave up, this is not a maintainable

solution. So I wrote the very simple sympy core in python here:

http://freehg.sympy.org/u/certik/csympy2/

and I ported the nice dictionary from glib using Cython. The plan now

is to make it calculate the expansion benchmark above correctly (it

almost does) and then use Cython to convert all the classes to

extension classes (basically just C struct) and optimize things to be

as fast as my C solution, that I abandoned.

This is done with the aim of merging with sympy immediatelly once we

get a fast/maintainable solution. (of course the cython version will

only be optional).

The aim is to get as fast as maple and mathematica or ginac. And then

port it to sympy, thus making sympy as fast too.

Ondrej

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu