[ANNOUNCE]: sympyx -- sympy core in cython

390 views
Skip to first unread message

Ondrej Certik

unread,
Aug 14, 2008, 9:23:19 AM8/14/08
to sy...@googlegroups.com, sage-...@googlegroups.com
Hi,

we wrote new sympy core from scratch using Cython, that runs roughly an order
of magnitude faster than the current sympy, yet it uses exactly the same
architecture, thus it will be merged soon. We call it sympyx, because
Cython/Pyrex files ends with .pyx suffix.

git repository ("tree -> snapshot" to get the tarball, see also [1]):
http://repo.or.cz/w/sympyx.git

hg repository: http://landau.phys.spbu.ru/~kirr/cgi-bin/hg.cgi/sympyx-hg

Our aim is to provide Python library for symbolic mathematics that is as fast
at Mathematica or Maple (or Magma), and provides all features that one needs
for symbolics or calculus. We would also optionally like to run in pure Python
as a (slower) option, because that has many advantages (and we hope Cython will
allow us to do so, see below).

First a little appetizer:

$ git clone git://repo.or.cz/sympyx
$ cd sympyx
$ ipython
In [1]: from sympy import var
I: import sympy_pyx ...
No module named sympy_pyx
W: can't import sympy_pyx -- will be pure python

In [2]: var("x y z")
Out[2]: (x, y, z)

In [3]: e = (x+y+z+1)**10

In [4]: f = e*(e+1)

In [5]: time g = f.expand()
CPU times: user 0.40 s, sys: 0.00 s, total: 0.41 s
Wall time: 0.41 s

In [6]: import os

In [7]: time os.system("./test")
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.34 s

Where [5] is our pure Python sympy core and [7] is this C++ program
calling ginac:

#include "ginac.h"

using namespace std;
using namespace GiNaC;

int main()
{
symbol x("x"), y("y"), z("z");
ex e = pow(x+y+z+1, 10);
//ex f = pow(e, 2).expand()+e.expand();
ex f = (e*(e+1)).expand();
//cout << f << endl;
return 0;
}


So as you can see, our pure python core is almost as fast as C++ ginac for this
particular benchmark. Now let's go to the main dish and compile our cythonized
core:

$ make
gcc -I/usr/include/python2.5 -I/usr/include/python2.5 -g -O0 -fPIC
-c -o sympy_pyx.o sympy_pyx.c
gcc -shared sympy_pyx.o -o sympy_pyx.so
$ ipython
In [1]: from sympy import var
I: import sympy_pyx ... ok

In [2]: var("x y z")
Out[2]: (Symbol(x), Symbol(y), Symbol(z))

In [3]: e = (x+y+z+1)**10

In [4]: f = e*(e+1)

In [5]: time g = f.expand()
CPU times: user 0.13 s, sys: 0.00 s, total: 0.13 s
Wall time: 0.13 s

And we blow ginac away. Ginac is expanding in an unoptimal way though,
so we can help him by uncommenting the line:

//ex f = pow(e, 2).expand()+e.expand();

in the C++ driver (and commenting the original line). Now:

In [2]: time os.system("./test")
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.05 s

So if ginac trivially improves it's expansion algorithm, it will be
about 2.5x faster on this particular benchmark.

Our core is just 600 lines and compiles in about 1 or 2s, while stripped down
ginac compiles for several minutes (am I right?), using heavy C++.

Yes, we know we should measure the time in C++ directly, otherwise we
are also measuring the speed of creation of the linux process, but
then it's again not fair, because timings in ipython are always bigger
on my machine than measuring things directly. Also when I run the
benchmarks yesterday on my laptop, I got a little different results (I
would say up to 10%) than today. So rather run the benchmarks yourself
to get the feeling.


As you saw above, benchmarks are almost like statistics to me (a'la
lies, damned lies
and statistics[2]), but nevertheless you can run the ./benchmarks.py
script to get at least some impression about the actual speed.

sympyx - cython module:

$ ./benchmarks.py
I: import sympy_pyx ... ok
I: Running SymPy
e=(x+y+z+1)**10;f=e*(e+1);f.expand(): 0.087618
e=(x+y+z+1)**10; f=e**2+e; f.expand(): 0.084371
e=(x+y+z+1)**20; f=e**2+e; f.expand(): 0.836535
e=(x+y+z+1)**30; f=e**2+e; f.expand(): 3.838012
e=(x+y+z+1)**10; e.expand(): 0.005631
e=(x+y+z+1)**50; e.expand(): 0.685875
e=((x**x+y**y+z**z)**10 * (x**y+y**z+z**x)**10); e.expand(): 0.479764
Add(x,<random integer>,y), 2000x: 0.053736
Mul(x,<random integer>,y), 2000x: 0.033046
sum(x**i/i,i=1..400): 1.087146
sum(x**i/i,i=1..400), using Add(terms): 0.012360


sympyx - pure python module:

$ ./benchmarks.py
I: import sympy_pyx ... fail (No module named sympy_pyx)
W: can't import sympy_pyx -- will be pure python
I: Running SymPy
e=(x+y+z+1)**10;f=e*(e+1);f.expand(): 0.377762
e=(x+y+z+1)**10; f=e**2+e; f.expand(): 0.384766
e=(x+y+z+1)**20; f=e**2+e; f.expand(): 3.315983
e=(x+y+z+1)**30; f=e**2+e; f.expand(): 15.782160
e=(x+y+z+1)**10; e.expand(): 0.019719
e=(x+y+z+1)**50; e.expand(): 2.505715
e=((x**x+y**y+z**z)**10 * (x**y+y**z+z**x)**10); e.expand(): 2.189778
Add(x,<random integer>,y), 2000x: 0.220534
Mul(x,<random integer>,y), 2000x: 0.118482
sum(x**i/i,i=1..400): 6.332842
sum(x**i/i,i=1..400), using Add(terms): 0.070217


sympycore - with compiled C module:

$ ./benchmarks.py
I: Running sympycore
e=(x+y+z+1)**10;f=e*(e+1);f.expand(): 0.527226
e=(x+y+z+1)**10; f=e**2+e; f.expand(): 0.060202
e=(x+y+z+1)**20; f=e**2+e; f.expand(): 0.400932
e=(x+y+z+1)**30; f=e**2+e; f.expand(): 1.397105
e=(x+y+z+1)**10; e.expand(): 0.006567
e=(x+y+z+1)**50; e.expand(): 0.654393
e=((x**x+y**y+z**z)**10 * (x**y+y**z+z**x)**10); e.expand(): 0.101733
Add(x,<random integer>,y), 2000x: 0.023955
Mul(x,<random integer>,y), 2000x: 0.036462
sum(x**i/i,i=1..400): 0.007237
sum(x**i/i,i=1..400), using Add(terms): 0.006226

sympycore -- pure python:

$ ./benchmarks.py
I: Running sympycore
e=(x+y+z+1)**10;f=e*(e+1);f.expand(): 3.649253
e=(x+y+z+1)**10; f=e**2+e; f.expand(): 0.105257
e=(x+y+z+1)**20; f=e**2+e; f.expand(): 0.794901
e=(x+y+z+1)**30; f=e**2+e; f.expand(): 2.648821
e=(x+y+z+1)**10; e.expand(): 0.013228
e=(x+y+z+1)**50; e.expand(): 1.268088
e=((x**x+y**y+z**z)**10 * (x**y+y**z+z**x)**10); e.expand(): 0.375300
Add(x,<random integer>,y), 2000x: 0.048301
Mul(x,<random integer>,y), 2000x: 0.067446
sum(x**i/i,i=1..400): 0.014150
sum(x**i/i,i=1..400), using Add(terms): 0.012438


Again sympycore doesn't use optimal expansion algorithm, but after fixing it
and compiling it's C module, it's usually either the same fast or 2x up to 5x
faster, depending on the benchmark. Also note that sympycore was developed in
the last 10 or more months, while we developed the above 600 lines script in
about 3 evenings and using exactly the same architecture as sympy has, thus
making it easy to merge the research code back (as opposed to sympycore, which
is very difficult to merge back, due to it's different internal design and
couple thousands lines of code). However, both me and Kirill learned from our
experience in SymPy and also taking some useful code and ideas from sympycore
(the multinomial_coefficients, which btw is now in sage too), so sympycore was
helpful, not only by providing competitive environment, but also by providing
some reusable code. And also our discussions at EuroSciPy with Pearu greatly
helped me, so it's not like that we can do something better in 3 evenings,
while other people need 10 months, it's rather the opposite, other people
needed 10 months of hard work and we just stood on their shoulders.

Looking at the generated C code from Cython, there is still room for
improvements, so we expect to be as fast as ginac in the future. Lists and
dicts are slow, currently calling too many python C/API things, so this needs
to be moved to C (or better customized with Cython interfaces) -- btw, I think
this is the reason why you Gary call C/API stuff from your code? I am sure
there is a more maintainable way to achieve the same thing though. The big
difference is that our design makes it extremely easy to extend things, add new
functions and other features to the basic symbolic engine.

Example:

from sympy import *


class sin(Basic):

def __new__(cls, arg):
if arg == 0:
return Integer(0)
else:
obj = Basic.__new__(cls, (arg,))
return obj

def __repr__(self):
return "sin(%s)" % self.args[0]

x = Symbol("x")
print ((sin(1)+sin(0)+x)**2).expand()


This will print:

I: import sympy_pyx ... ok
x^2 + sin(1)^2 + 2*x*sin(1)


And you can see that sin(0) get's simplified to 0 and absorbed in Add,
and the rest is correctly expanded (the "I" line is just our debugging
print).

The symbolic engine can call any virtual functions, so it is optimized in C
using Cython, yet it naturally allows extensions in pure Python (or Cython
whatever you want). For me, this is Holy Grail.

Now we are going to speed things up more, add little more functionality, like
functions and some more simplifications to see that our engine doesn't break
into pieces, and then port it back to sympy. That's the main point -- I don't
know how to merge back sympycore (nor I don't know how to merge back ginac for
that matter), but I know how to merge back sympyx --- just copy the functions
from the pure python sympyx back to sympy, this thing only should provide great
speedups. And then cythonizing sympy, just like we cythonized sympyx.

BTW, we are really looking forward to be able to maintain just one code and get
both cython generated C files and pure python files out of it. Maybe providing
pure python backend to Cython? Or parsing the Cython file to produce a file
that runs in Python? Or changing the Cython syntax to be Python compatible?

Anyway, our goal is to be a symbolic library to Python that is viable to
Mathematica. For that we need features and we need speed. In SymPy we have some
features, and the number of contributors shows that it's approach is viable.
Sympyx brings speed using the same architecture as sympy. This should bring us
much closer to Mathematica, but not *that* close --- we still need to speedup
sympyx maybe 10x, but as said above, looking into our generated C code, we
think it may be possible.

Important is that even if we cannot make it 10x faster with this architecture,
it still brings a huge improvements to sympy, thus boosting it's usability. And
after we get all changes in, we may pursue some other paths to symbolic
manipulation to make it even faster.

Opinions, Comments, Suggestions?

Ondrej & Kirill

[1] We are experimenting with git, because it provides much better
features than mercurial, has regular release process and has a very
broad and active community. What we miss in mercurial: patches
rebasing (this is really a pain in mercurial), showstopper bugs that
takes months to get fixed due to a slow release process, qrecord (this
was recently implemented by Kirill in mercurial), colored output of
diffs, many places on the internet to host the repo and many more. Git
has all of that and we are just tired of fixing mercurial if we can
just use git and spend our time on better things. That said, we are
not advocating to switch sympy to git until we are 100% sure we can do
the same things in git on all platforms, just easier. See also:
http://wiki.sympy.org/wiki/Git_hg_rosetta_stone

[2] http://en.wikipedia.org/wiki/Lies,_damned_lies,_and_statistics

chu-ching huang

unread,
Aug 14, 2008, 10:57:23 AM8/14/08
to sympy
Just follow your example to test. In my laptop, Toshiba A100, slax
Linux

1. python-sympy
time for f.expand(): 6.77sec
2. sympy_pyx
time : 0.01sec

I don't know why the result is different from yours. But it is very
great!



cch

Ondrej Certik

unread,
Aug 14, 2008, 11:27:03 AM8/14/08
to sy...@googlegroups.com
On Thu, Aug 14, 2008 at 4:57 PM, chu-ching huang
<cch...@mail.cgu.edu.tw> wrote:
>
> Just follow your example to test. In my laptop, Toshiba A100, slax
> Linux
>
> 1. python-sympy
> time for f.expand(): 6.77sec
> 2. sympy_pyx
> time : 0.01sec
>
> I don't know why the result is different from yours.

Maybe you used the current sympy-hg or the latest release? Those give
aroung 6s on my laptop.

sympyx has pure python and cythonized core -- both are much faster
than the current sympy (the cythonized version
about 4x or e more faster than the pure python version).

> But it is very
> great!

Thanks. We'll try to merge soon, after more feedback and testing.

Ondrej

Robert Bradshaw

unread,
Aug 14, 2008, 3:07:13 PM8/14/08
to sage-...@googlegroups.com, sy...@googlegroups.com
On Thu, 14 Aug 2008, Ondrej Certik wrote:

>
> Hi,
>
> we wrote new sympy core from scratch using Cython, that runs roughly an order
> of magnitude faster than the current sympy, yet it uses exactly the same
> architecture, thus it will be merged soon. We call it sympyx, because
> Cython/Pyrex files ends with .pyx suffix.
>
> git repository ("tree -> snapshot" to get the tarball, see also [1]):
> http://repo.or.cz/w/sympyx.git
>
> hg repository: http://landau.phys.spbu.ru/~kirr/cgi-bin/hg.cgi/sympyx-hg
>
> Our aim is to provide Python library for symbolic mathematics that is as fast
> at Mathematica or Maple (or Magma), and provides all features that one needs
> for symbolics or calculus. We would also optionally like to run in pure Python
> as a (slower) option, because that has many advantages (and we hope Cython will
> allow us to do so, see below).

Yes, being able to use a single codebase in both a Python and Cython
context is a high priority and will hopefully be in Cython in not too
long (though right now my focus has been on the NumPy stuff). Also, you
mentioned that in some places Cython produces sub-optimal code. Please
report them here

http://trac.cython.org/cython_trac/

Needing to use the Python/C API directly is, IMHO, usually indicates a
deficiency in Cython. The results below are encouraging, but I am
relatively sure (and hope) there is still more speed to be gained.

Ondrej Certik

unread,
Aug 16, 2008, 2:42:27 PM8/16/08
to sy...@googlegroups.com, sage-...@googlegroups.com
>> Our aim is to provide Python library for symbolic mathematics that is as fast
>> at Mathematica or Maple (or Magma), and provides all features that one needs
>> for symbolics or calculus. We would also optionally like to run in pure Python
>> as a (slower) option, because that has many advantages (and we hope Cython will
>> allow us to do so, see below).
>
> Yes, being able to use a single codebase in both a Python and Cython
> context is a high priority and will hopefully be in Cython in not too
> long (though right now my focus has been on the NumPy stuff). Also, you

Thanks, I am really looking forward.

> mentioned that in some places Cython produces sub-optimal code. Please
> report them here
>
> http://trac.cython.org/cython_trac/
>
> Needing to use the Python/C API directly is, IMHO, usually indicates a
> deficiency in Cython.

We need to analyze this much deeper and we'll probably do that after
we merge this with SymPy. I did some benchmarks with my original pure
C code and to my surprise, we are already as fast! This is really
cool. This is not so much that Cython produces faster C code than I
did, but rather that I optimized some algorithms a little bit as it is
much easier in Python/Cython than in C.

However, I think we maybe should just use C arrays instead of python
lists and maybe couple more things like that to move things to C from
Python.

> The results below are encouraging, but I am
> relatively sure (and hope) there is still more speed to be gained.

I am pretty sure too.

Ondrej

Robert Bradshaw

unread,
Aug 16, 2008, 5:57:32 PM8/16/08
to sage-...@googlegroups.com, Ondrej Certik, sy...@googlegroups.com

If you use C arrays then you will be back into the quagmire of doing
your own garbage collecting again. Python lists are actually pretty
fast as long as you don't access them via the generic mechanisms--you
can declare you variables to be of type list and the fast macros will
be used to manipulate them.

Ondrej Certik

unread,
Aug 16, 2008, 6:36:31 PM8/16/08
to sy...@googlegroups.com, sage-...@googlegroups.com

You mean to convert this:

def test(sum):
cdef i
a = []
a.append(5)
a.append(6)
a.append(7)
print a

to this:

def test(sum):
cdef i
cdef list a
a = []
a.append(5)
a.append(6)
a.append(7)
print a


?

We use the second version, e.g.:

cdef tuple _args

I studied the generated source code for the above and basically the
only difference is that in the first case cython uses
__Pyx_PyObject_Append that does one more check PyList_CheckExact,
while in the second case cython uses PyList_Append() dirrectly. But
looking at Python sources, PyList_CheckExact macro reduces to this:

PyAPI_FUNC(int) PyType_IsSubtype(PyTypeObject *, PyTypeObject *);
#define PyObject_TypeCheck(ob, tp) \
((ob)->ob_type == (tp) || PyType_IsSubtype((ob)->ob_type, (tp)))

but the first condition should evaluate to 1, so the possibly slow
function PyType_IsSubtype should not be called, so this is just one
pointer dereference and one comparison in C, isn't it? So that's
nothing.

So what is possibly slow is PyList_Append(), that calls app1()
function, that checks the size of the array, possibly makes it larger
and calls PyList_SET_ITEM

#define PyList_SET_ITEM(op, i, v) (((PyListObject *)(op))->ob_item[i] = (v))

which again is superfast. Ah, I didn't realize everything is so fast
actually internally. Hm, so I don't see any ways to get this faster.
Only to allocate the list at once at the beginning (i.e. don't resize
it), but otherwise it seems to be just doing things that I'd do by
hand anyway. Just better. Ok, we need to profile it to see how to
speed it up.

So do you view Cython as a way to write C code with automatic memory
handling and in a better (high level) syntax?

Ondrej

Robert Bradshaw

unread,
Aug 17, 2008, 4:36:02 AM8/17/08
to sympy
Yes, exactly.

> We use the second version, e.g.:
>
> cdef tuple _args
>
> I studied the generated source code for the above and basically the
> only difference is that in the first case cython uses
> __Pyx_PyObject_Append that does one more check PyList_CheckExact,
> while in the second case cython uses PyList_Append() dirrectly. But
> looking at Python sources, PyList_CheckExact macro reduces to this:
>
> PyAPI_FUNC(int) PyType_IsSubtype(PyTypeObject *, PyTypeObject *);
> #define PyObject_TypeCheck(ob, tp) \
> ((ob)->ob_type == (tp) || PyType_IsSubtype((ob)->ob_type, (tp)))

Actually, PyList_CheckExact is

#define PyList_CheckExact(op) ((op)->ob_type == &PyList_Type)

There is also the issue of how much code fits into cache, which makes
the list version better in some cases.

>
> but the first condition should evaluate to 1, so the possibly slow
> function PyType_IsSubtype should not be called, so this is just one
> pointer dereference and one comparison in C, isn't it? So that's
> nothing.
>
> So what is possibly slow is PyList_Append(), that calls app1()
> function, that checks the size of the array, possibly makes it larger
> and calls PyList_SET_ITEM
>
> #define PyList_SET_ITEM(op, i, v) (((PyListObject *)(op))->ob_item[i] = (v))
>
> which again is superfast. Ah, I didn't realize everything is so fast
> actually internally. Hm, so I don't see any ways to get this faster.

Yeah, manually managing a variable sized pointer array isn't going to
be any better...

> Only to allocate the list at once at the beginning (i.e. don't resize
> it), but otherwise it seems to be just doing things that I'd do by
> hand anyway. Just better. Ok, we need to profile it to see how to
> speed it up.

When I implemented list comprehension I did some timings of pre-
allocating the array (assuming no errors and no if statements) vs.
just using PyList_Append, and was surprised at the closeness of the
two in terms of speed (don't remember the exact figures, but it was
<10%, and I think it was more like 3-5% which just wasn't worth the
extra complexity at the time).

I don't think indexing is optimized (yet) if the type is declared.

> So do you view Cython as a way to write C code with automatic memory
> handling and in a better (high level) syntax?

Mostly, yes. There are still, and probably always will be, things that
are better written in C, but for the kind of stuff I am interested in
coding [C|P]ython is a much better level of abstraction to work at. I
think this is true of a lot of code that is written nowadays...

- Robert

Ondrej Certik

unread,
Aug 17, 2008, 10:23:41 AM8/17/08
to sy...@googlegroups.com, sage-...@googlegroups.com

Indeed, I was looking at the macro above. So this is even simpler.

>
> There is also the issue of how much code fits into cache, which makes
> the list version better in some cases.
>
>>
>> but the first condition should evaluate to 1, so the possibly slow
>> function PyType_IsSubtype should not be called, so this is just one
>> pointer dereference and one comparison in C, isn't it? So that's
>> nothing.
>>
>> So what is possibly slow is PyList_Append(), that calls app1()
>> function, that checks the size of the array, possibly makes it larger
>> and calls PyList_SET_ITEM
>>
>> #define PyList_SET_ITEM(op, i, v) (((PyListObject *)(op))->ob_item[i] = (v))
>>
>> which again is superfast. Ah, I didn't realize everything is so fast
>> actually internally. Hm, so I don't see any ways to get this faster.
>
> Yeah, manually managing a variable sized pointer array isn't going to
> be any better...
>
>> Only to allocate the list at once at the beginning (i.e. don't resize
>> it), but otherwise it seems to be just doing things that I'd do by
>> hand anyway. Just better. Ok, we need to profile it to see how to
>> speed it up.
>
> When I implemented list comprehension I did some timings of pre-
> allocating the array (assuming no errors and no if statements) vs.
> just using PyList_Append, and was surprised at the closeness of the
> two in terms of speed (don't remember the exact figures, but it was
> <10%, and I think it was more like 3-5% which just wasn't worth the
> extra complexity at the time).

This is very interesting.

>
> I don't think indexing is optimized (yet) if the type is declared.
>
>> So do you view Cython as a way to write C code with automatic memory
>> handling and in a better (high level) syntax?
>
> Mostly, yes. There are still, and probably always will be, things that
> are better written in C, but for the kind of stuff I am interested in
> coding [C|P]ython is a much better level of abstraction to work at. I
> think this is true of a lot of code that is written nowadays...

On Sun, Aug 17, 2008 at 12:51 AM, Fernando Perez <fpere...@gmail.com> wrote:


>
> On Sat, Aug 16, 2008 at 3:36 PM, Ondrej Certik <ond...@certik.cz> wrote:
>
>> So do you view Cython as a way to write C code with automatic memory
>> handling and in a better (high level) syntax?
>

> I know I do :) And with the new-and-improving numpy multi-dimensional
> array support, I view it as a way to get a civilized, high-level
> language for numerical computing that can be every way as performant
> as C, with the most glaring and insulting deficiency of C fixed: the
> lack of proper array support.
>
> That's why I keep feeding sacrificial animals to my secret voodoo
> Cython shrine to keep the team coding 24 hours a day ;)


That is great. As my experience showed again this time, my C code
version is just plainly worse to the Cython version, which is already
the same fast and there are still ways to improve it.

BTW, I just discovered, that if I do

import gc
gc.disable()

at the beginning of the script doing this:

e = (x+y+z+1)**20
f = e*(e+1)
g = f.expand()

it's time goes from 0.46 to 0.35 on my laptop. Don't know what this
suggests though, if it's good news or bad news.
BTW ginac (if I help it manually with the expansion) on this takes
0.237, so this is only 1.5x faster. If I leave ginac to expand it as
above, it takes 23s!

So that's why I don't want to use ginac, it will require fixing
anyways, so I'd rather fix things in Python and Cython, this will be
much better in the long term.

Ondrej

Ondrej Certik

unread,
Aug 17, 2008, 2:16:19 PM8/17/08
to sage-...@googlegroups.com, sy...@googlegroups.com
>> A few remarks about your benchmark: I believe it's Richard Fateman who
>> introduced this kind of benchmark, in order to be able to time
>> multiplication of two large expressions. That's why it is written e*(e
>> +1) and not e^2+e (since computing a multivariate power is faster e.g.
>> by repeting multiplication). GiNaC and giac do the multiplication
>> without rearrangement, which of course costs more on this particular
>> example but is for generic expressions the right way to do things (and
>> you can benchmark both ways using the two different expressions).
>> I don't mean that benchmarking time to compute powers is not
>> interesting -and it seems you are now not far from ginac and probably
>> not far from giac (I don't know how faster your PC is compared to
>> mine), but benchmarking general multivariate multiplication would be
>> more representative of what will be required for symbolic expansion.
>
> +1 I was about to write the same thing.

I agree. My point though is that I as a user simply don't care why it
is slow. I simply type the expression in, type e.expand() and expect
the library to use the fastest way.

So I also tried the

e = ((x+y+z+1)**10).expand()
f = ( e*(e+1) ).expand()

benchmark and indeed, sympyx is about 8x slower than ginac on this one
(6.5x with gc disabled). So we need to optimize this particular
operation more.

As to gc, I am not saying it should be disabled by default. But if
it's an easy option to speed things up, it's good to know about it.

Bernarnd, I compiled giac and it compiled, that's great (it depends on
latex2html though, which is in Debian non-free and I don't have it
installed, so the build fails, so maybe you could consider not
depending on it). If I create you the initial giac spkg together with
some preliminary Cython bindings, would you be willing to do the work
or find some student to make giac work with Sage easily?

I'd like to have it around as a way to benchmarks advanced things like
limits, integration etc.

Ondrej

Vinzent Steinberg

unread,
Aug 17, 2008, 4:15:25 PM8/17/08
to sympy


On Aug 14, 3:23 pm, "Ondrej Certik" <ond...@certik.cz> wrote:
> [1] We are experimenting with git, because it provides much better
> features than mercurial, has regular release process and has a very
> broad and active community. What we miss in mercurial: patches
> rebasing (this is really a pain in mercurial), showstopper bugs that
> takes months to get fixed due to a slow release process, qrecord (this
> was recently implemented by Kirill in mercurial), colored output of
> diffs, many places on the internet to host the repo and many more. Git
> has all of that and we are just tired of fixing mercurial if we can
> just use git and spend our time on better things. That said, we are
> not advocating to switch sympy to git until we are 100% sure we can do
> the same things in git on all platforms, just easier. See also:http://wiki.sympy.org/wiki/Git_hg_rosetta_stone

Git is probably the fastest and most feature-rich VCS. But it's
Windows support is not optimal.
Have you ever tried Bazaar? It's actively maintained and pure Python,
while being sufficiently fast.
I would prefer it over git. Any objections?

http://bazaar-vcs.org/

Vinzent

Ondrej Certik

unread,
Aug 17, 2008, 4:49:00 PM8/17/08
to sy...@googlegroups.com

In my own experience Bazaar is even slower than Mercurial. But what we
are going to do is that we'll leave the current Mercurial repo of
SymPy as it is and I'll be just using git locally on my machine. Then
when we get more used to git, we'll create a live sync from the
mercurial repo, so that anyone can use it. But definitely we are not
going to stop supporting mercurial anytime soon, exactly because of
windows.

We could on the other hand figure out how to provide a baazar repo as
well, synced with our main hg repo, so that you can use what you like
the best. If you know how to do it, let me know.

Ondrej

Vinzent Steinberg

unread,
Aug 17, 2008, 5:24:57 PM8/17/08
to sympy
When did you try the performance? And which things are too slow for
you?
I'll be glad to use git, I'm working mainly on Linux anyway, and I'm
not afraid of the hundreds of megabytes the Windows port ships as
dependency.

>
> We could on the other hand figure out how to provide a baazar repo as
> well, synced with our main hg repo, so that you can use what you like
> the best. If you know how to do it, let me know.
>
> Ondrej

For me bzr is fast enough, and it seems very feature-rich and usable
to me.

Vinzent

Ondrej Certik

unread,
Aug 17, 2008, 5:42:22 PM8/17/08
to sy...@googlegroups.com

Now.

> And which things are too slow for
> you?

Each bzr command is slow.

$ time bzr status

real 0m0.381s
user 0m0.320s
sys 0m0.060s

$ time git status
# On branch master
nothing to commit (working directory clean)

real 0m0.041s
user 0m0.000s
sys 0m0.008s

git is just 10x faster. I hate to wait almost 0.4s just to see the
status, if git can tell the same info to me immediatelly.
And with other things, like cloning, branching and stuff, it's just my
empirical experience, that git feels fast, while bzr feels terribly
slow.


> I'll be glad to use git, I'm working mainly on Linux anyway, and I'm
> not afraid of the hundreds of megabytes the Windows port ships as
> dependency.

Or you can just continue using mercurial as today.

>
>>
>> We could on the other hand figure out how to provide a baazar repo as
>> well, synced with our main hg repo, so that you can use what you like
>> the best. If you know how to do it, let me know.
>>
>> Ondrej
>
> For me bzr is fast enough, and it seems very feature-rich and usable
> to me.

Sure, use what you find the best, only you can make the decision what
is best for you, not me (I only know what is good for me). :) I think
it's a good idea to support all 3 major distributed systems and it is
technically doable.

Ondrej

Ondrej Certik

unread,
Aug 17, 2008, 5:44:28 PM8/17/08
to sy...@googlegroups.com

For the record, mercurial is somewhere in the middle:

$ time hg status

real 0m0.266s
user 0m0.232s
sys 0m0.036s

(git and hg were tested on the same revision of sympy, bzr on ipython).

Ondrej

Reply all
Reply to author
Forward
0 new messages