euroscipy2008, sympy, sympycore update

Ondrej Certik

unread,

Jul 26, 2008, 6:37:22 PM7/26/08

to sage-...@googlegroups.com, sy...@googlegroups.com

Hi,

I am giving a quick update what is happening at euroscipy2008.

I came with Mateusz Paprocki (the author of the parallel
integration/heuristic risch in SymPy, fast polynomials, nontrivial
simplification, terms rewriting...)
and Robert Cimrman (the author of sfepy.org) and we were finally able
to meet with Pearu Peterson, the main author of sympycore.

If you have some concerns, wishes, comments, please share it now,
tomorrow is the last day, which we can discuss things face to face
(this really helps!).

If you want to know more about sympy/sympycore split, read our wiki:
http://code.google.com/p/sympy/wiki/SymPyCore
where all people who wanted to say something to this expressed their opinions.

We went over the core of SymPy with Pearu and tried to figure out why
the object creation is so slow (around 30x if the latest benchmarks
are correct), as this is the main slowdown as to sympycore (maybe
there are some more too). Then we were discussing how to move forward.
I think the fact that we were finally able to meet with Pearu greatly
helped to understand what we want and what we want to do in the future
and how, at least from my side. We both agreed that we want to have a
fast symbolic manipulation package in Python, the question is how to
get there. I noticed a new update on the sympycore's webpage:

"
Question: What is the main difference between SympyCore and SymPy projects?

Answer: SympyCore is a research project while SymPy is a software project.
"

I agree with that, in SymPy, the most important thing is that it works
and it gets the job done, that all tests pass, that we review all
patches that go in, we try to do frequent releases, and so on, and so
on. Pearu on the other hand really wants to get the core symbolic
manipulation done as fast as possible and this needs a lot of
experimenting. Only things that are maintainable and agreed among
sympy developers can go into sympy, anyone who disagrees with
something is welcomed to present his arguments, I encourage anyone who
disagree with us/me to speak up (in a constructive way), that's imho
the only way to find the solution that we all agree is the good one.

Nice thing about sympycore is that Pearu (and Fredrik, another sympy develper
and google summer of code student, also an author of mpmath) has made
a great job at
making it very fast in pure python (and even writing an optional C
extension module for giving additional 2x speedup) and this creates a
real presure on sympy to make it fast too, otherwise people will not
use it.This keeps me (and I am sure other sympy devs) in a pretty
motivated mood. :) Which is good, even Linus says that internal
competition is good.

To the outside world (and I think Linus stresses that too in some
interview) I'd like to look united though. And I'd like to look united
with Sage too, I just believe it is more efficient for all of us. This
means being able to easily convert expressions between all projects
and thinking and trying to take advantage of each other. For myself, I
greatly improved the Sage/SymPy conversion lately, I am just waiting
for the Sage 3.0.6 release as I don't want to update two sympy
releases per one Sage release. Next I think it's a good idea to start
using Sage optionally for stuff that we cannot do (yet). So that the
barrier is lowered for users to choose the package (be it sage,
maxima, sympy or anything else) that does the job
for them.

A problem, that I see is that we just don't know what stuff is needed
in the core. I just talked about this with Mateusz, who implemented a
*lot* of nontrivial stuff in sympy and after all of that he also is
not sure what exactly the core should have. I am not sure either and I
don't think there is some other way to get to know this other than to
get all the nontrivial stuff done and then improve. The aim is to
write a core that can be used to build up stuff on top of it, but my
point is that without actually writing the advanced stuff on top of
it, one can not be sure what needs to be done in the core

So I can say what we will do in SymPy. We have our roadmap here:

http://wiki.sympy.org/wiki/Plan_for_SymPy_1.0

We just polished the assumptions, so now we'll try to make Add/Mul/Pow
much faster using the same representation as sympycore, while
continuing fixing and adding more features to SymPy and writing
general assumptions, because only then we'll have a general CAS. The
aim is to get as fast as sympycore. If we manage with the help of
Pearu to disentangle the core, it'd be nice, so that stuff is more
modular. If not, then not. We are also thinking with Kirill (another
sympy developer) and Pearu how to rewrite stuff in Cython to make it
faster, but
it is highly nontrivial, because one can speedup basic stuff like
expansion and we will do that, but if we also want to speedup
integration, advanced simplification, then it is not clear at all how
to optimize stuff with cython.

When Travis posts the videos of the presentations up, I'll send a link.

Ondrej

Gael Varoquaux

unread,

Jul 26, 2008, 6:43:34 PM7/26/08

to sy...@googlegroups.com

On Sun, Jul 27, 2008 at 12:37:22AM +0200, Ondrej Certik wrote:
> Next I think it's a good idea to start using Sage optionally for stuff
> that we cannot do (yet).

Does that mean anything as far as the license of sympy? I am trying to
figure out how using Sage could fit with a BSD license.

Gaël

Ondrej Certik

unread,

Jul 27, 2008, 1:57:04 AM7/27/08

to sy...@googlegroups.com

Just like with gmpy (LGPL). If the user has it installed, sympy could
use it. If he doesn't, sympy won't use it. The same with Sage.

Ondrej

Gael Varoquaux

unread,

Jul 27, 2008, 2:01:25 AM7/27/08

to sy...@googlegroups.com

On Sun, Jul 27, 2008 at 07:57:04AM +0200, Ondrej Certik wrote:
> > Does that mean anything as far as the license of sympy? I am trying to
> > figure out how using Sage could fit with a BSD license.

> Just like with gmpy (LGPL). If the user has it installed, sympy could
> use it. If he doesn't, sympy won't use it. The same with Sage.

Hum, I hate to discuss these things, as I don't want to give the
impression I am a license fanatic, but I am wondering if sympy calling
sage would not be similar to linking, and thus propagating the GPL to
sympy.

Gaël

Pearu Peterson

unread,

Jul 27, 2008, 3:45:00 AM7/27/08

to sympy

On Jul 27, 6:01 am, Gael Varoquaux <gael.varoqu...@normalesup.org>
wrote:

That's interesting point. If that would be true then GPL would also
propagate
from fftw to scipy, for instance. I hope its not true..

Pearu

Ondrej Certik

unread,

Jul 27, 2008, 3:47:02 AM7/27/08

to sy...@googlegroups.com, sage-...@googlegroups.com

No, please discuss these things, it is very important. Gmpy is fine,
because it's LGPL. But GPL library is imho fine too, for example scipy
also optionally uses umfpack, that is GPL. The crucial point is
optional. Imho. Sympy can use sage even now, it has the _sage_
methods, that import sage and call it. So, does this make sympy GPL? I
don't know, but I think it doesn't. Scipy is also BSD and it contains
code that can call umfpack.
I think would would make sympy GPL would be if we took some GPL code
from Sage and put it to sympy. We would have to ask the particular
sage developer who wrote it if he's fine to make it BSD. So far
everytime I asked it was ok (it's true it was mostly some utility
scripts, but even the groebner bases implementation from Martin
Albrecht). Fortunately most sage devels and sympy devels are license
agnostic in the sense, that they don't mind contributing their code to
either GPL or BSD project, but some are not, so one needs to choose
such a license that allows us to work together, where one needs to
define "us", but most of our users and developers (so far) are from
the scipy/numpy/ipython crowd, that use BSD like license. I know that
Kirill advocates LGPL, right Kirill? :)

I think BSD is the best choice for SymPy now, but as always, anyone
feel free to argue with me, present your arguments, let's discuss.

Ondrej

Pearu Peterson

unread,

Jul 27, 2008, 4:30:07 AM7/27/08

to sympy

On Jul 26, 10:37 pm, "Ondrej Certik" <ond...@certik.cz> wrote:

> A problem, that I see is that we just don't know what stuff is needed
> in the core. I just talked about this with Mateusz, who implemented a
> *lot* of nontrivial stuff in sympy and after all of that he also is
> not sure what exactly the core should have. I am not sure either and I
> don't think there is some other way to get to know this other than to
> get all the nontrivial stuff done and then improve. The aim is to
> write a core that can be used to build up stuff on top of it, but my
> point is that without actually writing the advanced stuff on top of
> it, one can not be sure what needs to be done in the core

In general, core should provide basic data structures for representing
symbolic expressions + basic methods to manipulate these data
structures.
Core should also define a generic interface to these data structures
(to support changing the internals of the core).

In think the pattern for representing symbolic expressions
that we use in sympycore is quite general: symbolic expression is
represented as a pair: head and data. The head part defines how to
interpret data. So the core can define various heads and the
corresponding
methods to access and manipulate data parts. Note that this approach
does not specify what concrete mathematical concept the symbolic
expression
represents - this will be defined in a library part that uses core.

Pearu

Gael Varoquaux

unread,

Jul 27, 2008, 11:43:56 AM7/27/08

to sy...@googlegroups.com

On Sun, Jul 27, 2008 at 09:47:02AM +0200, Ondrej Certik wrote:
> No, please discuss these things, it is very important. Gmpy is fine,
> because it's LGPL. But GPL library is imho fine too, for example scipy
> also optionally uses umfpack, that is GPL. The crucial point is
> optional.

I have googled a bit, and I can see that the interpretation of the
linking clause of the GPL vary. Some believe that if the GPL code is not
distributed along with the code that uses it, the code that uses it is
not contaminated by the GPL. Others believe that the GPLed code should be
released under the linking exception for this to hold.

> I think BSD is the best choice for SymPy now, but as always, anyone
> feel free to argue with me, present your arguments, let's discuss.

I am very happy about sympy being BSD. Currently I am paid, and pretty
well, by a company (Enthought) to not only work on their own BSD-licensed
code, but also contribute to other BSD-licensed code (ipython, numpy,
scipy). The BSD is an important aspect of this eco-system.

Next year, I'll be working in neuro-imaging. In this world there is a lot
of code writen. Money comes from the governement, but it can also come
from companies building scanners (GE, Siemens). The companies will not
found GPL or LGPL code, because they modify the code for their scanners,
and want to keep the modifications secret. However, they have been
founding some very nice BSD projects (VTK, ITK, slicer3D --
http://www.slicer.org/ ). I like this model, and I am very happy that
there is a large amount of high-quality BSD code available.

This is not to say that I dislike GPL or LGPL, I just have the feeling
that BSD fits better in our specialised field of scientific computing
librairies.