SymPy presentations at SageDays (online, June 1-3)

186 views
Skip to first unread message

Matthias Köppe

unread,
Apr 10, 2022, 9:44:03 PM4/10/22
to sympy
We are in the early stages of planning an online SageDays event (https://wiki.sagemath.org/days112.358). Wondering if some SymPy developers would be interested in presenting? I think there's a lot of potential for synergy between the projects.

We plan to use this event also for onboarding participants in the Sage GSoC (https://wiki.sagemath.org/GSoC); if there's interest, perhaps there could be some shared activities with SymPy in this direction too.
 

Aaron Meurer

unread,
Apr 12, 2022, 8:17:22 AM4/12/22
to sy...@googlegroups.com
On Sun, Apr 10, 2022 at 7:44 PM Matthias Köppe <matthia...@gmail.com> wrote:
>
> We are in the early stages of planning an online SageDays event (https://wiki.sagemath.org/days112.358). Wondering if some SymPy developers would be interested in presenting? I think there's a lot of potential for synergy between the projects.

What sort of presentation would you like to see from SymPy?

>
> We plan to use this event also for onboarding participants in the Sage GSoC (https://wiki.sagemath.org/GSoC); if there's interest, perhaps there could be some shared activities with SymPy in this direction too.

Yes, possibly. You should circle back with us on this after Google
announces the GSoC projects in May.

Aaron Meurer

>
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/32169621-b6e2-4e1a-86d3-ed5b88a0ecfdn%40googlegroups.com.

Matthias Köppe

unread,
Apr 12, 2022, 2:26:28 PM4/12/22
to sympy
Hi Aaron,

On Tuesday, April 12, 2022 at 5:17:22 AM UTC-7 asme...@gmail.com wrote:
On Sun, Apr 10, 2022 at 7:44 PM Matthias Köppe <matthia...@gmail.com> wrote:
> We are in the early stages of planning an online SageDays event (https://wiki.sagemath.org/days112.358). Wondering if some SymPy developers would be interested in presenting? I think there's a lot of potential for synergy between the projects.

What sort of presentation would you like to see from SymPy?

I'd be interested in:
- General overview of recent-ish features of SymPy
- status and plans regarding use of FLINT - the Sage interface to it misses many of the more recent developments in the FLINT 2.x series (https://trac.sagemath.org/ticket/31408), so there's a potential for synergy here
- status and plans regarding SymEngine 
- status and plans regarding solveset for several variables
- status and plans regarding the assumptions facility

> We plan to use this event also for onboarding participants in the Sage GSoC (https://wiki.sagemath.org/GSoC); if there's interest, perhaps there could be some shared activities with SymPy in this direction too.

Yes, possibly. You should circle back with us on this after Google
announces the GSoC projects in May.

Sounds good.

Matthias 


 

Oscar Benjamin

unread,
Apr 14, 2022, 12:34:28 PM4/14/22
to sympy
On Tue, 12 Apr 2022 at 19:26, Matthias Köppe <matthia...@gmail.com> wrote:
>
> Hi Aaron,
>
> On Tuesday, April 12, 2022 at 5:17:22 AM UTC-7 asme...@gmail.com wrote:
>>
>> On Sun, Apr 10, 2022 at 7:44 PM Matthias Köppe <matthia...@gmail.com> wrote:
>> > We are in the early stages of planning an online SageDays event (https://wiki.sagemath.org/days112.358). Wondering if some SymPy developers would be interested in presenting? I think there's a lot of potential for synergy between the projects.
>>
>> What sort of presentation would you like to see from SymPy?
>
> I'd be interested in:

I would be interested in coming along and can present on at least some
of these things.

> - General overview of recent-ish features of SymPy
> - status and plans regarding use of FLINT - the Sage interface to it misses many of the more recent developments in the FLINT 2.x series (https://trac.sagemath.org/ticket/31408), so there's a potential for synergy here

It would definitely be good to work together on this if possible. Does
SAGE use its own bindings for flint? I've been working a little on
python_flint e.g.:
https://github.com/fredrik-johansson/python-flint/pull/20

The python_flint bindings still miss a lot of the newer features from
flint, arb etc. A primary goal though is just to make it more easily
installable.

> - status and plans regarding SymEngine

I don't know about the status of SymEngine. I can't say that I can see
any significant work happening on the SymPy side to integrate
SymEngine any further with SymPy. My personal view is that for faster
symbolics a different approach is needed in general but SymEngine
seems to have the same design flaws as SymPy itself in that respect.

> - status and plans regarding solveset for several variables

There are no immediate plans for a solveset for several variables
beyond nonlinsolve. There was recent discussion about this on the
mailing list here though:
https://groups.google.com/g/sympy/c/v_YLkX4QuRY

> - status and plans regarding the assumptions facility

I think much like solveset etc there needs to be more organisation
among sympy developers to define what the plans for things like this
should be going forwards. These are the kinds of things that GSOC
should really be used for rather than adding peripheral features. We
need to make that clearer to GSOC applicants though.

--
Oscar

Tirthankar Mazumder

unread,
Apr 14, 2022, 1:16:38 PM4/14/22
to sympy
Hi, as someone who is relatively new to the SymPy + SymEngine project, and wants to work on SymEngine as a part of their GSoC, I would appreciate it if you could elaborate a bit on what design flaws you are referring to. I have been going through the SymEngine repository, and the main thing that sticks out to me is the lack of useful documentation. For example, what exactly is ATan2's is_canonical() function supposed to do, and why? From a cursory glance at the code, we know that the code deems something in a non-canonical form if the numerator is equal to the denominator, but why?

Moreover, the design principles page says that the repository uses the visitor and double dispatch design pattern, but how exactly, and to do what?

From my perspective, it looks like we can do a lot to make SymEngine more beginner-friendly. I had a much easier time starting with SymPy than I did with SymEngine.

Matthias Köppe

unread,
Apr 14, 2022, 1:36:06 PM4/14/22
to sympy
On Thursday, April 14, 2022 at 9:34:28 AM UTC-7 Oscar wrote:
On Tue, 12 Apr 2022 at 19:26, Matthias Köppe <matthia...@gmail.com> wrote:
>> On Sun, Apr 10, 2022 at 7:44 PM Matthias Köppe <matthia...@gmail.com> wrote:
>> > We are in the early stages of planning an online SageDays event (https://wiki.sagemath.org/days112.358). Wondering if some SymPy developers would be interested in presenting? I think there's a lot of potential for synergy between the projects.

I would be interested in coming along and can present on at least some
of these things.

Let's figure out the details in the coming weeks.
 
> - General overview of recent-ish features of SymPy
> - status and plans regarding use of FLINT - the Sage interface to it misses many of the more recent developments in the FLINT 2.x series (https://trac.sagemath.org/ticket/31408), so there's a potential for synergy here

It would definitely be good to work together on this if possible. Does
SAGE use its own bindings for flint?


I've been working a little on
python_flint e.g.:
https://github.com/fredrik-johansson/python-flint/pull/20

The python_flint bindings still miss a lot of the newer features from
flint, arb etc. A primary goal though is just to make it more easily
installable.

I saw that. Having a cibuildwheel workflow is an important step. 
This is also something that we're planning for modularized distributions of the Sage library (https://trac.sagemath.org/ticket/29705).
 
> - status and plans regarding SymEngine

I don't know about the status of SymEngine. I can't say that I can see
any significant work happening on the SymPy side to integrate
SymEngine any further with SymPy. My personal view is that for faster
symbolics a different approach is needed in general but SymEngine
seems to have the same design flaws as SymPy itself in that respect.

> - status and plans regarding solveset for several variables

There are no immediate plans for a solveset for several variables
beyond nonlinsolve. There was recent discussion about this on the
mailing list here though:
https://groups.google.com/g/sympy/c/v_YLkX4QuRY

Thanks for the pointer. Noted in https://trac.sagemath.org/ticket/24142
 
> - status and plans regarding the assumptions facility

I think much like solveset etc there needs to be more organisation
among sympy developers to define what the plans for things like this
should be going forwards.

I'll schedule a discussion on these topics for the SageDays event -- hopefully interested SymPy developers could join. https://wiki.sagemath.org/days112.358#Activities_and_Speakers 
Details to be fleshed out. 
I also plan to invite some people from other related communities.

Best,
Matthias
 

Aaron Meurer

unread,
Apr 14, 2022, 2:04:53 PM4/14/22
to sy...@googlegroups.com
On Thu, Apr 14, 2022 at 11:36 AM Matthias Köppe
How will the scheduling actually work for Sage Days? The site just
says it will take place "during the 50 hours when it is June 2, 2022
in some timezone in the world". Does that mean that events will take
place at any time during the day?

Aaron Meurer

>
> Best,
> Matthias
>
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/433423ff-3ef7-4c39-b409-8ac812c52e7bn%40googlegroups.com.

Matthias Köppe

unread,
Apr 14, 2022, 2:12:34 PM4/14/22
to sympy
On Thursday, April 14, 2022 at 11:04:53 AM UTC-7 asme...@gmail.com wrote:
On Thu, Apr 14, 2022 at 11:36 AM Matthias Köppe
<matthia...@gmail.com> wrote:
> On Thursday, April 14, 2022 at 9:34:28 AM UTC-7 Oscar wrote:
>>
>> On Tue, 12 Apr 2022 at 19:26, Matthias Köppe <matthia...@gmail.com> wrote:
>> >> On Sun, Apr 10, 2022 at 7:44 PM Matthias Köppe <matthia...@gmail.com> wrote:
>> >> > We are in the early stages of planning an online SageDays event (https://wiki.sagemath.org/days112.358). Wondering if some SymPy developers would be interested in presenting? I think there's a lot of potential for synergy between the projects.

How will the scheduling actually work for Sage Days? The site just
says it will take place "during the 50 hours when it is June 2, 2022
in some timezone in the world". Does that mean that events will take
place at any time during the day?

Yes, the event is designed like this to emphasize and promote global collaboration. 
Events can be scheduled at any time of the day, whenever is convenient for the speakers.
For example, if there were several talks / discussions involving SymPy, they could be spread out over the day and there would be no concern about duplicating some introductory material because it would reach different parts of the global audience.

Matthias 

Oscar Benjamin

unread,
Apr 14, 2022, 2:26:15 PM4/14/22
to sympy
On Thu, 14 Apr 2022 at 18:16, Tirthankar Mazumder
<greenw...@gmail.com> wrote:
>
> On Thursday, April 14, 2022 at 10:04:28 PM UTC+5:30 Oscar wrote:
>>
>> On Tue, 12 Apr 2022 at 19:26, Matthias Köppe <matthia...@gmail.com> wrote:
>>
>> > - status and plans regarding SymEngine
>>
>> I don't know about the status of SymEngine. I can't say that I can see
>> any significant work happening on the SymPy side to integrate
>> SymEngine any further with SymPy. My personal view is that for faster
>> symbolics a different approach is needed in general but SymEngine
>> seems to have the same design flaws as SymPy itself in that respect.
>
> Hi, as someone who is relatively new to the SymPy + SymEngine project, and wants to work on SymEngine as a part of their GSoC, I would appreciate it if you could elaborate a bit on what design flaws you are referring to.

The basic design of the way that symbolic expressions are represented
in SymPy and SymEngine is through the use of classes to represent
different types of expression e.g. there is a class Pow and an
expression like x**y is represented by an instance of that class
created as Pow(x, y).

Using classes like this makes it hard to extend the system with new
types of symbolic expression e.g. with SymEngine you can't make a new
kind of symbolic expression without writing C++ code so if you are
using SymEngine from Python then it isn't extensible. With SymPy you
could at least make your own symbolic expression class in Python but
then it still doesn't work so well because there are so many places in
the codebase where particular types of expression are special-cased
meaning that any new symbolic expression type cannot be handled by
functions like solve, integrate etc.

The other problem with using classes is that it means that the basic
data structure that is used to represent expressions is fixed from the
outside. In SymPy and SymEngine that data structure is a tree and all
algorithms recurse through that tree. More efficient data structures
for some given operation can't be used because the implementation of
each symbolic expression type requires that you always use instances
of the class meaning that you always have to have a tree and always
have to recurse in the same way.

As an example it would be trivial in SymPy to make substitutions
involving large expressions and many replacements much faster with a
small redesign. The problem though is backwards compatibility: each
expression class can implement its own _eval_subs method so for
backwards compatibility the subs implementation must recurse in the
same slow way once throughout the whole tree for each replacement that
is to be made. (There are ways to work around this but the design
makes it harder than it should be.)

This is a small example of a more general problem. Using classes means
you have to use the interfaces of the class which means that efficient
algorithms needing something not provided by that public interface are
impossible. Even worse both SymPy and SymEngine allow the
*constructors* of those classes to do nontrivial work without
providing any good way to override that. This means that you can't
even represent a symbolic expression without allowing arbitrary code
execution: it's impossible to optimise higher-level algorithms if you
have so little control over execution from the outside.

It's very hard later to change the public interfaces of the expression
classes because as soon as any downstream code has subclassed your
classes you are bound by backwards compatibility. (Subclassing across
the boundaries of different software projects leads to strong
coupling.)

There are ways that this can be improved in both SymPy and SymEngine
but I also think that for many important operations the basic design
used by these is limiting in a big-O sense: the design constrains what
algorithms can be used. SymEngine is faster than SymPy in a brute
force sense by using C++ rather than Python but it would be possible
to make something both faster and more flexible if a different design
was used at a basic level.

--
Oscar

Aaron Meurer

unread,
Apr 14, 2022, 3:06:21 PM4/14/22
to sy...@googlegroups.com
On Thu, Apr 14, 2022 at 12:12 PM Matthias Köppe
<matthia...@gmail.com> wrote:
>
> On Thursday, April 14, 2022 at 11:04:53 AM UTC-7 asme...@gmail.com wrote:
>>
>> On Thu, Apr 14, 2022 at 11:36 AM Matthias Köppe
>> <matthia...@gmail.com> wrote:
>> > On Thursday, April 14, 2022 at 9:34:28 AM UTC-7 Oscar wrote:
>> >>
>> >> On Tue, 12 Apr 2022 at 19:26, Matthias Köppe <matthia...@gmail.com> wrote:
>> >> >> On Sun, Apr 10, 2022 at 7:44 PM Matthias Köppe <matthia...@gmail.com> wrote:
>> >> >> > We are in the early stages of planning an online SageDays event (https://wiki.sagemath.org/days112.358). Wondering if some SymPy developers would be interested in presenting? I think there's a lot of potential for synergy between the projects.
>>
>> How will the scheduling actually work for Sage Days? The site just
>> says it will take place "during the 50 hours when it is June 2, 2022
>> in some timezone in the world". Does that mean that events will take
>> place at any time during the day?
>
>
> Yes, the event is designed like this to emphasize and promote global collaboration.
> Events can be scheduled at any time of the day, whenever is convenient for the speakers.
> For example, if there were several talks / discussions involving SymPy, they could be spread out over the day and there would be no concern about duplicating some introductory material because it would reach different parts of the global audience.

OK. Well you can put me down as provisionally interested in giving a
talk about SymPy as well.

Aaron Meurer

>
> Matthias
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/770238d7-aae0-435c-b97c-42bd95622de5n%40googlegroups.com.

Matthias Köppe

unread,
Apr 14, 2022, 3:12:55 PM4/14/22
to sympy
On Thursday, April 14, 2022 at 12:06:21 PM UTC-7 asme...@gmail.com wrote:
>> >> >> On Sun, Apr 10, 2022 at 7:44 PM Matthias Köppe <matthia...@gmail.com> wrote:
>> >> >> > We are in the early stages of planning an online SageDays event (https://wiki.sagemath.org/days112.358). Wondering if some SymPy developers would be interested in presenting? I think there's a lot of potential for synergy between the projects.

OK. Well you can put me down as provisionally interested in giving a
talk about SymPy as well.

Great!

Matthias

Aaron Meurer

unread,
Apr 14, 2022, 3:30:43 PM4/14/22
to sy...@googlegroups.com
On Thu, Apr 14, 2022 at 12:26 PM Oscar Benjamin
<oscar.j....@gmail.com> wrote:
>
> On Thu, 14 Apr 2022 at 18:16, Tirthankar Mazumder
> <greenw...@gmail.com> wrote:
> >
> > On Thursday, April 14, 2022 at 10:04:28 PM UTC+5:30 Oscar wrote:
> >>
> >> On Tue, 12 Apr 2022 at 19:26, Matthias Köppe <matthia...@gmail.com> wrote:
> >>
> >> > - status and plans regarding SymEngine
> >>
> >> I don't know about the status of SymEngine. I can't say that I can see
> >> any significant work happening on the SymPy side to integrate
> >> SymEngine any further with SymPy. My personal view is that for faster
> >> symbolics a different approach is needed in general but SymEngine
> >> seems to have the same design flaws as SymPy itself in that respect.
> >
> > Hi, as someone who is relatively new to the SymPy + SymEngine project, and wants to work on SymEngine as a part of their GSoC, I would appreciate it if you could elaborate a bit on what design flaws you are referring to.

I don't want to discourage you, but if you are interested in working
on SymEngine, you might want to send a message to the SymEngine list
to see if there are any available SymEngine mentors this year. I don't
follow the SymEngine development as closely as I do SymPy's, so I
can't say for sure whether there are or not, but I do know that most
SymPy GSoC mentors would be unable to directly mentor a SymEngine
project.

>
> The basic design of the way that symbolic expressions are represented
> in SymPy and SymEngine is through the use of classes to represent
> different types of expression e.g. there is a class Pow and an
> expression like x**y is represented by an instance of that class
> created as Pow(x, y).

A problem I see with classes is that you don't actually want to
represent something like 2*x**3 + 1 as Add(Mul(2, Pow(x, 3)), 1) like
SymPy does. It's much more efficient to use a more compact
"polynomial" representation, for example, something like (x, {3: 2, 0:
1}). My understanding was that SymEngine does do some of this, though.
For example, in SymEngine it's impossible to represent 2**x*2**y and
2**(x + y) as separate things because the internal data structure can
only represent the latter (it's something like a dictionary mapping
base to exponent). In SymPy it is. That data structure lets you be
more performant, though. So there are tradeoffs.
Just to be clear, isn't this already fixed with xreplace, which does
only replacement with no smart substitution and no dispatching to
_eval methods? That isn't to say xreplace itself couldn't be more
performant, which I'm sure it could.

>
> This is a small example of a more general problem. Using classes means
> you have to use the interfaces of the class which means that efficient
> algorithms needing something not provided by that public interface are
> impossible. Even worse both SymPy and SymEngine allow the
> *constructors* of those classes to do nontrivial work without
> providing any good way to override that. This means that you can't
> even represent a symbolic expression without allowing arbitrary code
> execution: it's impossible to optimise higher-level algorithms if you
> have so little control over execution from the outside.

What's your suggested alternative? It seems like you on the one hand
are saying that it's not extensible enough (I guess because any
extension has to be in C++) and at the same time that it's too
extensible, because custom classes can do anything.

I think there is a balance in general between extensibility and
performance. The only way to make something performant is to limit the
domain of what can be expressed so that more efficient data structures
can be used. Often the best of both worlds is to have a restricted
computation domain where only very specific things can be represented,
but in a very efficient way, and have fast ways to convert back and
forth between them and the more general expressions. For instance, a
polynomial data structure can be very fast for polynomial operations,
but it also can't directly represent certain types of important
expressions like symbolic exponents or rational functions.

>
> It's very hard later to change the public interfaces of the expression
> classes because as soon as any downstream code has subclassed your
> classes you are bound by backwards compatibility. (Subclassing across
> the boundaries of different software projects leads to strong
> coupling.)

When it comes to SymEngine, a big constraint is that to be usable as a
swappable core for SymPy, it has to have the same semantics as SymPy.
Even something like the 2**(x + y) thing I mentioned means that it
would break certain algorithms in SymPy that rely on 2**x*2**y being
representable. That's one reason why so far SymEngine has only been
successfully used in very specific SymPy submodules.

Aaron Meurer

>
> There are ways that this can be improved in both SymPy and SymEngine
> but I also think that for many important operations the basic design
> used by these is limiting in a big-O sense: the design constrains what
> algorithms can be used. SymEngine is faster than SymPy in a brute
> force sense by using C++ rather than Python but it would be possible
> to make something both faster and more flexible if a different design
> was used at a basic level.
>
> --
> Oscar
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxTXJRY%3DtJ%3DPCtkNBgmd_nWmWd-wm9ax_w0zcPtXe_gynA%40mail.gmail.com.

Oscar Benjamin

unread,
Apr 14, 2022, 5:12:04 PM4/14/22
to sympy
On Thu, 14 Apr 2022 at 20:30, Aaron Meurer <asme...@gmail.com> wrote:
>
> On Thu, Apr 14, 2022 at 12:26 PM Oscar Benjamin
> <oscar.j....@gmail.com> wrote:
> >
> > As an example it would be trivial in SymPy to make substitutions
> > involving large expressions and many replacements much faster with a
> > small redesign. The problem though is backwards compatibility: each
> > expression class can implement its own _eval_subs method so for
> > backwards compatibility the subs implementation must recurse in the
> > same slow way once throughout the whole tree for each replacement that
> > is to be made. (There are ways to work around this but the design
> > makes it harder than it should be.)
>
> Just to be clear, isn't this already fixed with xreplace, which does
> only replacement with no smart substitution and no dispatching to
> _eval methods? That isn't to say xreplace itself couldn't be more
> performant, which I'm sure it could.

Even xreplace runs the evaluation code which is baked into __new__.
Also it recurses down through all of the args in the tree which can in
many cases be done much more efficiently if the expression is
represented as a DAG rather than a tree.

> > This is a small example of a more general problem. Using classes means
> > you have to use the interfaces of the class which means that efficient
> > algorithms needing something not provided by that public interface are
> > impossible. Even worse both SymPy and SymEngine allow the
> > *constructors* of those classes to do nontrivial work without
> > providing any good way to override that. This means that you can't
> > even represent a symbolic expression without allowing arbitrary code
> > execution: it's impossible to optimise higher-level algorithms if you
> > have so little control over execution from the outside.
>
> What's your suggested alternative? It seems like you on the one hand
> are saying that it's not extensible enough (I guess because any
> extension has to be in C++) and at the same time that it's too
> extensible, because custom classes can do anything.
>
> I think there is a balance in general between extensibility and
> performance. The only way to make something performant is to limit the
> domain of what can be expressed so that more efficient data structures
> can be used.

I think it's possible to get more extensibility and more performance
by taking a different approach. It's not that SymPy can't be extended:
it's just not easy to extend. The Python/SymPy knowledge required to
be able to create new symbolic classes is too great for "ordinary
users" to make use of it.

I think that what is needed is a core that is well-defined in its
scope but designed fundamentally around extensibility. The core
operations that manipulate expressions should be implemented in a
low-level language but it should be possible to define and control
their behaviour precisely from a higher level.

--
Oscar

Aaron Meurer

unread,
Apr 14, 2022, 6:22:32 PM4/14/22
to sy...@googlegroups.com
On Thu, Apr 14, 2022 at 3:12 PM Oscar Benjamin
<oscar.j....@gmail.com> wrote:
>
> On Thu, 14 Apr 2022 at 20:30, Aaron Meurer <asme...@gmail.com> wrote:
> >
> > On Thu, Apr 14, 2022 at 12:26 PM Oscar Benjamin
> > <oscar.j....@gmail.com> wrote:
> > >
> > > As an example it would be trivial in SymPy to make substitutions
> > > involving large expressions and many replacements much faster with a
> > > small redesign. The problem though is backwards compatibility: each
> > > expression class can implement its own _eval_subs method so for
> > > backwards compatibility the subs implementation must recurse in the
> > > same slow way once throughout the whole tree for each replacement that
> > > is to be made. (There are ways to work around this but the design
> > > makes it harder than it should be.)
> >
> > Just to be clear, isn't this already fixed with xreplace, which does
> > only replacement with no smart substitution and no dispatching to
> > _eval methods? That isn't to say xreplace itself couldn't be more
> > performant, which I'm sure it could.
>
> Even xreplace runs the evaluation code which is baked into __new__.

Right, so the "redesign" isn't so much about the substitution but
rather in removing automatic evaluation. You could write a
substitution algorithm right now that completely ignores the
constructors, and would be very fast. B then it would be easy to do (x
+ y).subs(y, x) and get x + x instead of 2*x, which in the current
SymPy codebase can cause problems because everything expects evaluated
expressions.

> Also it recurses down through all of the args in the tree which can in
> many cases be done much more efficiently if the expression is
> represented as a DAG rather than a tree.

Not sure that's so hard to do with the current design. I think it
would be possible to add some optimizations to effectively
de-duplicate identical subexpressions. It's already basically possible
to do this, you just have to assume that == is not a slow operation
(which it mostly isn't, although it could be faster).

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxS%2BCo%2BE5hxOxM4RfjZgH0XkryQtnTtH7n5Az21dNygr%2BQ%40mail.gmail.com.

Oscar Benjamin

unread,
Apr 14, 2022, 7:01:46 PM4/14/22
to sympy
On Thu, 14 Apr 2022 at 23:22, Aaron Meurer <asme...@gmail.com> wrote:
>
> On Thu, Apr 14, 2022 at 3:12 PM Oscar Benjamin
> <oscar.j....@gmail.com> wrote:
> >
> > On Thu, 14 Apr 2022 at 20:30, Aaron Meurer <asme...@gmail.com> wrote:
> > >
> > > On Thu, Apr 14, 2022 at 12:26 PM Oscar Benjamin
> > > <oscar.j....@gmail.com> wrote:
> > > >
> > > > As an example it would be trivial in SymPy to make substitutions
> > > > involving large expressions and many replacements much faster with a
> > > > small redesign. The problem though is backwards compatibility: each
> > > > expression class can implement its own _eval_subs method so for
> > > > backwards compatibility the subs implementation must recurse in the
> > > > same slow way once throughout the whole tree for each replacement that
> > > > is to be made. (There are ways to work around this but the design
> > > > makes it harder than it should be.)
> > >
> > > Just to be clear, isn't this already fixed with xreplace, which does
> > > only replacement with no smart substitution and no dispatching to
> > > _eval methods? That isn't to say xreplace itself couldn't be more
> > > performant, which I'm sure it could.
> >
> > Even xreplace runs the evaluation code which is baked into __new__.
>
> Right, so the "redesign" isn't so much about the substitution but
> rather in removing automatic evaluation.

Removing automatic evaluation can mean different things. Automatic
evaluation is nice from a user perspective for most tasks (but not
all). Automatic evaluation is a nightmare from an algorithmic
perspective if you don't have any good way to control it. Right now it
isn't possible to distinguish between an internal operation for which
evaluation should be postponed and a user-facing operation for which
it might be desired: everything goes through the same __new__ so even
xreplace can't get around it. The class-based design makes it possible
to design classes that are completely broken if you don't let them
evaluate and also means that the choice to evaluate lies with each
class.

In any case what I'm talking about is still not just about evaluation.
It's about representing expressions with data structures rather than
polymorphic classes. It should be possible to define something like a
tan function and then use different kinds of data structures to
represent expressions involving that function. The design should not
be tied to any particular data structure and that means that tan can
not be a class that chooses its own internal representation for any
expression involving the tan function. Of course the lower-level data
structures wouldn't have automatic evaluation though (you wouldn't
expect a dict or a list to "evaluate").

> > Also it recurses down through all of the args in the tree which can in
> > many cases be done much more efficiently if the expression is
> > represented as a DAG rather than a tree.
>
> Not sure that's so hard to do with the current design. I think it
> would be possible to add some optimizations to effectively
> de-duplicate identical subexpressions.

I want a design where these things mostly just happen automatically.
The current design seems to maximise the amount of procedural code
that needs to be written which simultaneously maximises bugs and also
the amount of work that is needed to optimise anything.

> It's already basically possible
> to do this, you just have to assume that == is not a slow operation
> (which it mostly isn't, although it could be faster).

It's slower than it should be. A different design can easily make it
faster but not if you have classes that are allowed to define their
own __eq__ methods.

--
Oscar

Isuru Fernando

unread,
Apr 15, 2022, 9:53:15 PM4/15/22
to sy...@googlegroups.com
Hi Oscar,

Here's a few things that are different in SymEngine than SymPy.

> As an example it would be trivial in SymPy to make substitutions
involving large expressions and many replacements much faster with a
small redesign. The problem though is backwards compatibility: each
expression class can implement its own _eval_subs method so for
backwards compatibility the subs implementation must recurse in the
same slow way once throughout the whole tree for each replacement that
is to be made. (There are ways to work around this but the design
makes it harder than it should be.)

In SymEngine, classes do not have such a method and therefore cannot
extend for eg: subs function.
However the subs function itself (SubsVisitor class technically) can be
extended.

> This is a small example of a more general problem. Using classes means
you have to use the interfaces of the class which means that efficient
algorithms needing something not provided by that public interface are
impossible. Even worse both SymPy and SymEngine allow the
*constructors* of those classes to do nontrivial work without
providing any good way to override that. This means that you can't
even represent a symbolic expression without allowing arbitrary code
execution: it's impossible to optimize higher-level algorithms if you

have so little control over execution from the outside.

No, SymEngine does not do anything in the constructor of the C++ class
itself. (For eg: class "Add"). We have different functions (For eg: function "add")
that do complicated functionality to create the C++ class. A user is allowed
to create the C++ class directly and we assume that the data structure that the
user passed is internally consistent in Release mode and we check user
input in Debug mode. This allows the user to do low level optimizations
when they know that going through the complicated function is unnecessary.


> There are ways that this can be improved in both SymPy and SymEngine
but I also think that for many important operations the basic design
used by these is limiting in a big-O sense: the design constrains what
algorithms can be used. SymEngine is faster than SymPy in a brute
force sense by using C++ rather than Python but it would be possible
to make something both faster and more flexible if a different design
was used at a basic level.

I disagree about this in SymEngine. For eg: SymPy's extensibility
of classes prevents optimizations that SymEngine can do because symengine
doesn't have an equivalence of `_eval_subs` or similar.
Some of the optimizations in SymEngine like Aaron said leads to better
big-O complexities. i.e. the ratio between the two modules is not always a
constant as input size increases.

Isuru



--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

Oscar Benjamin

unread,
Apr 16, 2022, 7:18:43 AM4/16/22
to sympy
On Sat, 16 Apr 2022 at 02:53, Isuru Fernando <isu...@gmail.com> wrote:
>
> Hi Oscar,
>
> Here's a few things that are different in SymEngine than SymPy.
>
> > As an example it would be trivial in SymPy to make substitutions
> involving large expressions and many replacements much faster with a
> small redesign. The problem though is backwards compatibility: each
> expression class can implement its own _eval_subs method so for
> backwards compatibility the subs implementation must recurse in the
> same slow way once throughout the whole tree for each replacement that
> is to be made. (There are ways to work around this but the design
> makes it harder than it should be.)
>
> In SymEngine, classes do not have such a method and therefore cannot
> extend for eg: subs function.
> However the subs function itself (SubsVisitor class technically) can be
> extended.

Yes, this does work differently in SymEngine compared to SymPy. If
SymPy could be changed to work the same way then it could also be a
lot faster (without needing to be rewritten in C++). SymEngine has the
same overall design as SymPy but makes different choices for some
aspects of that design so that it is faster but at the same time
incompatible.

Can the SubsVisitor class be extended from Python?

> > This is a small example of a more general problem. Using classes means
> you have to use the interfaces of the class which means that efficient
> algorithms needing something not provided by that public interface are
> impossible. Even worse both SymPy and SymEngine allow the
> *constructors* of those classes to do nontrivial work without
> providing any good way to override that. This means that you can't
> even represent a symbolic expression without allowing arbitrary code
> execution: it's impossible to optimize higher-level algorithms if you
> have so little control over execution from the outside.
>
> No, SymEngine does not do anything in the constructor of the C++ class
> itself. (For eg: class "Add"). We have different functions (For eg: function "add")
> that do complicated functionality to create the C++ class. A user is allowed
> to create the C++ class directly and we assume that the data structure that the
> user passed is internally consistent in Release mode and we check user
> input in Debug mode. This allows the user to do low level optimizations
> when they know that going through the complicated function is unnecessary.

Does this only apply when working in C++? I just tried:

In [120]: from symengine import add, Add, symbols

In [121]: x = symbols('x')

In [122]: Add(x, x, evaluate=False)
Out[122]: 2*x

In [123]: add(x, x, evaluate=False)
Out[123]: 2*x

> > There are ways that this can be improved in both SymPy and SymEngine
> but I also think that for many important operations the basic design
> used by these is limiting in a big-O sense: the design constrains what
> algorithms can be used. SymEngine is faster than SymPy in a brute
> force sense by using C++ rather than Python but it would be possible
> to make something both faster and more flexible if a different design
> was used at a basic level.
>
> I disagree about this in SymEngine. For eg: SymPy's extensibility
> of classes prevents optimizations that SymEngine can do because symengine
> doesn't have an equivalence of `_eval_subs` or similar.
> Some of the optimizations in SymEngine like Aaron said leads to better
> big-O complexities. i.e. the ratio between the two modules is not always a
> constant as input size increases.

That's true and it is also true that in many cases SymPy's big-O can
be improved without radically changing the design. I think though that
it's possible to get a better trade-off of performance and flexibility
using very different approaches. I will make a demonstration of this
at some point.

Going forwards though for better integration of SymPy and SymEngine
what can be done? SymEngine is faster than SymPy for many important
operations because it does things differently. Most of those
differences are incompatible though so it can't be used as a dropin
replacement. Changing SymEngine to make it more compatible would be a
huge amount of work and would also make it a lot slower.

There are different things that integration could mean. One
possibility is just that SymPy could quietly make use of SymEngine
internally in e.g. trigsimp or something without the user really
noticing the difference (except maybe for speed). Another possibility
is that it could be made easier for a user to use SymEngine
expressions in combination with public API in SymPy so that some
functions would accept and return SymEngine expressions where
possible.

--
Oscar

David Bailey

unread,
Apr 17, 2022, 6:47:49 AM4/17/22
to sy...@googlegroups.com
On 16/04/2022 12:18, Oscar Benjamin wrote:
> Yes, this does work differently in SymEngine compared to SymPy. If
> SymPy could be changed to work the same way then it could also be a
> lot faster (without needing to be rewritten in C++). SymEngine has the
> same overall design as SymPy but makes different choices for some
> aspects of that design so that it is faster but at the same time
> incompatible.

I must say, SymPy symbolics seem fast enough to me, but I haven't tried
to generate very large symbolic expressions.

I am curious as to how much of the difference in speed between SymPy and
SymEngine you think is attributable to the lack of the optimal design of
the Python code, and how much do you think is attributable to the choice
of computer language. Clearly C++ is a much lower level language than
Python, and would presumably be intrinsically much faster, but result in
many hard to fix memory corruption bugs.

David

>

Oscar Benjamin

unread,
Apr 17, 2022, 8:01:01 AM4/17/22
to sympy
It's hard to disentangle these things. Both SymPy and SymEngine will
do a lot of symbolic processing behind the scenes to produce the
output that users expect to see. In general SymPy will do a lot more
processing than SymEngine which means that expressions will not
evaluate in the same way but also means that a comparison of the two
for speed is not easy to interpret as being about e.g. C++ vs Python
or any other particular optimisation.

For example one of the things that is often slow in SymPy when you
have large expressions is assumptions queries. Every time you create
an expression like an Add or a Mul or exp etc there is a lot of
processing that goes on to determine if the expression can simplify
and this often involves checking "assumptions" using the core
assumptions system e.g.:

>>> n = symbols('n', integer=True)
>>> sin(n*pi)
0

This kind of thing often dominates the runtime in SymPy when working
with large expressions. In SymEngine there are no assumptions and so
no assumptions checking is done at all:

>>> from symengine import symbols, sin, pi
>>> n = symbols('n', integer=True)
>>> sin(n*pi)
sin(n*pi)
>>> n.is_integer
>>> False

Having a simpler evaluation scheme would make SymPy faster while still
working in Python. There are many more examples of this where SymPy
just hasn't been clearly designed with performance always in mind.
Contributors are often unaware of the performance implications of the
changes that they make and it's very easy for a seemingly innocent fix
in one place to result in significant slowdowns elsewhere.

The main reason it's so hard to understand the performance
implications of anything is that most of the computation is implicit
(because of the class based design). Something as innocuous as a+b
results in a bunch of computation that you might not expect at a
higher level. In turn that means that e.g. sum(expressions) has
quadratic complexity and is therefore slow for a large list of inputs:

In [1]: from sympy import symbols

In [2]: %time ok = sum(symbols('x:1000'))
CPU times: user 6.72 s, sys: 8 ms, total: 6.72 s
Wall time: 6.73 s

You can see the same quadratic cost with SymEngine but at least for
this example it is something like 50x faster. That difference will be
due in part to the different algorithms/representations but also the
difference between Python and C++. In both cases the cost is O(n^2)
though when in principle it can be O(n). Note that in both cases
Add(*expressions) is O(n) and should be preferred to using sum. Again
though in principle Add(*expressions) could be made effectively O(1)
in a different design.

Oscar

David Bailey

unread,
Apr 17, 2022, 10:02:15 AM4/17/22
to sy...@googlegroups.com
Thanks for that interesting response!

Maybe I'm missing something, but if assumptions are so costly, couldn't
every expression contain a flag contains_assumptions to say if it
contains assumptions. Then when a larger expression was created it would
be simple to compute the contains_assumptions for the larger expression.

E.g. in your example, the expression for n would have
contains_assumptions set to 1, and this would propagate through any
expression containing n.

I think this scheme would work because expressions are immutable.

Couldn't such a flag be used to speed things up by locally turning off
the assumption checking?


David

Oscar Benjamin

unread,
Apr 17, 2022, 10:45:12 AM4/17/22
to sympy
> Maybe I'm missing something, but if assumptions are so costly, couldn't
> every expression contain a flag contains_assumptions to say if it
> contains assumptions. Then when a larger expression was created it would
> be simple to compute the contains_assumptions for the larger expression.
>
> E.g. in your example, the expression for n would have
> contains_assumptions set to 1, and this would propagate through any
> expression containing n.
>
> I think this scheme would work because expressions are immutable.
>
> Couldn't such a flag be used to speed things up by locally turning off
> the assumption checking?

The core assumptions system isn't just about assumptions that are
defined on symbols: every expression has "assumptions" e.g.:

In [1]: (1 + sqrt(2)).is_positive
Out[1]: True

You can read more about it here:
https://docs.sympy.org/dev/guides/assumptions.html

Typically what can be slow when manipulating large expressions is
actually expressions that don't involve symbols at all. For example to
answer the query (1 + sqrt(2)).is_positive the expression is
numerically evaluated with evalf which can be very expensive for large
expressions. This can also make operations with some expressions very
slow e.g. RootOf, Integral, etc. There is a tension between some
users/contributors wanting all assumptions queries to give a definite
answer and the fact that the "assumptions system" is invoked
(repeatedly) pretty much every time any new expression object is
created.

--
Oscar

S.Y. Lee

unread,
Apr 18, 2022, 10:29:32 AM4/18/22
to sympy
How could the future of the fast expression evaluation be suggested?
Would it be making the assumption system faster?
Or would it be choosing carefully between 'easy assumptions' and 'hard assumptions' with time complexity analysis?
Or would it be not using assumption system at all, but may be replaced by more faster (but limited) decision procedures like pattern matching or type theory?

Aaron Meurer

unread,
Apr 18, 2022, 3:44:02 PM4/18/22
to sy...@googlegroups.com
On Thu, Apr 14, 2022 at 5:01 PM Oscar Benjamin
I think you're just restating my restricted computation domain point
in a different way. To take the polys, they are just a special data
structure for representing a polynomial, with wrapper classes for
convenience at the highest level, but it's also purely functional at
the lowest level.

This level based design where the lowest level where the fast
algorithms are implemented operates directly on the data structure and
the top level has a wrapper class around it works well. The existence
of the lower level removes the temptation to do too much magic in the
class (the polys do have some stuff for convenience, like automatic
domain promotion).

>
> > > Also it recurses down through all of the args in the tree which can in
> > > many cases be done much more efficiently if the expression is
> > > represented as a DAG rather than a tree.
> >
> > Not sure that's so hard to do with the current design. I think it
> > would be possible to add some optimizations to effectively
> > de-duplicate identical subexpressions.
>
> I want a design where these things mostly just happen automatically.
> The current design seems to maximise the amount of procedural code
> that needs to be written which simultaneously maximises bugs and also
> the amount of work that is needed to optimise anything.
>
> > It's already basically possible
> > to do this, you just have to assume that == is not a slow operation
> > (which it mostly isn't, although it could be faster).
>
> It's slower than it should be. A different design can easily make it
> faster but not if you have classes that are allowed to define their
> own __eq__ methods.

I agree that almost nothing should be defining its own __eq__ method.
That's a pretty bad antipattern which we mostly avoid in library code,
but I expect there's a good bit of user code that does it because it
doesn't adhere to the same standards.

Aaron Meurer

>
> --
> Oscar
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxTz1YW0JQ7_ySkHNUsQQws1oVYxwZtLqhYznWzEQ0sgag%40mail.gmail.com.

Aaron Meurer

unread,
Apr 18, 2022, 6:55:15 PM4/18/22
to sy...@googlegroups.com
I think the only way is to move SymPy away from its current approach
of automatically evaluating things when expressions are constructed.
To take a concrete example from
https://github.com/sympy/sympy/issues/10800, consider this expression:

a = 2*Integral(sin(exp(x)), (x, 1, oo))+2*Si(E)

This expression currently causes evalf() to hang (which is of course,
a separate issue, but it's useful to use it to just get an idea of
where evalf() is overused).

Try doing almost anything with this expression and you'll see where
evalf() is being used. You can't even print the expression, because
the printer wants to numerically evaluate every numerical
subexpression so that they can be printed in order. You can't create
the expression a > 0 because > wants to automatically evaluate to True
or False when the arguments are numeric. This expression is numeric
(it is actually equal to pi, as doit() will reveal), but just because
something is known to be a number doesn't mean that we can efficiently
compute that number.

Obviously, the best "fix" here is to make it so that a.evalf()
actually computes quickly. But even if it computed a value, it would
still take some time to do that. There's no reason why the printers
really need to evaluate expressions just to sort them. And there's no
reason why > needs to automatically evaluate. If a user creates a >
expression and wants to know if it can be simplified they can call
simplify() or doit().

This is just one example. This sort of thing happens all the time with
assumptions, and with other sorts of automatic simplifications. We
could try to come up with a design where things can evaluate, but only
in cases where they involve only minimal calculations like you
suggested. For example, assumptions might automatically evaluate
simple deductions like positive -> real, but bypass complex things
that require computation like _eval_is_real. This would be complicated
to implement. Better would be to use a design where calculation is not
expected to be done automatically in the first place. This represents
a pretty big change in the way SymPy works, though.

If you look at the matrix expressions, they are close to the sort of
design we should be aiming for. If you create any expression with
matrix expressions, it is completely unevaluated, and only when you
call doit() does it attempt any simplifications, even really basic
ones (with the exception of shape checking, which already might be too
much).

>>> A = MatrixSymbol("A", n, n)
>>> MatAdd(A, A)
A + A
>>> MatAdd(A, A).doit()
2*A

Note that this is different from calling +, which calls doit() automatically

>>> A + A
2*A

The important difference is that an algorithm that manipulates matrix
expressions which pulls them apart and rebuilds them with args will
effectively create the class directly using MatAdd, not +, so the
evaluation will not happen there. Those algorithms are where the
performance difference between evaluating and not evaluating will
matter the most.

Ideally a class like Add would not even have a __new__ method, so that
it is not possible for it to evaluate directly in its constructor.
Instead all evaluation would happen in a different method like doit().
Certain evaluations like x + x -> 2*x would still happen when using +
so that the end-user experience is still practical.

This would also have the side effect of making unevaluated expressions
easier to work with. Right now you have to create them with things
like evaluate=False, and they are buggy when you use them. If they
were directly supported as the "default" way that expressions work,
then it would be straightforward to use them, and every function would
handle them correctly.

If we don't remove the automatic evaluations then any other approach
will always be an uphill battle to get performance, because for any
automatic simplification you can construct expressions that make it
too slow, and because they happen automatically, it will completely
remove any other performance gains you might have had.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/e99b97c2-4904-45be-852b-1a507915a22fn%40googlegroups.com.

Ondřej Čertík

unread,
Apr 18, 2022, 11:45:43 PM4/18/22
to sympy, syme...@googlegroups.com
Hi,

I am CCing the symengine list also. Let me say a few words about the design and goals of SymEngine. It's a result of many years of experience with SymPy and implementing high performance code in general. The goal is the fastest possible symbolic manipulation. So we do not want to introduce slowdowns just to get a different API, rather we should figure out how to create an API that works for everybody, while keeping the speed.

The design is, as Isuru wrote below, quite close to optimal: SymEngine does not do any unnecessary work. The C++ classes just "pass through" the data, which must be prepared ahead of time. In order to get good speed, it needs to store the data in data structures like a hash table, which restricts what it can represent, in return for good speed. It seems that covers 90% of use cases. If you need to represent things differently than what SymEngine can do (such as `x+x` instead of `2*x`), you can always use SymPy, or we can implement some slower representations, but I think that should not be the default.

One idea for improvement is to represent all SymEngine classes in some kind of a description language, similar to what we use for LFortran here: https://gitlab.com/lfortran/lfortran/-/blob/8761fcee2bbc4cb924bf65f5d576541e58bb8b08/src/libasr/ASR.asdl, and generate all the C++ classes automatically from it. This has many advantages (much less code to maintain, consistent code, easy to add a new symbolic function/object, etc.) and almost no downsides. One can then also experiment with changing how things are represented.

Regarding the internal representation itself --- if anyone knows a faster design, please let me know!

* One idea is not to use reference counted pointers at all, and just represent everything by value. That means if you take a term from an Add, it will get copied. When Add goes out of scope, it deallocates all its terms. Whether this approach is overall faster is unclear, but it might be quite a bit faster if you don't need to reuse subexpressions too much.

* Another orthogonal idea is to represent all symbols like "x" in a separate structure (symbol table) and only reference them from expressions like x^3 + 2x^2+sin(x)...; The issue becomes again with memory management, but there are probably a few ways forward.

* SymEngine was design to operate fast on general expressions. One can of course get faster speed if you know that you are dealing with a polynomial or other more specific structure, but this is an orthogonal idea. It seems just getting the general case working really fast is worth it.

If anyone is interested in discussing this more, I am happy to have a video meeting where we can discuss more. Just let me know!

Ondrej
>> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com <mailto:sympy%2Bunsu...@googlegroups.com>.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxTXJRY%3DtJ%3DPCtkNBgmd_nWmWd-wm9ax_w0zcPtXe_gynA%40mail.gmail.com.
>
> --
> You received this message because you are subscribed to the Google
> Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/CA%2B01voM%2Byu65B%2Brtv5HQEVoxkvF95Gf27L8%3DH7N7ZbJQWUbcUg%40mail.gmail.com
> <https://groups.google.com/d/msgid/sympy/CA%2B01voM%2Byu65B%2Brtv5HQEVoxkvF95Gf27L8%3DH7N7ZbJQWUbcUg%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Alan Bromborsky

unread,
Apr 19, 2022, 7:11:05 AM4/19/22
to sy...@googlegroups.com
For speed up how about parallel processing?  Looking at Ryzen processors
16 cores and 32 threads are quite affordable these days.  Is parallel
python mature code these days?

Ondřej Čertík

unread,
Apr 19, 2022, 9:31:13 AM4/19/22
to sympy
Hi Alan!

Indeed, parallelism is another avenue. SymEngine can be compiled in a "thread-safe" mode, which enables to then use it in parallel. It's slightly slower, since the reference counted pointer becomes an atomic (so the idea below about not using reference counted pointers might help here too).

The question then is if it makes sense to parallelize algorithms like expand(), or even just "add" for two larger expressions. Possibly have both serial and parallel options.

I can imagine especially matrix operations could be sped up greatly in parallel.

Ondrej
> https://groups.google.com/d/msgid/sympy/9d80022f-c82d-70e5-9530-4f763c93cbf9%40gmail.com.

Aaron Meurer

unread,
Apr 19, 2022, 5:32:53 PM4/19/22
to sy...@googlegroups.com
On Tue, Apr 19, 2022 at 7:31 AM Ondřej Čertík <ond...@certik.us> wrote:
>
> Hi Alan!
>
> Indeed, parallelism is another avenue. SymEngine can be compiled in a "thread-safe" mode, which enables to then use it in parallel. It's slightly slower, since the reference counted pointer becomes an atomic (so the idea below about not using reference counted pointers might help here too).
>
> The question then is if it makes sense to parallelize algorithms like expand(), or even just "add" for two larger expressions. Possibly have both serial and parallel options.
>
> I can imagine especially matrix operations could be sped up greatly in parallel.

I would suggest focusing on "embarrassing" parallelism such as matrix
operations where there are multiple top-level operations. Matrices are
a great example of this. There's also a lot of functions that are
linear, in the sense that if you pass them an Add with many terms you
just apply it term-wise. Since symbolic expressions are immutable
there's generally no issue with doing operations out of order. Trying
to parallelize lower level operations like constructing a single large
expression is likely to be more work, and in most cases, it is part of
a larger operation, so it itself can be done in parallel with
something else.

To some degree you can already do this sort of thing with SymPy and
the Python multiprocessing module, although there are some issues with
it due to pickling problems.

Aaron Meurer
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/24bcee42-68ec-474c-859d-94ff5753e1af%40www.fastmail.com.

Matthias Köppe

unread,
May 19, 2022, 2:39:59 PM5/19/22
to sympy
Hi Aaron, Oscar,
Following up regarding the SageDays event. https://wiki.sagemath.org/days112.358
We are now in the phase of collecting abstracts for talks/activities and preparing a schedule.
If you are still interested, could you use https://whenisgood.net/sagedays112358 to indicate details of your availability? I hope to have a schedule ready later this week.

Matthias

Matthias Köppe

unread,
May 19, 2022, 2:41:28 PM5/19/22
to sympy
Hi Isuru,
Would you be interested in contributing a presentation about SymEngine at our SageDays?

Matthias

Oscar Benjamin

unread,
May 28, 2022, 2:00:49 PM5/28/22
to sympy
I can't remember if I filled this out. Is the schedule already fixed yet?

Oscar
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/312bad92-9b88-4e88-b29d-1ee1d791689an%40googlegroups.com.

Matthias Koeppe

unread,
May 28, 2022, 2:36:17 PM5/28/22
to sy...@googlegroups.com
Hi Oscar,
We have a preliminary schedule at
https://researchseminars.org/seminar/SageDays112358, but contributions
are still welcome!
There is a symbolics session on Thursday June 2 starting at 18:00 UTC
(Aaron will speak at 20:00 UTC).
I'll be happy to schedule a presentation at 22:00 UTC if that works
for you -- or let me know about your availability and I can find
another slot.

Matthias
> You received this message because you are subscribed to a topic in the Google Groups "sympy" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/sympy/E89SKBXWUSk/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxRAnMgq5KMY0LFXqosxLfzpZ4QpGxg7PSmKwf%2BEhxy03Q%40mail.gmail.com.



--
Dr. Matthias Koeppe . . . . . . . . http://www.math.ucdavis.edu/~mkoeppe
Professor of Mathematics

Oscar Benjamin

unread,
May 29, 2022, 7:25:12 AM5/29/22
to sympy
I guess we don't need two talks about SymPy if it's just for updating
on development status and plans.

I just realised how close this is: it's Thursday this week. I'm very
busy between now and then so I'm not sure I have time to prepare much.

Aaron have you been working on a talk?

Oscar
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAJ_wo5gh3tmX2e%3DUAfe1jMWaWkRVC4anAAafFPaXJEDt8%2BDkLw%40mail.gmail.com.

Aaron Meurer

unread,
May 29, 2022, 2:30:47 PM5/29/22
to sy...@googlegroups.com
I haven't started yet. I was planning on putting together slides this week. If you want to merge your talk with mine we can do that. 

Aaron Meurer 

Reply all
Reply to author
Forward
0 new messages