Faster CSE

Cristóvão Sousa

unread,

Jul 3, 2013, 9:52:13 AM7/3/13

to sy...@googlegroups.com

Forwarded from PyDy mailing list, https://groups.google.com/forum/#!topic/pydy/PjZ9SP8PYDA .

Hi,

I'm posting here because of one of GSoC 2013 ideas, "Efficient Code Generation".

You stated that "Common subexpression elimination (cse) takes a long time (>1 hour) to run on systems of equations that are derived in a very short period of time (< 1 minute). This needs to be improved." [https://pydy.org/gsoc_2013_ideas#efficient_code_generation]

Indeed, I've verified that myself on my work on SymPyBotics (https://github.com/cdsousa/sympybotics), a tool I'm developing to help me on my PhD studies.

So, I've developed a kind of CSE, faster than SymPy CSE though less general and with some quirks.

Such CSE functionality is implemented in SymCode package (https://github.com/cdsousa/symcode).

Be aware that both SymPyBotics and SymCode are badly documented and probably reimplement some functionalities which could be taken from SymPy/PyDy (and I probably implement them in worse ways :)

The cse core function is "fast_cse()" which can be found in symcode/subexprs.py (https://github.com/cdsousa/symcode/blob/master/symcode/subexprs.py.)

(It uses Subexprs class, which can be used alone to store intermediate variables in recursive computations).

I've profiled SymPy cse and noticed that the main time consumption is due to "count" and "subs" functions, so I tried a different approach.

First, fast_cse function reversely parses the expression tree and recreate each unique operation with non-atom arguments substituted by temporary symbols (this is the "collect" phase).

Each unique operation is stored on an "unique_op:tmp_symbol" dictionary.

For example,

a + b*c + cos(b*c)

is transformed into

t2

while the dictionary holds

a + t0 + t1 : t2

cos(t0) : t1

b * c : t0

Additional, match of multiple argument Mul and Add operations is made, in a similar way to SymPy cse, although argument commutativity is always assumed.

Then, in the "get" phase, the expression tree is recreated from the dictionary while temporary "used more than once" symbols are maintained.

The example output will be

( [(t0, b*c)], a + t0 + cos(t0) )

This is much faster than SymPy CSE although output is generally different. Also, it still doesn't work with "iterable" arguments, and, as said, non-commutativity is not respected .

I would love to have time to work on this, or to work on SymPy CSE optimization directly, but I'm currently in work overload.

Nevertheless, I'm showing you this since some ideas can be useful.

Best regards,

Cristóvão Sousa

unread,

Jul 4, 2013, 8:55:54 AM7/4/13

to sy...@googlegroups.com

Ok, I think it makes sense to continue the posts from https://groups.google.com/forum/#!topic/pydy/PjZ9SP8PYDA here.

I've implement a fast_cse() with the ideas I had.

code: https://gist.github.com/cdsousa/a3dd16fea7c0bd27f07a

ipynb example: https://gist.github.com/cdsousa/e404a692e268e8df18de

It is still incomplete (see todo at top of code).

Compared to current sympy cse, my routine does not revisit the whole subexpression tree if it has already been seen,

so it can set 'to_eliminate' top duplicated subexpressions only, thus there is no need to '_remove_singletons()'.

Also, instead of using 'subs()' on the expressions, it recreates the whole tree replacing operation 'args' that are 'to_eliminate' with symbols (I will look inside subs() to better implement this).

I'll keep working.

Cristóvão

On Wednesday, July 3, 2013 2:52:13 PM UTC+1, Cristóvão Sousa wrote:

Forwarded from PyDy mailing list, https://groups.google.com/forum/#!topiForwarded from PyDy mailing list, https://groups.google.com/forum/#!topic/pydy/PjZ9SP8PYDA .

Ondřej Čertík

unread,

Jul 4, 2013, 12:39:39 PM7/4/13

to sympy

Hi Cristóvão,

On Thu, Jul 4, 2013 at 6:55 AM, Cristóvão Sousa <cri...@gmail.com> wrote:
> Ok, I think it makes sense to continue the posts from
> https://groups.google.com/forum/#!topic/pydy/PjZ9SP8PYDA here.
>
> I've implement a fast_cse() with the ideas I had.
>
> code: https://gist.github.com/cdsousa/a3dd16fea7c0bd27f07a
> ipynb example: https://gist.github.com/cdsousa/e404a692e268e8df18de

This is awesome, thanks for sharing it.

>
> It is still incomplete (see todo at top of code).
>
> Compared to current sympy cse, my routine does not revisit the whole
> subexpression tree if it has already been seen,
> so it can set 'to_eliminate' top duplicated subexpressions only, thus there
> is no need to '_remove_singletons()'.
> Also, instead of using 'subs()' on the expressions, it recreates the whole
> tree replacing operation 'args' that are 'to_eliminate' with symbols (I will
> look inside subs() to better implement this).
>
> I'll keep working.

I can help you write a pull request so that we can include this in sympy. This
is a very useful addition. Let me know if you need any help.

Ondrej

Aaron Meurer

unread,

Jul 4, 2013, 3:54:31 PM7/4/13

to sy...@googlegroups.com

What are the differences between your fast_cse() and SymPy's cse()? Is there any reason why it shouldn't just replace cse(), or at least add some options to cse() to change the behavior. Having cse() be fast is important, as the whole point of the function is to use it on very large expressions.

Aaron Meurer

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.
For more options, visit https://groups.google.com/groups/opt_out.

Cristóvão Sousa

unread,

Jul 4, 2013, 4:42:49 PM7/4/13

to sy...@googlegroups.com

The code I post has still some missing features.
Differences in substitutions orders between both cse functions result
in different subexpression groups.
Due to that many cse unit tests fail with fast_cse.

I'm now reworking on add and mul cse. This is a much harder task.

I'm working on it, and I hope to come with a complete fast_cse.

Only after that, I can truly compare cse and fast_cse, both in returns
and times.

Cristóvão

> You received this message because you are subscribed to a topic in the
> Google Groups "sympy" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/sympy/mNsEVVLFVGU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to

Cristóvão Sousa

unread,

Jul 4, 2013, 4:45:52 PM7/4/13

to sy...@googlegroups.com

Probably, I will only be able to work on that next month...

Cristóvão Sousa

unread,

Jul 12, 2013, 3:32:17 PM7/12/13

to sy...@googlegroups.com

Ok, I've added CSE of Add and Mul arguments (only commutative terms):

http://nbviewer.ipython.org/5986996

It runs faster compared to current sympy.cse, even more if applying it to large expressions.

However, it still lacks a lot of features, which I'll try to address using sympy.cse unit tests.

On Wednesday, July 3, 2013 2:52:13 PM UTC+1, Cristóvão Sousa wrote:

Ondřej Čertík

unread,

Jul 12, 2013, 3:36:48 PM7/12/13

to sympy

Hi Cristóvão,

Excellent. Let us know if you need help with submitting a PR.

Ondrej

Cristóvão Sousa

unread,

Jul 12, 2013, 3:41:02 PM7/12/13

to sy...@googlegroups.com

Ok, thanks :)

> You received this message because you are subscribed to a topic in the Google Groups "sympy" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/sympy/mNsEVVLFVGU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to sympy+un...@googlegroups.com.

Cristóvão Sousa

unread,

Aug 2, 2013, 12:48:45 PM8/2/13

to sy...@googlegroups.com

Hi,

Is it possible to execute the Travis or SymPyBot tests without making a PR?

I would like to run the whole test set prior to make a PR, however, tests are too slow my poor laptop (and they seem to burn it :/ ).

Jason Moore

unread,

Aug 2, 2013, 2:39:44 PM8/2/13

to sy...@googlegroups.com

There isn't any reason not to make a pull request. You can always close it after the tests run.

Jason
moorepants.info
+01 530-601-9791

Cristóvão Sousa

unread,

Aug 2, 2013, 2:41:52 PM8/2/13

to sy...@googlegroups.com

Ok, nice!

> You received this message because you are subscribed to a topic in the
> Google Groups "sympy" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/sympy/mNsEVVLFVGU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to

Ondřej Čertík

unread,

Aug 2, 2013, 3:00:08 PM8/2/13

to sympy

On Fri, Aug 2, 2013 at 10:48 AM, Cristóvão Sousa <cri...@gmail.com> wrote:
> Hi,
>
> Is it possible to execute the Travis or SymPyBot tests without making a PR?

You can setup Travis for your own branch at github.

You can run sympy-bot locally I think. You can also just execute tests
using bin/test.

Ondrej

Aaron Meurer

unread,

Aug 2, 2013, 3:12:51 PM8/2/13

to sy...@googlegroups.com

Yes, as Jason said, it's best to just submit a pull request as soon as
you have code, so that we can see it and start commenting on it.

But to answer your actual question, you just need to enable the Travis
hook in your GitHub. If you look on the Travis site there are
instructions.

Aaron Meurer

Cristóvão Sousa

unread,

Aug 5, 2013, 4:00:35 PM8/5/13

to sy...@googlegroups.com

PR #2355 submitted ;)

Reply all

Reply to author

Forward