Hi,
I'm posting here because of one of GSoC 2013 ideas, "Efficient Code Generation".
So, I've developed a kind of CSE, faster than SymPy CSE though less general and with some quirks.
Be aware that both SymPyBotics and SymCode are badly documented and probably reimplement some functionalities which could be taken from SymPy/PyDy (and I probably implement them in worse ways :)
(It uses Subexprs class, which can be used alone to store intermediate variables in recursive computations).
I've profiled SymPy cse and noticed that the main time consumption is due to "count" and "subs" functions, so I tried a different approach.
First, fast_cse function reversely parses the expression tree and recreate each unique operation with non-atom arguments substituted by temporary symbols (this is the "collect" phase).
Each unique operation is stored on an "unique_op:tmp_symbol" dictionary.
For example,
a + b*c + cos(b*c)
is transformed into
t2
while the dictionary holds
a + t0 + t1 : t2
cos(t0) : t1
b * c : t0
Additional, match of multiple argument Mul and Add operations is made, in a similar way to SymPy cse, although argument commutativity is always assumed.
Then, in the "get" phase, the expression tree is recreated from the dictionary while temporary "used more than once" symbols are maintained.
The example output will be
( [(t0, b*c)], a + t0 + cos(t0) )
This is much faster than SymPy CSE although output is generally different. Also, it still doesn't work with "iterable" arguments, and, as said, non-commutativity is not respected .
I would love to have time to work on this, or to work on SymPy CSE optimization directly, but I'm currently in work overload.
Nevertheless, I'm showing you this since some ideas can be useful.
Best regards,
Cristóvão Sousa