Rewrite of expand (focus on hints)

Sergiu Ivanov

unread,

Jul 15, 2012, 5:17:29 AM7/15/12

to sy...@googlegroups.com

Hello,

In #1395 we have started the discussion about how expand applies hints
and, mainly, about the fact that, eventually, the hints are actually
applied a number of times.

I tried to find a thread on this topic, failed, and decided it would
be good to have one.

Chris said [0]:

I tried a total rewrite of expand with hints which is very different
from what is being done here. So as far as how to sort the keywords
I don't have advice past the suggestion that you do the more
dependent ones last. Does the docstring have something to say about
that -- something about doing power_base before power_exp?

Chris, I couldn't find any mention about the order of these hints in
the docstring.

Does anyone have anything to say about why hints are actually applied
multiple times? I know, the best way is to read the code, but, if
someone has at least a bit of that information in their mind... :-)

Sergiu

[0] https://github.com/sympy/sympy/pull/1395#discussion_r1164777

Chris Smith

unread,

Jul 16, 2012, 12:16:00 AM7/16/12

to sy...@googlegroups.com

> I tried to find a thread on this topic, failed, and decided it would
> be good to have one.

see issue 1725 and my branch mentioned there for what I tried, but
there isn't discussion about the ordering as I recall

> Chris, I couldn't find any mention about the order of these hints in
> the docstring.

See the paragraph just before the examples in function.py ("Note:
because hints are applied in arbitrary order, some hints may...")

/c

Chris Smith

unread,

Jul 16, 2012, 12:21:15 AM7/16/12

to sy...@googlegroups.com

> Does anyone have anything to say about why hints are actually applied
> multiple times? I know, the best way is to read the code, but, if
> someone has at least a bit of that information in their mind... :-)

It seemed to me that things were excessively repetitive, but I think
part of the reason is that it's hard to anticipate how one expansion
may make another need to be reapplied. So as I recall, if hint foo is
being applied, along with 5 other hints, foo is applied and then all 5
others are applied; then the next of the 5 hints is applied and the
remaining 4 that haven't been applied (or is it the other 5?) are
applied again, etc... until all hints have been applied.

Joachim Durchholz

unread,

Jul 16, 2012, 1:29:35 AM7/16/12

to sy...@googlegroups.com

Am 16.07.2012 06:21, schrieb Chris Smith:
> It seemed to me that things were excessively repetitive, but I think
> part of the reason is that it's hard to anticipate how one expansion
> may make another need to be reapplied. So as I recall, if hint foo is
> being applied, along with 5 other hints, foo is applied and then all 5
> others are applied; then the next of the 5 hints is applied and the
> remaining 4 that haven't been applied (or is it the other 5?) are
> applied again, etc... until all hints have been applied.

Hurk. That's an antipattern if I ever saw one.
For example, how can you be sure that if A leads to a reapplication of
B, that B doesn't lead to a reapplication of A?
Nevermind the quadratic blowup.

A hint should know about which other hints may influence it. Something
like "this hint must be run after that hint".
And if the runs-after relationship turns out to be cyclic, I think the
two hints should have their implementations merged. Sometimes it's
enough to merge just the tree walker or whatever is tangling the two,
and have the remaining code still separate.

Just sayin'.

Sergiu Ivanov

unread,

Jul 16, 2012, 5:29:21 AM7/16/12

to sy...@googlegroups.com

On Mon, Jul 16, 2012 at 7:16 AM, Chris Smith <smi...@gmail.com> wrote:
>> I tried to find a thread on this topic, failed, and decided it would
>> be good to have one.
>
> see issue 1725 and my branch mentioned there for what I tried, but
> there isn't discussion about the ordering as I recall

Thank you, I'll take a look at your branch whenever I have some time.

I've seen the issue 1725 some time ago, but have forgotten about it
lately, thanks for pointing out, there seem to be a number of
interesting ideas.

>> Chris, I couldn't find any mention about the order of these hints in
>> the docstring.
>
>
> See the paragraph just before the examples in function.py ("Note:
> because hints are applied in arbitrary order, some hints may...")

Well, yes. I was too focused on thinking about problems to remark
that your question was not so much about problems as it was about the
possibility of getting different results.

On Mon, Jul 16, 2012 at 8:29 AM, Joachim Durchholz <j...@durchholz.org> wrote:
> Am 16.07.2012 06:21, schrieb Chris Smith:
>
>> It seemed to me that things were excessively repetitive, but I think
>> part of the reason is that it's hard to anticipate how one expansion
>> may make another need to be reapplied. So as I recall, if hint foo is
>> being applied, along with 5 other hints, foo is applied and then all 5
>> others are applied; then the next of the 5 hints is applied and the
>> remaining 4 that haven't been applied (or is it the other 5?) are
>> applied again, etc... until all hints have been applied.
>
>
> Hurk. That's an antipattern if I ever saw one.
> For example, how can you be sure that if A leads to a reapplication of B,
> that B doesn't lead to a reapplication of A?
> Nevermind the quadratic blowup.

That's what I thought as well.

> A hint should know about which other hints may influence it. Something like
> "this hint must be run after that hint".
> And if the runs-after relationship turns out to be cyclic, I think the two
> hints should have their implementations merged. Sometimes it's enough to
> merge just the tree walker or whatever is tangling the two, and have the
> remaining code still separate.

I have also just caught a glimpse of an idea of a hint dependency
graph. However, I wonder how feasible it would be to create one which
would be sufficiently general to always work. I haven't thought too
much on that matter, but I'm afraid that the cost of eliminating loops
may sometimes be too dramatic, in the sense that it would disable
expand to do something which it currently can do.

On the other hand, I'm not sure whether merging two expansions is
always the proper way out.

The disclaimer here is that I know rather little about the general
principles along which the expansion routines work, so I'm just
conjecturing.

Sergiu

Chris Smith

unread,

Jul 16, 2012, 8:06:09 AM7/16/12

to sy...@googlegroups.com

> I have also just caught a glimpse of an idea of a hint dependency
> graph. However, I wonder how feasible it would be to create one which
> would be sufficiently general to always work. I haven't thought too
> much on that matter, but I'm afraid that the cost of eliminating loops
> may sometimes be too dramatic, in the sense that it would disable
> expand to do something which it currently can do.

Another way to think about this problem is to consider certain
expressions to be "attractors" for hints. e.g. a Mul is an attractor
for the mul hint. So the expression tree could be scanned with `has`
(e.g. expr.has(Mul)) to see if a certain expression should be applied.
So rather than blindly trying to apply all hints, only the "attracted
hints" would be applied at a given step. But maybe this violates the
KISS principle, I don't know.

Aaron Meurer

unread,

Jul 16, 2012, 9:23:52 PM7/16/12

to sy...@googlegroups.com

I've been looking into the test_expand() memory leak (e.g., with
PYTHONHASHSEED=2538990509). Apparently, the problem is not with cse,
but with expand. If you put a print statement in expr.py like

diff --git a/sympy/core/expr.py b/sympy/core/expr.py
index 2a2ae16..38bffcf 100644
--- a/sympy/core/expr.py
+++ b/sympy/core/expr.py
@@ -2645,6 +2645,7 @@ def expand(self, deep=True, modulus=None,
power_base=True, power_exp=True, \
return n.expand(deep=deep, modulus=modulus, **hints)/d
for hint, use_hint in hints.iteritems():
if use_hint:
+ print 'expanding %s' % hint
func = getattr(expr, '_eval_expand_'+hint, None)
if func is not None:
expr = func(deep=deep, **hints)

and run the tests with PYTHONHASHSEED=2538990509 (64-bit), you'll get

...
expanding mul
expanding log
expanding basic
expanding power_exp
expanding power_base
expanding multinomial
expanding mul
expanding log
expanding basic
expanding power_exp
expanding power_base
expanding multinomial
expanding mul
expanding log
expanding basic
expanding power_exp
expanding power_base
expanding multinomial
expanding mul
expanding log
expanding basic
expanding power_exp
expanding power_base
expanding multinomial
expanding mul
expanding log
expanding basic
expanding power_exp
expanding power_base
...

and so on, infinitely. Clearly somewhere is not handling the base
case of recursion correctly, but solving this more generally should
take care of the problem as well.

Aaron Meurer

Aaron Meurer

unread,

Jul 16, 2012, 9:34:56 PM7/16/12

to sy...@googlegroups.com

I think I figured out the cause of this. Some expand methods are
calling expand to do deep=True recursion. For example,
Pow._eval_expand_power_exp(). This results in the hints being called
recursively indefinitely.

I think each hint should only call itself recursively. I'll try to
make this change and see if it doesn't break anything.

Aaron Meurer

Aaron Meurer

unread,

Jul 16, 2012, 10:03:39 PM7/16/12

to sy...@googlegroups.com

Chris, can you explain what's going on in Add._eval_expand_mul? This
is breaking the pattern of all the other do-nothing _eval_expand
functions, and I don't understand what it's supposed to be doing. Is
this a hack, or do you think all such noop functions should be written
this way?

The relevant commit is 6b892e98.

Aaron Meurer

Chris Smith

unread,

Jul 16, 2012, 10:15:51 PM7/16/12

to sy...@googlegroups.com

On Mon, Jul 16, 2012 at 9:03 PM, Aaron Meurer <asme...@gmail.com> wrote:
> Chris, can you explain what's going on in Add._eval_expand_mul? This
> is breaking the pattern of all the other do-nothing _eval_expand
> functions, and I don't understand what it's supposed to be doing. Is
> this a hack, or do you think all such noop functions should be written
> this way?
>
> The relevant commit is 6b892e98.
>

Perhaps I can check on this in the next day or two.

/c

Aaron Meurer

unread,

Jul 16, 2012, 11:26:24 PM7/16/12

to sy...@googlegroups.com

I've gone ahead and removed it in my forthcoming branch for now. With
the changes I've made so far, the issue from 3022,
S("-I*exp(-3*I*pi/4)/(4*pi**(3/2)*sqrt(t))").expand(complex=True),
doesn't give a recursion error.

Aaron Meurer

> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
>

Aaron Meurer

unread,

Jul 17, 2012, 6:14:07 PM7/17/12

to sy...@googlegroups.com

Regarding "dependencies" of hints, I don't think it's possible.
Consider for example

cos(x*(y + 1))*(z + 1)

expanded with the trig and mul hints. On the one hand, you need to
apply the mul hint first to expand the x*(y + 1) to x*y + x.
Otherwise, the trig hint will not expand the cosine of a sum. On the
other hand, if you apply mul before trig, then when you expand, you'll
get (-sin(x)*sin(x*y) + cos(x)*cos(x*y))*z + -sin(x)*sin(x*y) +
cos(x)*cos(x*y). This can be carried out to arbitrary depth, and
using almost any of the hints.

In master, expand(cos(x*(y + 1))*(z + 1), trig=True) gives the fully
expanded form

-z*sin(x)*sin(x*y) + z*cos(x)*cos(x*y) - sin(x)*sin(x*y) + cos(x)*cos(x*y)

In my branch, it will give

z*(-sin(x)*sin(x*y) + cos(x)*cos(x*y)) - sin(x)*sin(x*y) + cos(x)*cos(x*y)

(this depends on the order of the hints, of course, which currently
depends on hash randomization, but that's the idea).

So that's the price to pay. In my branch, hints are applied exactly
once, in whatever order they are applied in (I'm not planning on
messing with that for now). In master, they are recursively applied
several times. The result is that in my branch, things should be
faster, but you may have to call expand() multiple times to get a
fully expanded expression if it happens to require multiple passes to
do so.

On my computer, in my branch, test_expand takes 0.12 seconds, in
master, it takes 0.79 seconds, mostly due to the infamous slow
test_3022 (I tested on Python 2.5 to keep hash randomization out of
the picture).

Aaron Meurer

Chris Smith

unread,

Jul 17, 2012, 11:13:50 PM7/17/12

to sy...@googlegroups.com

> So that's the price to pay. In my branch, hints are applied exactly
> once, in whatever order they are applied in (I'm not planning on
> messing with that for now). In master, they are recursively applied
> several times. The result is that in my branch, things should be
> faster, but you may have to call expand() multiple times to get a
> fully expanded expression if it happens to require multiple passes to
> do so.

Initially, I don't like this (and that was the intent of modifying
Add._eval_expand_mul, I would guess). If I use the hint expand_mul, I
would like to know that there is no Mul containing an Add factor left
when the expansion is done. On the other hand, if we had well defined,
one level of expansion methods to handle a single hint (as I imagine
you are proposing) then those could be used with more discretion and
confidence of not being excessively recursive -- the recursion would
be handled by the method that uses the focused expander. For example,
say expand_mul only did a single level of expansion on a Mul. This,
then might be how it would be used in expand when given a mul=True
hint:

def expand(mul=False):
if mul:
e = expr
while 1:
expr = e.xreplace(Transform(lambda w:expand_mul(w), lambda
w:w.is_Mul and any(f.is_Add for f in w.args)))
if expr == e:
break
return expr

Reply all

Reply to author

Forward