meijerint.py: Sorting in _mul_as_two_parts, _rewrite2

96 views
Skip to first unread message

Joachim Durchholz

unread,
Mar 23, 2015, 2:21:04 PM3/23/15
to sy...@googlegroups.com
Hi all,

this probably mostly goes to Tom Bachmann and Chris Smith, but maybe
others know a bit about Meijer integration.

I'm working on something that changes the internal ordering of terms a
bit, and Meijer seems to rely on the current ordering for some
vaguely-explained benefits; I need to find out whether it's relevant or not.

Specifically, during
sympy/integrals/tests/test_integrals.py:test_issue_8368 ,
I see a call to _rewrite2(f, x) with
f = Mul: exp(_x)*exp(-_x*s)/2 and
x = Dummy: _x
(it seems only f is relevant here).

The docstring for _rewrite2 talks about multiset_partitions returning
stuff in a canonical (but not always optimal) order. Frustratingly, it
does not really explain what the problem is, nor what will fail if the
order is changed.

It might be ultimately irrelevant though, as further processing filters
the result through ordered() in _rewrite2. If that's the case, then
maybe the docstring of _rewrite2 is wrong about test cases breaking, so
I'm unsure whether my current understanding of what's going on is
missing something important.

Any tips, pointers and explanations appreciated.

Regards,
Jo

Joachim Durchholz

unread,
Mar 23, 2015, 5:35:40 PM3/23/15
to sy...@googlegroups.com
(Following up by myself)

Background: The test does integrate(exp(-s*x)*sinh(x), (x, 0, oo)).

Old sort order gives me

Piecewise(
(
-1/(s + 1)/2 - 1/(-s + 1)/2, #####1
And(
Ne(1/s, 1), #####2
Abs(periodic_argument(s, oo)) < pi/2,
Abs(periodic_argument(s, oo)) <= pi/2,
cos(Abs(periodic_argument(s, oo)))*Abs(s) - 1 > 0
)
),
(
Integral(exp(-s*x)*sinh(x), (x, 0, oo)),
True
)
)

New sort order comes back with
Piecewise(
(
-1/(2*(s + 1)) - 1/(2*(-s + 1)), #####1
And(
Abs(periodic_argument(s, oo)) < pi/2,
Abs(periodic_argument(s, oo)) <= pi/2,
Ne(s, 1), ####2
cos(Abs(periodic_argument(s, oo)))*Abs(s) - 1 > 0
)
),
(
Integral(exp(-s*x)*sinh(x), (x, 0, oo)),
True
)
)

Points of interest:

#####1 Different formulae, but same meaning. Haven't checked yet whether
that's just due to different term ordering or to different choices made
during Meijer.

#####2 Now this is asserting Eq(1/s,1) for old and Eq(s,1) for new.
These are semantically different for the case s=0 (i.e. integral of just
sinh(x)): Old sort order gives a fail, new sort order gives a definite
result.
Now I'm wondering how this is accidentally improving Meijer heuristics.
(Or is the difference mathematically irrelevant?) And I'm wondering
where a changed sort order might accidentally worsen the heuristics.

Aaron Meurer

unread,
Mar 23, 2015, 5:40:21 PM3/23/15
to sy...@googlegroups.com
If s = 0 in either case Ne(s, 1) and Ne(1/s, 1) will be False, so I
think in this case they are completely the same.

The version with s is obviously simpler, but we shouldn't rely on the
algorithm to give simple results, unless there are no other changes.

I would try to find some other example that this gives different
results for, or, preferably, get ahold of Tom Bachmann, and see if he
can shed any insight.

Aaron Meurer

>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/55108726.2070803%40durchholz.org.
>
> For more options, visit https://groups.google.com/d/optout.

Joachim Durchholz

unread,
Mar 24, 2015, 5:36:35 AM3/24/15
to sy...@googlegroups.com
Am 23.03.2015 um 22:39 schrieb Aaron Meurer:
> On Mon, Mar 23, 2015 at 4:35 PM, Joachim Durchholz <j...@durchholz.org> wrote:
>> #####2 Now this is asserting Eq(1/s,1) for old and Eq(s,1) for new. These
>> are semantically different for the case s=0 (i.e. integral of just sinh(x)):
>> Old sort order gives a fail, new sort order gives a definite result.
>> Now I'm wondering how this is accidentally improving Meijer heuristics. (Or
>> is the difference mathematically irrelevant?) And I'm wondering where a
>> changed sort order might accidentally worsen the heuristics.
>
> If s = 0 in either case Ne(s, 1) and Ne(1/s, 1) will be False,

Ah. I'd thought that Ne(1/s, 1) would be None (because 1/0 undefined).

> so I
> think in this case they are completely the same.

Then they are.

> The version with s is obviously simpler, but we shouldn't rely on the
> algorithm to give simple results, unless there are no other changes.

Is Meijer heuristic? In that case, I think variation due to changed sort
order would be expected.

Oh. I find the ordered() call in _rewrite2 isn't dependent on Basic
ordering at all,

> I would try to find some other example that this gives different
> results for,

I won't find anything by myself (not enough integral knowledge on this
side of the keyboard), but I'll take a look at the other unit test
failure related to integrals.

Or, alternatively, we could start torture testing SymPy by making Basic
use a random (but consistent-per-run) sort order, e.g. XOR
PYTHONHASHROOT into the class names before comparing them.
I'm obviously working on making SymPy's unit tests robust against this
kind of change, but I'm sceptical that this will be doable for heuristic
algorithms.

> or, preferably, get ahold of Tom Bachmann, and see if he
> can shed any insight.

I'll try to contact him.

Joachim Durchholz

unread,
Mar 24, 2015, 5:02:24 PM3/24/15
to sy...@googlegroups.com
Am 23.03.2015 um 22:39 schrieb Aaron Meurer:
> I would try to find some other example that this gives different
> results for, or, preferably, get ahold of Tom Bachmann, and see if he
> can shed any insight.

I can't find Tom via github, his account seems to be gone.
I tried contacting him via the mail address in the AUTHORS file, but I
didn't get an answer yet.

Aaron Meurer

unread,
Mar 24, 2015, 5:30:21 PM3/24/15
to sy...@googlegroups.com
On Tue, Mar 24, 2015 at 4:36 AM, Joachim Durchholz <j...@durchholz.org> wrote:
> Am 23.03.2015 um 22:39 schrieb Aaron Meurer:
>>
>> On Mon, Mar 23, 2015 at 4:35 PM, Joachim Durchholz <j...@durchholz.org>
>> wrote:
>>>
>>> #####2 Now this is asserting Eq(1/s,1) for old and Eq(s,1) for new. These
>>> are semantically different for the case s=0 (i.e. integral of just
>>> sinh(x)):
>>> Old sort order gives a fail, new sort order gives a definite result.
>>> Now I'm wondering how this is accidentally improving Meijer heuristics.
>>> (Or
>>> is the difference mathematically irrelevant?) And I'm wondering where a
>>> changed sort order might accidentally worsen the heuristics.
>>
>>
>> If s = 0 in either case Ne(s, 1) and Ne(1/s, 1) will be False,
>
>
> Ah. I'd thought that Ne(1/s, 1) would be None (because 1/0 undefined).
>
>> so I
>>
>> think in this case they are completely the same.
>
>
> Then they are.
>
>> The version with s is obviously simpler, but we shouldn't rely on the
>> algorithm to give simple results, unless there are no other changes.
>
>
> Is Meijer heuristic? In that case, I think variation due to changed sort
> order would be expected.

Yes, it uses a lot of pattern matching, which is heuristic in nature.

Aaron Meurer

>
> Oh. I find the ordered() call in _rewrite2 isn't dependent on Basic ordering
> at all,
>
>> I would try to find some other example that this gives different
>> results for,
>
>
> I won't find anything by myself (not enough integral knowledge on this side
> of the keyboard), but I'll take a look at the other unit test failure
> related to integrals.
>
> Or, alternatively, we could start torture testing SymPy by making Basic use
> a random (but consistent-per-run) sort order, e.g. XOR PYTHONHASHROOT into
> the class names before comparing them.
> I'm obviously working on making SymPy's unit tests robust against this kind
> of change, but I'm sceptical that this will be doable for heuristic
> algorithms.
>
>> or, preferably, get ahold of Tom Bachmann, and see if he
>>
>> can shed any insight.
>
>
> I'll try to contact him.
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/55113020.60804%40durchholz.org.

Joachim Durchholz

unread,
Mar 27, 2015, 6:26:43 PM3/27/15
to sy...@googlegroups.com
Am 24.03.2015 um 22:29 schrieb Aaron Meurer:
> On Tue, Mar 24, 2015 at 4:36 AM, Joachim Durchholz <j...@durchholz.org> wrote:
>> Is Meijer heuristic? In that case, I think variation due to changed sort
>> order would be expected.
>
> Yes, it uses a lot of pattern matching, which is heuristic in nature.

I have tracked down the difference to a flatten() call inside the Mul
constructor. After that, Meijer is getting its heuristics in a different
order.
(I have taken a cursory look at the other test failures, and ~75% seem
to be integration-heavy math such as quantum physics.)

Problem is that if we change Mul to not sort its arguments via flatten,
this is going to ripple all over SymPy. That's not a labyrinth I wish to
enter. (1)

The other approach would be to accept that some unit tests depend on
sort order in a superficial way, and to change the unit tests. (2)

Is there a third, better approach?

Aaron Meurer

unread,
Mar 27, 2015, 7:52:18 PM3/27/15
to sy...@googlegroups.com
Is the result deterministic? And is it still mathematically correct?

Aaron Meurer

>
> Is there a third, better approach?
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/5515D91E.7040509%40durchholz.org.

Joachim Durchholz

unread,
Mar 28, 2015, 3:18:16 AM3/28/15
to sy...@googlegroups.com
Am 28.03.2015 um 00:51 schrieb Aaron Meurer:
> On Fri, Mar 27, 2015 at 5:26 PM, Joachim Durchholz <j...@durchholz.org> wrote:
>> The other approach would be to accept that some unit tests depend on sort
>> order in a superficial way, and to change the unit tests. (2)
>
> Is the result deterministic? And is it still mathematically correct?

Deterministic: yes.

Mathematically correct: Sort of, one Piecewise condition goes from
Ne(1/s, 1) to Ne(s, 1).
For SymPy, it is equivalent; for the human reader, it might be
considered mathematically "almost equivalent" due to the singularity at s=0.
With a mathematically incorrect result I'd assume a bug in the
integration algorithm anyway.

>> Is there a third, better approach?

I indeed came up with third approach tonight: keep the existing sort
order and remove the comment that it is outdated.
Not better, but correct enough to have a place on the list of options.

Joachim Durchholz

unread,
Mar 28, 2015, 9:02:35 AM3/28/15
to sy...@googlegroups.com
Am 28.03.2015 um 00:51 schrieb Aaron Meurer:
> On Fri, Mar 27, 2015 at 5:26 PM, Joachim Durchholz <j...@durchholz.org> wrote:
>> The other approach would be to accept that some unit tests depend on sort
>> order in a superficial way, and to change the unit tests. (2)

The not-so-nice aspect about this one is that the PR must contain both
changes: those to the sorting order, and those to all the tests that
actually depend on that sorting order (a dozen or two of them).

It's not going to be a monster PR, fortunately.

Rathmann

unread,
Apr 13, 2015, 1:13:12 AM4/13/15
to sy...@googlegroups.com
Just one data point that may be relevant -- blame shows the comment as dating from 2012.

The code for multiset partitions has changed since then.

Before pull request 1658, by Chris Smith, multiset_partitions gave bad results.   1658 is from about the same time as the comment.

Later, (pr 1848) I replaced Chris' implementation with a faster algorithm. The current ordering is deterministic, but I would be surprised if it were the same as the old algorithm. 

I don't claim any knowledge of Meijer integration, but I don't remember breaking any tests there.

Joachim Durchholz

unread,
Apr 13, 2015, 4:36:57 AM4/13/15
to sy...@googlegroups.com
Am 13.04.2015 um 07:13 schrieb Rathmann:
> Just one data point that may be relevant -- blame shows the comment as
> dating from 2012.
>
> The code for multiset partitions has changed since then.

Ah thanks, I didn't notice that. Thanks for the pointer.

I'd still like to know what motivated that comment, in particular
whether the "tests break" part of it was "tests break because the
integration gives wrong results" or "tests break because the result is
different, though still correct".
This is going to be cleared up when Tom comes around (he promised to in
private mail, he was just too busy with other stuff to deal with the
issue right away).

> Before pull request 1658, by Chris Smith, multiset_partitions gave bad
> results. 1658 is from about the same time as the comment.
>
> Later, (pr 1848) I replaced Chris' implementation with a faster algorithm.
> The current ordering is deterministic, but I would be surprised if it were
> the same as the old algorithm.

Meijer uses quite small sets, two or at most three elements IIRC.
Do you think your changes should have affected the result order given so
small sets?

> I don't claim any knowledge of Meijer integration, but I don't remember
> breaking any tests there.

Regards,
Jo

Tom Bachmann

unread,
Apr 19, 2015, 9:31:37 AM4/19/15
to sy...@googlegroups.com
Hi,

I am sorry I am being so unresponsive. To be totally honest I do not think I will be able to keep up with you guys. The code is changing (I'm sure for the better) and my memory is already failing me with regards to internal details. Pursuing a PhD in mathematics I simply do not have the time to keep up to date with sympy.

That being said, let me try my best. As Aaron has explained, the G-function "algorithm" for integration is a heuristic pattern matching algorithm that employs tables of mellin transforms (in the convenient form of G-functions). The basic idea is that in order to compute an integral

integrate(f, x, 0, oo)

we try to write

(*)   f = x^a * g_1 * g_2

and then try to recognise g_1 and g_2 as special cases of meijer g functions. Once this is done, there is a convolution theorem which expresses the definite integral as another g-function, under certain conditions. So the "algorithm" really has about four parts:

(1) split the function f in the form (*)
(2) recognise (combinations of) elementary functions as G-functions
(3) check the conditions of the convolution theorem
(4) expand the g-functions into elementary functions again

(There are some additional difficulties, such as the fact that the G-functions really must be considered as continued analytically to the Riemann surface of the logarithm in general, but let's try and not get bogged down.) All of thees steps are heuristic! Moreover, steps (2), (3) and (4) involve highly complicated tables of mathematical expressions, making the whole thing a testing nightmare.

The testing philosophy I went for with meijerint was to look at a lot of common "book" integrals and make sure the algorithm can do them. What this means may not be entirely clear. But certainly we want to have a definite integral returned in elementary terms, and non-vacuous conditions under which the result is true. But there is definitely no *correct* answer.


So: If you change something about the internals of the algorithm, expect a lot of tests to fail. The thing to do is to ensure the new results are mathematically equivalent and at least as "nice" (simple) as before. This is unfortunately very tedious.

All the best,
Tom

Joachim Durchholz

unread,
Apr 25, 2015, 6:12:51 PM4/25/15
to sy...@googlegroups.com
Am 19.04.2015 um 15:31 schrieb Tom Bachmann:
> The testing philosophy I went for with meijerint was to look at a lot of
> common "book" integrals and make sure the algorithm can do them. What this
> means may not be entirely clear. But certainly we want to have a definite
> integral returned in elementary terms, and non-vacuous conditions under
> which the result is true. But there is definitely no *correct* answer.
>
>
> So: If you change something about the internals of the algorithm, expect a
> lot of tests to fail. The thing to do is to ensure the new results are
> mathematically equivalent

So I guess if I find changes that make tests fail, I'll have to verify
that old and new result are mathematically equivalent.

> and at least as "nice" (simple) as before. This
> is unfortunately very tedious.

This is going to be a real problem. Even if I see some specific
integrals become worse, this might be compensated by other integrals
becoming better. And vice versa, of course.
I have no way to know whether I'm improving or damaging SymPy on that front.

One last set of question here:
What was your overall strategy when choosing heuristics?
Was that strategy in some way derived from the SymPy's term sort order?
(If no, improvement and loss will tend to cancel each other out and I
can stop worrying about this.)

Tom Bachmann

unread,
Apr 28, 2015, 9:27:33 AM4/28/15
to sy...@googlegroups.com
> One last set of question here:
> What was your overall strategy when choosing heuristics?
> Was that strategy in some way derived from the SymPy's term sort order?
> (If no, improvement and loss will tend to cancel each other out and I
> can stop worrying about this.)
>

Which heuristics are you referring to? In any case, sympy term order has
not been taken into account (by me, at least).

Tom

Joachim Durchholz

unread,
Apr 28, 2015, 9:56:31 AM4/28/15
to sy...@googlegroups.com
Am 28.04.2015 um 15:27 schrieb Tom Bachmann:
>> One last set of question here:
>> What was your overall strategy when choosing heuristics?
>> Was that strategy in some way derived from the SymPy's term sort order?
>> (If no, improvement and loss will tend to cancel each other out and I
>> can stop worrying about this.)
>>
>
> Which heuristics are you referring to?

Those in Meijer. I was assuming Meijer is inherently heuristic.

> In any case, sympy term order has
> not been taken into account (by me, at least).

Okay then.
I was worried that any bugs that I might see would be attributed to the
sort order. With your answer, I now know that any problems arising from
a changed sort order will have to be fixed in the integral code; that's
reassuring and I can proceed.

Thanks for the answers!
Jo
Reply all
Reply to author
Forward
0 new messages