Pattern matching and mathematical-awareness

72 views
Skip to first unread message

F. B.

unread,
Sep 1, 2014, 7:28:45 AM9/1/14
to sy...@googlegroups.com
SymPy's default pattern matcher is not a structural one, i.e. it tries to match according to some mathematical rules. A structural pattern matcher would instead just match the structure of the expression-tree.

While it is necessary to have some mathematical awareness such as, give the wild w:
  • associativity awareness (i.e. f(a, b, c) = f(a, f(b, c)) and similar):
    (a+b+c).match(a+w) ==> {w: b+c}
  • commutativity awareness:
    (a+b).match(w+a) ==> {w: b}
  • one-identity awareness (f is one-identical if f(x) = x for any x):
    x.match(Add(w, evaluate=False)) ==> {w: x}

These rules are also available in Wolfram Mathematica.

There are some results which come out as an attempt to apply some kind of equation solving:

In [20]: (x*y).match(x/w)
Out[20]:
  1
w: ─⎬
  y

That is, Mul(x, y) is matched to become Mul(x, Pow(Mul(1, Pow(y, -1)), -1)) which is equivalent mathematically, but not structurally.

I would expect this result by calling:

In [29]: solve(Eq(x*y, x/w), w)
Out[29]:
1
⎢─⎥
y

I was wondering, what are the pattern matching rules for this awareness? This goes beyond the three rules I pointed out hereabove. Technically, mathematical-aware patterns may yield infinite results as long as they are combined with equation solvers.

Wolfram Mathematica seems not to have any solver-awareness, that is, the previous pattern does not match:

In[12]:= (x y) /. x/w_ -> {w}
Out[12]= x y

It would be useful to keep SymPy's pattern matching rules as close as possible to those of Mathematica. I suggest to remove redundant results relying on solvers, or maybe add a parameter to the matcher to turn it off. What do you think?

Richard Fateman

unread,
Sep 1, 2014, 10:23:03 AM9/1/14
to sy...@googlegroups.com
You could read about unification,   unification with identity, associativity, etc.
You could add "solving" as a method of matching but probably you don't want
to use this routinely.  Maxima's matcher does a little of this and most people
find it surprising.

I think it would be unfortunate if you design another matcher without
reading what else has gone on.  Knowing Mathematica's matcher is a start, but
it is not everything.

Doing what Mathematica does, no more and no less would have the advantage of taking advantage
of the 40+ years of experience in designing pattern matchers, some of which are condensed into it.
There also appear to be some open-source implementations of much of it (MockMMA, and
also Mathics)

On the other hand  (a) Some of it is still controversial (b) Especially for a noivce It is hard to use efficiently (c) it is
intimately tied to the evaluation semantics of the rest of Mathematica, and you probably don't
want to eat the whole thing.

Joachim Durchholz

unread,
Sep 1, 2014, 12:54:36 PM9/1/14
to sy...@googlegroups.com
Am 01.09.2014 um 13:28 schrieb F. B.:
> SymPy's default pattern matcher is not a structural one, i.e. it tries to
> match according to some mathematical rules. A structural pattern matcher
> would instead just match the structure of the expression-tree.
>
> While it is necessary to have some mathematical awareness such as, give the
> wild *w*:
>
> - associativity awareness (i.e. f(a, b, c) = f(a, f(b, c)) and similar):
> (a+b+c).match(a+w) ==> {w: b+c}
> - commutativity awareness:
> (a+b).match(w+a) ==> {w: b}
> - one-identity awareness (f is one-identical if f(x) = x for any x):
> x.match(Add(w, evaluate=False)) ==> {w: x}

You can inject math awareness into a structural pattern matcher by
letting the structure declare its mathematical properties.

E.g. a function could have a one_identity_aware attribute, and the
structural pattern matcher could have a primitive that inspects such
attributes.

(I hope that was understandable.)

Aaron Meurer

unread,
Sep 1, 2014, 2:18:58 PM9/1/14
to sy...@googlegroups.com
At the very least the mathematical pattern matcher should be separate
(ideally a layer on top of) the structural one. Otherwise you can't do
structural only matching, at least without bending over backwards.

Whenever I've used pattern matching in SymPy, I just need it to be as
smart as it can. If it can't match something that should be matched,
that is fine: it won't lead to wrong results, just an algorithm might
not get applied when it could (so e.g., you might get
NotImplementedError instead of an answer).

Much of the mathematical matching in SymPy could be done away with if
we require that people using pattern matching understand how
expressions work, and avoid using certain kinds of patterns. For
example, matching the pattern u - v is bad (same with u/v), because
structurally that's Add(u, Mul(-1, v)). So it will not match x + y as
{u: x, v: -y} because the expression doesn't unify with the -1 in the
pattern. But if your pattern is instead u + v, and you want to match x
- y, that will work (and if you really wanted u - v you can just
negate what you get for v). This particular issue likely comes up more
often for division than subtraction, as you want to match a numerator
and denominator.

However, there are other cases that are not so clear to me, such as
matching x**4 against the pattern w**2 as {w: x**2}. The only thing I
can think of is matching w**n and checking if n is an integer multiple
of 2, but it's really nice when you can get the pattern matcher to do
this sort of thing for you.

OTOH, having that specific behavior automatically can be a bad thing,
because that assumes that everyone always wants it, whereas it likely
only applies to some situations, but not to others.

Finally, I'll point out that subs/xreplace/replace have these exact
same problems. If you're looking at cleaning up pattern matching I
would look at cleaning up the replacement functions as well. The two
are closely related. Essentially expr.subs(old, new) implicitly
pattern matches old in the subexpressions of expr.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/10ad1141-ee19-44ad-bd6e-42c982af3da2%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Richard Fateman

unread,
Sep 1, 2014, 11:58:51 PM9/1/14
to sy...@googlegroups.com
I think that to understand why this is unlikely, you could read
about inherited and synthesized attributes (usually in relation
to intermediate expression trees in the theory of compiling.)

RJF

Joachim Durchholz

unread,
Sep 2, 2014, 1:21:25 AM9/2/14
to sy...@googlegroups.com
Am 02.09.2014 um 05:58 schrieb Richard Fateman:
> you could read
> about inherited and synthesized attributes (usually in relation
> to intermediate expression trees in the theory of compiling.)

Heh. I don't need to read about these, that's established knowledge :-)

> I think that to understand why this is unlikely,

Not sure what you mean with "this".
Also, not sure whether you meant to say "unlikely to happen", "unlikely
to be doable", or "unlikely to exist".

So... can you clarify?

F. B.

unread,
Sep 2, 2014, 8:54:21 AM9/2/14
to sy...@googlegroups.com
OK, by the way I see it an alternative structure for SymPy's pattern matcher could be to define .match( ) in Basic only and cleverly use method inheritance. That is:
  • express associativity through a static field (e.g. is_Associative)
  • same if the type is an identity when given only one parameter (e.g. is_one_param_identity)
  • I would express commutativity as a method returning a generator of permutations
  • to match inverses use again a static field in the class (e.g. inverse_matcher)
  • Pow requires a special case for its exponent-matching.

These 5 properties should be the extensions over a structural matcher introduced by SymPy's matcher.

In examples:

class Basic(object):
   
...
    is_Associative
= False  # Basic(1, 2) != Basic(1, Basic(2))
    is_one_param_identity
= False  # Basic(1) != 1
   
def get_permutation_iter(self):
       
def f():
           
yield self.args
       
return f
    inverse_matcher
= None

class Mul(Expr):
   
...
    is_Associative
= True
    is_one_param_identity
= True  # Mul(x) == x
   
def get_permutation_iter(self):
       
def f():
           
... iter through all possible permutations such that
           
... the non-commutative parts do not commute
       
return f
    inverse_match
= "x -> Pow(x, -1)"

In this way, the information about how a specific type behaves under pattern matching can be easily customized. The generator of allowed permutations would thus be useful in cases where the commutation properties are complicated, for example when they are specified by a generic element of the symmetric group for the object instance.

Anyways, I would consider this task as low priority. Better first concentrate on import the RUBI ruleset into SymPy.

Joachim Durchholz

unread,
Sep 2, 2014, 2:40:02 PM9/2/14
to sy...@googlegroups.com
Am 02.09.2014 um 14:54 schrieb F. B.:
> OK, by the way I see it an alternative structure for SymPy's pattern
> matcher could be to define .match( ) in Basic only and cleverly use method
> inheritance.

Actually that's normal use of inheritance :-)

> That is:
>
> - express associativity through a static field (e.g. is_Associative)
> - same if the type is an identity when given only one parameter (e.g.
> is_one_param_identity)
> - I would express commutativity as a method returning a generator of
> permutations

Generator is a good approach, but it should take some parameter about
which permutations to generate first, to avoid combinatoric explosion.

> Anyways, I would consider this task as low priority. Better first
> concentrate on import the RUBI ruleset into SymPy.

If that means making a rule engine for SymPy that makes it easy to
transform RUBI rules for it, then I'm all for it.
Though the details discussed above seem to be part of how to best set up
a rule engine; isn't that spot on for RUBI, too?

Joachim Durchholz

unread,
Sep 2, 2014, 3:23:31 PM9/2/14
to sy...@googlegroups.com
The makers of RUBI insist that no two rules of a rule set can ever apply
to the same subexpression.
That's draconic, and verifying that would be, erm, "interesting".

I'm not sure whether that's worth it, but they do have a point if they
say it's the only way to be sure that no rule is applied in an
unexpected way, which is what I get is the main point behind most of the
problems you mentioned.
Also, it would force rule writers to investigate into which rule's turf
they are breaking, and reconsider whether their addition is actually an
improvement over what's already there.

I can't say what's the best approach, I'm just collecting potentially
relevant arguments here and hope somebody has enough breadth of vision
to properly weigh them all.

Richard Fateman

unread,
Sep 2, 2014, 8:07:03 PM9/2/14
to sy...@googlegroups.com
Sure.  Unlikely to be easy to do by simply hacking on trees. Here's a classic pattern:

a*x^2+b*x+c.        a,b,c are pattern variables.  x, in this context,  is a symbolic constant.
You might like to also impose  a,b,c free of x, and a is non-zero,  to complete the pattern "quadratic in x"

match this against the expression

(45*x+10)^2/5 +q*x+pi

Yes this can be done, but can you do it by placing attributes on + and *?
I suppose there is also the question of whether you want this to match or not.  

Richard Fateman

unread,
Sep 2, 2014, 8:14:41 PM9/2/14
to sy...@googlegroups.com
Having tried to use Rubi  (not the latest version) I can attest to the problem that
the ruleset as I used it did lots of problems but failed on some of them in
somewhat mysterious ways.  (I think I reported these to Albert Rich).
Here's my suspicion, and it is sort of different from your concern about rule independence.
That is, the rules need to terminate -- either each rule must show progress towards some goal,
or some collection of them converge (hill-climbing as necessary) towards a termination.

Some of these conditions are delicately dependent on particular simplifications.

I was running Rubi using the Mathematica-syntax rules, using MockMMA.  The
pattern syntax and semantics are essentially the same as Mathematica, but the
forms for (my version of) FullSimplify are different.  I'm not sure how much more
of Mathematica beyond what I wrote might actually be necessary to run all of
the Rubi test suite.  I think I kind of gave up after repeated "infinite loops"  about
180 examples in to the tests.

RJF

Joachim Durchholz

unread,
Sep 3, 2014, 2:17:35 AM9/3/14
to sy...@googlegroups.com
Am 03.09.2014 um 02:07 schrieb Richard Fateman:
> Sure. Unlikely to be easy to do by simply hacking on trees. Here's a
> classic pattern:
>
> a*x^2+b*x+c. a,b,c are pattern variables. x, in this context, is a
> symbolic constant.
> You might like to also impose a,b,c free of x, and a is non-zero, to
> complete the pattern "quadratic in x"
>
> match this against the expression
>
> (45*x+10)^2/5 +q*x+pi
>
> Yes this can be done, but can you do it by placing attributes on + and *?

Easily, actually.
Just store the entire pattern, including preconditions, in the attribute
of the top-level node (+ in this case).
Actually Python makes this relatively easy since reifying functions is
fussless and reifying expressions as lambdas is lightweight (can't beat
Haskell or Lisp in that area, of course).

This would probably also make it easier to check for pattern overlap.

> I suppose there is also the question of whether you want this to match or
> not.

That's a strategy question: Which of multiple applicable rules do we
want to apply?
RUBI requires that at most one rule be applicable at a time (i.e. no
pattern overlap), so strategy is encoded in the preconditions of the
rules and it's a non-issue.

F. B.

unread,
Sep 3, 2014, 11:14:17 AM9/3/14
to sy...@googlegroups.com
I fear that SymPy's matching of multiplicative inverses may be a problem in RUBI, as the matcher may match too much.

By the way, assumptions should get to work on the patterns, currently they seem not to work:

In [2]: x = symbols('x', real=False)

In [3]: w = Wild('w', real=True)

In [4]: x.match(w)
Out[4]: {w_: x}

This should not be matched.

Richard Fateman

unread,
Sep 4, 2014, 12:08:57 PM9/4/14
to sy...@googlegroups.com


On Tuesday, September 2, 2014 11:17:35 PM UTC-7, Joachim Durchholz wrote:
Am 03.09.2014 um 02:07 schrieb Richard Fateman:
> Sure.  Unlikely to be easy to do by simply hacking on trees. Here's a
> classic pattern:
>
> a*x^2+b*x+c.        a,b,c are pattern variables.  x, in this context,  is a
> symbolic constant.
> You might like to also impose  a,b,c free of x, and a is non-zero,  to
> complete the pattern "quadratic in x"
>
> match this against the expression
>
> (45*x+10)^2/5 +q*x+pi
>
> Yes this can be done, but can you do it by placing attributes on + and *?

Easily, actually.
Just store the entire pattern, including preconditions, in the attribute
of the top-level node (+ in this case).
This sounds like "assume that we have a solution"  .  Theorem.  "We have a solution".
If you store the entire pattern, then you are hypothesizing that you have a program
that matches quadratics.
 
Actually Python makes this relatively easy since reifying functions is
fussless and reifying expressions as lambdas is lightweight (can't beat
Haskell or Lisp in that area, of course).

This would probably also make it easier to check for pattern overlap.
I don't understand why you say this. 

> I suppose there is also the question of whether you want this to match or
> not.

That's a strategy question: Which of multiple applicable rules do we
want to apply?

If you are writing a program that does something with quadratics, e.g.
integrating   1/quadratic_in_x,   then how do you want to match quadratics?
 
RUBI requires that at most one rule be applicable at a time (i.e. no
pattern overlap), so strategy is encoded in the preconditions of the
rules and it's a non-issue.

You seem to be asserting that this requirement is trivially satisfied. 
Are you sure?   The other requirement that I mentioned, related to
termination, is distinctly not trivially satisfied unless you can prove
things about simplification.
 
Reply all
Reply to author
Forward
0 new messages