Discussion for introducing the hooks to core classes

69 views
Skip to first unread message

mcpl snu

unread,
May 15, 2020, 2:36:31 AM5/15/20
to sympy
Discussion for introducing the hooks to core classes

This thread is related to issue #5040 and PR #18769.


1. What is the problem?

Currently, SymPy's Add and Mul are very complicated. There exist various type checkings in their flatten method to make them able to handle various objects. However, hard-coding like this can be troublesome when new classes are added. Not only these methods need to be added with more type checkings, it will make the code more and more jumbled.

To relieve this, 'postprocessor' is introduced to Add, Mul, Pow. When the instance of these classes is constructed, this 'preprocessor' hook is applied to it, transforming the result. (Currently, only matrix uses this hook.)
Although this can be used to reduce the complexity in the classes, still it has many problems. First, it is not applied when 'evaluate=False' option is passed. Second, it cannot take any keyword argument option. Third, the instance always needs to go through flatten method first, and only after then it is transformed. This is a waste if completely different flatten algorithm is to be applied by postprocessor. Finally, because of the same reason with third problem, the arguments must be made sure to pass flatten methods without raising any error, adding unnecessary restriction to class behaviors.


2. How can we solve it?

I believe we can solve it by introducing a hook, which is applied before any job is done, and also able to take keyword arguments.


3. What is my proposal?

My proposal in #18769 is to introduce a hook which is, unlike postprocessor, applied to the arguments before anything is done. It has these features:

    1) Keyword arguments can be passed, allowing more flexibility.
    2) Averts the arguments to be unnecessarily processed by flatten method.
    3) By checking `_op_priority` of the arguments, determines which hook will be used.

The third feature is important. In current SymPy, `_op_priority` determines which operator methods will be used when operator (+, *, etc) is applied between two objects. This means, if A's _op_priority is 10 and B's is 11, then B.__radd__(A), instead of A.__add__(b), is called when A+B is executed. So I thought; "If `_op_priority` determines the behavior of 'A+B', surely it could determine `Add(A, B)` as well!"

This is especially useful when argument of highest priority defines special classes, such as MatAdd and MatMul for matrix classes. Suppose we have three arguments A,B and C, with their priority A<B<C. When these three are added, `C_Add(C, Add(A, B) )` is returned.
Resolving this is simple. Just make a hook function for C:

>>> C_type_args = [i for i in args if isinstance(i, C)]
>>> other_args = [i for i in args if not isinstance(i, C.func)]
>>> other_args_added = Add(*other_args)
>>> return C_Add(C, other_args_added)

See? No type checking is needed.
With this, let's say someone want to add a new class D. `Add(A, B, C, D)` will return `C_Add(C, D_Add(A, B, D) )`. Doing this is simple indeed. Set D's `_op_priority` so that A<B<D<C, and make a hook function for D:

>>> D_type_args = [i for i in args if isinstance(i, D)]
>>> other_args = [i for i in args if not isinstance(i, D.func)]
>>> other_args_added = Add(*other_args)
>>> return D_Add(D, other_args_added)

Then, running `Add(A, B, C, D)` will call C's hook. Inside it, `Add(*other_args)` will call D's hook because it has the highest priority among A,B and D, assigning D_add(A, B, D) to `other_args_added`. Add.flatten is not called here.
This all works with full expandability, with minimum complexity and bypassing unnecessary steps.


4. Please give your opinion

My proposal will be one of the many ways to resolve current issue. I'd be grateful if you suggest any improvement to my work, or come up with better idea.

Aaron Meurer

unread,
May 15, 2020, 5:34:49 PM5/15/20
to sympy
My biggest concern with this design is the prioritization system.

How does it handle subclasses? Do subclasses always need to increase
their op_priority to give themselves higher priority when they
override the preprocessor?

Another question is if we want to support a multipledispatch like
model where you can override the behavior independent of defining it
on a class. We should also look at how multipledispatch system handle
priority. I think it's harder here because we have an arbitrary number
of arguments, but perhaps we can use the same principles.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/fee15557-d98b-4512-9d26-92203c207cc3%40googlegroups.com.

Aaron Meurer

unread,
May 15, 2020, 5:41:48 PM5/15/20
to sympy
I posted this on the PR:

For dealing with combinations of classes maybe it would be better to
use multipledispatch to define pairwise combinations, which
automatically get combined together.

Although to be honest, most classes that would want to extend Add or
Mul do not interact with each other, only with normal Expr classes.
Order, MatrixExpr, Vector, and AccumBounds do not know about each
other, and adding or multiplying one by the other is, for the most
part, mathematically meaningless. In fact, we might even ignore the
op_priority issue for now and make it an error if multiple objects
have conflicting processors (subclasses that override processors
should take priority). Would that cause issues with what is currently
here?

This could change though if we want to take this a step further for
Expr classes that are commonly used but have custom add/mul
combination logic, such as infinity (but that has other issues such as
performance considerations).

If we can punt on the prioritization issue for now, that might be a good idea.


Aaron Meurer

mcpl snu

unread,
May 16, 2020, 12:03:48 AM5/16/20
to sympy
Aaron Meurer,

Thanks for your opinion. I will try with that.

mcpl snu

unread,
May 18, 2020, 7:00:19 AM5/18/20
to sympy
I think I resolved this. Please refer to PR page.

mcpl snu

unread,
May 29, 2020, 8:38:54 AM5/29/20
to sympy
I found a typo in my original comment. in line 6, 'preprocessor' shoud be changed to 'postprocessor'.
Also, in line 7 I wrote that postprocessor is not applied when evaluate=False is passed, which is wrong. I will write an updated description.
Reply all
Reply to author
Forward
0 new messages