Python magic to trigger post-constructors for Add, Mul, Pow?

266 views
Skip to first unread message

Francesco Bonazzi

unread,
Mar 21, 2017, 7:06:44 AM3/21/17
to sympy
I am trying to fix the unit system module in SymPy, and I got into a problem:

>>> meter + second
meter
+ second
>>> length + time
length
+ time

Such expressions create an Add object. Technically, they such raise an exception, given that the summation involves incompatible dimensions.

Obviously, Add does not know about all of this, and just creates an Add object.

The point is, what about adding some Python magic (or similar tricks) to make it possible, when some submodules are loaded, to extend the constructors for Add, Mul, Pow, and possibly also for SymPy functions?

We also have a related problem concerning matrix symbols, as operations on them need to be consistent with their dimensions. The problem is currently solved by ad-hoc MatMul, MatAdd classes. This causes them to lack a lot of features/operations from Add, Mul. Subclassing also appears to cause problems. Other instances are tensor indices in implicit summation mode.

One problem that might arise is the inability of adding methods specific for those kind of operations (such as matrix symbol related methods implemented in MatAdd, MatMul). This could be avoided by using non-class functinos though.

What do you think?

Aaron Meurer

unread,
Mar 21, 2017, 1:30:46 PM3/21/17
to sy...@googlegroups.com
I like the idea, although we've have to make sure to keep things fast,
especially for common objects.

Maybe each class could have a registry, and at the top of each class,
it could have a function that checks each argument against the
registry. So for instance, for Add:

class Add:
preprocess_functions = {}
@classmethod
def flatten(cls, seq):
for i in seq:
if type(i) in cls.preprocess_functions: # really should
check for subclass
return cls(*cls.preprocess_functions(seq))

Then the units would register Add.preprocess_functions[Unit] =
<function that checks unit consistency>.

There are a lot of details to work out here, but is that the basic
idea you are suggesting?

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/0b0a8187-6b23-47eb-a4e5-bbb4609e5375%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Francesco Bonazzi

unread,
Mar 22, 2017, 9:36:17 AM3/22/17
to sympy


On Tuesday, 21 March 2017 18:30:46 UTC+1, Aaron Meurer wrote:
I like the idea, although we've have to make sure to keep things fast,
especially for common objects.

Yes, that's the main problem. A simple idea is to load these preprocessors only if the submodules requiring them are loaded. That way, if someone needs a fast code, he should just not to load those submodules.


Maybe each class could have a registry, and at the top of each class,
it could have a function that checks each argument against the
registry. So for instance, for Add:

class Add:
    preprocess_functions = {}
    @classmethod
    def flatten(cls, seq):
        for i in seq:
            if type(i) in cls.preprocess_functions: # really should
check for subclass
                return cls(*cls.preprocess_functions(seq))


 This would apply the preprocess functions many times if the types are repeated.

Is there a way to trigger the preprocessing functions starting from the operator?
That is, if you have something like

3+y+A+x

and A requires the trigger, one could add a dynamic collection. That is, given the evaluation order:

(((3+y)+A)+x)

The second + operator could detect the presence of required triggers in A, collect them and then apply them when Add(3, y, A, x) is called.

Then the units would register Add.preprocess_functions[Unit] =
<function that checks unit consistency>.

I would use a set rather than a dict.
 

There are a lot of details to work out here, but is that the basic
idea you are suggesting?

Yes, approximately. Maybe it can be further improved by thinking a bit more.

We could experiment with the new unitsystem module as soon as I finish its refactoring.

Aaron Meurer

unread,
Mar 22, 2017, 1:48:15 PM3/22/17
to sy...@googlegroups.com
On Wed, Mar 22, 2017 at 9:36 AM, Francesco Bonazzi
<franz....@gmail.com> wrote:
>
>
> On Tuesday, 21 March 2017 18:30:46 UTC+1, Aaron Meurer wrote:
>>
>> I like the idea, although we've have to make sure to keep things fast,
>> especially for common objects.
>
>
> Yes, that's the main problem. A simple idea is to load these preprocessors
> only if the submodules requiring them are loaded. That way, if someone needs
> a fast code, he should just not to load those submodules.

The independent submodules should definitely register their own
preprocessors in their own code (it would be bad to have units related
code in the core anyway).

But the way I suggested below, the units preprocessor is never called
unless a Unit object is found. So it doesn't affect the performance
for expressions without Units.

>
>>
>> Maybe each class could have a registry, and at the top of each class,
>> it could have a function that checks each argument against the
>> registry. So for instance, for Add:
>>
>> class Add:
>> preprocess_functions = {}
>> @classmethod
>> def flatten(cls, seq):
>> for i in seq:
>> if type(i) in cls.preprocess_functions: # really should
>> check for subclass
>> return cls(*cls.preprocess_functions(seq))
>>
>
> This would apply the preprocess functions many times if the types are
> repeated.

It should probably collect all matching types, and call
cls.preprocess_functions(matching_terms, rest)

Although keep in mind that for objects created with the + operator,
there will always be two arguments to Add. You only get more arguments
when Add is used (e.g., when a routine rebuilds an object).

>
> Is there a way to trigger the preprocessing functions starting from the
> operator?
> That is, if you have something like
>
> 3+y+A+x
>
> and A requires the trigger, one could add a dynamic collection. That is,
> given the evaluation order:
>
> (((3+y)+A)+x)
>
> The second + operator could detect the presence of required triggers in A,
> collect them and then apply them when Add(3, y, A, x) is called.

The Add(3 + y, A) call has no way of knowing that there is another +
that will be applied to it. However, perhaps some efficiency could be
done by passing the new argument x separately.

>
>> Then the units would register Add.preprocess_functions[Unit] =
>> <function that checks unit consistency>.
>
>
> I would use a set rather than a dict.

How would it work as a set? How do you know which function to call?

>
>>
>>
>> There are a lot of details to work out here, but is that the basic
>> idea you are suggesting?
>
>
> Yes, approximately. Maybe it can be further improved by thinking a bit more.
>
> We could experiment with the new unitsystem module as soon as I finish its
> refactoring.
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/468ab060-a073-4684-ad10-db4b0b4fca94%40googlegroups.com.

Francesco Bonazzi

unread,
Mar 22, 2017, 7:58:57 PM3/22/17
to sympy


On Wednesday, 22 March 2017 18:48:15 UTC+1, Aaron Meurer wrote:

How would it work as a set? How do you know which function to call?


I was thinking about something like this:


class Add:
    _check_postprocess
= False

   
@classmethod
   
def flatten(cls, seq):
        postprocess
= set([])
       
if cls._check_postprocess:
           
for i in seq:
               
if hasattr(i, "_postprocess_function_Add"):
           
        postprocess.add(i._postprocess_function_Add)
       
       
[ ... body of flatten ... ]
       
[ call `expr` the returning expression ]

       
for i in postprocess:
            expr
= i(expr)
       
return expr

 

Aaron Meurer

unread,
Mar 23, 2017, 12:47:46 AM3/23/17
to sy...@googlegroups.com
Ah, that would take care of subclass checks too. If someone wanted to
add their own postprocessor to an existing class they'd need to
subclass it, but that's fine (it is adding behavior to a class, so
subclassing makes sense).

We should think about ordering. set() ordering is (effectively)
random, which is not so great. We could sort them, but people might
want to be able to make sure their processor fires before some other
one. Maybe there should be some kind of priority key that classes
could add to specify this.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/c6e46bef-173a-40fd-aacc-c0c5ae9ba5ba%40googlegroups.com.

Francesco Bonazzi

unread,
Mar 23, 2017, 11:59:06 AM3/23/17
to sympy
There's another complication. Consider the expression:

3*A + 2*B

Suppose A and B require a flatten-postprocessor at the Add level. The Add object will not detect them, because its parameters are just two Mul objects.

Aaron Meurer

unread,
Mar 23, 2017, 1:46:03 PM3/23/17
to sy...@googlegroups.com
An expression like x*A + B would be problematic, but 2*A + B should
actually be fine, because the flatten algorithm splits off numeric
coefficients (so that they can be combined).

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/e1823603-00b8-4424-8ccc-52d08053f2ea%40googlegroups.com.

Jason Moore

unread,
Mar 25, 2017, 7:48:13 PM3/25/17
to sy...@googlegroups.com
Would multiple dispatch be best for implementing Add/Mul/Etc for these different types?
On Thu, Mar 23, 2017 at 10:45 AM, Aaron Meurer <asme...@gmail.com> wrote:
An expression like x*A + B would be problematic, but 2*A + B should
actually be fine, because the flatten algorithm splits off numeric
coefficients (so that they can be combined).

Aaron Meurer

On Thu, Mar 23, 2017 at 11:59 AM, Francesco Bonazzi
<franz....@gmail.com> wrote:
> There's another complication. Consider the expression:
>
> 3*A + 2*B
>
> Suppose A and B require a flatten-postprocessor at the Add level. The Add
> object will not detect them, because its parameters are just two Mul
> objects.
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/e1823603-00b8-4424-8ccc-52d08053f2ea%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+unsubscribe@googlegroups.com.

To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.

Francesco Bonazzi

unread,
Mar 26, 2017, 9:21:27 AM3/26/17
to sympy


On Sunday, 26 March 2017 00:48:13 UTC+1, Jason Moore wrote:
Would multiple dispatch be best for implementing Add/Mul/Etc for these different types?

I think this alternative would work better than multiple dispatching. It's also much simpler to implement.

Jason Moore

unread,
Mar 26, 2017, 1:44:38 PM3/26/17
to sy...@googlegroups.com
I haven't thought about the details, but just happened to click on Matthew Rocklins 2014 SciPy talk where he introduces the multiple dispatch ideas for SymPy and it seemed relevant to this issue.

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+unsubscribe@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.

Francesco Bonazzi

unread,
Apr 7, 2017, 6:11:23 PM4/7/17
to sympy

Francesco Bonazzi

unread,
Apr 18, 2017, 4:28:21 PM4/18/17
to sympy
I'm gonna merge this PR soon:

https://github.com/sympy/sympy/pull/12508

If anyone oppose merging the PR in its current state, please write it before it's too late.

Ronan Lamy

unread,
Apr 19, 2017, 11:05:08 AM4/19/17
to sy...@googlegroups.com
Le 18/04/17 à 21:28, Francesco Bonazzi a écrit :
> I'm gonna merge this PR soon:
>
> https://github.com/sympy/sympy/pull/12508

Wow, that's horrifying! Good luck maintaining it!

Francesco Bonazzi

unread,
Apr 19, 2017, 2:00:04 PM4/19/17
to sympy


On Wednesday, 19 April 2017 17:05:08 UTC+2, Ronan Lamy wrote:
Le 18/04/17 à 21:28, Francesco Bonazzi a écrit :
> I'm gonna merge this PR soon:
>
> https://github.com/sympy/sympy/pull/12508

Wow, that's horrifying! Good luck maintaining it!


Why is it horrifying?

Ronan Lamy

unread,
Apr 19, 2017, 3:46:08 PM4/19/17
to sy...@googlegroups.com
Le 19/04/17 à 19:00, Francesco Bonazzi a écrit :
Well, using __new__ in the first place is a big WTF (and yes, I know
that sympy is pretty much stuck with it forever), but layering a hook to
a whole new programming paradigm inside it is evil genius. Bonus horror
points for introspecting __slots__ and doing such fundamental changes to
solve a minor issue.

And BTW, has anybody checked the impact on performance? IIUC, that adds
a non-trivial amount of work to every Add or Mul instantiation.

That being said, don't mind me. As long as I don't have to debug it,
that code won't bother me.

Aaron Meurer

unread,
Apr 19, 2017, 3:51:08 PM4/19/17
to sy...@googlegroups.com
Performance is a valid concern. We should definitely check that.

The __slots__ part I don't know. That could probably be done differently.

Regarding the implementation itself, how would you do it? The ability
to make Mul or Add do custom stuff for custom objects is a very
commonly requested feature. Would you keep it unimplemented, or do it
some other way?

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/086034c8-3ad1-4059-7424-d358f3de9b92%40gmail.com.

Francesco Bonazzi

unread,
Apr 20, 2017, 6:12:12 AM4/20/17
to sympy


On Wednesday, 19 April 2017 21:51:08 UTC+2, Aaron Meurer wrote:
Performance is a valid concern. We should definitely check that.

The __slots__ part I don't know. That could probably be done differently.


I didn't like that part too. Has anyone a better idea?

Regarding the implementation itself, how would you do it? The ability
to make Mul or Add do custom stuff for custom objects is a very
commonly requested feature. Would you keep it unimplemented, or do it
some other way?

Yes, we really need this feature.

Ideally we want to extend such mechanism to functions as well.


Aaron Meurer

unread,
Apr 20, 2017, 3:23:54 PM4/20/17
to sy...@googlegroups.com


On Thu, Apr 20, 2017 at 6:12 AM, Francesco Bonazzi <franz....@gmail.com> wrote:
>
>
> On Wednesday, 19 April 2017 21:51:08 UTC+2, Aaron Meurer wrote:
>>
>> Performance is a valid concern. We should definitely check that.
>>
>> The __slots__ part I don't know. That could probably be done differently.
>>
>
> I didn't like that part too. Has anyone a better idea?
>
>> Regarding the implementation itself, how would you do it? The ability
>> to make Mul or Add do custom stuff for custom objects is a very
>> commonly requested feature. Would you keep it unimplemented, or do it
>> some other way?
>
>
> Yes, we really need this feature.
>
> Ideally we want to extend such mechanism to functions as well.

For functions, dispatch works nicely. In fact, most functions are single-argument, so you only need single dispatch, which is basically SymPy's _eval_thing mechanism. We could extend Function so that it dispatches before eval() is called on the subclass (say, sin(x) could call x._eval_function(sin)). At least, that should work when nargs = 1. For nargs > 1, you need something more complicated.

The problem with Add and Mul is that they not only take multiple arguments, but an arbitrary number, so a true multiple dispatch system becomes unwieldy. 

Francesco's solution maybe isn't the most general, but it seems simple, and capable of handling the use-cases I know of. 

Aaron Meurer

Francesco Bonazzi

unread,
Apr 20, 2017, 4:44:25 PM4/20/17
to sympy


On Thursday, 20 April 2017 21:23:54 UTC+2, Aaron Meurer wrote:


On Thu, Apr 20, 2017 at 6:12 AM, Francesco Bonazzi <franz....@gmail.com> wrote:
>
>
> On Wednesday, 19 April 2017 21:51:08 UTC+2, Aaron Meurer wrote:
>>
>> Performance is a valid concern. We should definitely check that.
>>
>> The __slots__ part I don't know. That could probably be done differently.
>>
>
> I didn't like that part too. Has anyone a better idea?
>
>> Regarding the implementation itself, how would you do it? The ability
>> to make Mul or Add do custom stuff for custom objects is a very
>> commonly requested feature. Would you keep it unimplemented, or do it
>> some other way?
>
>
> Yes, we really need this feature.
>
> Ideally we want to extend such mechanism to functions as well.

For functions, dispatch works nicely. In fact, most functions are single-argument, so you only need single dispatch, which is basically SymPy's _eval_thing mechanism. We could extend Function so that it dispatches before eval() is called on the subclass (say, sin(x) could call x._eval_function(sin)). At least, that should work when nargs = 1. For nargs > 1, you need something more complicated.


Suppose we have the exponential of a MatrixSymbol, how do we transmit the information that the result of the exponential function is a matrix?

Suppose you have the expression:

exp(MatrixSymbol("M", 3, 3)) + MatrixSymbol("A", 2, 2)

It would be desirable to raise a ShapeError.

Francesco's solution maybe isn't the most general, but it seems simple, and capable of handling the use-cases I know of.

I had a thought about using the arguments instead of a new __slots__. That is, add an extra argument to the args that is hidden when printing, but if Mul or Add meet it, they should post-process the other args and pass it further. But this could also have its cons.
 

Aaron Meurer

unread,
Apr 24, 2017, 7:35:33 PM4/24/17
to sy...@googlegroups.com
On Thu, Apr 20, 2017 at 4:44 PM, Francesco Bonazzi <franz....@gmail.com> wrote:


On Thursday, 20 April 2017 21:23:54 UTC+2, Aaron Meurer wrote:


On Thu, Apr 20, 2017 at 6:12 AM, Francesco Bonazzi <franz....@gmail.com> wrote:
>
>
> On Wednesday, 19 April 2017 21:51:08 UTC+2, Aaron Meurer wrote:
>>
>> Performance is a valid concern. We should definitely check that.
>>
>> The __slots__ part I don't know. That could probably be done differently.
>>
>
> I didn't like that part too. Has anyone a better idea?
>
>> Regarding the implementation itself, how would you do it? The ability
>> to make Mul or Add do custom stuff for custom objects is a very
>> commonly requested feature. Would you keep it unimplemented, or do it
>> some other way?
>
>
> Yes, we really need this feature.
>
> Ideally we want to extend such mechanism to functions as well.

For functions, dispatch works nicely. In fact, most functions are single-argument, so you only need single dispatch, which is basically SymPy's _eval_thing mechanism. We could extend Function so that it dispatches before eval() is called on the subclass (say, sin(x) could call x._eval_function(sin)). At least, that should work when nargs = 1. For nargs > 1, you need something more complicated.


Suppose we have the exponential of a MatrixSymbol, how do we transmit the information that the result of the exponential function is a matrix?

Suppose you have the expression:

exp(MatrixSymbol("M", 3, 3)) + MatrixSymbol("A", 2, 2)

It would be desirable to raise a ShapeError.

exponential is a bad example because ideally exp(x) would be equivalent to Pow(E, x). But for other functions, like sin(MatrixSymbol("M", 3, 3)), I would make it in the Function super class so that this calls MatrixSymbol("M", 3, 3)._eval_function(sin), which could return a result (and if not, calls sin.eval(MatrixSymbol("M", 3, 3)) as usual).


Francesco's solution maybe isn't the most general, but it seems simple, and capable of handling the use-cases I know of.

I had a thought about using the arguments instead of a new __slots__. That is, add an extra argument to the args that is hidden when printing, but if Mul or Add meet it, they should post-process the other args and pass it further. But this could also have its cons.

That would break a lot of things that do manual processing of Add or Mul args.

Aaron Meurer
 
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+unsubscribe@googlegroups.com.

To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
Reply all
Reply to author
Forward
0 new messages