Coding guidelines

24 views
Skip to first unread message

Rybalka Denys

unread,
Sep 8, 2019, 6:49:45 AM9/8/19
to sy...@googlegroups.com
There are several ways to implement properties of sympy objects. The simplest one is:
```
def __new__(*args):
    obj = Basic(*args)
    obj.prop = 'some_property'
```
A slightly more complicated one is:
```
def __new__(*args):
    obj = Basic(*args)
    obj._prop = 'some_property'

@property
def prop(self):
    return self._prop
```

The first method is simpler and shorter, but the second one explicitly forbids assignments `obj.prop = ...`. I have tried looking into various sympy modules and the practices are different. Which implementation is preferable? And on a more general topic, is there a "coding guidelines" webpage, where issues like this should be resolved once and for all? Maybe it makes sense to create one? I believe it will improve both the code and user experience.

Best Regards,
Denys Rybalka

Oscar Benjamin

unread,
Sep 8, 2019, 8:56:32 AM9/8/19
to sympy
Preventing assignment isn't necessary for the internal workings of
SymPy. The only advantage it has is potentially avoiding confusion for
users. I don't particularly see users getting confused about this
though (although maybe that's because of the widespread use of
properties).

What I think is important is that any properties/attributes are
derivable from `.args` and not stored separately. Also I think it's
good to avoid using `.args` directly where possible so something like:

class MyThing(Basic):
def __new__(cls, one, another):
obj = Basic.__new__(cls, one, another)

@property
def one(self):
return self.args[0]

@property
def another(self):
return self.args[1]

def get_something(self):
return f(self.one) # Using one rather than .args[0]

Then apart from the properties none of the other methods should
directly access args. This makes it possible to easily subclass
MyThing and have a different args layout in the subclass. This is
common when creating a singleton subclass:

class MyThingZero(MyThin, Singleton):
def __new__(cls):
return Basic.__new__(cls)

@property
def one(self):
return 0

@property
def another(self):
return oo

Here MyThingZero doesn't need to have .args because its a singleton so
they couldn't mean anything. MyThingZero can inherit all of the
methods of MyThing and only needs to redefine the accessor properties
because nothing else touches .args directly.

Sometimes it is desirable to cache some property of the object as an
attribute in __new__ but this should only done if it is something
derivable from .args and actually I think it's better to do the
caching in a property.

class Bad(Basic):
def __new__(self, one, another):
obj = Basic.__new__(cls, another)
obj._one = one # <--------------- don't do this

This is bad because the `_one` attribute is stored outside of .args.
Once you've done this you've interrupted the machinery of Basic and
you'll end up being tempted to override __eq__, __hash__, .args,
.func, etc and there's no way you'll get all the details right so
there will be subtle bugs and things you haven't thought of like
pickling will break. You can try to fix these things one at a time but
by the time you're finished you will have reimplemented a good chunk
of Basic.

It might be desirable to cache an object as an attribute to avoid
recomputing it. As long as it is derived from .args that is okay:

class Okay(Basic)
def __new__(self, one, another):
obj = Basic.__new__(cls, one, another)
# This is derivable from args but expensive so we cache it
obj._attribute = expensive_function(one)

@property
def attribute(self):
return self._attribute

Better though is to put that caching inside the property so if we
don't need it then expensive_function is never called:

class Better(Basic):
def __new__(self, one, another):
obj = Basic.__new__(cls, one, another)

@property
def attribute(self):
val = getattr(self, '_attribute', None)
if val is not None:
return val
else:
val = self._attribute = expensive_function(self.one)
return val

I'm not sure if there is already a decorator that does this kind of
caching (it should be added if there isn't).

The examples of bad things I show above are all ones that I've seen in
SymPy's own codebase so please judge for yourself what is reasonable
rather than looking for precedents in the existing code. Currently the
codebase is inconsistent and there are a lot of Basic subclasses doing
dodgy things with attributes and args. I would like to clean that up
though.

Oscar
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAOsx_LqFFGo9wifF30QC8sT%3DYutaEcj6aYPNU7Tt154ZbPnD2w%40mail.gmail.com.

Rybalka Denys

unread,
Sep 8, 2019, 9:56:33 AM9/8/19
to sympy
Great, that is exactly what I've been trying to find!

By the way, there is caching property decorator called `@cached_property`,
that does exactly what you suggested. It is, however, not in the standard
library (https://pypi.org/project/cached-property/). Maybe it would be
beneficial to add it to the standard sympy package? Unfortunately¸
I have no idea how these things are done.

Do you think it is worthwhile to create a page with the "coding guidelines",
so that other authors can use this information? If yes, where?

Denys

Oscar Benjamin

unread,
Sep 8, 2019, 1:50:06 PM9/8/19
to sympy
I should clarify that what I said is opinionated (my opinion in
particular!) so it's possible others disagree. Certainly there is a
lot of code in SymPy that doesn't follow the advice above.

If we want a cached_property decorator then we can add this somewhere
in sympy/utilities or perhaps sympy/core/cache:

from functools import wraps

def cached_property(propfunc):

attrname = '_' + propfunc.__name__
sentinel = object()

@wraps(propfunc)
def accessor(self):
val = getattr(self, attrname, sentinel)
if val is sentinel:
val = propfunc(self)
setattr(self, attrname, val)
return val

return property(accessor)

class A:
@cached_property
def g(self):
print('foo')
return 'bar'

From the SymPy codebase though most instances of storing attributes
outside of args not for this kind of caching. I think it's usually the
author deliberately trying to keep the attribute out of args. That
could be because the arg is unsympifiable or it could be because there
is too much code that relies on a particular args layout so the author
wants to emulate that. The latter reason is why it is best to have a
layer above args that defines properties (one and another in the
examples above) so other code doesn't need to access args directly.

Oscar
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/a78f21b0-63c6-4b78-8b4d-e21a3e95968d%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages