subclassing from Symbol, only for the sake of having a name string

60 views
Skip to first unread message

krastano...@gmail.com

unread,
Jul 21, 2012, 3:30:37 PM7/21/12
to sy...@googlegroups.com
At least once it was stated that the preferred way to get a named
object is to subclass from Symbol and basically use Symbol as a string
class.

This is so because many recursion algorithms in SymPy assume that
every element of args subclasses Basic. This is a reasonable
requirement for all objects except the leafs of the tree that have
names. A hack around this is the way that Symbol does not contain its
name string in its args and thus it fails on obj.func(*obj.args)==obj.
And now everybody who needs a name string resorts to encapsulating it
in a Symbol.

The root of the problem is that there is no notion of an atom or a
leaf **consistently** used in sympy. There are the classes Atom and
the property is_atom, however very few of the algorithms in sympy
actually check those.

I have raised this issue a few times already, however it got stalled.

Can we settle on a definition for an atom that works including for
named objects (i.e. objects that need to have strings in their args).

What about defining a function atom() in utilities that does something like:

def atom(obj):
try:
return bool(obj.args) and obj.is_Atom
# captures Atom instances and
# shortcircuits on empty args
except AttributeError:
return False # strings work

Then each time an issue arises, we find the offending recursive
algorithm and reimplement it with atom(). Step by step all of sympy
will have consistent recursion routines.

krastano...@gmail.com

unread,
Jul 21, 2012, 5:03:20 PM7/21/12
to sy...@googlegroups.com
Or the other obvious solution is to make it official that objects that
want a name subclass (or contain in their args) an instance of Symbol.

It is much easier as a solution (however rather hackish).

I am just asking the community about what should the "canonical" way
to do it be.

Aaron Meurer

unread,
Jul 21, 2012, 9:54:41 PM7/21/12
to sy...@googlegroups.com
I don't know if it is the best way, but in some sense, this does make
sense. Symbol is the SymPy version of str, just as Integer is the
SymPy version of int and Tuple is the SymPy version of tuple. If you
want to have an integer or tuple contained in your object, you just
use Integer or Tuple. So I think logically using Symbol for str makes
sense.

One possible issue is that Symbol has assumptions tied into it.
Hopefully the new assumptions will make this less of an issue. We
could also expand the class structure so that there is a base class to
Symbol that is only a string encapsulation, and subclasses for Symbol,
and other things like BooleanSymbol.

By the way, using Symbol instead of just a str will give you many
useful things for free, like smart printing, easy replacement with
.subs, the ability to drop in a Dummy if you need a dummy name, etc.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
>

Aaron Meurer

unread,
Jul 21, 2012, 9:58:13 PM7/21/12
to sy...@googlegroups.com
Why False? Wouldn't atom(obj) == False mean that obj can be recursed
into, which would not hold for non-Basic objects?

Aaron Meurer

>
> Then each time an issue arises, we find the offending recursive
> algorithm and reimplement it with atom(). Step by step all of sympy
> will have consistent recursion routines.
>

Ronan Lamy

unread,
Jul 21, 2012, 11:14:12 PM7/21/12
to sy...@googlegroups.com
Le samedi 21 juillet 2012 à 19:54 -0600, Aaron Meurer a écrit :
> I don't know if it is the best way, but in some sense, this does make
> sense. Symbol is the SymPy version of str, just as Integer is the
> SymPy version of int and Tuple is the SymPy version of tuple. If you
> want to have an integer or tuple contained in your object, you just
> use Integer or Tuple. So I think logically using Symbol for str makes
> sense.

No, Symbol is definitely not the sympy version of str. It has none of
the methods expected of a string and does a whole lot of things
completely unrelated to strings. If Symbols didn't have names, very
little would change, outside of printing.

The fact that Symbols (and functions, and classes, etc.) have a name
doesn't make them strings. On the contrary, it shows that they are
nothing like strings, because strings are text but don't have an
additional textual label (aka a name) attached to them.

> One possible issue is that Symbol has assumptions tied into it.
> Hopefully the new assumptions will make this less of an issue. We

Assumptions are part of the identity of a Symbol, so the new assumptions
won't change that.

> could also expand the class structure so that there is a base class to
> Symbol that is only a string encapsulation, and subclasses for Symbol,
> and other things like BooleanSymbol.

Having a base class for named objects would be good, but it can only
work for atoms, as non-atoms can't have strings in their args.

krastano...@gmail.com

unread,
Jul 22, 2012, 5:36:24 AM7/22/12
to sy...@googlegroups.com
>> could also expand the class structure so that there is a base class to
>> Symbol that is only a string encapsulation, and subclasses for Symbol,
>> and other things like BooleanSymbol.
>
> Having a base class for named objects would be good, but it can only
> work for atoms, as non-atoms can't have strings in their args.

It seems that you are wrong about this. The idea of Symbol is exactly
that it provides a hack for having a name and not storing it in args.
See MatrixSymbol for instance. There is no algorithm in sympy that
should not consider it an atom, it has non-empty args, it has a name
that is not in its args.

So basically my question is. what do you prefer. Hacks like what is
done to the name string of Symbol or permitting non-Basic objects in
the args of the leafs of the sympy expression tree. Basically, the
presence of non-Basic objects in the args will be what defines the end
of the recursion.

@Aaron
> Why False? Wouldn't atom(obj) == False mean that obj can be recursed
> into, which would not hold for non-Basic objects?
Yes, you are right, it should be True instead of False.

Ronan Lamy

unread,
Jul 22, 2012, 12:35:25 PM7/22/12
to sy...@googlegroups.com
Le dimanche 22 juillet 2012 à 11:36 +0200, krastano...@gmail.com a
écrit :
> >> could also expand the class structure so that there is a base class to
> >> Symbol that is only a string encapsulation, and subclasses for Symbol,
> >> and other things like BooleanSymbol.
> >
> > Having a base class for named objects would be good, but it can only
> > work for atoms, as non-atoms can't have strings in their args.
>
> It seems that you are wrong about this. The idea of Symbol is exactly
> that it provides a hack for having a name and not storing it in args.
> See MatrixSymbol for instance. There is no algorithm in sympy that
> should not consider it an atom, it has non-empty args, it has a name
> that is not in its args.

Symbol is perfectly consistent with sympy's existing object model. Since
it's an Atom, it doesn't have args and is free to use whatever
attributes it needs.

OTOH, what MatrixSymbol does is wrong and unsupported. Being an Atom and
having args is inconsistent and causes problems for algorithms that
don't expect it, eg.:

>>> MatrixSymbol('m', 3, 4).xreplace({4:5}).args
('m', 3, 4)
>>> MatrixSymbol('m', 3, 4).subs({4:5}).args
('m', 3, 5)

Also, putting a string in .args causes it to fail test_args.py.

>
> So basically my question is. what do you prefer. Hacks like what is
> done to the name string of Symbol or permitting non-Basic objects in
> the args of the leafs of the sympy expression tree. Basically, the
> presence of non-Basic objects in the args will be what defines the end
> of the recursion.

By definition, a leaf can't have args, so I don't understand what you
mean. If you're saying that leaves should be either Atoms or non-Basic
objects, that's de facto the way it works now, though making it the rule
would contradict issue 2070 and test_args.py.

krastano...@gmail.com

unread,
Jul 22, 2012, 3:52:27 PM7/22/12
to sy...@googlegroups.com
> Symbol is perfectly consistent with sympy's existing object model. Since
> it's an Atom, it doesn't have args and is free to use whatever
> attributes it needs.
>
> OTOH, what MatrixSymbol does is wrong and unsupported. Being an Atom and
> having args is inconsistent and causes problems for algorithms that
> don't expect it, eg.:

If it is still unclear, I am not stating that everything is ok as it
is. I am asking how to solve issues raised by objects like
MatrixSymbol. Just stating that it is not working with current
algorithms is not of much help.

>> So basically my question is. what do you prefer. Hacks like what is
>> done to the name string of Symbol or permitting non-Basic objects in
>> the args of the leafs of the sympy expression tree. Basically, the
>> presence of non-Basic objects in the args will be what defines the end
>> of the recursion.
>
> By definition, a leaf can't have args, so I don't understand what you
> mean. If you're saying that leaves should be either Atoms or non-Basic
> objects, that's de facto the way it works now, though making it the rule
> would contradict issue 2070 and test_args.py.

Well, this definition does not work well **in practice** at the
moment. This is why I am asking how to fix it. If neither using Symbol
for strings (hackish) nor redefining what an atom is (takes long time,
requires fixing many tree traversal routines) is a viable route
forward there must be another way.

I do not think that just stating "MatrixSymbols (or something similar)
has unexpected args" is of any help. No one has proposed a clear
solution that will work with "named object that must have args"
different from the two that have been proposed above.

So the way forward is to choose between (I may be missing alternative options):

- say that sympy will never support "named objects that must have args"
- use Symbol when we need name strings
- redefine what an atom is and correct each failing algorithm one by one

The first one is just giving up on implementing something that is
obviously useful.
The second one is hackish.
The third one is hard. However it would permit to refactor all three
traversal algorithms in a better abstracted way. Yes, it does
contradict issue 2070, but the only reason that this issue exists is
that the current methods for three traversal are limited (I may be
wrong about this, however I have asked the question many times and I
have never received a different answer.)

krastano...@gmail.com

unread,
Jul 22, 2012, 3:57:44 PM7/22/12
to sy...@googlegroups.com
> So the way forward is to choose between (I may be missing alternative options):
>
> - say that sympy will never support "named objects that must have args"
> - use Symbol when we need name strings
> - redefine what an atom is and correct each failing algorithm one by one
>
> The first one is just giving up on implementing something that is
> obviously useful.

Actually the first option is not bad if we give up on obj.func(*obj.args) == obj
This is already done for Symbol. We just need to admit that some
objects (the leafs) will contain information that will not be in args.

So, maybe before proceeding with arguing about the technical details
(I am sure that I am missing many of them) we can try to list any
other options. For the moment I see only the three mentioned above.
Any other ideas?

Ronan Lamy

unread,
Jul 22, 2012, 5:59:28 PM7/22/12
to sy...@googlegroups.com
Le dimanche 22 juillet 2012 à 21:57 +0200, krastano...@gmail.com a
écrit :
> > So the way forward is to choose between (I may be missing alternative options):
> >
> > - say that sympy will never support "named objects that must have args"
> > - use Symbol when we need name strings
> > - redefine what an atom is and correct each failing algorithm one by one
> >
> > The first one is just giving up on implementing something that is
> > obviously useful.
>
> Actually the first option is not bad if we give up on obj.func(*obj.args) == obj
> This is already done for Symbol. We just need to admit that some
> objects (the leafs) will contain information that will not be in args.

Well, obj.func(*obj.args) == obj never made sense for Atoms, so we would
be clarifying what the real situation is rather than giving up any
useful invariant. I'm not sure what should be done about named objects
with symbolic parts (like MatrixSymbol), though.

> So, maybe before proceeding with arguing about the technical details
> (I am sure that I am missing many of them) we can try to list any
> other options. For the moment I see only the three mentioned above.
> Any other ideas?
>
The second option is clearly hackish, because a Symbol has a name but
*is* not a name. The correct™ variant of this idea would be to create
sympy objects that actually *are* name and use them.

krastano...@gmail.com

unread,
Jul 22, 2012, 7:19:35 PM7/22/12
to sy...@googlegroups.com
I am trying to summarize:

- say that sympy will never support "named objects that must have args"
OK, if we give up on obj.func(*obj.args)==obj. We do not want to do it
because it is a useful invariant. Example where doing it may be
harmful is MatrixSymbol(name, dim1, dim2) where we may want to
subs/replace dim1. I am considering this option dead.

- use Symbol when we need name strings
Yes, if it is not Symbol but a new SymbolName class. Adds complexity
(Why should one use SymbolName when actually recursing over it seems
like nonsense? Why not just str? The only answer to these question
that I can think about is "otherwise there is too much over sympy that
must be corrected". Ronan, I would appreciate your point of view on
these questions.)

- redefine what an atom is and correct each failing algorithm one by one
Actually using strings instead of Symbols or SymbolNames. The **only**
drawback is that it require fixing algorithms all over sympy. However
these fixes can be done step by step without breaking anything. Is
there another drawback? It would permit obj.func(*obj.args)==obj even
for Symbols, as they will be able to keep their name strings in their
args.
Reply all
Reply to author
Forward
0 new messages