SymPy with generic Symbol proxies; python expression algebra

202 views
Skip to first unread message

Nathan Rice

unread,
Jan 31, 2012, 6:38:22 PM1/31/12
to sympy
I am the author of elementwise (http://packages.python.org/
elementwise/) and constraints (http://packages.python.org/
constraintslib/). These tools are similar to SymPy in their use of
symbolic proxy objects to represent values that may be provided later
(and to some degree in their target audience).

I'm getting in touch because:

1.) I'd like to get people on board with the idea of a standard symbol
proxy. The general case is basically doing anything with a symbol
returns another symbol representing the resulting expression. While
you do this to some degree now, the way in which you've gone about it
only allows you to use functions that return a wrapper object (Add,
Pow, etc). Anyone that wants to interact SymPy has to take and return
these objects. A much better scenario is to have and symbolic
expressions that can be realized down to normal python code (and which
can be generated in a general manner from ASTs). This gives you
full interoperability with regular Python code.

2.) I noticed that a type system was on your roadmap. Given a
solution can be achieved for #1, I feel this has broad applicability
to the community. This also ties into your assumption architecture,
which I understand you are currently overhauling. Ideally, your types
and assumptions should pretty much be one and the same.

I would be happy to work with you to achieve these goals. I think
SymPy is a cool project and nothing would make me happier than being
able to help bridge the gap between performing operations on symbolic
constructs and arbitrary code. I realize that the changes I've
proposed are non-trivial; I hope you see their value the way I do :)

Nathan Rice



Chris Smith

unread,
Feb 1, 2012, 2:15:01 AM2/1/12
to sy...@googlegroups.com
This is exciting news, I think. Those that know better how these
things should work will no doubt be responding. Thanks for contacting
SymPy!

Aaron Meurer

unread,
Feb 1, 2012, 3:18:03 PM2/1/12
to sy...@googlegroups.com
Hi.

On Tue, Jan 31, 2012 at 4:38 PM, Nathan Rice
<nathan.ale...@gmail.com> wrote:
> I am the author of elementwise (http://packages.python.org/
> elementwise/) and constraints (http://packages.python.org/
> constraintslib/).  These tools are similar to SymPy in their use of
> symbolic proxy objects to represent values that may be provided later
> (and to some degree in their target audience).

This is cool. Where can I get the code for these projects? I didn't
see any download links (I know I can pip install, but I want to play
with the code too).

>
> I'm getting in touch because:
>
> 1.) I'd like to get people on board with the idea of a standard symbol
> proxy.  The general case is basically doing anything with a symbol
> returns another symbol representing the resulting expression.  While
> you do this to some degree now, the way in which you've gone about it
> only allows you to use functions that return a wrapper object (Add,
> Pow, etc).  Anyone that wants to interact SymPy has to take and return
> these objects.  A much better scenario is to have and symbolic
> expressions that can be realized down to normal python code (and which
> can be generated in a general manner from ASTs).    This gives you
> full interoperability with regular Python code.

So I'm trying to understand exactly what you are suggesting here? You
want to do the sort of thing that SymPy does where you can build up
expression symbolically, but abstract out the mathematical properties?
For example, x + x would not necessarily auto-reduce to 2*x (because
"x" could be anything that supports __add__, like str). We have an
idea to make it easier to produce unevaluated expressions that I think
is similar to what you are suggesting (see
http://code.google.com/p/sympy/issues/detail?id=2738).

>
> 2.) I noticed that a type system was on your roadmap.  Given a
> solution can be achieved for #1, I feel this has broad applicability
> to the community.  This also ties into your assumption architecture,
> which I understand you are currently overhauling.  Ideally, your types
> and assumptions should pretty much be one and the same.

Definitely. We are still in the process of removing the old system,
and to some degree, evolving the new one.

By the way, here's an example of the sort of thing to think about.
Right now, our relationals (a < b, a <= b, etc.) work like this. Expr
< Expr creates Lt(Expr, Expr) (Expr is the base class of SymPy
mathematical objects). If the relational can be evaluated to True or
False, that is returned. Otherwise, it is done mathematically. For
example:

In [55]: S(1) < S(2)
Out[55]: True

In [56]: x < x
Out[56]: False

In [57]: x < y
Out[57]: x < y

(S() is a shortcut to sympify(), which converts the arguments into
SymPy objects: in this case, Integer objects).

The problem with this is in the auto-evaluation. The relationals are
serving a double purpose. On the one hand, they represent boolean
objects (i.e., x < y means asking if x is less than y). On the other
hand, they represent symbolic (mathematical) objects. (x < y as a
mathematical statement).

The difference may seem subtle, but the point is that the former
should try to evaluate to boolean types, and the latter should not.
We have an open issue to split the two
(http://code.google.com/p/sympy/issues/detail?id=1887).

The sort of thing I'd like to discuss is how to combine the two. My
idea is to make __lt__ default to the symbolic type. But any call to
bool() would coerce it to the boolean type. This would also happen
automatically if they are used in the assumptions system (like ask(x <
y)). On the other hand, even if x = Symbol('x', positive=True) (or
assume(Q.positive(x)) or whatever the system looks like), you would
want x**2 > 0 to remain unevaluated, so you can pass it to solve
(solve(x**2 > 0), which would give something like x > 0).

What is the best way to make the object unevaluated by default, but to
evaluate intelligently in the right contexts, so that the user rarely
has to think about such things. The same question could apply to any
kind of symbolic object.

>
> I would be happy to work with you to achieve these goals.  I think
> SymPy is a cool project and nothing would make me happier than being
> able to help bridge the gap between performing operations on symbolic
> constructs and arbitrary code.  I realize that the changes I've
> proposed are non-trivial; I hope you see their value the way I do :)
>
> Nathan Rice
>

Awesome. We are very welcome to new contributions. Feel free to jump
right in, or if you want, we can discuss things here until we start to
get an idea of what should be implemented.

Aaron Meurer

Nathan Rice

unread,
Feb 1, 2012, 5:44:35 PM2/1/12
to sy...@googlegroups.com
> On Tue, Jan 31, 2012 at 4:38 PM, Nathan Rice
> <nathan.ale...@gmail.com> wrote:
>> I am the author of elementwise (http://packages.python.org/
>> elementwise/) and constraints (http://packages.python.org/
>> constraintslib/).  These tools are similar to SymPy in their use of
>> symbolic proxy objects to represent values that may be provided later
>> (and to some degree in their target audience).
>
> This is cool.  Where can I get the code for these projects? I didn't
> see any download links (I know I can pip install, but I want to play
> with the code too).

My bad:

https://github.com/nathan-rice/Constraints
https://github.com/nathan-rice/Elementwise

>>
>> I'm getting in touch because:
>>
>> 1.) I'd like to get people on board with the idea of a standard symbol
>> proxy.  The general case is basically doing anything with a symbol
>> returns another symbol representing the resulting expression.  While
>> you do this to some degree now, the way in which you've gone about it
>> only allows you to use functions that return a wrapper object (Add,
>> Pow, etc).  Anyone that wants to interact SymPy has to take and return
>> these objects.  A much better scenario is to have and symbolic
>> expressions that can be realized down to normal python code (and which
>> can be generated in a general manner from ASTs).    This gives you
>> full interoperability with regular Python code.
>
> So I'm trying to understand exactly what you are suggesting here?  You
> want to do the sort of thing that SymPy does where you can build up
> expression symbolically, but abstract out the mathematical properties?
>  For example, x + x would not necessarily auto-reduce to 2*x (because
> "x" could be anything that supports __add__, like str).  We have an
> idea to make it easier to produce unevaluated expressions that I think
> is similar to what you are suggesting (see
> http://code.google.com/p/sympy/issues/detail?id=2738).

Basically, instead of having the specialized properties at the proxy
level, just use the proxy to build something resembling a generic AST.
If done properly, it should be a fairly trivial matter to convert a
"true" python AST into a symbolic expression AST. That lets you do
fun things with normal python functions :) Granted, to really do fun
stuff, you will need to provide some annotations for any python
functions you want to use (it is possible to infer these automatically
in a great many cases but I don't think it is time to go down that
rabbit hole yet).

Also, building a generic AST means if you want to implement some
calculus down the line where distributivity or commutativity hold in
very different circumstances, you touch a lot less code.

My simple rule has been anything done with a symbol returns a symbol.
I think that works well for the general symbol proxy case.

I think in the more specific case, it depends on the context. If you
know that a particular expression exists in the context of some
specific calculus, and under the accepted axioms you can make a
simplification that will __always__ be valid, you should go ahead and
make it. You can fix the calculus based on what functions are used on
the expression, and any symbols would ideally have constraints, so
that requirement seems reasonable to me. To reference your ask
example, if you assume that calling ask imposes an elementary algebra
context, that would give you a green light to reduce and transform,
without introducing inconsistencies in the behavior of the symbol
proxy itself.

So, to answer the question, I would build a general expression that is
"as written", then transform it as needed once the specific context
can be clearly identified.


I will go ahead and try to generalize the proxy base I have now to
support multiple variables and generate a full AST. Once I've got
that I'll create a SymPy-experimental fork and start nibbling around
the edges of a rather large refactor as time permits :)


Nathan

Aaron Meurer

unread,
Feb 1, 2012, 6:31:20 PM2/1/12
to sy...@googlegroups.com
On Wed, Feb 1, 2012 at 3:44 PM, Nathan Rice

<nathan.ale...@gmail.com> wrote:
>> On Tue, Jan 31, 2012 at 4:38 PM, Nathan Rice
>> <nathan.ale...@gmail.com> wrote:
>>> I am the author of elementwise (http://packages.python.org/
>>> elementwise/) and constraints (http://packages.python.org/
>>> constraintslib/).  These tools are similar to SymPy in their use of
>>> symbolic proxy objects to represent values that may be provided later
>>> (and to some degree in their target audience).
>>
>> This is cool.  Where can I get the code for these projects? I didn't
>> see any download links (I know I can pip install, but I want to play
>> with the code too).
>
> My bad:
>
> https://github.com/nathan-rice/Constraints
> https://github.com/nathan-rice/Elementwise

Cool. So you are basically letting Python do the work by putting
everything in lambdas (or is it nested lambdas?). SymPy builds things
up in it's own proxy objects, which has advantages (e.g., we can
easily print the object as it was built). I'm sure there are
advantages to just using Python closures, though I can't think of them
now (perhaps more space efficient, and simpler code?).

In my experience, you should be even more cautious than that, because
the ability to print symbolic expressions in SymPy means that someone
might want to write something like x + x = 2*x and have it printed
that way. That's why we allow to create unevaluated objects even when
we can evaluate them (like Eq(Integral(x, x), integrate(x, x))).
Perhaps this concept can be abstracted into the "calculus" concept.

There are also issues like the cost of the simplification (for
example, we don't auto-simplify sqrt(number) into factors beyond a
certain size of the prime factors, as integer factorization is too
expensive), and whether or not the "simplification" actually makes
things simpler or not (it's very subjective, and depends on what
measure you use).

But the point is that automatic simplification cannot be overridden,
at least in the same context, so it should only be done if you are
certain that no one would ever not want it done (in that context).

> You can fix the calculus based on what functions are used on
> the expression, and any symbols would ideally have constraints, so
> that requirement seems reasonable to me.  To reference your ask
> example, if you assume that calling ask imposes an elementary algebra
> context, that would give you a green light to reduce and transform,
> without introducing inconsistencies in the behavior of the symbol
> proxy itself.
>
> So, to answer the question, I would build a general expression that is
> "as written", then transform it as needed once the specific context
> can be clearly identified.

I remembered what the tricky part of this was. We use a three-valued
logic in our assumptions system. If something can be deduced to be
always true or always false, we use True and False. Otherwise, we use
None, which means that it can be either true or false with the
information given (or possibly just one and our algorithms aren't
smart enough to deduce it).

So if we have x < y, where x and y are just Symbol objects, ask(x < y)
would return None, as we don't know if x is less than y or not with
the information given. The problem is that bool() only returns True
or False. So if someone does "if x < y:", this is a problem. My idea
is to make __nonzero__ raise ValueError if it would be None, which
would prevent users from writing such statements (right now,
Le.__nonzero__ returns True or False arbitrarily, which leads to
hidden bugs if someone writes such an if statement).

The question is, will this restrict our ability to coerce it into a
Boolean object when necessary? It seems like it might, though I can't
tell without actually implementing it if it would or not. Perhaps it
would be cleaner to make Lt.__nonzero__ *always* raise ValueError,
even if it's something like Lt(2, 3), and require code that wants to
ask about it to wrap it around a call to ask().

Sorry if this is fledging off topic, but your discussion reminded me
of this particular issue.

>
>
> I will go ahead and try to generalize the proxy base I have now to
> support multiple variables and generate a full AST.  Once I've got
> that I'll create a SymPy-experimental fork and start nibbling around
> the edges of a rather large refactor as time permits :)

Great. Keep us updated on your progress.

Aaron Meurer

>
>
> Nathan
>

Joachim Durchholz

unread,
Feb 1, 2012, 7:05:54 PM2/1/12
to sy...@googlegroups.com
Am 02.02.2012 00:31, schrieb Aaron Meurer:
> Cool. So you are basically letting Python do the work by putting
> everything in lambdas (or is it nested lambdas?). SymPy builds things
> up in it's own proxy objects, which has advantages (e.g., we can
> easily print the object as it was built).

I may be displaying my Pythonic ignorance showing here, but wouldn't it
be possible to change the print function for these lambdas?
Alternatively, the proxies could be made into Callables by attaching the
proper function stuff.

As far as I understand Python, both variants should work equally well
from the interpreter's point of view, so we'd be free to choose either
route.

I have to admit I don't have a good idea about the gap between Nathan's
propsal and current SymPy code, so YMMV.
Just my 2 cents :-)

Nathan Rice

unread,
Feb 1, 2012, 9:32:58 PM2/1/12
to sy...@googlegroups.com
> Cool.  So you are basically letting Python do the work by putting
> everything in lambdas (or is it nested lambdas?).  SymPy builds things
> up in it's own proxy objects, which has advantages (e.g., we can
> easily print the object as it was built). I'm sure there are
> advantages to just using Python closures, though I can't think of them
> now (perhaps more space efficient, and simpler code?).

Depends on whether you're talking about elementwise or constraints.
Constraints basically converts everything to a one argument callable,
so that you can evaluate each expression in the context of an input
value. Elementwise uses lambdas to put off dereferencing variables
until the last possible moment, and to get around the fact that
standard generators are one shot items.

Also, note that @chainable does some magic :) Every operation returns
a new proxy, so I can store whatever I need. Currently I only store
the "state" (via closure) and a parent reference, but rather than
building state, I could just store (operation, [arguments]). There
are a few downsides to that from my perspective:

1.) All arguments are evaluated immediately
2.) You lose the nice ability to "swap" variables in the closure.
3.) You are going to have to generate the "compiled" python expression
eventually anyhow

The main advantage of the (function, [arguments]) tuple is that it
lets you defer "compiling" the expression into python, so if you want
to go back and replace all __add__ calls with __wackyadd__ later on,
you can do that easily. You could achieve the same behavior with the
benefits of closures by having a "func" variable in the closure that
points to the unbound type.__method__, then using that in the returned
function. The (function, [arguments]) version is also easier to
introspect, however there is nothing to stop you from decorating the
proxy object with the same info.

> In my experience, you should be even more cautious than that, because
> the ability to print symbolic expressions in SymPy means that someone
> might want to write something like x + x = 2*x and have it printed
> that way.  That's why we allow to create unevaluated objects even when
> we can evaluate them (like Eq(Integral(x, x), integrate(x, x))).
> Perhaps this concept can be abstracted into the "calculus" concept.

Could you provide some examples of when you would want to
automatically simplify? Off the top of my head, canonical forms is
the only one that came to mind, and that could be deferred until
needed.

> But the point is that automatic simplification cannot be overridden,
> at least in the same context, so it should only be done if you are
> certain that no one would ever not want it done (in that context).

I think "no one would ever not want it done" is a pretty good criteria.

The behavior of bool is something that I'm actually not entirely happy
with. I've been toying around with a PyPy plugin that would deal with
the situations that break proxies, like bool, and, etc.

My personal feeling is that bool() on a symbolic expression should
always return True (or True if the expression is non-empty), because
"if thing: ..." as a way to check that the variable you're holding on
to has been bound or initialized is idiomatic python.


Nathan

Aaron Meurer

unread,
Feb 2, 2012, 12:51:47 PM2/2/12
to sy...@googlegroups.com
On Wed, Feb 1, 2012 at 7:32 PM, Nathan Rice
<nathan.ale...@gmail.com> wrote:
>> Cool.  So you are basically letting Python do the work by putting
>> everything in lambdas (or is it nested lambdas?).  SymPy builds things
>> up in it's own proxy objects, which has advantages (e.g., we can
>> easily print the object as it was built). I'm sure there are
>> advantages to just using Python closures, though I can't think of them
>> now (perhaps more space efficient, and simpler code?).
>
> Depends on whether you're talking about elementwise or constraints.
> Constraints basically converts everything to a one argument callable,
> so that you can evaluate each expression in the context of an input
> value.  Elementwise uses lambdas to put off dereferencing variables
> until the last possible moment, and to get around the fact that
> standard generators are one shot items.

I saw that they were both using lambda, though I didn't try to follow
all the decorator and metaclass logic to see what exactly happens with
each.

>
> Also, note that @chainable does some magic :)  Every operation returns
> a new proxy, so I can store whatever I need.  Currently I only store
> the "state" (via closure) and a parent reference, but rather than
> building state, I could just store (operation, [arguments]).  There
> are a few downsides to that from my perspective:
>
> 1.) All arguments are evaluated immediately
> 2.) You lose the nice ability to "swap" variables in the closure.
> 3.) You are going to have to generate the "compiled" python expression
> eventually anyhow
>
> The main advantage of the (function, [arguments]) tuple is that it
> lets you defer "compiling" the expression into python, so if you want
> to go back and replace all __add__ calls with __wackyadd__ later on,
> you can do that easily.  You could achieve the same behavior with the
> benefits of closures by having a "func" variable in the closure that
> points to the unbound type.__method__, then using that in the returned
> function.  The (function, [arguments]) version is also easier to
> introspect, however there is nothing to stop you from decorating the
> proxy object with the same info.

Another advantage of doing it the other way: you can change the
precedence order of operators. This would be kind of magical (read:
hackish) if done at this level, but it should be doable.

And I think one of the reasons people use SymPy is that you can get
simplification. So if your operation boils down to x + 2*x - y, SymPy
will simplify this to the more efficient 3*x + y. This is a trivial
example, but you can imagine the highly powerful symbolic
simplification possibilities that exist for this sort of thing.

>
>> In my experience, you should be even more cautious than that, because
>> the ability to print symbolic expressions in SymPy means that someone
>> might want to write something like x + x = 2*x and have it printed
>> that way.  That's why we allow to create unevaluated objects even when
>> we can evaluate them (like Eq(Integral(x, x), integrate(x, x))).
>> Perhaps this concept can be abstracted into the "calculus" concept.
>
> Could you provide some examples of when you would want to
> automatically simplify?  Off the top of my head, canonical forms is
> the only one that came to mind, and that could be deferred until
> needed.

Well, we automatically simplify x + x to 2*x. If we didn't do that,
SymPy would be useless as a CAS. We also combine sums and products of
rational numbers, and canonicalize rational powers of rational numbers
(up to a limit in the prime factors).

>
>> But the point is that automatic simplification cannot be overridden,
>> at least in the same context, so it should only be done if you are
>> certain that no one would ever not want it done (in that context).
>
> I think "no one would ever not want it done" is a pretty good criteria.

Right. You have to consider context too. Some people do want to write
x + x without it turning into 2*x. In SymPy, they can do this with
Add(x, x, evaluate=False). This might be considered a different
"context". Certainly expressions created this way won't work in some
algorithms without being evaluated first.

But calling the objects with evaluate=False is messy and hard to
write. My hope is that this discussion will lead to something easier
to use, and possibly more powerful.

That's another option.

And I agree about bool, or at least booleans. One thing that came up
recently was that someone wanted to be able to write x < y < z and
create a symbolic object out of it. This is impossible, because this
evaluates to x < y and y < z, and there is no way to override the
logic operators or their short circuits.

Using PyPy to get things working that are "broken" in Python is an
interesting idea. I noticed that PyPy doesn't have the stupid
restriction that __contains__ return a bool that CPython does, so we
could actually implement an In() object there.

Unfortunately, right now, SymPy is unbearably slow in PyPy, so this
would definitely be a roadmap thing.

Aaron Meurer

>
>
> Nathan
>

Ronan Lamy

unread,
Feb 2, 2012, 2:13:40 PM2/2/12
to sy...@googlegroups.com
Le jeudi 02 février 2012 à 10:51 -0700, Aaron Meurer a écrit :

> Unfortunately, right now, SymPy is unbearably slow in PyPy, so this
> would definitely be a roadmap thing.

No, it isn't, see for instance
http://speed.pypy.org/timeline/#/?exe=3,6,1,5&base=2
+472&ben=sympy_expand&env=1&revs=200&equid=off
It's only testing sympy that is unbearably slow on PyPy, but actually
using it should give about the same speed as on CPython.

Nathan Rice

unread,
Feb 2, 2012, 5:21:27 PM2/2/12
to sy...@googlegroups.com
>> Also, note that @chainable does some magic :)  Every operation returns
>> a new proxy, so I can store whatever I need.  Currently I only store
>> the "state" (via closure) and a parent reference, but rather than
>> building state, I could just store (operation, [arguments]).  There
>> are a few downsides to that from my perspective:
>>
>> 1.) All arguments are evaluated immediately
>> 2.) You lose the nice ability to "swap" variables in the closure.
>> 3.) You are going to have to generate the "compiled" python expression
>> eventually anyhow
>>
>> The main advantage of the (function, [arguments]) tuple is that it
>> lets you defer "compiling" the expression into python, so if you want
>> to go back and replace all __add__ calls with __wackyadd__ later on,
>> you can do that easily.  You could achieve the same behavior with the
>> benefits of closures by having a "func" variable in the closure that
>> points to the unbound type.__method__, then using that in the returned
>> function.  The (function, [arguments]) version is also easier to
>> introspect, however there is nothing to stop you from decorating the
>> proxy object with the same info.
>
> Another advantage of doing it the other way: you can change the
> precedence order of operators.  This would be kind of magical (read:
> hackish) if done at this level, but it should be doable.

I don't quite understand what you mean. When you say "change the
precedence order of operators" are you referring to rewriting the
expression after it has been built?


>>> But the point is that automatic simplification cannot be overridden,
>>> at least in the same context, so it should only be done if you are
>>> certain that no one would ever not want it done (in that context).
>>
>> I think "no one would ever not want it done" is a pretty good criteria.
>
> Right. You have to consider context too.  Some people do want to write
> x + x without it turning into 2*x.  In SymPy, they can do this with
> Add(x, x, evaluate=False).  This might be considered a different
> "context".  Certainly expressions created this way won't work in some
> algorithms without being evaluated first.
>
> But calling the objects with evaluate=False is messy and hard to
> write.  My hope is that this discussion will lead to something easier
> to use, and possibly more powerful.

If you start from a very general symbolic expression that is pretty
much "as written", I think automatically performing whatever
evaluation is required for the function not to raise an exception
would be reasonable. You could decorate the expressions with the your
evaluated isomorphism. Of course, then you'd have to track mutations
to the expression and throw away any isomorphisms that were on
ancestors of mutated nodes. If someone explicitly simplifies the
expression you could just swap nodes with any already calculated
canonical isomorphisms.


> And I agree about bool, or at least booleans.  One thing that came up
> recently was that someone wanted to be able to write x < y < z and
> create a symbolic object out of it.  This is impossible, because this
> evaluates to x < y and y < z, and there is no way to override the
> logic operators or their short circuits.
>
> Using PyPy to get things working that are "broken" in Python is an
> interesting idea.  I noticed that PyPy doesn't have the stupid
> restriction that __contains__ return a bool that CPython does, so we
> could actually implement an In() object there.

PyPy is an awesome sandbox :) The python/rpython separation and
application/iterpreter level objects takes some getting used to, but
things by and large are VERY accessible. In particular, object spaces
expose most stuff you'd want to change. The behavior of and/or/not/is
are exceptions (those are buried in the interpreter). Getting even
trivial stuff changed in the CPython interpreter is a rough process,
but a working implementation and concrete use case does a lot to
advance your position.


Nathan

Aaron Meurer

unread,
Feb 3, 2012, 6:18:28 PM2/3/12
to sy...@googlegroups.com

I mean things like make ^ act like **. This actually probably isn't
possible, at least in general, because you can't know how the user
parenthesized the expression. But it's just an idea that I had.

>
>
>>>> But the point is that automatic simplification cannot be overridden,
>>>> at least in the same context, so it should only be done if you are
>>>> certain that no one would ever not want it done (in that context).
>>>
>>> I think "no one would ever not want it done" is a pretty good criteria.
>>
>> Right. You have to consider context too.  Some people do want to write
>> x + x without it turning into 2*x.  In SymPy, they can do this with
>> Add(x, x, evaluate=False).  This might be considered a different
>> "context".  Certainly expressions created this way won't work in some
>> algorithms without being evaluated first.
>>
>> But calling the objects with evaluate=False is messy and hard to
>> write.  My hope is that this discussion will lead to something easier
>> to use, and possibly more powerful.
>
> If you start from a very general symbolic expression that is pretty
> much "as written", I think automatically performing whatever
> evaluation is required for the function not to raise an exception
> would be reasonable.  You could decorate the expressions with the your
> evaluated isomorphism.  Of course, then you'd have to track mutations
> to the expression and throw away any isomorphisms that were on
> ancestors of mutated nodes.  If someone explicitly simplifies the
> expression you could just swap nodes with any already calculated
> canonical isomorphisms.

I think this is why we need a types system, so we can tell how far to
evaluate things. And in some cases, we can get away with not
evaluating things. For example, if you have factorint(23 + 14), it
has to compte the sum to get the answer (suppose that 23 + 14 is given
in an unevaluated format somehow). But factorint(23*14) does not have
to evaluate the product to get the answer, and in fact, can compute
the answer more efficiently by *not* evaluating the product.
Similarly, we can compute log(log(10**10**10)) as 10*log(10*log(10))
*only* if we prevent 10**10**10 from evaluating into a single number
(because this is too big to be represented by any machine in our
universe). These are simple examples with numbers, but you can
imagine that similar things can happen with symbolic expressions.

The reverse can happen too, sometimes you want things that would not
otherwise evaluate automatically to evaluate (or simplify or however
you want to think of it), so that a computation can take place (the
assumptions system is a common place where this happens).

>
>
>> And I agree about bool, or at least booleans.  One thing that came up
>> recently was that someone wanted to be able to write x < y < z and
>> create a symbolic object out of it.  This is impossible, because this
>> evaluates to x < y and y < z, and there is no way to override the
>> logic operators or their short circuits.
>>
>> Using PyPy to get things working that are "broken" in Python is an
>> interesting idea.  I noticed that PyPy doesn't have the stupid
>> restriction that __contains__ return a bool that CPython does, so we
>> could actually implement an In() object there.
>
> PyPy is an awesome sandbox :)  The python/rpython separation and
> application/iterpreter level objects takes some getting used to, but
> things by and large are VERY accessible.  In particular, object spaces
> expose most stuff you'd want to change.  The behavior of and/or/not/is
> are exceptions (those are buried in the interpreter).  Getting even
> trivial stuff changed in the CPython interpreter is a rough process,
> but a working implementation and concrete use case does a lot to
> advance your position.
>
>
> Nathan
>

> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
>

Andy Ray Terrel

unread,
Feb 12, 2012, 7:31:34 AM2/12/12
to sy...@googlegroups.com
FWIW, Andreas Kloeckner's Pymbolic code is closer to what Nathan is
suggesting. He builds trees and then operates on them with visitors.
Its a nice system but not nearly feature complete. SymPy relies on
quite a bit on "construction time" logic to reduce things to canonical
form. Pymbolic relies on visitors to rewrite the graph. This makes
Pymbolic faster for building simple exprs but leaves error checking
until the end of the computation. It also has the nice affect of
being very extensible.

While I agree it would be nice to have a more general AST that can go
between these projects, I don't know that it is feasible to switch
SymPy. It would be much more work.

-- Andy

Nathan Rice

unread,
Feb 13, 2012, 9:33:48 AM2/13/12
to sy...@googlegroups.com
On Sun, Feb 12, 2012 at 7:31 AM, Andy Ray Terrel <andy....@gmail.com> wrote:
> FWIW, Andreas Kloeckner's Pymbolic code is closer to what Nathan is
> suggesting.  He builds trees and then operates on them with visitors.
> Its a nice system but not nearly feature complete.  SymPy relies on
> quite a bit on "construction time" logic to reduce things to canonical
> form.  Pymbolic relies on visitors to rewrite the graph.  This makes
> Pymbolic faster for building simple exprs but leaves error checking
> until the end of the computation.  It also has the nice affect of
> being very extensible.
>
> While I agree it would be nice to have a more general AST that can go
> between these projects, I don't know that it is feasible to switch
> SymPy.  It would be much more work.

I was thinking about the easiest way to integrate an AST/visitor style
into SymPy. Rather than move everything to use the AST style
construct, I think it would be easier to have a function that converts
from one form to another. Since I construct a graph that encodes the
order in which the operations occurred, you could just walk from the
starting points, converting in order.

The main difficulty with the AST is that things which are expressions
in python are statements in the AST, and behave differently. The
biggest offender here are the in place operations. I have to think of
a nice interface to encode multiple symbol parents to a given state as
well, I'm not happy with anything I've tried so far. If you guys have
any suggestions as far as how you would like to access all the symbols
in an expression I would love to hear your thoughts.

Nathan

Aaron Meurer

unread,
Feb 13, 2012, 5:54:28 PM2/13/12
to sy...@googlegroups.com

At present our core is immutable, so we haven't really dealt with
those problems very much. But we would like to experiment with a
mutable core at some point, so solving these problems would be useful.

So you want to allow nested in-place operators? In other words,
something like x*(x += 1) would be allowed (e.g., if x starts as 2,
this would give 3). This presents some interesting possibilities, but
also some difficulties. One thing that might be a limitation is that
Python doesn't make the distinction between += and =+ like languages
that allow this do (like C).

Consider this. If you have something like x*(y*z), this is computed
as x.__mul__(y.__mul__(z)) (ignore the existence of __rmul__ for now).
This can be computed in that order, i.e., first compute x, then
recursively compute y.__mul__(z), and pass that to x.__mul__(). In
AST, this means that you convert it to preorder traversal.

But preorder traversal is equivalent to using a stack with infix.
What happens then when using a stack doesn't work? What if you have
x*((x += 1)*y)? If you first compute x, then try to recursively
compute (x += 1)*y, this will be wrong, because x += 1 will change the
value of the x you already computed.

How do languages like C that support these things implement them (or
more specifically, how do the compilers implement them)? I suppose you
have to do some kind of ordered topological sort on the tree, or
something like that, but I'm just guessing.

Aaron Meurer

Nathan Rice

unread,
Feb 13, 2012, 7:22:40 PM2/13/12
to sy...@googlegroups.com
>> Nathan
>>
>
> At present our core is immutable, so we haven't really dealt with
> those problems very much.  But we would like to experiment with a
> mutable core at some point, so solving these problems would be useful.

I like to solve problems completely, so I will find a solution :)

> So you want to allow nested in-place operators?  In other words,
> something like x*(x += 1) would be allowed (e.g., if x starts as 2,
> this would give 3).  This presents some interesting possibilities, but
> also some difficulties.  One thing that might be a limitation is that
> Python doesn't make the distinction between += and =+ like languages
> that allow this do (like C).

Because in place operators are statements in python (even though it is
computable as an expression), you have to generate ast that looks
something like this:

Module([
AugAssign(
Name('x', Store()),
Add(),
Num(1)
),
Expr(
BinOp(
Name('x', Load()),
Mult(),
Name('x', Load())
)
)
])

i.e two statements in a module.

So I will have break "state" into chain segments. Whenever a
statement style operation (such as __setattr__) is encountered, the
current parent will be set to none and a previous chain will be
created.

I can probably handle the multiple variable case of a symbol in the
same way, if a symbol is input to an operation on a symbol, take the
input symbol and add it to the "previous statements" list.

Oh python, why so inconsistent?


Nathan

Nathan Rice

unread,
Feb 26, 2012, 10:47:08 AM2/26/12
to sy...@googlegroups.com
I have this at a pretty workable state. It is available at
https://github.com/nathan-rice/symbolic.

Python's AST is not terribly fun to use, though I've done most of the
hard work. There were a lot of corner cases and little annoyances. I
still think the code transformation approach has promise, but I
definitely had to force myself to work on it at several points.

Has anyone here played with term rewriting software? I'm interested
in hacking up a little term rewriting system for proofs and simple
general computation. Suggestions for a good model to start from would
be appreciated.


Nathan

Joachim Durchholz

unread,
Feb 26, 2012, 12:06:01 PM2/26/12
to sy...@googlegroups.com
Am 13.02.2012 23:54, schrieb Aaron Meurer:
> In other words,
> something like x*(x += 1) would be allowed (e.g., if x starts as 2,
> this would give 3).

You do not want to have or allow side effects in user-specified
expressions. You get a whole lot of restrictions on which AST
transformations are safe and which aren't. I'm not sure how much of the
integral solvers would be applicable if one of, say, the integration
bounds contained an increment operator.
Also, users will be unclear about when exactly the increment happens.
Heck, even programmers tend to avoid this kind of code, because it
doesn't spell clearly out what it's doing.

Heck, I'd even make the mutating operations on all symbols throw an
exception. Math is about immutable stuff, you are not supposed to
redefine the meaning of a symbol in mid-flight.

Sergiu Ivanov

unread,
Feb 26, 2012, 2:05:16 PM2/26/12
to sy...@googlegroups.com
On Sun, Feb 26, 2012 at 7:06 PM, Joachim Durchholz <j...@durchholz.org> wrote:
> Am 13.02.2012 23:54, schrieb Aaron Meurer:
>>
>> In other words,
>> something like x*(x += 1) would be allowed (e.g., if x starts as 2,
>> this would give 3).
>
> Heck, I'd even make the mutating operations on all symbols throw an
> exception. Math is about immutable stuff, you are not supposed to redefine
> the meaning of a symbol in mid-flight.

As a functional programming beginner and fan, I can't but agree. I
was actually quite happy when I heard for the first time that SymPy
had immutable core, because that allows for much clearer and better
structured code (at least I see it so).

I'd like to ask the question which I believe to be very important:
what would be the benefit of having a mutable core? Is it going to
provide some essential speed-up? I am quite sure that there should be
a very serious reason to abandon the current immutability, because
this decision will immediately make the code more difficult to
maintain.

Sergiu

Joachim Durchholz

unread,
Feb 26, 2012, 4:58:22 PM2/26/12
to sy...@googlegroups.com
Am 26.02.2012 20:05, schrieb Sergiu Ivanov:
> As a functional programming beginner and fan, I can't but agree. I
> was actually quite happy when I heard for the first time that SymPy
> had immutable core, because that allows for much clearer and better
> structured code (at least I see it so).

Hm... there's the question what an "immutable core" actually is. Is it
the mathematical objects that the user sees, or is it an internal API
for programmers?

In the former case, I'd avoid mutability. It's just a conceptual hassle,
it makes semantics-preserving transformations harder (as soon as
references are involved, prohibitively hard for a project of SymPy's
scale - you need alias analysis, and compiler teams spend person-decades
on that kind of stuff).

In the latter case, there's nothing wrong with it. Sure, coding just
with immutable data is fun and far less error-prone, but Python is not
built for that coding style; I suspect such code would be very clean but
not very efficient.

> I'd like to ask the question which I believe to be very important:
> what would be the benefit of having a mutable core?

That's an interesting question, yes.
And the first one that should be asked whenever important changes are
undertaken, and the first one to look for an answer for.

Just my 2c.

krastano...@gmail.com

unread,
Feb 26, 2012, 9:35:29 PM2/26/12
to sy...@googlegroups.com
I would like to point out that the immutability of the core need not
change if we follow the propositions in this discussion.

In my opinion the important point here is the implementation of
abstract syntax trees for sympy. The minor (from mathematical point of
view) details about whether it should be immutable tree or whether
side effects and assignments should be allowed in the tree (hopefully
not!) are secondary.

The use of ASTs seems like the right way to fix
http://code.google.com/p/sympy/issues/detail?id=1941

We can have identity crawler that does not change the tree (just
returns a copy). The current crawlers (just what Add and Mul do, but
much better abstracted). Simplification and canonization crawlers.
Special crawler options specific to an object (Units subclass politely
asking the crawler to be placed last). Now every time that we need
some canonization we subclass Expr. It is not a scalable solution at
all.

Example:
Special treatment of Units in Mul
The subclassed Quantum Expr in the quantum module.
The need for MatExpr, MatAdd, MatMul... It is a good solution for the
current state of the code but it is ugly and it does not scale (what
happens when I want both matrices and kets in my expression?)

The existence of AST would make the implementation of abstract
multilinear algebra or group theory much easier as there won't be any
need for subclassing Expr. Just writing a crawler for the tree should
be sufficient. It won't be less work, but the incompatibility between
the subclasses of Expr will be evaded.

Matthew Rocklin

unread,
Feb 27, 2012, 12:47:11 PM2/27/12
to sy...@googlegroups.com
The Theano project that might be relevant in this discussion.

Theano is a python project that represents array computation as an abstract syntax tree of symbolic objects. It has a series of "optimizers" that walk over and change the graph on request. Its an interesting model and serves their purposes well. 

Sergiu Ivanov

unread,
Feb 27, 2012, 2:08:58 PM2/27/12
to sy...@googlegroups.com
On Sun, Feb 26, 2012 at 11:58 PM, Joachim Durchholz <j...@durchholz.org> wrote:
> Am 26.02.2012 20:05, schrieb Sergiu Ivanov:
>
>> As a functional programming beginner and fan, I can't but agree.  I
>> was actually quite happy when I heard for the first time that SymPy
>> had immutable core, because that allows for much clearer and better
>> structured code (at least I see it so).
>
>
> Hm... there's the question what an "immutable core" actually is. Is it the
> mathematical objects that the user sees, or is it an internal API for
> programmers?

I thought it referred to the fact that SymPy is immutable "in the
large"; i.e., side-effects are allowed on a small scale (some simple
flags, counters, etc.), but separate entities stay away from
modifications. At least, this is my (very basic) understanding of how
SymPy is organised.

> In the former case, I'd avoid mutability. It's just a conceptual hassle, it
> makes semantics-preserving transformations harder (as soon as references are
> involved, prohibitively hard for a project of SymPy's scale - you need alias
> analysis, and compiler teams spend person-decades on that kind of stuff).

Yes, that's precisely what I meant, but expressed with better
terminology and more detail :-)

> In the latter case, there's nothing wrong with it. Sure, coding just with
> immutable data is fun and far less error-prone, but Python is not built for
> that coding style; I suspect such code would be very clean but not very
> efficient.

Yes, I get it; I'd say what I meant when I joined the discussion was
something more similar to the former case of the two you describe.

Sergiu

Sergiu Ivanov

unread,
Feb 27, 2012, 2:15:36 PM2/27/12
to sy...@googlegroups.com
On Mon, Feb 27, 2012 at 4:35 AM, krastano...@gmail.com
<krastano...@gmail.com> wrote:
> I would like to point out that the immutability of the core need not
> change if we follow the propositions in this discussion.
>
> In my opinion the important point here is the implementation of
> abstract syntax trees for sympy. The minor (from mathematical point of
> view) details about whether it should be immutable tree or whether
> side effects and assignments should be allowed in the tree (hopefully
> not!) are secondary.

I see. And indeed, I think it would be great to do without side
effects in the trees, because, as Joachim said, this is very likely to
induce plenty of hardship.

> The use of ASTs seems like the right way to fix
> http://code.google.com/p/sympy/issues/detail?id=1941

Yes, can't but agree. ASTs seem to be the most generalised way of
achieving the desired functionality and it looks like not much effort
will be required to implement special cases.

> We can have identity crawler that does not change the tree (just
> returns a copy). The current crawlers (just what Add and Mul do, but
> much better abstracted). Simplification and canonization crawlers.
> Special crawler options specific to an object (Units subclass politely
> asking the crawler to be placed last). Now every time that we need
> some canonization we subclass Expr. It is not a scalable solution at
> all.

Crawlers seem to possess tremendous power; it would be great to have
them, obviously.

> Example:
> Special treatment of Units in Mul
> The subclassed Quantum Expr in the quantum module.
> The need for MatExpr, MatAdd, MatMul... It is a good solution for the
> current state of the code but it is ugly and it does not scale (what
> happens when I want both matrices and kets in my expression?)
>
> The existence of AST would make the implementation of abstract
> multilinear algebra or group theory much easier as there won't be any
> need for subclassing Expr. Just writing a crawler for the tree should
> be sufficient. It won't be less work, but the incompatibility between
> the subclasses of Expr will be evaded.

Indeed, crawlers seem to provide a relatively cheap solution to a wide
range of problems which are already arising in SymPy.

And yes, now I get the relationship between the main topic of the
thread and side-effects :-)

Sergiu

Reply all
Reply to author
Forward
0 new messages