What happens, at a high level, when you factor this expression?

108 views
Skip to first unread message

Jeff Pickhardt

unread,
Jun 24, 2011, 4:42:25 AM6/24/11
to sy...@googlegroups.com
Hey guys,

I'm reading through the SymPy code and, well, it's somewhat overwhelming if you're new to the project because there's so much going on.  (That's a good thing too - it means its robust!)

Can someone help explain how this works?

>>> from sympy import *
>>> x = Symbol("x")
>>> my_expression = sin(x)**2 + 2*sin(x) + 1
>>> my_expression.factor()
(1 + sin(x))**2
>>>

For instance, what data structures happen when I create my_expression, what happens when I factor it, etc.  A high-level walk through would help.  I see there's stuff going on at polytools.py, and I think _symbolic_factor gets called.  It's just confusing to keep everything in my head when I don't yet have a high level understanding of how sympy expressions and what not actually work.

Also, the new quantum mechanics stuff looks really cool.  Wish I had had that a few years ago!

Thanks,
Jeff

--

Chris Smith

unread,
Jun 24, 2011, 5:11:30 AM6/24/11
to sy...@googlegroups.com
On Fri, Jun 24, 2011 at 12:42 AM, Jeff Pickhardt <pick...@gmail.com> wrote:
Hey guys,

I'm reading through the SymPy code and, well, it's somewhat overwhelming if you're new to the project because there's so much going on.  (That's a good thing too - it means its robust!)

Can someone help explain how this works?

>>> from sympy import *

Import many of the sympy functions from all the different modules. 


>>> x = Symbol("x")

create a sympy symbol named 'x' and assign it to a python variable named 'x' 

>>> my_expression = sin(x)**2 + 2*sin(x) + 1

a sympy Expr is created which is an Add with 3 terms, a Pow, Mul and One (etc...). It is assigned to a python variable named my_expression.
 

>>> my_expression.factor()

This calls a method that factors the object associated with my_expression (the Expr created on the rhs of the equal sign). This is where it can get confusing because what happens "under the hood" might involve converting the Expr to a Poly to do the work and then reconverting it to an Expr. (integrate is an interesting routine in this regard because it does what I -- a layman in this regard -- consider some pretty esoteric stuff to come up with the answer.)
 

(1 + sin(x))**2

This is the final Expr, a Pow with base that is an Add and exponent that is an Integer.

>>>

For instance, what data structures happen when I create my_expression, what happens when I factor it, etc.  A high-level walk through would help.  I see there's stuff going on at polytools.py, and I think _symbolic_factor gets called.  It's just confusing to keep everything in my head when I don't yet have a high level understanding of how sympy expressions and what not actually work.


You can get a feel for what you are working with by using args and srepr:

    >>> 1+2*sin(x)+sin(x)**2
    sin(x)**2 + 2*sin(x) + 1
    >>> eq=_
    >>> eq.args
    (1, 2*sin(x), sin(x)**2)
    >>> srepr(eq)
    "Add(Pow(Function('sin')(Symbol('x')), Integer(2)), Mul(Integer(2), Function('si
    n')(Symbol('x'))), Integer(1))"

Using the preorder_traversal, you can get the same information a little at a time:

    >>> from sympy.utilities.iterables import *
    >>> p = preorder_traversal(eq)
    >>> for i in p:
    ...  print i, type(i)
    ...
    sin(x)**2 + 2*sin(x) + 1 <class 'sympy.core.add.Add'>
    1 <class 'sympy.core.numbers.One'>
    2*sin(x) <class 'sympy.core.mul.Mul'>
    2 <class 'sympy.core.numbers.Integer'>
    sin(x) sin
    x <class 'sympy.core.symbol.Symbol'>
    sin(x)**2 <class 'sympy.core.power.Pow'>
    sin(x) sin
    x <class 'sympy.core.symbol.Symbol'>
    2 <class 'sympy.core.numbers.Integer'>
 
Hope that helps. Feel free to ask any other questions.

Chris

Aaron Meurer

unread,
Jun 24, 2011, 5:33:15 AM6/24/11
to sy...@googlegroups.com
Well, you picked a particularly tricky example because it uses both
the main core and the polys.

So let's start with my_expresion (the core part). You already know
that x is defined to be a Symbol object (because you did this
yourself). You should also know that sin and cos are Function
objects, which you can think of as container objects that hold
arguments and which also have various mathematical relations defined
on them.

As you may know, Python let's you override the behavior of the
built-in operations *, +, /, -, etc. on your own objects. So all
objects have method __mul__, __add__, etc. defined on them. When you
call x + y, this reduces to x.__add__(y). In SymPy, a.__add__(b) is
converted to Add(a, b). The same is true for __mul__ and Mul. So in
sin(x)**2 + 2*sin(x) + 1, sin(x) creates a sin object (when it is
called). This reduces to

sin(x).__pow__(2) + sin(x).__rmul__(2) + 1
Pow(x, 2).__add__(Mul(2, sin(x))).__add__(1)
Add(Pow(x, 2), Mul(2, sin(x)), 1)

which is what you will get if you call srepr(my_expression). Note
that 2*sin(x) actually calls __rmul__. That's because 2 (type int)
doesn't know how to multiply sin(x) (type sin), so sin(x)'s __rmul__
method is called. This brings up an important point. All Python
types are converted in this process to SymPy types, through the
sympify() function. So, for example, sin(x).__pow__(2) reduces to
sin(x).__pow__(sympify(2)), which results in
sin(x).__pow__(Integer(2)).

Now, Mul, Pow, and Add's __new__ methods contain logic to do automatic
simplifications like x + 2*x => 3*x or x*x => x**2. In this case, no
such simplifications needed to be applied.

All the args for any SymPy expression are stored in expr.args. So you
would have

>>> my_expression.args
(1, sin(x)**2, 2*sin(x))
>>> my_expression.args[1].args
(sin(x), 2)

If you want to see the code for all of this, you can look at the files
in sympy/core. The __op__ stuff is mostly in basic.py and expr.py.
The flattening routines are in add.py, pow.py, and mul.py. The code
for functions that sin is built off of is in function.py.

Now, to the factoring part. my_expression.factor() is a shortcut to
factor(my_expression). Because factoring is a polynomial algorithm,
the expression has to be converted to a Poly first. Poly is able to
represent polynomials with arbitrary symbolic "generators". In this
case, it determines that it should use sin(x) as a generator, so it
creates Poly(sin(x)**2 + 2*sin(x) + 1, sin(x)), which you can think of
as a wrapper around the polynomial y**2 + 2*y + 1, where y is set to
be sin(x) (for the purposes of Poly, it does not matter what the
generators are, other than that coefficients cannot contain symbols
from them, so you can think of it in this way).

Then, it calls the factorization algorithm on Poly. If you are
interested in how this works, I suggest you read the code and the
papers referenced there. In this case, it is able to factor the
polynomial using the squarefree factorization algorithm, which is
actually not too difficult to understand. In the general case, it
uses a complicated multivariate factorization algorithm that factors
any multivariate polynomial into irreducibles.

Anyway, Poly.factorlist returns something like [(Poly(sin(x) + 1),
2)]. factor() converts this into a normal SymPy expression (also
called a Basic expression or Expr expression) by passing it to Mul and
Pow (something like Mul(*[Pow(b, e) for b, e in expr]) would do it, I
think).

All the Poly stuff lives in sympy/polys. If you are interested, I can
explain a little how they work (internal representation, etc.). The
code for the Poly class lives in polytools.py, though the actual
factorization algorithm lives in sqftools.py and factortools.py.

And one thing that I didn't mention (and maybe you didn't even think
of) is the printing. You do not use pretty printing, so the printing
is rather simple (just recursively print the objects using the proper
operators). If you are interested, you can look at the code in
sympy/printing.

Let me know if this made sense, and if there are bits that you still
would like to know about. Also, remember that SymPy is written in a
fairly modular way, so it's completely unnecessary to know how a
module works unless you want to work on that module specifically
(e.g., you don't need to how the core works to write some
simplification algorithm, like in simplify.py).

Aaron Meurer

> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to
> sympy+un...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/sympy?hl=en.
>

Jeff Pickhardt

unread,
Jun 24, 2011, 6:33:59 AM6/24/11
to sy...@googlegroups.com
Thanks guys, this is great stuff.  You might want to memorialize this on a wiki or something for newcomers to SymPy.

So a little bit about my background with Python: I'm a pretty experienced Python programming, I understand operator overloading, I see your @_sympifyit decorators all over the place (I'm guessing they're to convert 2 to Integer(2), as Aaron explained).  In fact, I started writing my own Python symbolics math library years ago.

One follow-up question is:

Is there an advantage to having your own classes for Pow, Add, Mul, and other operators? Why didn't you just absorb those in classes for expressions, functions, etc, and have an expression + expression return an expression?

Here's an example.  Specifically, x^2, instead of being Pow(x, 2), would be something like:
MonoTerm (self.units = {x: 1}).__pow__(2) becomes MonoTerm, self.units = {x: 2}.

For example, digging up my old code, when I was making a symbolics math library, I started doing it this way:

class MonoTerm:
    """A MonoTerm is an instance of terms multiplied together, but without any sums or differences.  As an example, 2 could be a MonoTerm, as could 3 z, but 2 x + 3 y must be a PolyTerm.  MonoTerms can only be operated on by MonoTerms with the same dimensional units."""
    def __init__(self, string=''):
        # instance variables
        self.number=0
        self.units={}
        # initialize to the string, with the become method
        self.become(string)

    def become(self, string=''):
        """Handles the parsing of the MonoTerm into number and units."""
        if string=='':
            # nothing should happen; this is the "empty term"
            return
        if m.search('^\s*('+patterns.number+')', string):
            self.number = float(m[0])
        else:
            self.number = 1.0 # 'kg' == '1.0 kg'
        unitSearch = re.findall('('+patterns.unit+')\s*(?:(?:\^|\*\*)\s*('+patterns.number+'))?', string)
        for unit, exponent in unitSearch:
            if unit == '':
                continue
            if exponent == '':
                self.units[unit]=1.0
            else:
                self.units[unit]=float(exponent)
   
    def __add__(self, other):
        """Adds two MonoTerms together, provided they have the same units."""
        if self.units != other.units:
            # can only subtract MonoTerms with MonoTerms
            raise Exception('Cannot add two MonoTerms with incompatible units.  Use PolyTerms instead.')
        try:
            newTerm = MonoTerm()
            newTerm.number = self.number + other.number
            newTerm.units = self.units
            newTerm.prune()
            return newTerm
        except:
            return other+self
(etc)

class PolyTerm:
    """A PolyTerm is a term made up of the sum of two or more MonoTerms."""
    def __init__(self, string=''):
        # instance variables
        self.terms={}
        self.become(string)
   
    def become(self, string=''):
        # assumes the string is well-formed various terms split by +
        # Actually, for now, assume the string is really a MonoTerm... this will be updated later.
        newTerm = MonoTerm(string)
        self.terms[newTerm.getLabel()]=newTerm
   
    def __add__(self, other):
        """Adds two PolyTerms together."""
        newTerm = copy.deepcopy(self)
        for term in other:
            if term.getLabel() in newTerm.terms:
                newTerm.terms[term.getLabel()] = newTerm.terms[term.getLabel()] + term
            else:
                newTerm.terms[term.getLabel()] = copy.deepcopy(term)
        newTerm.prune()
        return newTerm
(etc)

term1 = PolyTerm('1 x y')
term2 = PolyTerm('1 y z')
term3 = PolyTerm('2 x y^2 z')
term4 = PolyTerm('1 y^2')

print ((term1+term2)**2 - term3) / term4 # ((xy+yz)^2 - 2xy^2z)/y^2 == x^2 + z^2

Aaron Meurer

unread,
Jun 24, 2011, 6:55:57 AM6/24/11
to sy...@googlegroups.com
On Fri, Jun 24, 2011 at 12:33 AM, Jeff Pickhardt <pick...@gmail.com> wrote:
> Thanks guys, this is great stuff.  You might want to memorialize this on a
> wiki or something for newcomers to SymPy.

That's a good idea.

>
> So a little bit about my background with Python: I'm a pretty experienced
> Python programming, I understand operator overloading, I see your
> @_sympifyit decorators all over the place (I'm guessing they're to convert 2
> to Integer(2), as Aaron explained).  In fact, I started writing my own
> Python symbolics math library years ago.

OK. I just wanted to be sure to mention it, since to people who don't
know about operator overloading, it can be the most mysterious part of
the whole thing (how does it "know" to convert x + x into 2*x?)

>
> One follow-up question is:
>
> Is there an advantage to having your own classes for Pow, Add, Mul, and
> other operators? Why didn't you just absorb those in classes for
> expressions, functions, etc, and have an expression + expression return an
> expression?

It makes for much better object oriented programming. You can see all
the methods of Add, Mul, and Pow specifically handle the behavior on
that type.

Let's use differentiation as an example. diff(expr, x) is implemented
as expr._eval_derivative(x) (technically, this is not 100% true
because there's also fdiff, but let's suppose that it's just done this
way for simplicity). If you do it this way, it's a very simple
recursive algorithm. You define your base cases, and define

Add

def _eval_derivative(self, x):
return Add(*[i._eval_derivative(x) for i in self.args])

Mul

def _eval_derivative(self, x):
return Add(*(Mul(*(self.args[:i] +
[self.args[i]._eval_derivative(x)] + self.args[i+1:]))))

Pow

def _eval_derivative(self, x):
# I hope I have the rule correct here
return self.exp*self.base._eval_derivative(x)*self.base**(self.exp
- 1) + log(self.base)*self.exp._eval_derivative(x)*self.base**self.exp

(note, I didn't copy this from the actual SymPy code, so it might be a
little different, but it will be similar).

This way, the code is all very simple, because each class only needs
to know how to deal with itself.

This also makes it much more modular. If you want to extend SymPy,
you just create your own class. You can make that class work with
diff() just by defining ._eval_derivative() on that class. With your
method, if someone wants to extend the code, they have to extend the
Expression class (which will be huge, btw).

Your method goes half way, because it has a separate "Add" class,
PolyTerm, but you are still keeping things like x and x**2 and x*y as
the same class, when the first should be a Symbol, the second a Pow,
and the third a Mul.

By the way, if you are interested in other ways to implement symbolics
in Python, you might look at the sympycore project, which is a
research project that tries to implement symbolics in the most
efficient way possible (and is often faster than SymPy as a result).
See http://code.google.com/p/sympycore/.

Aaron Meurer

Jeff Pickhardt

unread,
Jun 25, 2011, 3:56:43 AM6/25/11
to sy...@googlegroups.com
Yeah, I see the advantage in that.  That's really cool.

Here's another one.  I see the Derivative class.  But cos already has fdiff and an _eval_derivative from class inheritance.  (Not sure what the difference between the two methods is, but anyway)  So why is a Derivative class needed?  Couldn't you just make it a function, something like:

@sympify_it
def derivative(f, with_respect_to):
  return f.fdiff(with_respect_to)

The Derivative class has extra methods like _eval_nseries, but wouldn't those just get subsumed with whatever a derivative function returns?  For example, if you call fdiff on cos, it returns -sin, which should have a _eval_nseries method anyhow.  What's the point of having a separate Derivative class?

Mateusz Paprocki

unread,
Jun 25, 2011, 4:07:37 AM6/25/11
to sy...@googlegroups.com
Hi,

On 25 June 2011 05:56, Jeff Pickhardt <pick...@gmail.com> wrote:
Yeah, I see the advantage in that.  That's really cool.

Here's another one.  I see the Derivative class.  But cos already has fdiff and an _eval_derivative from class inheritance.  (Not sure what the difference between the two methods is, but anyway)  So why is a Derivative class needed?  Couldn't you just make it a function, something like:

@sympify_it
def derivative(f, with_respect_to):
  return f.fdiff(with_respect_to)

The Derivative class has extra methods like _eval_nseries, but wouldn't those just get subsumed with whatever a derivative function returns?  For example, if you call fdiff on cos, it returns -sin, which should have a _eval_nseries method anyhow.  What's the point of having a separate Derivative class?

There are at least two reasons for this. One is that sometimes it's useful to have an unevaluated entity, e.g.:

In [1]: diff(sin(x), x)
Out[1]: cos(x)

In [2]: Derivative(sin(x), x)
Out[2]: 
d         
──(sin(x))
dx        

(for example for presentation purpose). You can use .doit() method to perform actual differentiation:

In [3]: _.doit()
Out[3]: cos(x)

The same applies to integrate()/Integral, limit()/Limit pairs (and maybe other). The other reason is the possibility for representing derivatives of purely symbolic entities, e.g.:

In [4]: diff(f(x), x)
Out[4]: 
d       
──(f(x))
dx      

In [5]: type(_)
Out[5]: <class 'sympy.core.function.Derivative'>

We don't know what is the derivative of f, so we have to somehow represent it. You will encounter symbolic representations of this kind very often in SymPy. Nice examples, but of a little different kind, are roots(f)[i] vs. RootOf(f, i) or sum(roots(f, multiple=True)) vs. RootSum(f)).

Mateusz

Aaron Meurer

unread,
Jun 25, 2011, 4:16:39 AM6/25/11
to sy...@googlegroups.com
fdiff makes it easier to define the derivative for functions. If
f'(x) = g(x), then f.fdiff will return g. Then
Function._eval_derivative will automatically use that to apply the
chain rule when differentiating an expression containing f. This
avoids code duplication of the chain rule across every single
function.

Aaron Meurer

Jeff Pickhardt

unread,
Jun 25, 2011, 10:45:15 PM6/25/11
to sy...@googlegroups.com
OK, that makes sense. So let's go back to this idea of having Add, Mul, and Pow classes.

How do Add and Mul figure out how to intelligently combine expressions, such as Add(x, x) -> Mul(2, x)?  For example, here's a sample REPL session:

>>> from sympy import *
>>> x = Symbol("x")
>>> Add(x, sin(x), x, sin(x), 3)
3 + 2*x + 2*sin(x)
>>> Add(x, sin(x), x, sin(x), 3).is_Add
True
>>> Add(x, sin(x), x, sin(x), 3).is_Mul
False
>>> Add(x, sin(x), x, sin(x), 3).args
(3, 2*x, 2*sin(x))
>>> Add(x, sin(x), x, sin(x), 3).args[1]
2*x
>>> Add(x, sin(x), x, sin(x), 3).args[1].is_Mul
True

Somehow Add must be figuring out that sin(x) and sin(x) are the same.  Maybe that's happening in flatten?  It's kind of magical.

Aaron Meurer

unread,
Jun 26, 2011, 5:16:18 AM6/26/11
to sy...@googlegroups.com
The trick is to use dictionaries, whose keys are the terms and items
are the coefficients. This is why it's essential that Basic objects
be immutable. Consider something like

Add(x, 2*x, y, -y)

The second and fourth arguments are Muls. Mul always puts any
Numerical coefficient in args[0] (note I use Numerical with an
uppercase N, meaning an actual number like 2 or 3/4 or 2.1; something
that would combine with another Number to create a new Number through
addition or multiplication). To simplify the below logic, suppose
that coeff_term(expr) returns (expr.args[0], Mul(*expr.args[:0])) if
expr is a Mul and expr.args[0] is a Number, and (1, expr) otherwise.
In other words, the first item is the Numerical coefficient and the
second argument is the rest. The logic in Add.flatten is something
like

termdict = {}
for arg in args:
coeff, term = coeff_term(arg)
if term in termdict:
termdict[term] += coeff
else:
# Add term to termdict
termdict[term] = coeff

Then follow through what we get for each term in our example

x:
coeff == 1
term == x
termdict == {x:1}
2*x:
coeff == 2
term == x
termdict == {x:3}
y:
coeff == 1
term == y
termdict == {x:3, y:1}
-y:
coeff == -1
term == y
termdict == {x:3, y:0}

Then, at the end, we (efficiently) multiply the terms and items in the
dictionary and those are our Add.args. Note that only in this last
step is the 0*y actually reduced to nothing (again, speed is a concern
here, so this is all done efficiently; see Add.flatten for how it's
actually done).

For Mul, it's a little more complicated, but actually not that much.
The problem here is that we combine not only x*x => x**2, but also
x**y*x**y => x**(2*y). But it's actually not to difficult. Instead
of a dictionary of term:coeff items, we have a dictionary of
base:expdict items, where expdict is itself a dictionary of term:coeff
items that works just like Add. So for example, Mul(x**2, y,
x**(2*y)) becomes
{x:{1:2, y:2}, y:{1:1}}. Note that each item in the inner dictionary
represents a separate base. This leads to the rule for automatically
combining exponents in a Mul: only combine them if they have the same
term from as_coeff_terms (this method is a Mul specific similar method
for the rule I mentioned above). Note that we used to just combine
all exponents automatically unequivocally, but this was a bad behavior
because it was too restrictive (it was impossible to represent
exp(x)*exp(y) as a Mul, for example).

Note that although we convert the Add dict into Muls and the Mul dict
into Pows, this done for historical/compatibility/hashing reasons. I
think it should be possible to build a core that just uses the dict
representation without .args. The main problem is how to keep things
hashable so that you can put them as dict keys.

By the way, Mul.flatten is way more complicated than Add.flatten, in
part because it actually does more than this (for example, it
automatically reduces things like sqrt(6)*sqrt(2) to 2*sqrt(3)).
Also, both methods are currently clobbered with special cases for
Order, nan, and the infinities, and the code is not always the easiest
to read because it's optimized for efficiency (which I want to stress
is *very* important at this level). Actually, Add.flatten is fairly
readable, and I would definitely start there if you want to understand
this stuff. Mul.flatten might require some print statements or a run
through with a debugger to fully understand everything that is going
on.

Aaron Meurer

Reply all
Reply to author
Forward
0 new messages