Symbolic trigonometric optimizations

86 views
Skip to first unread message

Jari-Pekka Ikonen

unread,
Feb 16, 2016, 2:04:39 PM2/16/16
to sympy
Some trigonometric expressions can easily and quickly be simplified without ever calculating some of the functions in it just by using the known properties of the functions in the expression.

For example:

sin(2*pi)

is 0. So:

sin(pi*factorial(847937526693730893984732849857349))

is also 0. This does not require calculation of the factorial, but the property knowledge, that the result of the factorial is an even integer.

There are several other examples of expressions, like:

tan(pi*(1342^(87236487262873^(8732498237693269832+3)+5)-1))

is also 0. Several others:

cos(pi*(factorial(8629264243264^64862423642638763847)+1/2))

is 0.

Could this be implemented in sympy? So many other examples would be much faster to calculate.

Aaron Meurer

unread,
Feb 16, 2016, 2:39:53 PM2/16/16
to sy...@googlegroups.com
The problem here isn't so much to do with trig identities as the big
numbers. The issue is that things like
factorial(847937526693730893984732849857349) and
1342**(87236487262873**(8732498237693269832+3)+5)-1) evaluate
automatically, which causes Python to try to compute them until the
memory fills up and it crashes.

The way I would fix this is to create a special BigInteger object,
which would wrap large integers and avoid explicit computation. For
example, BigInteger(10)**BigInteger(10)**BigInteger(100) would remain
unevaluated. It would then use some algorithms and the assumptions
system to compute facts. So something like
factorial(BigInteger(847937526693730893984732849857349)).is_integer
would be True, which would be enough for
sin(pi*factorial(BigInteger(847937526693730893984732849857349))) to
simplify. There are some cool things that you could do with this,
like (2**BigInteger(74207281) - 1).is_prime. For anyone interested,
there's probably enough cool stuff that could be done here for a GSoC
project.

(By the way, I had an issue open in the issue tracker a while ago
about this, but I can't find it)

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/218ff008-9f3a-479e-8b4e-bca49d646294%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Aaron Meurer

unread,
Feb 16, 2016, 3:57:39 PM2/16/16
to sy...@googlegroups.com
I found the issue https://github.com/sympy/sympy/issues/6835.

Aaron Meurer

Ondřej Čertík

unread,
Feb 16, 2016, 5:15:50 PM2/16/16
to sympy
On Tue, Feb 16, 2016 at 12:39 PM, Aaron Meurer <asme...@gmail.com> wrote:
> The problem here isn't so much to do with trig identities as the big
> numbers. The issue is that things like
> factorial(847937526693730893984732849857349) and
> 1342**(87236487262873**(8732498237693269832+3)+5)-1) evaluate
> automatically, which causes Python to try to compute them until the
> memory fills up and it crashes.
>
> The way I would fix this is to create a special BigInteger object,
> which would wrap large integers and avoid explicit computation. For
> example, BigInteger(10)**BigInteger(10)**BigInteger(100) would remain
> unevaluated. It would then use some algorithms and the assumptions
> system to compute facts. So something like
> factorial(BigInteger(847937526693730893984732849857349)).is_integer
> would be True, which would be enough for
> sin(pi*factorial(BigInteger(847937526693730893984732849857349))) to
> simplify. There are some cool things that you could do with this,
> like (2**BigInteger(74207281) - 1).is_prime. For anyone interested,
> there's probably enough cool stuff that could be done here for a GSoC
> project.
>
> (By the way, I had an issue open in the issue tracker a while ago
> about this, but I can't find it)

Another point is that SymPy should not do automatic simplification of
these large things, because the code to do that is not completely
trivial and always fast. However, there should be a function in sympy
that can do these simplifications when called explicitly by the user.

Ondrej
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAKgW%3D6JTFaC9O08ymvq0tLb1p-1Lz2iuWFduCAL-ZSPyby8k0w%40mail.gmail.com.

Aaron Meurer

unread,
Feb 16, 2016, 5:59:56 PM2/16/16
to sy...@googlegroups.com
That's true, but I still think it's a good idea to have this. Take for
instance the googolplex, 10**10**100. You can represent this already
using Pow(10, Pow(10, 100, evaluate=False), evaluate=False). But there
are two issues with this. First is that evaluate=False is very
fragile. Any function that tries to rebuild the expression will cause
it to evaluate. And even if you fixed that issue some other function
might try to take the expression apart and use it to build some other
expressions. That's why I think you need a BigNumber wrapper for the
numbers, rather than some kind of hold property on the operators.
With BigNumber(10)**BigNumber(10)**BigNumber(100), no matter what
function you pass this to it won't accidentally try evaluating it,
because it will basically look the same as n**n**m for n, m =
symbols('n m', integer=True).

Second, you build these large numbers out of smaller numbers. Python
can represent a googol just fine

In [25]: 10**100
Out[25]: 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

it's the googolplex that you don't want to compute. But you don't want
to represent it as 10**
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000,
you want to represent it as 10**10**100. So even if Pow is smart
enough to not evaluate Pow(10,
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000),
it would still evaluate Pow(10, 100), so Pow(10, Pow(10, 100)) would
reasonably evaluate to Pow(10,
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000),
when you really want Pow(10, Pow(10, 100)). The same thing happens for
factorial.

However, I definitely am +1 for functions like factorial automatically
not trying to evaluate if the argument is above some reasonably large
value. My concern would just be with picking the value. Some people
might actually want to sit and wait to compute some large factorial.

Aaron Meurer
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CADDwiVAA1PsNQX9W%3DO%3DsPXgdq%3DFJ0cSEeqhsWATGMYoiFmfcgg%40mail.gmail.com.

Francesco Bonazzi

unread,
Feb 17, 2016, 1:38:37 PM2/17/16
to sympy


On Tuesday, 16 February 2016 23:59:56 UTC+1, Aaron Meurer wrote:
Take for instance the googolplex, 10**10**100. You can represent this already
using Pow(10, Pow(10, 100, evaluate=False), evaluate=False). But there
are two issues with this. First is that evaluate=False is very
fragile. Any function that tries to rebuild the expression will cause
it to evaluate.

What about introducing a persistent container for unevaluated objects? It would define .func( ) so as to never evaluate its contained expression.

Aaron Meurer

unread,
Feb 17, 2016, 1:45:57 PM2/17/16
to sy...@googlegroups.com
But func isn't the only way an evaluation might happen. An algorithm
might try to pull apart the expression and use the terms in different
ways (for example, an algorithm that does the transformation a**x*a**y
-> a**(x + y) extracts the parts from the two powers and creates a new
one). The only way to prevent this in general is to hide the fact that
it's a Pow, but if you do that, you lose the ability to perform some
computations.

By wrapping the numbers, you essentially make n**n**m, where n and m
are symbolic, except n and m magically have the correct assumptions
for n = 10 and m = 100.

Aaron Meurer

>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/a3ef333b-5b0a-4d0b-96fb-597b8ae14fc0%40googlegroups.com.

Francesco Bonazzi

unread,
Feb 17, 2016, 2:09:57 PM2/17/16
to sympy


On Wednesday, 17 February 2016 19:45:57 UTC+1, Aaron Meurer wrote:

But func isn't the only way an evaluation might happen. An algorithm
might try to pull apart the expression and use the terms in different
ways (for example, an algorithm that does the transformation a**x*a**y
-> a**(x + y) extracts the parts from the two powers and creates a new
one). The only way to prevent this in general is to hide the fact that
it's a Pow, but if you do that, you lose the ability to perform some
computations.

Yes, you may seal the expression tree inside a held object.

In the case of Pow, the unevaluated expression may have the arguments that you would generally pass to Pow, and maybe even inherit Pow or have a __getattr__ that calls Pow back.

It should be thought of a bit, but I believe this could be feasible.

Francesco Bonazzi

unread,
Feb 17, 2016, 3:25:18 PM2/17/16
to sympy
As an example:

class Unevaluated(Expr):
       
def __new__(cls, wrapped_type, *args, **kwargs):
                obj
= Expr.__new__(cls, *args, **kwargs)
                obj
._wrapped_type = wrapped_type
                kwargs
["evaluate"] = False
                obj
._wrapped_object = wrapped_type(*args, **kwargs)
               
return obj

       
def func(self, *args, **kwargs):
               
return Unevaluated(self._wrapped_type, *args, **kwargs)

       
def __getattr__(self, name):
               
return getattr(self._wrapped_type, name)

       
def __str__(self):
               
return self._wrapped_object.__str__()

       
def __repr__(self):
               
return self._wrapped_object.__repr__()

       
def evaluate(self):
               
return self._wrapped_object.func(*self._wrapped_object.args)


So you would call Unevaluated(Pow, 2, 3) to represent 2**3 in the unevaluated form.

Aaron Meurer

unread,
Feb 17, 2016, 4:33:18 PM2/17/16
to sy...@googlegroups.com
I think this object would be useful, but not necessarily for the sort
of things suggested in this thread.

This would be useful for most of the common use-cases today of people
using evaluate=False, which usually amounts to making an expression
look a specific way, or keeping some numbers from combining. So for
instance if you are using SymPy to generate some step by step algebra
instructions and you want to be able to write something like x**2 +
2*x + 3 - 1 + 1 (say, for some completing the square instructions),
then this is useful. Given how prevalent questions about this sort of
thing are here and on StackOverflow, I think we definitely need
something like this, especially since evaluate=False has so many
issues.

However, for the idea being discussed here, the point is not just
about keeping numbers unevaluated but also computing with them. A much
more trivial way to keep numbers "unevaluated" is to use symbols with
numeric names, like Symbol('10')**Symbol('10')**Symbol('100'). That
also won't evaluate.

Say you want to use SymPy to try to figure out, for instance if
10**10**100 > E**E**E**E**E. A good start would be to take logs of
both sides, since log lets you pull out exponents, and it is
monotonic. You need some way of representing those numbers that works
with SymPy functions (like log), but absolutely never tries to
compute. If some function accidentally calls evalf on the wrong
subexpression it will raise an exception (best case) or hang the
computer (worst case).

Just using Pow(10, Pow(10, 100, evaluate=False), evaluate=False) is
bad because evaluate=False expressions are so fragile. Your
Unevaluated won't ever evaluate, but also SymPy functions won't know
what to do with it (e.g., log(Unevaluated(Pow, 2, 3)) won't simplify
to 3*log(2), even if you pass it through simplification functions like
expand_log(), because they won't recognize the Unevaluated as a Pow.
You would need to modify them. Conversely, if we used some
metaprogramming magic to make Unevaluated look like a Pow, then
existing functions might do some manipulations to it which would
effectively evaluate it, or at the very least, try to numerically
evaluate it, which is a no-no for our big number examples. Finally,
log(Symbol('10')**Symbol('10')**Symbol('100')) is no good because the
symbols don't have the right assumptions on them. Symbol('10',
even=True, positive=True) would get you closer.

My idea is to have a BigNumber wrapper, which pulls assumptions from
its arguments automatically. BigNumber(10).is_positive would be True.
But importantly, BigNumber(10).is_number would be False, and
BigNumber(10).evalf() would be a noop. There could also be a
big_number_simp() which would cautiously go through an expression and
remove BigNumber wrappers when they are no longer needed. For
instance, if you simplify
log(BigNumber(10)**BigNumber(10)**BigNumber(100)) to
BigNumber(10)**BigNumber(100)*log(BigNumber(10)) all the BigNumbers
can be removed (and 10**100 evaluated, since it isn't that big).

So, in short, I think both ideas are useful, but for different applications.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/00f00329-a754-40dd-a4d3-d8e3be3a156a%40googlegroups.com.

Francesco Bonazzi

unread,
Feb 17, 2016, 5:01:50 PM2/17/16
to sympy


On Wednesday, 17 February 2016 22:33:18 UTC+1, Aaron Meurer wrote:

My idea is to have a BigNumber wrapper, which pulls assumptions from
its arguments automatically. BigNumber(10).is_positive would be True.
But importantly, BigNumber(10).is_number would be False, and
BigNumber(10).evalf() would be a noop. There could also be a
big_number_simp() which would cautiously go through an expression and
remove BigNumber wrappers when they are no longer needed.

I see. May I suggest to change its name, "Big" seems to imply the number being large, while BigNumber(10) is quite small. What about SymbolicNumber or NumberSymbol?

Aaron Meurer

unread,
Feb 17, 2016, 6:32:08 PM2/17/16
to sy...@googlegroups.com
I agree. Either of your suggestions sounds better. The number itself isn't large, but it will likely be used in functions or combined in a way that would create a large number. 

Aaron Meurer 

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
Reply all
Reply to author
Forward
0 new messages