cancellation in sqrt(a * x**2) / (b * x), with x being a NUMBER

10 views
Skip to first unread message

klmn

unread,
Jul 27, 2010, 8:20:44 AM7/27/10
to sympy
Dear all,

I am trying to calculate Gaunt coefficients (integrals of 3 spherical
harmonics) for subsequent storing and have problems with simplifying
the results.

1) Consider a simple example. A relatively short ratio does simplify
(10 is cancelled):
>>> sympy.Rational(1,30)*sympy.sqrt(1000000100)
10000001**(1/2)/3

whereas if the number under sqrt is longer than some critical length,
it doesn't simplify (10 is not cancelled):
>>> sympy.Rational(1,30)*sympy.sqrt(10000000100)
10000000100**(1/2)/30

Question: how can I tweak this magic critical length?

2) here is a real example:
I got this result from sympy:
7503204762339254285241591792888384061440000000000000**(1/2)/
614846149298305892352000000
whereas I know it should be
91*36890**(1/2)/124062
(numerically these are identical)

…I would expect at least the zeroes got killed.

Could you please direct me towards the right simplification path?
Googling around didn’t help much.

Thanks in advance!

Ondrej Certik

unread,
Jul 27, 2010, 12:40:17 PM7/27/10
to sy...@googlegroups.com
Hi Kimn,

Right. This is implemented in sympy/core/numbers.py, line 1013:

# The following is an algorithm where we collect perfect roots
# from the factors of base
if b > 4294967296:
# Prevent from factorizing too big integers
return None

and the follow up code. You can trivially remove those lines and it
should start working for your case.

However, I think there is a reason for this. Would you help us
investigate what Mathematica does and if they factor everything
always, or only sometimes (and if so, how you, as a user, can get the
simplified answer when needed)?

We should probably revisit the above code.

Many thanks,
Ondrej

Aaron S. Meurer

unread,
Jul 27, 2010, 5:18:14 PM7/27/10
to sy...@googlegroups.com

The reason is that factorint() takes exponential time in the worst case on the size of the input. That means that if you happen to enter sqrt(somewhat_large_semiprime), where somewhat_large_semiprime is a semiprime perhaps no more than 20 digits long (a semiprime is the product of two primes of roughly the same length), it could hang for a very long time. This is no fault of SymPy's. No one knows how to efficiently factor large integers. See http://en.wikipedia.org/wiki/Integer_factorization. In fact, the fact that this problem is so hard to solve lies at the core of the security of the popular RSA encryption algorithm.

Now I do not know if the problem can be optimized for what sqrt does, namely, given a specific root, compute all factors of the integer that have multiplicity a multiple of that root (e.g., sqrt(2**2 * 3**3 * 5**4) => 2 * 5**2 * sqrt(3**3)). Does anyone know the answer to this?

Either way, I think we should give an option to override the limit in sqrt and Pow, so if someone wants it to reduce, he can do it, but he should know that it could cause things to not terminate in a long time (or effectively never), if he gives it a bad input. Alternately, we should have a specialized simplify function that handles it, regardless of any limit. It would be like powsimp(), except I would want to put it in a separate function because of the time problems. Maybe ipowsimp().

Aaron Meurer

klmn

unread,
Jul 28, 2010, 8:50:24 AM7/28/10
to sympy
Aaron, your explanations sound reasonable BUT
following the Ondrej's suggestion I've tried the case in Mathematica
and !!!immediately!!! got these results:
______________________________
>>> Simplify (7503204762339254285241591792888384061440000000000000^(1/2)/614846149298305892352000000)
: 91(595/62)^(1/2)/2001

>>> Simplify (1000000000000000000000000000000000000000000000000000000000000000000000000000000000100^(1/2)/30)
:
10000000000000000000000000000000000000000000000000000000000000000000000000000000001^(1/2)/
3
______________________________

After I have removed the limit cited by Ondrej, I also get an
immediate result for the 1st example:
>>>sympy.sqrt(7503204762339254285241591792888384061440000000000000)/614846149298305892352000000

91*36890**(1/2)/124062

but not for the 2nd one where the task in sympy will run forever.

I have to admit that Mathematica behaves in the way you expect it and
Sympy, in contrast, either silently skips the simplification or hangs.

klmn

unread,
Jul 28, 2010, 8:58:07 AM7/28/10
to sympy
Ondrej and all,

Here comes another related question.

My big numbers under sqrt() in the numerator are calculated as
products of several factorials. So these numbers ARE ALREADY factored.
I calculate them symbolically. Is it possible to retain this
factorization for subsequent simplification?

Regards,
Konstantin

Ondrej Certik

unread,
Jul 28, 2010, 1:15:13 PM7/28/10
to sy...@googlegroups.com

Yes, just apply the sqrt to each of them separately.

Let me know if it works.

Ondrej

Ondrej Certik

unread,
Jul 28, 2010, 1:16:25 PM7/28/10
to sy...@googlegroups.com

Wait a minute. What happens if you don't call simplify in mathematica?
Does mathematica automatically simplifies such things?

Ondrej

Aaron S. Meurer

unread,
Jul 28, 2010, 1:56:23 PM7/28/10
to sy...@googlegroups.com

On Jul 28, 2010, at 11:16 AM, Ondrej Certik wrote:

> On Wed, Jul 28, 2010 at 5:50 AM, klmn <kklem...@cells.es> wrote:
>> Aaron, your explanations sound reasonable BUT
>> following the Ondrej's suggestion I've tried the case in Mathematica
>> and !!!immediately!!! got these results:

Well, factoring isn't *always* slow, just sometimes. So factorint(7503204762339254285241591792888384061440000000000000) returns instantly in sympy too. It's only in cleverly designed numbers that it shows up.

Also, I see the limit option to factorint. I think we should use that instead of a raw numerical limit so that we always get the small factors, and only don't try finding the big ones.

Aaron Meurer

>> ______________________________
>>>>> Simplify (7503204762339254285241591792888384061440000000000000^(1/2)/614846149298305892352000000)
>> : 91(595/62)^(1/2)/2001
>>
>>>>> Simplify (1000000000000000000000000000000000000000000000000000000000000000000000000000000000100^(1/2)/30)
>> :
>> 10000000000000000000000000000000000000000000000000000000000000000000000000000000001^(1/2)/
>> 3
>> ______________________________
>>
>> After I have removed the limit cited by Ondrej, I also get an
>> immediate result for the 1st example:
>>>>> sympy.sqrt(7503204762339254285241591792888384061440000000000000)/614846149298305892352000000
>>
>> 91*36890**(1/2)/124062
>>
>> but not for the 2nd one where the task in sympy will run forever.
>>
>> I have to admit that Mathematica behaves in the way you expect it and
>> Sympy, in contrast, either silently skips the simplification or hangs.
>
> Wait a minute. What happens if you don't call simplify in mathematica?
> Does mathematica automatically simplifies such things?
>
> Ondrej


Yes. Unfortunately, one of the downsides to using Python is that we cannot prevent Python from automatically combining numbers when you multiply them. So if you do

sqrt(34473824732948*3784372438924738294)

by the time sqrt sees it, it just sees 130461792183690387558425339110712. So Ondrej's way probably the only workaround.

Aaron Meurer

klmn

unread,
Jul 28, 2010, 1:57:45 PM7/28/10
to sympy
> Yes, just apply the sqrt to each of them separately.
Sorry, I don't understand this. Let me reformulate the question.
I have a big number which was calculated by several multiplications. I
want to take sqrt and divide by another number (factorization of this
one is unknown). Then I want to simplify. The question was whether the
fact that the number under sqrt was factored could be helpful somehow.
I don know how sympy is implemented but I assume that the answer is
negative because that number is simply compared with 4294967296 (in
sympy/core/numbers.py) without analyzing its history.

klmn

unread,
Jul 28, 2010, 1:59:17 PM7/28/10
to sympy
> Wait a minute. What happens if you don't call simplify in mathematica?
Then I get a decimal number instead of symbolic fraction and sqrt.

Ondrej Certik

unread,
Jul 28, 2010, 1:59:58 PM7/28/10
to sy...@googlegroups.com

Aaron has answered this in the other email.

My suggestion is to apply "sqrt" on each part, before you assemble the
big number. You have to do it yourself.

Ondrej

Ondrej Certik

unread,
Jul 28, 2010, 2:01:14 PM7/28/10
to sy...@googlegroups.com
On Wed, Jul 28, 2010 at 10:59 AM, klmn <kklem...@cells.es> wrote:
>> Wait a minute. What happens if you don't call simplify in mathematica?
> Then I get a decimal number instead of symbolic fraction and sqrt.

So if you do this:

7503204762339254285241591792888384061440000000000000^(1/2)/614846149298305892352000000

it doesn't get simplified?

Ondrej

klmn

unread,
Jul 30, 2010, 4:47:44 AM7/30/10
to sympy
The Aaron's suggestion:...
>Also, I see the limit option to factorint. I think we should use that instead of a raw numerical
>limit so that we always get the small factors, and only don't try finding the big ones.
... is great. This would solve the 1st example in my 1st message
(above).

> So if you do this:
> 7503204762339254285241591792888384061440000000000000^(1/2)/6148461492983058­92352000000
> it doesn't get simplified?

Oops, yes, it does. I've never used Mathematica before and in my first
trials I put ^(0.5) and then it returned a number. When I put ^(1/2),
it simplifies symbolically.

I have played with yet bigger Gaunt coefficients and in all trials the
simplification went very fast. When I do separate sqrt to the 9
factors there or just one sqrt to the product, the execution speeds
are the same (within the statistical variation).

So the only critical change to sympy was to comment 'b > 4294967296'
out. Thanks for the good advice!

Ondrej Certik

unread,
Jul 30, 2010, 2:03:10 PM7/30/10
to sy...@googlegroups.com
On Fri, Jul 30, 2010 at 1:47 AM, klmn <kklem...@cells.es> wrote:
> The Aaron's suggestion:...
>>Also, I see the limit option to factorint.  I think we should use that instead of a raw numerical
>>limit so that we always get the small factors, and only don't try finding the big ones.
> ... is great. This would solve the 1st example in my 1st message
> (above).
>
>> So if you do this:
>> 7503204762339254285241591792888384061440000000000000^(1/2)/6148461492983058­92352000000
>> it doesn't get simplified?
>
> Oops, yes, it does. I've never used Mathematica before and in my first
> trials I put ^(0.5) and then it returned a number. When I put ^(1/2),
> it simplifies symbolically.

Ok, so Mathematica always simplifies such things?

If so, maybe we can also increase the constant in power.py.

>
> I have played with yet bigger Gaunt coefficients and in all trials the
> simplification went very fast. When I do separate sqrt to the 9
> factors there or just one sqrt to the product, the execution speeds
> are the same (within the statistical variation).
>
> So the only critical change to sympy was to comment 'b > 4294967296'
> out. Thanks for the good advice!

Yes, that's a workaround, but not a solution, because as Aaron has
explained, in principle, this can hang.

Ondrej

Aaron S. Meurer

unread,
Jul 30, 2010, 5:13:09 PM7/30/10
to sy...@googlegroups.com

The solution is just as I said, to use factorint() with the limit kwarg as per issue 2003. Even before that issue is fixed, it would only be one or two lines of code to manually call factorint() on your number and use that to compute the square root.

I wonder what mathematica does when you pass it a large semiprime. So what does it do if you take the sqrt() of the bottom two numbers?

In [1]: nextprime(100000000000000000000000)
Out[1]: 100000000000000000000117

In [2]: nextprime(100000000000000000000117)
Out[2]: 100000000000000000000157

In [3]: 100000000000000000000117**2*100000000000000000000157**2
Out[3]: 100000000000000000000548000000000000000001118140000000000000001006621200000000000000337420161

In [4]: 100000000000000000000117**2*100000000000000000000157
Out[4]: 1000000000000000000003910000000000000000005042700000000000000002149173

I'm guessing it will simplify [3], because it is a perfect square, but it might not [4], unless their limit is larger than that, or if there really is an efficient algorithm for computing the square part of an integer. (btw, give it the numbers from Out, not In, so it doesn't have an advantage of seeing a factorization).

Aaron Meurer

Aaron S. Meurer

unread,
Jul 30, 2010, 5:16:16 PM7/30/10
to sy...@googlegroups.com

I don't own Mathematica, but here's what Maple 12 does:

> sqrt(100000000000000000000548000000000000000001118140000000000000001006621200000000000000337420161);
10000000000000000000027400000000000000000018369
> sqrt(1000000000000000000003910000000000000000005042700000000000000002149173);
(1/2)
1000000000000000000003910000000000000000005042700000000000000002149173

So it evaluates the first one, but not the second, just as I predicted.

Aaron Meurer

Ondrej Certik

unread,
Jul 30, 2010, 5:17:35 PM7/30/10
to sy...@googlegroups.com

I did that myself:


In[1]:= Sqrt[1000000000000000000003910000000000000000005042700000000000000002149173]

Out[1]= Sqrt[1000000000000000000003910000000000000000005042700000000000000002\

> 149173]

In[2]:= Sqrt[20]

Out[2]= 2 Sqrt[5]


Apparently, unless I am doing something stupid (quite probably here),
Mathematica doesn't factor it either, if it's too big.

Ondrej

Ondrej Certik

unread,
Jul 30, 2010, 5:20:16 PM7/30/10
to sy...@googlegroups.com

And the other one:

In[3]:= Sqrt[100000000000000000000548000000000000000001118140000000000000001006621200000000000000337420161]

Out[3]= 10000000000000000000027400000000000000000018369


So just like Maple.

However, to come to the original problem --- Mathematica just works
for Kimn, but not sympy. So the solution is to increase the number I
guess? I wonder what the limit is in Mathematica. Is there some way to
find it out?

I guess since our factorization algorithm is slower, we might need
some lower number. But maybe our current number is too low.

Ondrej

Aaron S. Meurer

unread,
Jul 30, 2010, 5:44:39 PM7/30/10
to sy...@googlegroups.com

Maple's docs for sqrt: http://www.maplesoft.com/support/help/Maple/view.aspx?path=sqrt&term=sqrt

It says "The simplificationsqrt(999988999906999847) = sqrt(1000003^2*999983) ==> 1000003*999983^(1/2) is not attempted because it is too expensive in general to try to find the factors 1000003 and 999983."

Again, I want to stress that their limit is on the factors, not the number itself, but it looks like it is actually quite small.

In sympy, I tried

a = 1000
b = nextprime(a)*nextprime(nextprime(a))
%timeit factorint(b)

and increased a by 10x until I got to a = 100000000000 (1e11), which was the first one that took more than a second. The times fluctuated, though. It's roughly exponential, but one factorization would be a little faster than the previous, then the next would be 100x slower, and so on. Maybe someone who know more about how to generate worst case examples for Pollard-Rho can do better, but it seems to me that the factors limit should be around 10000000000 to 100000000000 (1e10 to 1e11). But Maple stops at at least 1e6 based on their docs, so probably it needs to be timed more rigorously first.

After we fix issue 2003 and people want to sqrt to work as hard as it can, no matter the factors are, they should pass it to factorint(), or maybe there could be a keyword argument to sqrt that overrides the limit.

Aaron Meurer

smichr

unread,
Sep 25, 2010, 6:41:07 AM9/25/10
to sympy
On Jul 30, 1:47 pm, klmn <kklement...@cells.es> wrote:
> The Aaron's suggestion:...>Also, I see thelimitoption to factorint.  I think we should use that instead of a raw numerical
> >limitso that we always get the small factors, and only don't try finding the big ones.
>
> ... is great. This would solve the 1st example in my 1st message
> (above).
>
> > So if you do this:
> > 7503204762339254285241591792888384061440000000000000^(1/2)/6148461492983058­92352000000
> > it doesn't get simplified?
>
> Oops, yes, it does. I've never used Mathematica before and in my first
> trials I put ^(0.5) and then it returned a number. When I put ^(1/2),
> it simplifies symbolically.
>
> I have played with yet bigger Gaunt coefficients and in all trials the
> simplification went very fast. When I do separate sqrt to the 9
> factors there or just one sqrt to the product, the execution speeds
> are the same (within the statistical variation).
>
> So the only critical change to sympy was to comment 'b > 4294967296'
> out. Thanks for the good advice!

As has been pointed out, factorint uses a limit. What is being used in
numbers.py is a trialdivision method ".factors" rather than
factorint() so when you take the limit way by commenting out the line
you indicate you give it permission to hang as it keeps trying larger
and larger trail factors. It would be better in numbers to pass off b
to factorint. When making these changes,

if 0 and b > 4294967296:
# Prevent from factorizing too big integers
return None
from sympy import factorint
dict = factorint(b) #b.factors()

The following is nearly instantaneous:

>>> sqrt(1000000000000000000000000000000000000000000000000000000000000000000000000000000000100)
10*10000000000000000000000000000000000000000000000000000000000000000000000000000000001**(1/2)
>>>
Reply all
Reply to author
Forward
0 new messages