Issue 3809 in sympy: isprime can be faster

9 views
Skip to first unread message

sy...@googlecode.com

unread,
May 7, 2013, 6:55:36 PM5/7/13
to sympy-...@googlegroups.com
Status: Valid
Owner: ----
Labels: Type-Defect Priority-Medium Combinatorics

New issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

See http://www.johndcook.com/blog/2013/01/17/narcissus-prime-in-python/.
The code isprime(int("1808010808"*1560 + "1")) should return within a
couple of minutes.

--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

sy...@googlecode.com

unread,
May 9, 2013, 6:11:20 AM5/9/13
to sympy-...@googlegroups.com

Comment #1 on issue 3809 by idlike2d...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Can implementing AKS primality test will be of any help in this issue??

sy...@googlecode.com

unread,
May 10, 2013, 8:23:47 AM5/10/13
to sympy-...@googlegroups.com

Comment #2 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Well, we are using the Rabin-Miller Strong Pseudoprime Test
(http://mathworld.wolfram.com/Rabin-MillerStrongPseudoprimeTest.html) right
now which is probabilistic in nature.

The AKS primality testing algorithm is a deterministic algorithm. If we
want to go with a deterministic algorithm anyway, we should in fact use
Lenstra and Pomerance (http://www.math.dartmouth.edu/~carlp/aks041411.pdf)
which is an improvement over the AKS algorithm.

Mathematica seems to use another probabilistic algorithm as well though
(http://mathworld.wolfram.com/LucasPseudoprime.html) so perhaps we should
implement that as well.

In fact, I believe that we should come up with a mix to use these
algorithms for appropriate cases. There should be a cross-over point where
the deterministic version works better compared to the probabilistic one.
So we can use that to use either a one or the other version depending on
the crossover point.

Perhaps someone with more knowledge in number theory can give more insights?

> The code isprime(int("1808010808"*1560 + "1")) should return within a
> couple of minutes.

It seems it does on PyPy (<18 minutes according to one of the comments).

sy...@googlecode.com

unread,
May 10, 2013, 8:29:08 AM5/10/13
to sympy-...@googlegroups.com

Comment #3 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Scratch the last line.

The user ran "Iterative method for Fermat's Test" which ran in <18 minutes
on PyPy so perhaps we should implement a deterministic test after all.

sy...@googlecode.com

unread,
May 10, 2013, 12:16:31 PM5/10/13
to sympy-...@googlegroups.com

Comment #4 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

It's also possible that we are doing something inefficiently in Miiller
Rabin.

sy...@googlecode.com

unread,
May 11, 2013, 2:34:47 AM5/11/13
to sympy-...@googlegroups.com

Comment #5 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Certainly. In any case, I'll try to look for these inefficiencies today and
see what I find.

sy...@googlecode.com

unread,
May 11, 2013, 5:15:03 AM5/11/13
to sympy-...@googlegroups.com

Comment #6 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

So I looked at the code. The only inefficiency I found was that 2
parameters were being calculated many times in the _test function (line 62
and 63) when they should be calculated only once for each n. But this
doesn't seem to be the major reason of slowness. From the looks of it, the
slowness occurs because of the loop on lines 69-72.

On the link mentioned in the bug report, there was a mention of using
multiple threads for different bases. Should we do that? I haven't seen
threading used elsewhere in SymPy but maybe it is. Also, someone should
probably check whether the test for the Narcissus prime really does take
that long as mentioned on the post.

sy...@googlecode.com

unread,
May 11, 2013, 1:01:19 PM5/11/13
to sympy-...@googlegroups.com

Comment #7 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

I ran it on my computer and it took several hours to finish.

sy...@googlecode.com

unread,
May 11, 2013, 1:02:19 PM5/11/13
to sympy-...@googlegroups.com

Comment #8 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

If you have time, maybe run it with line_profiler.

sy...@googlecode.com

unread,
May 11, 2013, 1:04:29 PM5/11/13
to sympy-...@googlegroups.com

Comment #9 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

I wonder if pow(b, 2, n) would be faster.

sy...@googlecode.com

unread,
May 11, 2013, 1:07:09 PM5/11/13
to sympy-...@googlegroups.com

Comment #10 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

We should add some kind of verbose flag to isprime that gives the status,
similar to factorint.

sy...@googlecode.com

unread,
May 11, 2013, 2:15:38 PM5/11/13
to sympy-...@googlegroups.com

Comment #11 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

I'm travelling right now and am away from my workstation. I'll see what can
be done once I'm free.

sy...@googlecode.com

unread,
May 12, 2013, 12:56:01 AM5/12/13
to sympy-...@googlegroups.com

Comment #12 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

My internet is messed up right now and I don't have the python-dev package
so I cannot build line profiler right now. But using %time on ipython, I
don't think that pow(b, 2, n) makes any difference. I tested for the prime
2**1279 - 1 and got exactly the same results for using pow(b, 2, n) against
using (b**2)%n.

sy...@googlecode.com

unread,
May 14, 2013, 1:02:17 PM5/14/13
to sympy-...@googlegroups.com

Comment #13 on issue 3809 by smi...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

the mr routine raises the bases to a large power mod n. The larger the
power the longer this takes. Since the number in question here has about
16000 digits we won't expect a return in a matter of minutes unless that
power(base, t, n) can be faster:

>>> from time import time
>>> p=100
>>> while 1:
... n=(10**p)
... t=time();j=pow(2,n,n);print time()-t,p
... p*=2
...
0.00300002098083 100
0.0210001468658 200
0.156000137329 400
0.84299993515 800
5.82999992371 1600
65.0769999027 3200

sy...@googlecode.com

unread,
May 14, 2013, 1:17:11 PM5/14/13
to sympy-...@googlegroups.com

Comment #14 on issue 3809 by smi...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

btw, the difference between n**2 % n and pow(n,2,n) can be pretty dramatic:

>>> p=1
>>> while 1:
... n=int((10**p))
... t=time();j=n**2 % n;print time()-t,p
... p*=2
...
0.0 1
0.0 2
0.0 4
0.0 8
0.0 16
0.0 32
0.0 64
0.0 128
0.0 256
0.0 512
0.0 1024
0.0019998550415 2048
0.0360000133514 4096
0.0379998683929 8192
0.104000091553 16384
0.398000001907 32768
1.39400005341 65536
4.4509999752 131072
17.4820001125 262144

>>> p=1
>>> while 1:
... n=int((10**p))
... t=time();j=pow(n,2,n);print time()-t,p
... p*=2
...
0.0 1
0.0 2
0.0 4
0.0 8
0.0 16
0.0 32
0.0 64
0.0 128
0.0 256
0.0 512
0.0 1024
0.0 2048
0.0 4096
0.0 8192
0.0 16384
0.000999927520752 32768
0.000999927520752 65536
0.000999927520752 131072
0.0019998550415 262144

That routine should probably be changed to use pow.

sy...@googlecode.com

unread,
May 14, 2013, 2:37:11 PM5/14/13
to sympy-...@googlegroups.com

Comment #15 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

It's not mod n. n**2 % n is always 0.

sy...@googlecode.com

unread,
May 15, 2013, 10:14:31 AM5/15/13
to sympy-...@googlegroups.com

Comment #16 on issue 3809 by smi...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

It's mod n but it's not n**2 it's b**2:

b = (b**2) % n --should be--> b = pow(b, 2, n)

...but it's not going to help much when computing b = pow(base, t, n)
involves such a large t.

sy...@googlecode.com

unread,
May 15, 2013, 1:14:23 PM5/15/13
to sympy-...@googlegroups.com

Comment #17 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

I think modular exponentiation helps the most with large powers.

sy...@googlecode.com

unread,
May 16, 2013, 11:53:20 AM5/16/13
to sympy-...@googlegroups.com

Comment #18 on issue 3809 by smi...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

See the tests above...a 262k digit base being squared either takes
about .002 seconds with pow or 10,000x longer with **

sy...@googlecode.com

unread,
May 16, 2013, 1:43:29 PM5/16/13
to sympy-...@googlegroups.com

Comment #19 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Oh of course. It will indeed be much faster with pow because it computes
the mod before squaring.

But is that the source of the slowdown here?

sy...@googlecode.com

unread,
May 16, 2013, 2:05:37 PM5/16/13
to sympy-...@googlegroups.com

Comment #20 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Well, I suppose we can just test out with large primes (Mersenne primes
maybe) for that. As I wrote before, on my workstation, it took the same
time with either way.

sy...@googlecode.com

unread,
May 16, 2013, 2:15:23 PM5/16/13
to sympy-...@googlegroups.com

Comment #21 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Btw, I ran a python script that uses the MR algorithm (found here :
http://en.literateprograms.org/index.php?title=Special:DownloadCode/Miller-Rabin_primality_test_(Python)&oldid=17104).

I checked for 2**1279 - 1 and it took half the time it took on SymPy. So,
definitely improvements can be made.

sy...@googlecode.com

unread,
May 16, 2013, 2:24:57 PM5/16/13
to sympy-...@googlegroups.com

Comment #22 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Do primes represent the worst case scenario in this algorithm. Should we
also test against composites?

sy...@googlecode.com

unread,
May 16, 2013, 2:47:47 PM5/16/13
to sympy-...@googlegroups.com

Comment #23 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Well, AFAIK, for the cases in the _mr_safe function (that is for if n <
10000000000000000), we can just test for certain primes and know _for sure_
whether n is prime or not. So there using primes as witnesses is justified.

Beyond that, we are using a list of 46 different primes to test for n not
composite. The overall probability that n is not prime will then be
(1/4)**46 which is good enough. As to your question, no, I do not think
that we need to have only primes to test for n > 10000000000000000. From
reading the algorithm on the internet, I think we should be able to take
any composite numbers as witness to the primality. Though why only primes
were used, I don't know.

I think it's quite clear
(http://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test#Algorithm_and_running_time)
that we can use composites.

Also, I do not know whether it will make much of a difference to not use
primes. We are doing pow(base, t, n) (line 58) where `base` is the number
in question. Unless pow can work better for composite bases compared to
prime bases, I think that it would be immaterial whether it is prime or
composite.

sy...@googlecode.com

unread,
May 16, 2013, 2:51:09 PM5/16/13
to sympy-...@googlegroups.com

Comment #24 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

And just to be clear, we don't need to test against composites *also*.
Primes themselves are good enough.

We are using any numbers between 2 and n-2 as witnesses to n's primality.
It doesn't matter whether they are prime or composite. I don't know why in
primetest.py, primes are exclusively used.

sy...@googlecode.com

unread,
May 16, 2013, 3:46:49 PM5/16/13
to sympy-...@googlegroups.com

Comment #25 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

You misunderstood my question. I'm not talking about the witnesses. I'm
talking about the inputs to isprime. You keep talking about testing it
against some large Mersenne prime. But shouldn't we also test it against
large composites?

sy...@googlecode.com

unread,
May 16, 2013, 3:48:35 PM5/16/13
to sympy-...@googlegroups.com

Comment #26 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

By the way, the b**2 % 2 thing actually doesn't matter. b is already
reduced mod n from the previous line, so using pow shouldn't make any
difference, especially since the exponent is only 2, so it's not like there
are any intermediate exponents that we can reduce.

sy...@googlecode.com

unread,
May 16, 2013, 3:50:24 PM5/16/13
to sympy-...@googlegroups.com

Comment #27 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Any one care to compare timings for
http://en.literateprograms.org/index.php?title=Special:DownloadCode/Miller-Rabin_primality_test_(Python)&oldid=17104)
to SymPy with the narcissus prime? I'm also curious how PyPy performs. I
suspect it can be faster, though it could depend a lot on how efficient
their big number implementation is.

sy...@googlecode.com

unread,
May 17, 2013, 9:08:05 AM5/17/13
to sympy-...@googlegroups.com

Comment #28 on issue 3809 by smi...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

> the b**2 % 2 thing actually doesn't matter. b is already reduced mod n
> from the previous line

Yeah, but n is huge in the case of the narcissus prime...so it can matter.

sy...@googlecode.com

unread,
May 17, 2013, 9:18:57 AM5/17/13
to sympy-...@googlegroups.com

Comment #29 on issue 3809 by smi...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

The code referred to online that is 2X faster is only using 20 bases (vs
46) to make the determination of primality so I think that explains why it
is faster.

sy...@googlecode.com

unread,
May 17, 2013, 9:35:02 AM5/17/13
to sympy-...@googlegroups.com

Comment #30 on issue 3809 by smi...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

On the basis of comment #13 I think this issue should be closed as invalid.
A 3200 digit number takes 65 seconds to square mod n. There is no way the
isprime test is going to return any faster on the much longer narcissus
prime unless one only tests a single prime.

Nonetheless, some changes to make things a little more efficient are in a
new PR.

sy...@googlecode.com

unread,
May 17, 2013, 9:36:02 AM5/17/13
to sympy-...@googlegroups.com
Updates:
Labels: NeedsReview smichr

Comment #31 on issue 3809 by smi...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

https://github.com/sympy/sympy/pull/2119

sy...@googlecode.com

unread,
May 17, 2013, 9:38:33 AM5/17/13
to sympy-...@googlegroups.com

Comment #32 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

@asmeurer: Sorry, I misunderstood your question. And I have run the test on
the narcissus prime. It's been running for a couple of hours now but I am
only using a netbook so it is bound to be slower.

@smichr: Hmm. That looks like the reason for the faster results. Probably
test on PyPy with it?

I can only see two inefficiencies till now: The one I mentioned earlier and
the other one is using pow(b, 2, n) vs using b**2 % n; The first one isn't
promising at all. We should test the second one.

If all fails, I suppose we should switch to some combination of algorithms
as we talked about before.

sy...@googlecode.com

unread,
May 17, 2013, 10:04:29 AM5/17/13
to sympy-...@googlegroups.com

Comment #33 on issue 3809 by prasoon9...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Oh I see that both of those inefficiencies have been fixed in the PR by
Chris.

sy...@googlecode.com

unread,
May 18, 2013, 6:11:56 AM5/18/13
to sympy-...@googlegroups.com
Updates:
Labels: -NeedsReview -smichr

Comment #34 on issue 3809 by smi...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

that PR has been committed but I'll leave the issue open. Although maybe
there is a different algorithm that might be faster, there is no way that I
can see to make Miller-Rabin faster other than by testing fewer base, so
there is no way (with MR) that the prime given in the OP will return in a
few minutes.

sy...@googlecode.com

unread,
May 22, 2013, 2:38:38 PM5/22/13
to sympy-...@googlegroups.com

Comment #35 on issue 3809 by smi...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

See http://www.trnicely.net/misc/bpsw.html for discussion about a test that
combines Miller-Rabil with Lucas-Lehmer.

sy...@googlecode.com

unread,
May 22, 2013, 2:39:59 PM5/22/13
to sympy-...@googlegroups.com

Comment #36 on issue 3809 by smi...@gmail.com: isprime can be faster
combines a single Miller-Rabin tests with the Lucas-Lehmer test.

sy...@googlecode.com

unread,
Jun 19, 2013, 4:58:36 AM6/19/13
to sympy-...@googlegroups.com

Comment #37 on issue 3809 by dana.jac...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

I've been working on primality tests and proofs for a Perl module, so this
is something I've been looking at a lot recently. Here are my long-winded
thoughts, with the caveat that I'm not a sympy developer.

Comment 1 & 2 mention AKS. AKS is theoretically important, but is still
basically useless in any practical sense. It is insanely slow, even with
Lenstra and Bernstein's improvements to the 'r' selection. Brent 2010
estimates 840 years for a 308 digit number (on a 1GHz machine, but even
with 100x faster hardware that's ridiculous). For the deterministic
variants I think you'll find that time isn't out of line.

Comment 3 mentions the "pypy ran a deterministic test in 18 minutes"
comment. He ran a single Fermat test, which is much less accurate than a
single M-R test. Try 561 through his function. or 1105, 1729, 2465, etc.
So after 18 minutes we've not accomplished very much. This is definitely
not the direction to go.


For small numbers (<2^64), you can use deterministic M-R tests, such as the
ones from https://miller-rabin.appspot.com/. That would be faster than the
existing code and remove the table of exceptions. BPSW would probably be
faster for the upper range. BPSW (standard, strong, extra strong, or the
oddball variants some packages implement) has no counterexamples < 2^64,
and this is typically the same speed or faster than running 4 M-R tests.
With my native (C without GMP) implementations I cut over to BPSW if the
number is over 32 bit in size -- your mileage may vary (this is purely a
benchmark-oriented performance decision at this size).

For large values, I think you really should use a variant of BPSW (see
Nicely's page or
http://en.wikipedia.org/w/index.php?title=Baillie%E2%80%93PSW_primality_test).
Here is a number from Arnault 1993 and another from Arnault 1995 that fool
sympy's current implementation:

8038374574536394912570796143419421081388376882875581458374889175222974273765333652186502336163960045457915042023603208766569966760987284043965408232928738791850869166857328267761771029389697739470167082304286871099974399765441448453411558724506334092790222752962294149842306881685404326457534018329786111298960644845216191652872597534901

2887148238050771212671429597130393991977609459279722700926516024197432303799152733116328983144639225941977803110929349655578418949441740933805615113979999421542416933972905423711002751042080134966731755152859226962916775325475044445856101949404200039904432116776619949629539250452698719329070373564032273701278453899126120309244841494728976885406024976768122077071687938121709811322297802059565867

Running lots of M-R tests to many fixed bases like sympy's current
implementation or libtommath does (and hence Perl 6) is problematic.
Arnault 1995 gives an example (the second above) that will pass MR tests
with all bases up to 306. That paper gives instructions on how to
construct such numbers. Using random bases mostly solves this problem.
However, we're still left with it being terribly inefficient -- running 46
M-R tests is 10-15 times slower than a BPSW test. If you're feeling
paranoid about letting a pseudoprime through, you can add a couple more
random-base M-R tests (only Mathematica and Math::Prime::Util seem to go to
this length that I'm aware of).

Why BPSW? Nicely's page should cover lots of the reasons. Almost all math
packages switched to using it, mostly in the 1990s. Maple, Mathematica
(with an added base 3 M-R test), Pari, SAGE, Maxima, FLINT, MPIR, PRIMO are
some examples that use a BPSW variant for their probable prime test. In 33
years nobody has published a concrete counterexample (though we know they
exist). For some reason the tests act anti-correlated.

Python's gmpy2 includes David Cleaver's C+GMP code for primality functions
including Lucas tests and a strong BPSW test, but is_prime still uses
mpz_probab_prime_p. I spoke to some of the Perl 6 people last month, and I
need to write up some code for NQP as well as start a discussion on the API.

Here are examples and times on my computer. The Perl tests use
Math::Prime::Util + Math::Prime::Util::GMP. This is C+GMP under the hood,
just like the gmpy and gmpy2 modules. Python 2.7. Based on the time
required for one run of mr with the latest git version (that uses pow(base,
t, n)), it looks like this could be done in a reasonable length of time.

# sympy from git, mr(int("1808010808"*1560 + "1"),[2])
12.2s

# sympy from git, isprime(int("1808010808"*1560 + "1"))
9 minutes 16.4s

Hmm, this is a few minutes. Of course I have the latest code so it's
presumably much more efficient than earlier, but I'm concerned about the
comment "[...]there is no way (with MR) that the prime given in the OP will
return in a few minutes." Let's look at some other systems.


# Pari/gp 2.6.0 ispseudoprime(....) uses a BPSW variant
60.12s

# gmpy 1.15's is_prime (mpz_probab_prime_p(n,25), or M-R with 25
fixed-random bases)
# Installed via Fedora
8 minutes 29.3s

# gmpy 2.0's is_prime (mpz_probab_prime_p(n,25), or M-R with 25
fixed-random bases)
# Built from sources along with latest MPFR, MPC, and GMP.
5 minutes 12.8s

# gmpy 2.0 is_strong_prp(int("1808010808"*1560 + "1"),2)
12.03s

# gmpy 2.0 is_selfridge_prp(int("1808010808"*1560 + "1"))
64.95s

# gmpy 2.0 is_strong_selfridge_prp(int("1808010808"*1560 + "1"))
64.85s

I wasn't able to get gmpy2's is_strong_bpsw_prp function to work, but it
would be about 77s based on the two tests it has to do.


My Perl module, with GMP implementations that are a little faster than the
ones in gmpy2:

# Just the SPRP-2 test:
time perl -Mbigint -MMath::Prime::Util=:all -E 'my $n = 0+("1808010808" x
1560 . "1"); say is_strong_pseudoprime($n,2);'
12.06s

# Strong Lucas-Selfridge test:
time perl -Mbigint -MMath::Prime::Util=:all -E 'my $n = 0+("1808010808" x
1560 . "1"); say is_strong_lucas_pseudoprime($n);'
47.17s

# Extra Strong Lucas test -- typically a bit faster:
time perl -Mbigint -MMath::Prime::Util=:all -E 'my $n = 0+("1808010808" x
1560 . "1"); say is_extra_strong_lucas_pseudoprime($n);'
31.84s

# is_prob_prime includes some quick tests for weeding out composites, then
SPRP-2 and strong Lucas-Selfridge:
time perl -Mbigint -MMath::Prime::Util=:all -E 'my $n = 0+("1808010808" x
1560 . "1"); say is_prob_prime($n);'
59.32s

# is_prime is is_prob_prime plus 2-5 extra M-R tests (2 for this size):
time perl -Mbigint -MMath::Prime::Util=:all -E 'my $n = 0+("1808010808" x
1560 . "1"); say is_prime($n);'
82.97s

So it's possible to do a BPSW variant (SPRP-2 + Grantham Extra Strong test)
and have it done in under 44 seconds, assuming modular math including
exponentiation is fast. Add ~12s per extra M-R test if you want.

sy...@googlecode.com

unread,
Jun 19, 2013, 10:57:00 AM6/19/13
to sympy-...@googlegroups.com

Comment #38 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Thanks for the write up. I didn't realize that gmpy2 has this
functionality. We already support gmpy and gmpy2 as optional dependencies
to make the polys faster. We should additionally wrap gmpy2's primality
tester if it is installed.

Where is the source for your Perl module? What is it licensed (though by
the time we translate it to Python, it probably wouldn't matter)? Is it
pure Perl or does it wrap C?

Are you interested in implementing any of these things in SymPy?

sy...@googlecode.com

unread,
Jun 19, 2013, 11:28:55 AM6/19/13
to sympy-...@googlegroups.com

Comment #39 on issue 3809 by dana.jac...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

I can check the gmpy2 functions -- the underlying code should be solid
since it was written for a PRIMO validator that has been used a lot in the
last two years, but it wouldn't hurt to run a few more tests on the code
that's present. The issue I had with is_strong_bpsw_prp looks like some
sort of wrapping issue where the second internal call is getting the wrong
number of arguments.

Perl modules, (note links to the github pages for the development versions):
https://metacpan.org/release/Math-Prime-Util
https://metacpan.org/release/Math-Prime-Util-GMP

License is the Perl artistic license (which is either GPL or Artistic
License), and a LICENSE file is included. The GMP module uses a derivative
of SIMPQS which is GPL v2+, though that is only used for certain factoring
tasks.

There are versions in pure Perl, C, and for many things (including the
primality tests) C+GMP in the Math::Prime::Util::GMP module. The timing
above all went to C+GMP for the real work. The pure Perl code really is
two parts:
1) a lot of wrapping and tasks where performance isn't critical
2) backup for all other functionality, or for big number use when
MPU::GMP isn't installed
The performance of pure Perl for big number operations leaves a lot to be
desired, especially with the default bigint backend, but at least it works.

If you'd like, I can write something to start, for instance:
- use deterministic bases for numbers < 2^64 which should simplify the
code and make it a little faster
- put try / except for HAS_GMPY == 2 and use the tests it has to do a
BPSW test.
and tests.

Implementing the strong Lucas test in pure Python should be possible -- it
isn't that much code. I'll give it a shot and we'll see what the
performance looks like.

sy...@googlecode.com

unread,
Jun 19, 2013, 11:16:45 PM6/19/13
to sympy-...@googlegroups.com

Comment #40 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Yeah, that would be great. I think the number theory module in SymPy could
definitely use some love from someone who knows what they're doing.

sy...@googlecode.com

unread,
Jun 25, 2013, 10:54:28 PM6/25/13
to sympy-...@googlegroups.com

Comment #41 on issue 3809 by dana.jac...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

I have a fork with a complete rewrite. I need to do more testing before
doing a pull request. I'm not sure what the normal review process is -- do
a pull request and go from there, or point to a branch to get initial
comments. This includes things like whether to make some functions private
or public (e.g. modular Lucas sequence).

I've been testing on a machine that doesn't have gmpy or gmpy2, so pow()
and friends are quite slow. I'm sure there is more room for optimization.
It does make performance more visible.

The standard, strong, and extra strong Lucas tests are implemented and I've
verified that the initial pseudoprimes match the OEIS sequence and other
code.

I put some new tests in, including the two Arnault composites. I saw one
of the Arnault composites in the test suite already, but asserted as prime
with no comments. I'm not sure if this was a mistake or someone putting in
an expected defect without commenting it as such. I did not add specific
tests for the new functions.

This has deterministic tests for up to 2^64, and some more trivial factor
checking. No more _pseudos table or testing for _mr_safe. However one of
the tests I need to do is verify that we properly determine all the PSP-2's
from Feitsma's database.

for i in xrange(5000000):
if isprime(i)
sum += i

master: 25.2s
branch: 14.9s

isprime( 10**2000 + 4561 ):
44.3s old isprime using 46 bases
5.3s strong BPSW plus an extra M-R test with random base
4.3s extra strong BPSW plus an extra M-R test with random base
4.1s strong BPSW
3.2s extra strong BPSW

for i in [2,4,6,8,10]:
isprime( 10**2000 + 4561 + i )

master: 5.0s
branch: 0.2s

For the Narcissus prime int("1808010808"*1560 + "1"), times are:
6m 2s mr(n, 2)
19m12s is_strong_lucas_prp(n)
13m 9s is_extra_strong_lucas_prp(n)
again noting that gmpy / gmpy2 is not installed, so pow() is quite slow.
If we ran the 46 bases from the old code, we would expect a total isprime
time of about 4.5 hours. If we do a strong BPSW test it is about 25
minutes. Using the "extra strong" test would be about 18 minutes. Adding
another M-R test would add about 6 minutes.

I'll do more tests including with GMPY2 tonight.

sy...@googlecode.com

unread,
Jun 26, 2013, 12:35:44 AM6/26/13
to sympy-...@googlegroups.com

Comment #42 on issue 3809 by cas...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

I thought I had fixed is_bpsw_prp() and is_strong_bpsw_prp() in gmpy2
2.0.0. :(

I committed a fix to the svn repository (r813). Can you give that a try?

There is a version of the BPSW test included with the MPIR library but not
with GMP. I'll compare its performance and report back. If it is
significantly faster, I'll try to extract the code so it is available for
all builds of gmpy2.

If you have have C code that you would like me to include with gmpy2, I can
do that, too.

casevh

sy...@googlecode.com

unread,
Jun 26, 2013, 2:37:47 AM6/26/13
to sympy-...@googlegroups.com

Comment #43 on issue 3809 by dana.jac...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Case,

Yes, the dev version works (and does the strong test, yay). Trying to
verify with Feistma's database, I found this segfaults:
is_strong_selfridge_prp("2047"). I can obviously wrap in int, but segfault
seems harsh. Regardless:

is_strong_selfridge_prp indicates composite for all 31.9M spsp2s below
2^64 (as expected and desired).

I looked at MPIR 2.6.0 last month (FLINT uses basically identical code),
and it does some really oddball things. I'm personally dubious about it
being actually BPSW. It does a mix of PSP and SPSP. The Lucas test it
uses has more pseudoprimes than the Lucas-Selfridge tests. For example
number of pseudoprimes under 1e9: 943 extra strong Lucas, 1415 strong
Lucas-Selfridge, 5485 Lucas-Selfridge, 8533 MPIR n_is_pseudoprime_lucas.
This doesn't preclude it from being a good test -- what is important is
that it be disjoint from the other test -- but it doesn't seem promising
given that there doesn't seem to be any reason to do it that way.

All that said, I just ran the Fibonacci/Lucas test against Feitsma's
database, and it has no counterexamples with standard or strong base 2
pseudoprimes under 2^64. Since MPIR 2.6.0 only uses it below 2^64 this is
good enough. For use above 2^64, with 6x the number of pseudoprimes as the
strong Lucas-Selfridge test, there should be a compelling reason to use the
non-standard method, which I just don't see.

David Cleaver's code could be optimized some, but it looks ok, it's easy to
read, and I believe it is used by factordb's Primo validator so gets a lot
of use. I'm not sure it really needs to be replaced unless you need.

You can look at my code here:
https://github.com/danaj/Math-Prime-Util-GMP/blob/master/gmp_main.c which
has all three Lucas tests.

sy...@googlecode.com

unread,
Jun 26, 2013, 9:12:51 AM6/26/13
to sympy-...@googlegroups.com

Comment #44 on issue 3809 by cas...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Dana,

I took a quick look before I had to leave for work and I think I spotted
the cause of the segmentation fault. I'll try to fix it this evening.

Case

sy...@googlecode.com

unread,
Jun 26, 2013, 11:19:46 AM6/26/13
to sympy-...@googlegroups.com

Comment #45 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Do a pull request as soon as you have code. That's the easiest way to even
see what you have done, not to mention comment on it.

sy...@googlecode.com

unread,
Jun 26, 2013, 11:21:36 AM6/26/13
to sympy-...@googlegroups.com

Comment #46 on issue 3809 by asme...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Apparently I can make a pull request for you.
https://github.com/sympy/sympy/pull/2209.

sy...@googlecode.com

unread,
Jun 27, 2013, 2:04:11 AM6/27/13
to sympy-...@googlegroups.com

Comment #47 on issue 3809 by cas...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Dana,

I've committed another set of fixes. Can you give r814 a try?

Case

sy...@googlecode.com

unread,
Jun 27, 2013, 3:41:55 AM6/27/13
to sympy-...@googlegroups.com

Comment #48 on issue 3809 by dana.jac...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Case, I tried r814 and it seems to work fine for me, both on the command
line and with sympy calling it. Using quotes now raises a TypeError, as do
most of the oddball error cases.


I was going through the prp calls, and am getting what I think are
incorrect results from is_extra_strong_lucas_prp. There are a lot more
pseudoprimes being output than there should be. It looks like the code and
comments have misread Grantham 2000 (the "Lucas pseudoprime" page on
Wikipedia is also good).

Grantham/MPU/Nicely: ( U_s=0 AND V_s = +/- 2 ) OR (V_{s*2^t} = 0 for t
in 0..r-1)
gmpy_mpz_prp.c: U_s=0 OR V_s = +/-2 OR V_{s*2^t} = 0 for t
in 0..r-1

It's in mpz_prp.c, so I'll drop David a note. This should have no impact
for GMPY2 outside of this one call.

sy...@googlecode.com

unread,
Jul 8, 2013, 2:35:42 AM7/8/13
to sympy-...@googlegroups.com

Comment #50 on issue 3809 by cas...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809

Dana,

I've updated is_extra_strong_lucas(). Can you test r816?

Thanks, Case
Reply all
Reply to author
Forward
0 new messages