Pollard P-1 Algorithm

5 views

Skip to first unread message

Jomega Gibson

unread,

Aug 5, 2024, 5:12:06 AM8/5/24

to lihericy

TimeComplexity analysis:

The algorithm offers a trade-off between its running time and the probability that it finds a factor. A prime divisor can be achieved with a probability around 0.5, in O(?d)

Pollard's rho algorithm is an algorithm for integer factorization. It was invented by John Pollard in 1975.[1] It uses only a small amount of space, and its expected running time is proportional to the square root of the smallest prime factor of the composite number being factorized.

In 1980, Richard Brent published a faster variant of the rho algorithm. He used the same core ideas as Pollard but a different method of cycle detection, replacing Floyd's cycle-finding algorithm with the related Brent's cycle finding method.[3]

I'm aware that the Pollard-Strassen algorithm can be used to find all prime factors of $n$ not exceeding $B$ in $O\big(n^\epsilon B^1/2\big)$ time. This is really useful because I need to find all factors less than $n^1/3$ to determine if n is squarefree, which could therefore theoretically be done in $O\big(n^1/6+\epsilon\big)$.

However I can't find anything more than a brief overview of the algorithm itself, let alone a worked example. Could anyone provide a detailed explanation or example, or reference to where I can find one? Additionally, I'm interested in other algorithms, if they exist, which provide all factors not exceeding $B=\big\lceil n^1/3 \big\rceil$ quickly.

The second for-loop takes $O(n^1/4\log n)$, and so the hard work of the algorithm is to compute $f_i$ in faster than the $O(n^1/2)$ demonstrated in the first for-loop above. How this is done is by using subproduct trees and multipoint evaluation. Here is a slide presentation that describes the details pretty well. Also this paper (search for "Main algorithmic ideas") has a good high level overview of the algorithm. Finally, subproduct trees are given a good treatment in this presentation, Asymptotically fastalgorithms for moderncomputer algebra.

Notice, if the number that you want to factorize is actually a prime number, most of the algorithms will run very slowly. This is especially true for Fermat's, Pollard's p-1 and Pollard's rho factorization algorithms.Therefore, it makes the most sense to perform a probabilistic (or a fast deterministic) primality test before trying to factorize the number.

This is an optimization of the trial division.Once we know that the number is not divisible by 2, we don't need to check other even numbers.This leaves us with only $50\%$ of the numbers to check.After factoring out 2, and getting an odd number, we can simply start with 3 and only count other odd numbers.

Extending the wheel factorization method indefinitely, we will only be left with prime numbers to check. A good way of checking this is to precompute all prime numbers with the Sieve of Eratosthenes until $\sqrtn$, and test them individually.

Fermat's factorization method tries to exploit this fact by guessing the first square $a^2$, and checking if the remaining part, $b^2 = a^2 - n$, is also a square number.If it is, then we have found the factors $a - b$ and $a + b$ of $n$.

However, there are still a large number of optimization options regarding this approach.By looking at the squares $a^2$ modulo a fixed small number, it can be observed that certain values $a$ don't have to be viewed, since they cannot produce a square number $a^2 - n$.

This algorithm finds a cycle by using two pointers moving over the sequence at differing speeds.During each iteration, the first pointer will advance one element over, while the second pointer advances to every other element. Using this idea it is easy to observe that if there is a cycle, at some point the second pointer will come around to meet the first one during the loops. If the cycle length is $\lambda$ and the $\mu$ is the first index at which the cycle starts, then the algorithm will run in $O(\lambda + \mu)$ time.

The implementation uses a function mult, that multiplies two integers $\le 10^18$ without overflow by using a GCC's type __int128 for 128-bit integer.If GCC is not available, you can using a similar idea as binary exponentiation.

Brent implements a similar method to Floyd, using two pointers.The difference being that instead of advancing the pointers by one and two places respectively, they are advanced by powers of two. As soon as $2^i$ is greater than $\lambda$ and $\mu$, we will find the cycle.

I'm interested in any tips on improvements to the basic Pollard's rho method to get it to handle larger composites of only a few prime factors. Of course if you find any stupidities in the code above, I'm interested in those as well.

For full disclosure: This is a homework assignment, so general tips and pointers are better than fully coded solutions. With this very simple approach I already get a passing grade on the assignment, but would of course like to improve.

You are using the original version of the rho algorithm due to Pollard. Brent's variant makes two improvements: Floyd's tortoise-and-hare cycle-finding algorithm is replaced by a cycle-finding algorithm developed by Brent, and the gcd calculation is delayed so it is performed only once every hundred or so times through the loop instead of every time. But those changes only get a small improvement, maybe 25% or so, and won't allow you to factor the large numbers you are talking about. Thus, you will need a better algorithm: SQUFOF might work for semi-primes the size that you mention, or you could implement quadratic sieve or the elliptic curve method. I have discussion and implementation of all those algorithms at my blog.

Decided to implement my own very complex C++ solution of your task from scratch. Although you asked not to write code, still I did it fully only to have proof of concept, to check real speed and timings.

Made most of computations in compile time. This is main feature of my code. Input number N is provided at compile time as long macro constant. This ensures that compiler does half of optimizations at compile time like inlining constants and doing optimizing division and other arithmetics. As a drawback, you have to re-compile program every time when you change a number that you want to factor.

One more very important speed improvement is that I used Montgomery Reduction to speedup modulus division. This Montgomery speeds up all computations 2-2.5x times. Besides Montgomery you can also use Barrett Reduction. Both Montgomery and Barrett replace single expensive division with several multiplications and additions, which makes division very fast.

One very advanced and fast optimization that I did is implemented my own uint128 and uint256 with all necessary sub-optimizations needed for my task. This optimization is only posted in code by me in Part 2 of my answer, see this part after first code.

As a result of above steps, total speed of Pollard Rho is increased 50-100x times, especially due to doing GCD only once in thousand steps. This speed is enough even to compute your biggest number provided in your question.

Besides algorithms described above I also used following extra algorithms: Extended Euclidean Algorithm (for computing coefficients for modular inverse), Modular Multiplicative Inverse, Euclidean Algorithm (for computing GCD), Modular Binary Exponentiation, Trial Division (for checking primality of small numbers), Fermat Primality Test, as already mentioned Montgomery Reduction and Pollard Rho itself.

As you can see above your smaller number takes just 0.1 second to factor, while your bigger number (that you failed to factor at all) takes quite affordable time, 2260 seconds (a bit more than half hour). Also you can see that I created myself a number with 48-bit smallest factor, and another number with 56-bit smaller factor.

In general a rule is such that if you have smallest factor of K-bit then it takes 2^(K/2) iterations of Pollard Rho to compute this factor. Unlike for example Trial division algorithm which needs square times bigger time of 2^K for K-bit factor.

In my code see very start of file, there is a bunch of lines #define NUM, each defining compile time constant containing a number. You can comment out any line or change value of a number or add a new line with new number. Then re-compile program and run it to see results.

As you can see in Console Output at the very end of my post, your biggest number takes just 420 seconds to factor. This is 5.4x times faster compared to 2260 seconds of the first part of this answer.

CODE GOES HERE. Only because of StackOverflow limit of 30 000 symbols per post, I can't inline 2nd code here, because it alone is 26 KB in size. Hence I'm providing this code as Gist link below and also through Try it online! link (to run online on GodBolt server):

So the order has some smooth factors and some not so smooth factors, so I tried to apply pollard rho to the semi-large primes but that's still a $O(\sqrt2^80)$ complexity, which I think will take too long.

What's a good $B$ for smoothness? Now this is implementation-dependent, so that depends on what you're working with and how long you have to wait. In practice, I found that if you factorize $p-1$ up to about $2^20$, and you have a few primes of that size, and the order of your base point is much smaller than $p-1$, that cuts $Q/r$ down to a reasonable size, but of course your experience may differ.

Pollard's Rho Algorithm is a very interesting and quite accessible algorithm for factoring numbers. It is not thefastest algorithm by far but in practice it outperforms trial division by many orders of magnitude. It is based on very simple ideas that can be used in other contexts as well.

Very simple: we have precisely two factors to find in the entire space and numbers. Therefore,the probability is . If (a 10 digit number), the chances are very much not in our favor about . Now the odds for winning Lotto are much better than this.