My proposed algorithm will give the GCD (which only needs modulus).
This GCD is very easy to factor, since it is much smaller than the
input number.
Also note that my algorithm is separable. What I mean is, let's
form some numbers called t, u, v, w, and x.
t = product for all primes from 1st to ith
u = product for all primes from (i+1)th to jth
v = product for all primes from (j+1)th tokth
w = product for all primes from (k+1)th tolth
x = product for all primes from (l+1)th to mth
if our huge input number is called q,
then if gcd(q,t) = 1
and gcd(q,u) = 1
and gcd(q,v) = 1
and gcd(q,w) = 1
and gcd(q,x) = 1
then gcd(q, t*u*v*w) = 1
further, q has no factors up to m squared.
If, on the other hand, any gcd's are revealed
then these are factors of q.
My algorithm will not reveal a lot about input numbers smaller
than m, other than primality. But with big inputs, each gcd
will be revealing.
I calculate that 4000 cd-roms could contain enough factors
to completely factor a 100 bit number. Further, with 4000
cpu's(each with its own cd reader) that 100 bit number could
be factored in the time it takes to read a single cd-rom. (which
is obviously the rate limiting step in this problem).
--
The opinions expressed in this message are my own personal views
and do not reflect the official views of Microsoft Corporation.
One possibly useful output from the techniqe described is the
method used for calculation of the modulus. Although it is not
a revolutionary or new notion, I have found that all of the extended
precision integer classes that I have examined ( though certainly
not comprehensive ) invariably do the actual division to calculate
the modulus. This makes GCD algorithms run orders of magnitude
slower than they need to. If they were to implement a simple ripple
modulus like that found in my code, the mod operation would be much
faster for these packages.
Note that this is a VERY EXPENSIVE way of factoring such a number. A
100bit number is about 30 digits long. If it's not prime, there should
be a factor of at most 15 digits. These factors are easily found with
even a straightforward implementation of the elliptic curve method
(without 2nd step, that is) in less than an hour on a workstation.
Redo your calculation for 100 digit ( = about 300 bit) numbers. Then
we're getting in the range that's interesting nowadays. Quite a nice
illustration of why smart algorithms really pay off ...
Nils Bruin
> See the previous thread,
> "How fast does the product of all primes up to N grow in size"
> for reference.
>
> My proposed algorithm will give the GCD (which only needs modulus).
> This GCD is very easy to factor, since it is much smaller than the
> input number.
>
Not so fast. In order to find the GCD you have to DIVIDE the product by
your number. How long does this division take??? About the size of your
number!
> Also note that my algorithm is separable. What I mean is, let's
> form some numbers called t, u, v, w, and x.
> t = product for all primes from 1st to ith
> u = product for all primes from (i+1)th to jth
> v = product for all primes from (j+1)th tokth
> w = product for all primes from (k+1)th tolth
> x = product for all primes from (l+1)th to mth
>
> if our huge input number is called q,
> then if gcd(q,t) = 1
> and gcd(q,u) = 1
> and gcd(q,v) = 1
> and gcd(q,w) = 1
> and gcd(q,x) = 1
> then gcd(q, t*u*v*w) = 1
> further, q has no factors up to m squared.
> If, on the other hand, any gcd's are revealed
> then these are factors of q.
>
A marginal improvement. Now if it does not factor easily then it takes
multiple divisions. I am still not impressed by this as an idea. I
mean, this approach might be useful to deal with the first hundred or
so primes. However it scales VERY badly.
> My algorithm will not reveal a lot about input numbers smaller
> than m, other than primality. But with big inputs, each gcd
> will be revealing.
>
And will take correspondingly longer to calculate.
> I calculate that 4000 cd-roms could contain enough factors
> to completely factor a 100 bit number. Further, with 4000
> cpu's(each with its own cd reader) that 100 bit number could
> be factored in the time it takes to read a single cd-rom. (which
> is obviously the rate limiting step in this problem).
Actually not. Before you can GET the GCD you have to go through a good
chunk of the cd-roms. (If it has no small factors then you might have
to go through all of them.) What is wrong with this picture?
This is not to mention the "detail" of how inconvenient 4000 cd-roms of
memory is. (That is a lot of memory. How much does, say, UPS use?)
Standard factoring algorithms are FAR faster than this one would be.
And can handle *much* larger numbers with reasonable computing power.
Ben Tilly
(If Microsoft uses algorithms like this, that would explains some
things...)
>See the previous thread,
>"How fast does the product of all primes up to N grow in size"
>for reference.
I replied to this earlier, but the post did not seem to make it out to
the net.
The product of all primes less than N is asymptotically
exp(N + o(1))
>My proposed algorithm will give the GCD (which only needs modulus).
>This GCD is very easy to factor, since it is much smaller than the
>input number.
>I calculate that 4000 cd-roms could contain enough factors
>to completely factor a 100 bit number. Further, with 4000
Not even close. With N = 10^100 the product of the primes needed
is about e^(10^50). If one could store each bit (somehow)
in a space the size of the Planck length, there would not be
enough space in the entire universe to hold the number.
You can lead a horse's ass to knowledge, but you can't make him think.
The largest 100 bit number I know of is approximately 1.26765e30.
The square root of that number is about 1.1259e15. After all,
if any number has one or more factors, at least on is <= sqrt(x).
One CD can hold about 8 billion bits.
1.1259e15/8e9 = 1.6125e5
161,250 is a lot more than 4000, but it will fit in the known universe.
>You can lead a horse's ass to knowledge, but you can't make him think.
>
Neigh! Whinnie! Snort!
actually (2^101-1) is approximately 5.07e30,
the square root of which is about 2.2518e15
I was previously thinking of 2^100, which is the SMALLEST 100 bit
number, not the largest.
>The square root of that number is about 1.1259e15. After all,
>if any number has one or more factors, at least on is <= sqrt(x).
>One CD can hold about 8 billion bits.
>1.1259e15/8e9 = 1.6125e5
>
>161,250 is a lot more than 4000, but it will fit in the known universe.
Ok, 320K CD's But we may be able to compress the data, on the other hand.
Even this large number will fit into a single Galaxy (a Ford Galaxy 500).
>
>>You can lead a horse's ass to knowledge, but you can't make him think.
>Neigh! Whinnie! Snort!
This time I plead Donkey...
EeeAwwh! EeeAwwh! EeeAwwh!
> a-c...@microsoft.com (Dann Corbit) wrote:
>
> >See the previous thread,
> >"How fast does the product of all primes up to N grow in size"
> >for reference.
>
> I replied to this earlier, but the post did not seem to make it out to
> the net.
>
> The product of all primes less than N is asymptotically
>
> exp(N + o(1))
>
> >My proposed algorithm will give the GCD (which only needs modulus).
> >This GCD is very easy to factor, since it is much smaller than the
> >input number.
>
>
> >I calculate that 4000 cd-roms could contain enough factors
> >to completely factor a 100 bit number. Further, with 4000
>
> Not even close. With N = 10^100 the product of the primes needed
> is about e^(10^50). If one could store each bit (somehow)
> in a space the size of the Planck length, there would not be
> enough space in the entire universe to hold the number.
>
Sorry, but Dann is in the right general ballpark here..
100 *bits* is 30 digits. To factor it I just need the factors up to
10^15. Therefore I am dealing with a number of size e^(10^15), which
means that I need order of magnitude 10^15 digits. (Actually it is a
bit smaller than that.) With 4000 cd-roms that brings it down to
something on the order of 10^11 digits per cd-rom. I am not sure
whether or not this is doable on a good cd-rom or not, but it is not
too outrageous.
Of course for the reason that you are talking about it is impossible
for us to solve interesting problems using this method...
>
> You can lead a horse's ass to knowledge, but you can't make him think.
>
Ben Tilly
/*******************************************************************/
/* This zero testing stuff may seem odd, since zero is not likely. */
/* However, knowing that neither a nor b is zero will speed up */
/* later operations greatly by elimination of tests for zero. */
/*******************************************************************/
if (a == 0L)
{
tmp = b;
}
else if (b == 0L)
{
tmp = a;
}
else /* Neither a NOR b is zero! */
{
/* Absolute values used. */
a = a > 0 ? a : -a;
b = b > 0 ? b : -b;
/**************************************************************/
/* While all this fuss about numbers divisible by 2 may seem */
/* like quite a bother, half of the integers in the universe */
/* are evenly divisible by 2. Hence, on a random sample of */
/* input values, great benefit will be realized. The odds */
/* that at least one of a,b is even is 1 - (1/2)*(1/2) = .75 */
/* since the probability that both are odd is .25. */
/**************************************************************/
/* If the last bit is 0, the number is divisible by 2 evenly. */
/* If a & b are divisible by 2, gcd(a,b) = 2*gcd(a/2,b/2). */
/**************************************************************/
while (!(a & 1L) && !(b & 1L))
{
a >>= 1;
b >>= 1;
shiftcount++;
}
/**************************************************************/
/* If the a is divisible by 2 and b is not divisible by 2, */
/* then gcd(a,b) = gcd(a/2,b). */
/**************************************************************/
while (!(a & 1L))
{
a >>= 1;
}
/*******************************************************/
/* If b is divisible by 2 and a is not divisible by 2, */
/* then gcd(a,b) = gcd(a,b/2). */
/*******************************************************/
while (!(b & 1L))
{
b >>= 1;
}
/**********************************************************************/
/* Make sure the numbers are in the proper order (swap if necessary). */
/**********************************************************************/
if (b > a)
{
tmp = a;
a = b;
b = tmp;
}
/****************************************/
/* Euclid's famous gcd algorithm: */
/* Iterate until the remainder is <= 1. */
/****************************************/
while (b > 1)
{
tmp = b;
b = a % b;
a = tmp;
}
if (b == 0)
tmp = a;
else
tmp = b;
/***********************************************************************/
/* If we divided BOTH numbers by factors of 2, we must compensate now. */
/***********************************************************************/
if (shiftcount > 0 && tmp > 0L)
tmp <<= shiftcount;
}
return (tmp);
}
>Standard factoring algorithms are FAR faster than this one would be.
>And can handle *much* larger numbers with reasonable computing power.
Absolutely agreed. There may be some smart folks out there that can
see some useful purpose for the math involved however, or it may be
useful for factoring in certain bands.
>Ben Tilly
>
>(If Microsoft uses algorithms like this, that would explains some
>things...)
It certainly would not explain how they became the most powerful and
influential software company in the world.
--
The opinions expressed in this message are my own personal views
and do not reflect the official views of Microsoft Corporation.
In fact, I'm just a subcontractor, not an employee, so pull in your claws.
Hmmm... how does your computer implement % :-)
Yes, but this is after the first division. That first division
is going to take time proportional to the size (in bits) of your number.
: >Standard factoring algorithms are FAR faster than this one would be.
: >And can handle *much* larger numbers with reasonable computing power.
: Absolutely agreed. There may be some smart folks out there that can
: see some useful purpose for the math involved however, or it may be
: useful for factoring in certain bands.
Until it becomes faster than trial division, it is unlikely
to be useful as anything other than the answer to exercise 4.5.4.3.
: >Ben Tilly
: >
: >(If Microsoft uses algorithms like this, that would explains some
: >things...)
: It certainly would not explain how they became the most powerful and
: influential software company in the world.
Announcing something that is at best 25% slower, and measurably
less useful, than the slowest algorithm in use as an advance seems to
be precisely what makes Microsoft both extremely profitable and hideously
annoying. Though I thought that claiming that long division was orders
of magnitude faster than long division was a classic worthy of the master,
and much more Microsoftian.
: --
: The opinions expressed in this message are my own personal views
: and do not reflect the official views of Microsoft Corporation.
: In fact, I'm just a subcontractor, not an employee, so pull in your claws.
--
David M Einstein | Lord, grant me the companionship of those who seek | the truth, and protection from those who have found
Ben, you ought to know better. But Dann's response is off-track too (he
just replaces division by taking the mod). See Knuth, Vol. 2, I think.
From the top of my head (and absolutely untested):
unsigned gcd(unsigned i, j)
{
unsigned tmp, i2 = 0, j2 = 0;
if(i == 1 || j == 1) return 1;
if(i != 0) while(!(i & 1)) { i >>= 1; i2++;}
if(j != 0) while(!(j & 1)) { j >>= 1; j2++;}
if(i > j) { tmp = i; i = j; j = tmp;}
while(i != 0) {
j -= i;
if(j != 0) while(!(j & 1)) j >>= 1;
if(i > j) { tmp = i; i = j; j = tmp;}
}
return j + 1 << (i2 < j2 ? i2 : j2);
}
See? No division, only by 2 (which is much faster than true division
on a binary machine, because it is a shift). Knuth discusses
the merits of both Euclids and this (binary) algorithm at length. Both
are similar in time complexity and worst case. Euclids is worst case if
the two numbers are succesive Fibonacci numbers; off-hand I do not know
the worst case for the binary algorithm, but it is similar.
{ From a purely mathematic viewpoint, there is of course no difference
between a division by 2 and a division by any other number. For a
computer scientists viewpoint there is a huge difference, unless he
is a Soviet computer scientist working with the, now defunct, base 3
computer (Setun).}
Still the question and the response are not in any way related to the
question whether the method is feasable or not. Doing a MOD (or GCD)
of a large product of primes and another (smaller) number is in time
complexity related to the size of the largest number (as Ben states and Dann
implicitly states when he is talking about time needed reading numbers).
> >Standard factoring algorithms are FAR faster than this one would be.
> >And can handle *much* larger numbers with reasonable computing power.
This one is absolutely true. Factoring a 60 digit (decimal digits that
is) will take only a few minutes on a reasonable workstation. Also
a Cray will factor a 20 digit number using trial division within
one hour. (And you may precompute the primes or not; it will not make
very much difference.) Using some standard methods will take you in
minutes, or even seconds.
>
> Absolutely agreed. There may be some smart folks out there that can
> see some useful purpose for the math involved however, or it may be
> useful for factoring in certain bands.
I do not think it is useful whatever. Not even in primality testing
(which is all Dann's algorithm really does). And for Dann, the program
I have to test primality is by Cohen and Lenstra; there is a short
description on it in some back number of Mathematics of Computation. I
think the 1981-1983 period.
...
> >(If Microsoft uses algorithms like this, that would explains some
> >things...)
I think, Ben, that this is uncalled for. Posting from microsoft in no
way should involve microsoft in the postings (as my posting this from
cwi should in no way involve cwi in the contents).
> It certainly would not explain how they became the most powerful and
> influential software company in the world.
On the other hand, Dann, being the most powerful and influential does
not make it necessarily the best.
Can we dispose off affiliations?
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924098
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Repeated subtraction :-)
roy
If the number were expressed in hex rather than decimal base, I would not
even need the multiplication. This is faster than division.
>
> Yes, but this is after the first division. That first division
>is going to take time proportional to the size (in bits) of your number.
>
>: >Standard factoring algorithms are FAR faster than this one would be.
>: >And can handle *much* larger numbers with reasonable computing power.
>
>: Absolutely agreed. There may be some smart folks out there that can
>: see some useful purpose for the math involved however, or it may be
>: useful for factoring in certain bands.
>
> Until it becomes faster than trial division, it is unlikely
>to be useful as anything other than the answer to exercise 4.5.4.3.
Agreed.
>: >Ben Tilly
>: >
>: >(If Microsoft uses algorithms like this, that would explains some
>: >things...)
>: It certainly would not explain how they became the most powerful and
>: influential software company in the world.
>
> Announcing something that is at best 25% slower, and measurably
>less useful, than the slowest algorithm in use as an advance seems to
>be precisely what makes Microsoft both extremely profitable and hideously
>annoying. Though I thought that claiming that long division was orders
>of magnitude faster than long division was a classic worthy of the master,
>and much more Microsoftian.
I never claimed it was an advance, nor did I say that it was faster than
any existing algorithm. I did want to explore the speed of the algorithm,
because when I originally posted, I had no idea how it would perform when
scaled up. Several mathematically inclined individuals did send me
friendly, courteous email messages to explain the behavior in terms that
I could understand. My last line was saying that IF MS used algorithms
that were inefficient, they would not be successful. You apparently missed
my meaning.
>: --
>: The opinions expressed in this message are my own personal views
>: and do not reflect the official views of Microsoft Corporation.
>: In fact, I'm just a subcontractor, not an employee, so pull in your claws.
>
>--
>David M Einstein | Lord, grant me the companionship of those who seek | the truth, and
protection from those
--
DTW> Ben, you ought to know better. But Dann's response is off-track too
DTW> (he just replaces division by taking the mod). See Knuth, Vol. 2, I
Whan that book was written, division and Mod operations were
slow. The binary method (when I implemented it) was quite a bit
slower than the "modern" Euclid method. A factor of 3.8 slower,
I think it was. It was written in a HLL, and a assembler could
make the binary method better than my implementation, but on
modern computers I think the binary method will still be slower.
DTW> See? No division, only by 2 (which is much faster than true division
DTW> on a binary machine, because it is a shift). Knuth discusses
DTW> the merits of both Euclids and this (binary) algorithm at length. Both
DTW> are similar in time complexity and worst case. Euclids is worst case
DTW> if the two numbers are succesive Fibonacci numbers; off-hand I do not
DTW> know the worst case for the binary algorithm, but it is similar.
I think the analysis is the same, but I'm not sure.
Jud McCranie jud.mc...@swsbbs.com
... Happiness is a composite number with a small factor.
* Silver Xpress V4.3 SW20178
I apologize, I was extremely rude.
: --
: The opinions expressed in this message are my own personal views
: and do not reflect the official views of Microsoft Corporation.
: In fact, I'm just a subcontractor, not an employee, so pull in your claws.
--
David M Einstein | Lord, grant me the companionship of those who seek | the truth, and protection from those who have found
> In article <4gl686$2...@news.microsoft.com> a-c...@microsoft.com (Dann Corbit) writes:
> > In article <4gdgpe$k...@dartvax.dartmouth.edu>, Benjamin...@dartmouth.edu says...
> > >Not so fast. In order to find the GCD you have to DIVIDE the product by
> > >your number. How long does this division take??? About the size of your
> > >number!
>
> Ben, you ought to know better. But Dann's response is off-track too (he
> just replaces division by taking the mod). See Knuth, Vol. 2, I think.
> From the top of my head (and absolutely untested):
> unsigned gcd(unsigned i, j)
> {
> unsigned tmp, i2 = 0, j2 = 0;
> if(i == 1 || j == 1) return 1;
> if(i != 0) while(!(i & 1)) { i >>= 1; i2++;}
> if(j != 0) while(!(j & 1)) { j >>= 1; j2++;}
> if(i > j) { tmp = i; i = j; j = tmp;}
> while(i != 0) {
> j -= i;
> if(j != 0) while(!(j & 1)) j >>= 1;
> if(i > j) { tmp = i; i = j; j = tmp;}
> }
> return j + 1 << (i2 < j2 ? i2 : j2);
> }
>
> See? No division, only by 2 (which is much faster than true division
> on a binary machine, because it is a shift). Knuth discusses
> the merits of both Euclids and this (binary) algorithm at length. Both
> are similar in time complexity and worst case. Euclids is worst case if
> the two numbers are succesive Fibonacci numbers; off-hand I do not know
> the worst case for the binary algorithm, but it is similar.
>
> { From a purely mathematic viewpoint, there is of course no difference
> between a division by 2 and a division by any other number. For a
> computer scientists viewpoint there is a huge difference, unless he
> is a Soviet computer scientist working with the, now defunct, base 3
> computer (Setun).}
>
I am a mathematician, so to me there is no difference...
(In fact later, possibly even in the same post though I think not, I
explained why it is that you cannot do more than a constant factor
better than division gives you.)
> Still the question and the response are not in any way related to the
> question whether the method is feasable or not. Doing a MOD (or GCD)
> of a large product of primes and another (smaller) number is in time
> complexity related to the size of the largest number (as Ben states and Dann
> implicitly states when he is talking about time needed reading numbers).
[..]
> > >(If Microsoft uses algorithms like this, that would explains some
> > >things...)
>
> I think, Ben, that this is uncalled for. Posting from microsoft in no
> way should involve microsoft in the postings (as my posting this from
> cwi should in no way involve cwi in the contents).
>
I agree that it is a cheap shot. However I was getting irritated that
when he first asked about this method, I sent him an e-mail explanation
of why it did not work. And then when he kept on posting for a bit
about how good he thought that this method was, I just could not
resist.
I should not have posted that.
> > It certainly would not explain how they became the most powerful and
> > influential software company in the world.
>
> On the other hand, Dann, being the most powerful and influential does
> not make it necessarily the best.
>
> Can we dispose off affiliations?
Ben Tilly
A 100 bit number is less than 2^100, not 10^100. Thus, the product of
primes needed is exp(2^50) which has about 1.62433e15 bits. A CD-ROM
holds 650 MB which is about 5452595200 bits. This means that about
300000 CD-ROMs could hold the appropriate LCM. Thus, it is physically
possible to hold this number. Feasibility is a different matter.
Rob Johnson
Apple Computer, Inc.
rjoh...@apple.com
Product of primes consists of words 1 to n.
Most significant word is word 1, least is n.
Here is our big product of primes number:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+
|1| | | | | | | | | | | | | | | | | | ...|n|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+
Input to test is one word. Here is our test word:
+-+
|w|
+-+
1. Take the mod of the first two words of our product of primes
with our test word.
2. Take the remainder as the most significant word and concantenate
it with the third word from our big prime product.
3. Take the mod of the newly formed number with our test word.
4. Take the remainder as the most significant word and concantenate
it with the next word from our big prime product.
5. Go to 3 until we reach n.
In the sample I provided, the modulus of a 28000+ digit number with
a 32 bit integer is found in 3145 64-bit integer modulus operations.
If the number were expressed in Hexadecimal format, then the 3145
integer multiplications would not be necessary.
By the way, I once had a cat named "PI".