(!?) need fast algorithm for computing factorials

Adam Russell

unread,

Nov 7, 2003, 1:49:39 PM11/7/03

to

I am looking for the fastest algorithm for computing factorials (ex:
4!=4*3*2*1). This is for a programming contest I am thinking of entering.
The program to be used is Labview and it needs to compute up to 10000!. I
realize that these numbers will be too large for simple representation, and
I think I can handle that. But I believe that simply multiplying each and
every number probably is not the most efficient algorithm so would like
advice on any numeric methods that might be quicker.

Richard Heathfield

unread,

Nov 7, 2003, 2:18:38 PM11/7/03

to

Adam Russell wrote:

> I am looking for the fastest algorithm for computing factorials (ex:
> 4!=4*3*2*1). This is for a programming contest I am thinking of entering.
> The program to be used is Labview and it needs to compute up to 10000!.

http://www.rjgh.co.uk/prg/c/misc/f10000.c can do this quite quickly. Just
supply it with the appropriate configuration file which you can find at
http://www.rjgh.co.uk/prg/c/mist/f10000.txt - like this:

Linux:
./f10000 < f10000.txt
Windows:
f10000 < f10000.txt

On my rather slow system, it completes in about 0.02 CPU seconds.

> I
> realize that these numbers will be too large for simple representation,
> and
> I think I can handle that. But I believe that simply multiplying each and
> every number probably is not the most efficient algorithm so would like
> advice on any numeric methods that might be quicker.

Consider gathering multiples of 5 and multiples of 2. Make as many pairs of
these as you can, and don't bother multiplying them. Just add that many 0s
to the output at the end.

The number 10000 has almost 2500 zeros at the end (it's a 35660 digit number
when represented in decimal). That's a lot of multiplying you can avoid
doing.

If you're allowed to do this in hex, even greater savings can be made, since
you can avoid multiplying by an /awful/ lot of numbers.

--
Richard Heathfield : bin...@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Ben Pfaff

unread,

Nov 7, 2003, 2:19:37 PM11/7/03

to

"Adam Russell" <REMOVE_THIS...@sbcglobal.net> writes:

How much precision do you need? n! is approximately sqrt(2 * pi
* n) * (n / e)**n, where e is the base of the natural logarithm,
according to Knuth Vol. 1.
--
"The only problem with Linux for Dummies is that the advice it
contains will result only in embarrassment and inconvenience if
followed, not actual death."
--henke at insync dot net

osmium

unread,

Nov 7, 2003, 4:42:58 PM11/7/03

to

Ben Pfaff writes:

> How much precision do you need? n! is approximately sqrt(2 * pi
> * n) * (n / e)**n, where e is the base of the natural logarithm,
> according to Knuth Vol. 1.

It seems to me that Stirling's equation is just going to crank out noise
long before he gets to his target. Or were you going to compute e and pi to
10,000 digits?

Ben Pfaff

unread,

Nov 7, 2003, 3:17:14 PM11/7/03

to

"osmium" <r124c...@comcast.net> writes:

Hence, the question about how much precision he needs.
--
"The BOFH persists despite the resistance and criticism from those
other people, and takes the road less traveled by. The BOFH does
not follow the crowd merely to gain the crowd's approval. Even if
the crowd is made of BOFHs." --Greg Andrews

Richard Heathfield

unread,

Nov 7, 2003, 3:20:46 PM11/7/03

to

osmium wrote:

rjh@tux:~/dev/maffs> cat stirling.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(int argc, char **argv)
{
if(argc > 1)
{
double e = 2.71828182845904523536;
double pi = 3.14159265358979323846;
double n = strtod(argv[1], NULL);
double nf = sqrt(2 * pi * n) * pow(n / e, n);
printf("%f\n", nf);
}
return 0;
}

rjh@tux:~/dev/maffs> gcc -W -Wall -ansi -pedantic -O2 -o stirling stirling.c
-lm
rjh@tux:~/dev/maffs> ./stirling 10000
inf

Peter Ammon

unread,

Nov 7, 2003, 3:25:08 PM11/7/03

to

Adam Russell wrote:

It may seem obvious, but the fastest algorithm will be a precomputed
lookup table. A quick sum of log_2 n! from n = 1 to 10000 using
Stirling's approximation and rounding up each term up to the nearest
multiple of 8 (so you don't have to use fractional bytes in your
representations), reveals that you'll need ~556163952 bits, which is
around 70 MB for a lookup table, if you use an efficient binary
representation of the results. That would easily fit in the memory of a
modern machine.

-Peter

Sheldon Simms

unread,

Nov 7, 2003, 3:40:53 PM11/7/03

to

If it is allowed for the contest, use the GMP library. It is
faster than anything you could come up with quickly.

Sheldon Simms

unread,

Nov 7, 2003, 3:39:43 PM11/7/03

to

On Fri, 07 Nov 2003 19:18:38 +0000, Richard Heathfield wrote:

> Adam Russell wrote:
>
>> I am looking for the fastest algorithm for computing factorials (ex:
>> 4!=4*3*2*1). This is for a programming contest I am thinking of entering.
>> The program to be used is Labview and it needs to compute up to 10000!.
>
> http://www.rjgh.co.uk/prg/c/misc/f10000.c can do this quite quickly. Just
> supply it with the appropriate configuration file which you can find at
> http://www.rjgh.co.uk/prg/c/mist/f10000.txt - like this:
>
> Linux:
> ./f10000 < f10000.txt
> Windows:
> f10000 < f10000.txt
>
> On my rather slow system, it completes in about 0.02 CPU seconds.

This doesn't meet the OP's requirements. He wants to compute
factorials _up to_ 10000!.

If you would just supply the other 9999 configuration files please...

Richard Heathfield

unread,

Nov 7, 2003, 4:01:25 PM11/7/03

to

Sheldon Simms wrote:

No need. Just divide that number by 10000, and then by 9999, and then by
9998...and so on! :-)

(BTW You probably noticed that I did actually give the OP a genuine hint
later in my article.)

Adam Russell

unread,

Nov 7, 2003, 5:15:57 PM11/7/03

to

"Ben Pfaff" <b...@cs.stanford.edu> wrote in message
news:878ymsm...@pfaff.stanford.edu...

> "Adam Russell" <REMOVE_THIS...@sbcglobal.net> writes:
>
> > I am looking for the fastest algorithm for computing factorials (ex:
> > 4!=4*3*2*1). This is for a programming contest I am thinking of
entering.
> > The program to be used is Labview and it needs to compute up to 10000!.
I
> > realize that these numbers will be too large for simple representation,
and
> > I think I can handle that. But I believe that simply multiplying each
and
> > every number probably is not the most efficient algorithm so would like
> > advice on any numeric methods that might be quicker.
>
> How much precision do you need? n! is approximately sqrt(2 * pi
> * n) * (n / e)**n, where e is the base of the natural logarithm,
> according to Knuth Vol. 1.

Only doing integers and the answer has to be exact. Output will be in
string form.

Adam Russell

unread,

Nov 7, 2003, 5:20:54 PM11/7/03

to

"Sheldon Simms" <sheldo...@yahoo.com> wrote in message
news:pan.2003.11.07....@yahoo.com...

Thanks but I probably should have mentioned that precomputed lookup tables
and external code is not allowed. It has to be entirely my code and
entirely in Labview. I am allowed to borrow from others algorithms tho.

Mel Wilson

unread,

Nov 7, 2003, 4:35:22 PM11/7/03

to

In article <bogpg0$1e6ek7$1...@ID-122512.news.uni-berlin.de>,

If limited precision will do, _Numerical Recipes_
recommends the gamma function, (in some applications, the
log of the gamma function is even easier to use.)
In Python:

from math import log
def gammln (x):
tmp = x + 4.5
tmp = (x - 0.5) * log (tmp) - tmp
ser = 1.0
for c in (76.18009173, -86.50532033, 24.01409822
, -1.231739516, 0.120858003e-2, -0.536382e-5):
ser += c / x
x += 1.0
return tmp + log (2.50662827465 * ser)

if __name__ == '__main__':
import sys
print 'x\tlog(x!) \tx!'
for a in sys.argv[1:]:
f = float(a)
g = gammln (f+1) / log(10.0) # log10(f!)
try:
print '%g:\t%g,\t%g' % (f, g, 10**g)
except StandardError:
print '%g:\t%g' % (f, g)

Regards. Mel.

Christer Ericson

unread,

Nov 7, 2003, 8:35:27 PM11/7/03

to

In article <bogpg0$1e6ek7$1...@ID-122512.news.uni-berlin.de>,
REMOVE_THIS...@sbcglobal.net says...

> I am looking for the fastest algorithm for computing factorials (ex:
> 4!=4*3*2*1). This is for a programming contest I am thinking of entering.
> The program to be used is Labview and it needs to compute up to 10000!. I
> realize that these numbers will be too large for simple representation, and
> I think I can handle that.

I'm not familiar with Labview, but yes, you do need some type
of 'bignum' integer package to work with numbers of that size.

> But I believe that simply multiplying each and every number probably is
> not the most efficient algorithm so would like advice on any numeric
> methods that might be quicker.

You are correct. The naive algorithm of multiplying all numbers
from 1 through N is not the best algorithm for computing N!.

A better algorithm is, um, "Ericson's algorithm", as first
described here:

http://groups.google.com/groups?selm=christer.689772066%40berghem

Updating my previous description to ANSI C, the naive algorithm
might look like so when implemented:

int naivefac(int n) {
int x = 1;
while (n > 1)
x *= n--;
return x;
}

This uses O(n) multiplications, O(n) additions/subtractions, and
O(n) comparisons.

My faster algorithm computes N! as follows:

int fac(int n) {
int x, y;

x = (n & 1) ? n-- : 1;
y = n;
while (n) {
x *= y;
y += (n -= 2);
}
return x;
}

This uses O(n/2) multiplications, O(n) additions/subtractions, and
O(n/2) comparisons.

if we assume unit costs for multiplication, addition/subtraction,
and comparisons, this algorithm is faster by a factor of 1.5
over the naive algorithm. However, multiplication is not linear
in N, but rather of complexity O(N^(lg 3)) (for Karatsuba
multiplication) so reducing the number of multiplications in
half is usually a big win.

Christer Ericson
Sony Computer Entertainment, Santa Monica

Sheldon Simms

unread,

Nov 7, 2003, 8:43:08 PM11/7/03

to

On Fri, 07 Nov 2003 21:01:25 +0000, Richard Heathfield wrote:

> Sheldon Simms wrote:
>
>> On Fri, 07 Nov 2003 19:18:38 +0000, Richard Heathfield wrote:
>>
>>> f10000 < f10000.txt
>>>
>>> On my rather slow system, it completes in about 0.02 CPU seconds.
>>
>> This doesn't meet the OP's requirements. He wants to compute
>> factorials _up to_ 10000!.
>>
>> If you would just supply the other 9999 configuration files please...
>
> No need. Just divide that number by 10000, and then by 9999, and then by
> 9998...and so on! :-)
>
> (BTW You probably noticed that I did actually give the OP a genuine hint
> later in my article.)

Yes I did, but it's no fun to gripe at actual useful information.

CBFalconer

unread,

Nov 8, 2003, 1:00:22 AM11/8/03

to

Along those lines, I did this about a year ago:

/* compute factorials, extended range
on a 32 bit machine this can reach fact(15) without
unusual output formats. With the prime table shown
overflow occurs at 101.

Public domain, by C.B. Falconer.
*/

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

/* 2 and 5 are handled separately
Placing 2 at the end attempts to preserve such factors
for use with the 5 factor and exponential notation
*/
static unsigned char primes[] = {3,7,11,13,17,19,23,29,31,37,
41,43,47,53,57,59,61,67,71,
/* add further primes here -->*/
2,5,0};
static unsigned int primect[sizeof primes]; /* = {0} */

static
unsigned long int fact(unsigned int n, unsigned int *zeroes)
{
unsigned long val;
unsigned int i, j, k;

#define OFLOW ((ULONG_MAX / j) < val)

/* This is a crude mechanism for passing back values */
for (i = 0; i < sizeof primes; i++) primect[i] = 0;

for (i = 1, val = 1UL, *zeroes = 0; i <= n; i++) {
j = i;
/* extract exponent of 10 */
while ((0 == (j % 5)) && (!(val & 1))) {
j /= 5; val /= 2;
(*zeroes)++;
}
/* Now try to avoid any overflows */
k = 0;
while (primes[k] && OFLOW) {
/* remove factors primes[k] */
while (0 == (val % primes[k]) && OFLOW) {
val /= primes[k];
++primect[k];
}
while (0 == (j % primes[k]) && OFLOW) {
j /= primes[k];
++primect[k];
}
k++;
}

/* Did we succeed in the avoidance */
if (OFLOW) {
#if DEBUG
fprintf(stderr, "Overflow at %u, %lue%u * %u\n",
i, val, *zeroes, j);
#endif
val = 0;
break;
}
val *= j;
}
return val;
} /* fact */

/* ---------------- */

int main(int argc, char *argv[])
{
unsigned int x, zeroes;
unsigned long f;

if ((2 == argc) && (1 == sscanf(argv[1], "%u", &x))) {
if (!(f = fact(x, &zeroes))) {
fputs("Overflow\n", stderr);
return EXIT_FAILURE;
}

printf("Factorial(%u) == %lu", x, f);
if (zeroes) printf("e%u", zeroes);
for (x = 0; primes[x]; x++) {
if (primect[x]) {
printf(" * pow(%d,%d)", primes[x], primect[x]);
}
}
putchar('\n');
return 0;
}
fputs("Usage: fact n\n", stderr);
return EXIT_FAILURE;
} /* main */

--
Chuck F (cbfal...@yahoo.com) (cbfal...@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Bill Godfrey

unread,

Nov 8, 2003, 5:29:04 PM11/8/03

to

bin...@eton.powernet.co.uk wrote:
> http://www.rjgh.co.uk/prg/c/misc/f10000.c can do this quite quickly. Just

http://195.66.240.211/cgi-bin/whois.cgi?query=rjgh.co.uk

Woohoo! All the information a stalker needs.

--
The address in the reply to header is correct, but I'll
read it quicker if you drop the word "usenet".

Calum

unread,

Nov 11, 2003, 5:54:17 AM11/11/03

to

This got me thinking. How easy is it to produce the prime factors of n! ?

As I discovered, this is surprisingly fast. The algorithm I wrote is
based upon a prime number sieve. 10000! contains only prime factors
less than or equal to 10000. It turns out that

10000! = 2^9995 * 3^4996 * 5^2499 * 7^1665 * 11^998 * 13^832 * 17^624 *
19^554 * ... * 9973

Looking at the power of 5 indicates this number will have 2499 trailing
zeros (base 10). Calculating x^y takes O(lg y) multiplications (by
repeated squaring), instead of the original O(y). Doing this reduces
the problem from 9998 multiplications to 3338.

But then I discovered an algorithm that reduces the number of
multiplications to 1522. It relies on the fact that the prime powers
are monotonically decreasing. I can't explain it very well, but some
code is at http://visula.org/calum/fact2.cpp

I'd never really considered that factorial could be optimized much, or
indeed that anyone would want to...

Calum Grant

Olathe

unread,

Nov 13, 2003, 12:32:01 PM11/13/03

to

Adam Russell wrote:

For pure speed, a 10000-line text-file lookup would be quite simple in
most programming languages. If they want 56!, you simply return line
56. Exact, already in string form, and fairly quick (even quicker for
multiple queries if you load the file into a string array).

I don't see any reason not to use this method. After all, some
processors use lookup tables to speed math up.

Not sure if Labview can read text files, but if not, there are probably
similar methods you can use.

osmium

unread,

Nov 13, 2003, 2:52:36 PM11/13/03

to

Olathe writes:

I would guess that there are a lot of digits in 10000!. Or even in 9999!
Might that be a problem?

Howard Ding <hading@hading.dnsalias.com>

unread,

Nov 13, 2003, 1:02:06 PM11/13/03

to

"osmium" <r124c...@comcast.net> writes:

> Olathe writes:
>
> I would guess that there are a lot of digits in 10000!. Or even in 9999!
> Might that be a problem?
>

Perhaps 35660 of them (in decimal).

--
Howard Ding
<had...@hading.dnsalias.com>

Anthony Ventimiglia

unread,

Nov 14, 2003, 9:12:45 AM11/14/03

to

Olathe <Ola...@DALnet.irc> writes:

> Adam Russell wrote:
>
> For pure speed, a 10000-line text-file lookup would be quite simple in
> most programming languages. If they want 56!, you simply return line
> 56. Exact, already in string form, and fairly quick (even quicker for
> multiple queries if you load the file into a string array).
>
> I don't see any reason not to use this method. After all, some
> processors use lookup tables to speed math up.
>
> Not sure if Labview can read text files, but if not, there are
> probably similar methods you can use.

That get's to be a pretty big text file, And the lines are pretty
long. Unless you had a fixed record length, which should make it
around 340M. But then when someone wants 10001! you're screwed.

personally I think this time is fine for me:

* (time (factorial 10000))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:
; Evaluation took:
; 0.83 seconds of real time
; 0.42 seconds of user run time
; 0.11 seconds of system run time
; 2,108,749,456 CPU cycles
; [Run times include 0.14 seconds GC run time]
; 238 page faults and
; 71,288,016 bytes consed.

284625968091705451890641321211986889014805140170279923079417999427441134000376444377299078675778477581588406214231752883004233994015351873905242116138271617481982419982759241828925978789812425312059465996259867065601615720360323979263287367170557419759620994797203461536981198970926112775004841988454104755446424421365733030767036288258035489674611170973695786036701910715127305872810411586405612811653853259684258259955846881464304255898366493170592517172042765974074461334000541940524623034368691540594040662278282483715120383221786446271838229238996389928272218797024593876938030946273322925705554596900278752822425443480211275590191694254290289169072190970836905398737474524833728995218023632827412170402680867692104515558405671725553720158521328290342799898184493136106403814893044996215999993596708929801903369984844046654192362584249471631789611920412331082686510713545168455409360330096072103469443779823494307806260694223026818852275920570292308431261884976065607425862794488271559568315334405344254466484168945804257094616736131876052349822863264529215294234798706033442907371586884991789325806914831688542519560061723726363239744207869246429560123062887201226529529640915083013366309827338....

--
(incf *yankees-world-series-losses*)

Olathe

unread,

Nov 15, 2003, 12:45:06 AM11/15/03

to

Anthony Ventimiglia wrote:

> That get's to be a pretty big text file, And the lines are pretty
> long. Unless you had a fixed record length, which should make it
> around 340M. But then when someone wants 10001! you're screwed.

I agree, but the priority wasn't that the method worked for inputs above
10000 (in fact, it specifically said that the highest input would be
10000). The priority was pure efficiency.

Fixed record lengths would be much faster than my solution, though.

Olathe

unread,

Nov 15, 2003, 11:33:51 PM11/15/03

to

Olathe wrote:

Now that I think of it, if you don't have that much space or something
like that, you could slim it down with one of two methods:

1) Only solve every ten (or 100 or whatever) numbers (Solve 10!, 20!,
30!, etc.). That way, you only have to multiply at the most ten numbers
together (for 199, you'd retrieve 190!, then multiply it by 191 through
199). This would approximately reduce the file to one-tenth its size.

2) Have groups. For instance, 1 * 2 * 3 * ... * 10 is the first number
stored, 11 * 12 * 13 * ... * 20 is the second number stored, etc. This
would cut down on the length of the numbers that are stored (making the
fixed record length much smaller; the largest number you'd have to store
is 9991 * ... * 10000). You'd still have to multiply at the most ten
numbers together and do up to 999 additions, though.

You could combine these methods to make it faster. For instance, use
method 1 to store the result of every five-hundredth number (500!,
1000!, etc.) and use method 2 as stated. This would reduce the number
of additions necessary without increasing the storage size too much.