Statistical approximations, calculator style (3)

sa...@kt-algorithms.com

unread,

Mar 24, 2005, 11:26:22 PM3/24/05

to

A simple approximation to x!, valid for x>=0
--------------------------------------------
While there is no lack of simple approximations to the factorial, they
are generally not performing well for small positive or 0 arguments.
The following approximation to x! is valid for x>=0 (x not necessarily
an integer):

f(x) = t^(x+1/2) e^(-t) sqrt(2 pi)

where:

t = x + c + d/(x+1)

and the max. |rel. error| is for conveniently rounded versions of
(c,d):

<1.045E-5 for (c,d) = (0.788208,0.031083)
<1.060E-5 for (c,d) = (0.78821,0.03108)
<1.395E-5 for (c,d) = (0.7882,0.0311)

A little extra accuracy can be gained by adjusting the value 1 in the
denominator, but hardly worth the effort if we wish to keep the formula
simple.

Derivation:
The Stirling and Burnside approximations (regarding the latter see
contributions to this forum by David W. Cantrell) can be written on the
common form:

f(x) = (x+a)^(x+1/2) e^(-x-a) sqrt(2 pi)

where a=0 for Stirling's and a=1/2 for Burnside's formula.
The optimal value of a depends on x approximately as follows:

a = c + d/x

where (c,d) are constants.

Regards,

Knud Thomsen
(Aarhus, Denmark)

sa...@kt-algorithms.com

unread,

Mar 25, 2005, 1:08:06 AM3/25/05

to

Obviously, in the 'Derivation' section, the formula for a should read:
'a = c + d/(x+1)' .
Knud Thomsen

sa...@kt-algorithms.com

unread,

Mar 27, 2005, 12:35:31 PM3/27/05

to

If you prefer the constants to be expressed as integer ratios, the best
integer ratios involving integers<=400 are:

(c,d) = (67/85, 5/161)

giving a |rel. error|<1.265E-5.

Regards,

Knud Thomsen

David W. Cantrell

unread,

Mar 30, 2005, 12:53:45 AM3/30/05

to

Greetings, Knud!

Don't let lack of response to your "Statistical approximations, calculator
style (n)" threads make you think that nobody is reading them. I certainly
have been (despite the unfortunate fact that I know only very little about
statistics).

sa...@kt-algorithms.com wrote:
> A simple approximation to x!, valid for x>=0
> --------------------------------------------
> While there is no lack of simple approximations to the factorial, they
> are generally not performing well for small positive or 0 arguments.

True. Not long ago, I found a simple approximation for the gamma function
which gives reasonable accuracy for arguments near zero. Spurred by your
post, I hope, in the next few days, to present that approximation and
several others, perhaps in a series of posts.

Below, I comment on the type of approximation you have considered.

> The following approximation to x! is valid for x>=0 (x not necessarily
> an integer):
>
> f(x) = t^(x+1/2) e^(-t) sqrt(2 pi)
>
> where:
>
> t = x + c + d/(x+1)
>
> and the max. |rel. error| is for conveniently rounded versions of
> (c,d):
>
> <1.045E-5 for (c,d) = (0.788208,0.031083)
> <1.060E-5 for (c,d) = (0.78821,0.03108)
> <1.395E-5 for (c,d) = (0.7882,0.0311)
>
> A little extra accuracy can be gained by adjusting the value 1 in the
> denominator, but hardly worth the effort if we wish to keep the formula
> simple.
>
> Derivation:
> The Stirling and Burnside approximations (regarding the latter see
> contributions to this forum by David W. Cantrell) can be written on the
> common form:
>
> f(x) = (x+a)^(x+1/2) e^(-x-a) sqrt(2 pi)
>
> where a=0 for Stirling's and a=1/2 for Burnside's formula.
> The optimal value of a depends on x approximately as follows:
>

> a = c + d/(x+1)
>
> where (c,d) are constants.

Your idea of considering a form which generalizes the Stirling and Burnside
approximations is excellent.

Since a is constant in both of those approximations, it might be fruitful
to consider a = c first, before considering a = c + d/(x+1).

Your objective was to minimize worst |rel. error| for x >= 0. That's fine.
However, to my taste, it would be more desirable to have an approximation
which is as good as possible for large arguments, while still providing
modest accuracy for arguments near zero.

Considering just a = c for the moment, if the approximation is to be as
good as possible for large arguments, then it can be shown that

c = (1 + 1/Sqrt(3))/2 = 0.788675...
[Compare that with your value of 0.788208 above.]

Using a = c = (1 + 1/Sqrt(3))/2, the relative error is approximately
1/(72 Sqrt(3) x^2) for large arguments. Your approximation's relative error
is approximately 0.000135/x for large arguments. Thus, although your
approximation is far better for arguments near zero, the approximation with
a = c = (1 + 1/Sqrt(3))/2 will be better for sufficiently large arguments.

Now consider a = c + d/(x+1). If the approximation is to be as good as
possible for large arguments, we find that c = (1 + 1/Sqrt(3))/2 again and
that

d = 1/36 = 0.027777...
[Compare that with your value of 0.031083 above.]

Using a = (1 + 1/Sqrt(3))/2 + 1/(36(x+1)), the relative error is roughly
0.002/x^3 for large arguments. [The constant in that numerator is precisely
(30 Sqrt(3) - 29)/12960, by the way.] Your original approximation is better
for positive arguments which are small (less than roughly 6).

For any simple approximation of x!, as x increases, absolute error will
surely increase without bound. Trying to slow that inevitable increase as
much as possible leads me to consider approximations which are as good as
possible for large arguments, while at the same time hoping that the
resulting approximation will have only moderate error when the argument is
near zero. What we choose to do is a matter of taste and utility. There's
certainly nothing wrong with minimizing worst |rel. error| for x >= 0. It's
just not what I chose to do here. Also, I wanted an approximation which
would be reasonable for use more generally than x >= 0:

Using f(x) = Sqrt(2 pi) u^(x+1/2) e^-u

with u = x + (1 + 1/Sqrt(3))/2 + 1/(36(x+1)) (1)

for all real x, x! is approximately

( f(x) for x >= -1/2
( (2)
( pi csc(pi (x + 1)) / f(-(x + 1)) for x < -1/2

with worst |rel. error| of approx 0.0026, occurring near x = -1/2. Relative
error is roughly 0.002/x^3 when |x| is large.

Knud: With little trouble, you could recalculate c and d so as to minimize
worst |rel. error| for x >= -1/2, rather than x >= 0. Go ahead and do so,
if you wish. Then, using (2), you would have an approximation for x!,
slightly different from my suggestion above, which would minimize worst
|rel. error| for all real x. But here's a different alternative:

The above approximation, using (1) and (2), has a small jump discontinuity
at x = -1/2. If we wish to have an approximation which is well-behaved
there, we may alter d, the coefficient of 1/(x+1), from 1/36 to

d = (log(2) - 1/Sqrt(3))/4 = 0.0289...

so that we then have

u = x + (1 + 1/Sqrt(3))/2 + (log(2) - 1/Sqrt(3))/(4(x+1)) (3)

Using this, (2) is differentiable at x = -1/2. Its worst |rel. error| is
approx 0.0011, occurring at x = -1/2 +/- 0.17 approx. Relative error is
roughly -0.004/x^2 sign(x) when |x| is large. [The constant in that
numerator is precisely 11/288 + (1 - 9 log(2))/(72 Sqrt(3)).]

Furthermore, we may use (2) with either (1) or (3) to approximate z!
throughout the complex plane. The worst |rel. errors| are roughly 0.005 and
0.004, using (1) and (3) resp., occurring when z = -1/2 +/- 0.3 i approx.

By the way, I do not know for sure if any of the approximations mentioned
above are new.

Best regards,
David W. Cantrell

sa...@kt-algorithms.com

unread,

Mar 30, 2005, 11:09:09 AM3/30/05

to

Hello David,

Your detailed response is much appreciated, it more than compensates
for my getting no response for so long!

The reason for my specific design goal, i.e. minimizing the maximum
|rel. error| for x>=0, is entirely pragmatic:
every so often a user may have the task of processing small as well as
large non-negative arguments in a uniform
fashion, so that larger arguments can be given no special priority.

However, I think your approximations with better asymptotic behavior
and analytically derived best constants
are very interesting, intrinsically as well practically. I wonder if,
for instance, your Burnside-based binomial
approximation (Sci.Math May 2004), could be improved, using these
recent ideas of yours?

By the way, I look forward to seeing your approximations with good
near-zero performance.

Best regards,

Knud Thomsen

David W. Cantrell

unread,

Apr 6, 2005, 2:49:57 AM4/6/05

to

sa...@kt-algorithms.com wrote:
> Hello David,
>
> Your detailed response is much appreciated, it more than compensates
> for my getting no response for so long!
>
> The reason for my specific design goal, i.e. minimizing the maximum
> |rel. error| for x>=0, is entirely pragmatic:
> every so often a user may have the task of processing small as well as
> large non-negative arguments in a uniform
> fashion, so that larger arguments can be given no special priority.

That's certainly an understandable design goal, and I believe it was
essentially what Lanczos had in mind too.

In case you were not familiar with his work, I decided to do a Web search,
so that I could supply you with some easy references. Googling for

Lanczos gamma

I didn't expect to find anything which was really new to me. But to my
surprise and delight, I found a recent UBC doctoral thesis: AN ANALYSIS OF
THE LANCZOS GAMMA APPROXIMATION by GLENDON RALPH PUGH

<http://laplace.physics.ubc.ca/People/matt/Doc/ThesesOthers/Phd/pugh.pdf>

Two small comments about section 10.3 of Pugh's thesis appear near the end
of this post.

BTW, since you used "calculator style" in the title of this thread, you
might also be interested in <http://www.rskey.org/gamma.htm>. In
particular, you might look at the section "A Curious Result", which gives
Windschitl's very nice (and, I suspect, original) approximation.

> However, I think your approximations with better asymptotic behavior
> and analytically derived best constants
> are very interesting, intrinsically as well practically. I wonder if,
> for instance, your Burnside-based binomial
> approximation (Sci.Math May 2004), could be improved, using these
> recent ideas of yours?

That's not the way I'd suggest improving approximations for the binomial
coefficients. If I really wanted an improved approximation, I'd just
approximate the Gamma functions involved better, using more of the
continued fraction in the convergent expansion which I presented in

<http://mathforum.org/kb/message.jspa?messageID=1617235>.

> By the way, I look forward to seeing your approximations with good
> near-zero performance.

Hmm. I'm not sure that you'd say they have _good_ performance when |z| is
small. In any event, my spare time is very limited right now; don't expect
anything else from me for a while.

Never forget that, if performance is good enough for larger arguments, you
can easily "translate" that good performance down to smaller arguments
using

z! = (z + n)!/((z + 1)(z + 2)(z + 3)...(z + n)).

Now let's look at something from my previous response:

Supposing that we wish to approximate x! by
Sqrt(2 pi) (x + c)^(x+1/2) e^-(x + c) for some constant c, I had said that

"if the approximation is to be as good as possible for large arguments,

then it can be shown that c = (1 + 1/Sqrt(3))/2 = 0.788675..." But <blush>
that's not quite true. For large arguments, c could just as well be
(1 - 1/Sqrt(3))/2 = 0.211324..., in that, using either value of c, the
relative error is approximately 1/(72 Sqrt(3) x^2). (However, for small
arguments, the approximation using the smaller value of c is substantially
worse than the approximation using the larger value of c.)

In Pugh's thesis, I have so far, unfortunately, had a chance to look at
only section 10.3: Variations on Stirling's Formula (pp. 134-8).
Comment 1: He says that "Spouge in [27] derives the more visually pleasing
approximation" and then states, as (10.3), the approximation normally
credited to Burnside (1917), but never mentions Burnside. I wonder if Pugh
had some reason for not crediting Burnside.
Comment 2: Looking at Figure 10.1, at the end of that section, the
ordinates at the far right of the graphs are precisely (1 +/- 1/Sqrt(3))/2,
that is, the values of c mentioned in my previous paragraph. Pugh only
notes that one of them is "about 0.79".

By the way, Knud, notice that the upper graph in that figure appears to be
almost straight. (Also note the "scaled real line"!) And recall that you
had considered an approximation in the form
Sqrt(2 pi) (x + a)^(x+1/2) e^-(x + a) where a = c + d/(x + 1) for some
constants c and d. Well, if that form had given x! precisely, then the
upper graph in Figure 10.1 would have been perfectly straight (due to the
scaling involved). Neat, eh?

Aargh! There's so much more I want to write about, but I must go now.

Best regards to you, Knud, and to Glen Pugh (to whom I'm sending a copy of
this message),

David W. Cantrell

sa...@kt-algorithms.com

unread,

Apr 7, 2005, 5:37:24 AM4/7/05

to

Actually, I'm well aware of both Lanczos's approximations (which I have

been admiring for a long time and use routinely), the recurrence
relation
and the 'rskey' site.
Pugh's thesis may help giving Lanczos the recognition he deserves; in
Lanczos' time, applied math was regarded as something quite distinct
from
(and inferior to) pure math.
Lanczos is a source of much original insight; I still find my copy of
Lanczos' 'Applied analysis' (1956) a useful complement to modern
literature
on that subject.
Re: Fig. 10.1 in Glen Pugh's thesis: interesting that the Lambert W
function
(and both real branches) pops up in this context, too!

Regards,

Knud Thomsen