maple is slow, pari gp even slower (on 1 topic)

plouffe

unread,

Jan 17, 2010, 5:12:46 PM1/17/10

to

Hello,

I made this simple test, computing exp(1) or E to 1 million digits
with
Maple 13, Mathematica 7.0 and Pari Gp 2.3.4 on a 2x4 Intel E5405
running
at 2.0 Ghz with 4 gigs of RAM. All installations are 64 bits on a
Ubuntu 9.10.

Maple 13 : ------------------------- 165 seconds
Mathematica 7: ------------------ 1.11 seconds
Pari GP 2.3.4 : -------------------- 936 seconds

What is this!??

This is bad. This means that Mathematica is ~150 times faster than
Maple 13 and ~900 times faster than Pari-GP ??

As far as I know, exp(1) can only be evaluated with the standard
series,
this means that there is a big problem with the handling of large
numbers
with some programs. I did not try with Pifast, probably takes a very
few
seconds with that one.

I wish that someday, some people will take the time to renew
some libraries and/or get a connection to internet to get the
latest news since the invention of the ENIAC...

Simon Plouffe

TimDaly

unread,

Jan 17, 2010, 6:48:44 PM1/17/10

to

I ran a test with Axiom and it takes about 45 seconds on
a slower, single processor box.

Are you the author of the code for Maple, MMA, or Pari?
If this is a simple test and you are aware of ways to
improve the performance, did you post improved code?

It is trivial to complain that there COULD be better code.
It is somewhat more of a challenge to write that code.
So write the improved code and benchmark your version.
I would be curious to see how much better you can do.

There is no such thing as a simple job.

Tim

clicl...@freenet.de

unread,

Jan 17, 2010, 7:52:56 PM1/17/10

to

plouffe schrieb:

The Maple and Pari GP timings are roughly compatible with the
performance of Derive 6.10, whose pseudocode-Lisp interpreter uses fast
machine-language implementations for multi-precision arithmetic, with
n-digit by n-digit multiplication taking n^2 operations (Mathematica and
Maple probably use fast bit-string convolution techniques taking
n*log(n) operations). On Derive, the effort of computing E = exp(1) to n
digits grows roughly as n^2; since roughly n terms of the standard
series for E are needed for n digits, no n^2 operations (such as n-digit
by n-digit multiplications) seem to be involved therefore (Derive must
somehow exploit the fact that each term in the series is related to the
prededing term by a small integer factor). I guess all this means that
Mathematica doesn't use the standard series for the calculation of E =
exp(1).

Martin.

A N Niel

unread,

Jan 17, 2010, 7:56:06 PM1/17/10

to

In article
<31423165-2a11-456b...@e16g2000yqc.googlegroups.com>,
plouffe <simon....@gmail.com> wrote:

> As far as I know, exp(1) can only be evaluated with the standard
> series,

Another would be a pre-computed value.

Waldek Hebisch

unread,

Jan 17, 2010, 8:57:31 PM1/17/10

to

In sci.math.symbolic plouffe <simon....@gmail.com> wrote:
>
> Hello,
>
> I made this simple test, computing exp(1) or E to 1 million digits
> with
> Maple 13, Mathematica 7.0 and Pari Gp 2.3.4 on a 2x4 Intel E5405
> running
> at 2.0 Ghz with 4 gigs of RAM. All installations are 64 bits on a
> Ubuntu 9.10.
>
> Maple 13 : ------------------------- 165 seconds
> Mathematica 7: ------------------ 1.11 seconds
> Pari GP 2.3.4 : -------------------- 936 seconds
>
> What is this!??
>
> This is bad. This means that Mathematica is ~150 times faster than
> Maple 13 and ~900 times faster than Pari-GP ??
>
> As far as I know, exp(1) can only be evaluated with the standard
> series,

What you mean by "the standard series"? AFAIK nobody directly uses
Taylor expansion of exp to compute it, its too slow. Additionally,
exp(1) may be computed much faster than general exp(x). For
example, FriCAS uses the follwing routine:

expInverse k ==
-- computes exp(1/k) via continued fraction
p0:I := 2*k+1; p1:I := 6*k*p0+1
q0:I := 2*k-1; q1:I := 6*k*q0+1
for i in 10*k.. by 4*k while 2 * LENGTH p0 < bits() repeat
(p0,p1) := (p1,i*p1+p0)
(q0,q1) := (q1,i*q1+q0)
dvide([p1,0],[q1,0])

this routine is subotimal for large precisions, the follwing is
a bit better (but more complicated):

mat0(m : I, n : I) : Matrix(Integer) ==
m11 : I := 1; m12 : I := 0;
m21 : I := 0; m22 : I := 1;
i : I := n
while i < m repeat
(m11, m12, m21, m22) := (m21, m22, m11 + i*m21, m12 + i*m22)
i := i + 4
matrix [[m11, m12], [m21, m22]]

mat1(m : I, n : I) : Matrix(Integer) ==
(m - n) <= 200 => mat0(m, n)
k := (m - n) quo 8
mat1(m, n + 4*k)*mat1(n+4*k, n)

expInv1() ==
p1 : I := 2*1 + 1; p0 : I := 1
q1 : I := 2 - 1; q0 : I := 1
k0 : I := 6
ki : I := 200
l := bits()
mm0 := matrix([[p0, q0], [p1, q1]])
while 2*length(mm0(1,1)) < l repeat
mm0 := mat0(k0+ki, k0)*mm0
k0 := k0 + ki
ki := 2*ki
mm0(2,1)::Float/mm0(2,2)::Float

Note: expInv1 should be equivalent to expInverse with apropriate
accuracy. I did not try to optimize treshholds in expInv1 and
mat1 (they are just educated guesses).

On 2.4 GHz Core 2 Quatro (but computation uses only one core)
FriCAS needs:
expInverse expInv1

ECL 133.80 26.78
GCL 42.19 17.29
sbcl 40.65 26.78

Note: FriCAS is build on top of Lisp and uses arithmetic routines
from Lisp, so times depend on Lisp implementation. It may of
some interest to ECL and GCL use the same library (GMP) to perfrom
arithmetic, just differ in parameter passing and memory
management.

> this means that there is a big problem with the handling of large
> numbers
> with some programs.

Comparing times using ECL, GCL and sbcl shows that there are
differences in handling large numbers, but they are more
complicated than saying "program P1 is X times faster than
program P2".

And formula used to compute exp(1) matters at least as much
(and probably more) as speed of large number arithmetic.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

clicl...@freenet.de

unread,

Jan 17, 2010, 10:57:52 PM1/17/10

to

Waldek Hebisch schrieb:

>
> In sci.math.symbolic plouffe <simon....@gmail.com> wrote:
> >
> > As far as I know, exp(1) can only be evaluated with the standard
> > series,
>
> What you mean by "the standard series"? AFAIK nobody directly uses
> Taylor expansion of exp to compute it, its too slow. Additionally,
> exp(1) may be computed much faster than general exp(x).

In fact, the "Notes On Internal Implementation" for Mathematica say:

* E is computed from its series expansion.
* Exponential and trigonometric functions use Taylor series, stable
recursion by argument doubling, and functional relations.

Continued fractions are not mentioned. For example, the Taylor series
for exp(1) may be accelerated by first calculating exp(2^(-m)) and then
squaring the result m times (using fast, i.e. n*log(n), multiplication).
Here, the integer parameter m should be optimized for minimum overall
computational effort.

Martin.

Martin Rubey

unread,

Jan 18, 2010, 1:58:48 AM1/18/10

to

How would you do this

> I made this simple test, computing exp(1) or E to 1 million digits

with a pre-computed value?

Martin

A N Niel

unread,

Jan 18, 2010, 6:47:21 AM1/18/10

to

In article <d9ockrv...@ada0.ifam.uni-hannover.de>, Martin Rubey
<axio...@yahoo.de> wrote:

> A N Niel <ann...@nym.alias.net.invalid> writes:
>
> > In article
> > <31423165-2a11-456b...@e16g2000yqc.googlegroups.com>,
> > plouffe <simon....@gmail.com> wrote:
> >
> >> As far as I know, exp(1) can only be evaluated with the standard
> >> series,
> >
> > Another would be a pre-computed value.
>
> How would you do this

Some CIS I recall does pi this way: There is a pre-computed value
stored, which it uses if the requested number of digits is less than
the number stored, otherwise it computes the larger number of digits.
Then it remembers the larger number of digits for the future.

>
> > I made this simple test, computing exp(1) or E to 1 million digits
>
> with a pre-computed value?

I have no idea how Mathematica does it. Since (on my machine)
Mathematica occupies 1.4 gigabytes, surely it has room for a few
one-megabyte constants in there... Or maybe if it uses the "saving"
method mentioned, the fast time just means that Simon computed
1 million digits or more already at some time in the past.

Raymond Toy

unread,

Jan 18, 2010, 10:50:04 AM1/18/10

to

On 1/18/10 6:47 AM, A N Niel wrote:
> In article <d9ockrv...@ada0.ifam.uni-hannover.de>, Martin Rubey
> <axio...@yahoo.de> wrote:
>
>> A N Niel <ann...@nym.alias.net.invalid> writes:
>>
>>> In article
>>> <31423165-2a11-456b...@e16g2000yqc.googlegroups.com>,
>>> plouffe <simon....@gmail.com> wrote:
>>>
>>>> As far as I know, exp(1) can only be evaluated with the standard
>>>> series,
>>>
>>> Another would be a pre-computed value.
>>
>> How would you do this
>
> Some CIS I recall does pi this way: There is a pre-computed value
> stored, which it uses if the requested number of digits is less than
> the number stored, otherwise it computes the larger number of digits.
> Then it remembers the larger number of digits for the future.

Maxima does this. Or used to do this. There's a rounding problem if
you want pi to fewer digits than the stored number. Recently it was
changed to just recompute pi if the number of digits is less than the
stored value.

Maxima uses the same idea for e as well.

Ray

plouffe

unread,

Jan 18, 2010, 3:16:58 PM1/18/10

to

Hello,

1) I do no use precomputed values.
2) By using an old trick which is exp(1/2^m) and then squaring m
times
the time taken in maple can be reduced to about 2 minutes.
This is explained in the classic book of Henrici on numerical
analysis.
3) It is true that for Pi, Maple used to have the first 10000 digits
stored in the program, I do not know about E. 10000 digits is
small
, this takes a fraction of a second on a modern machine.

The problem I think is that the bignum package does not use Karatsuba
algorithm to multiply big numbers.

4) Even with a value of m = 4096 (see 2) the time taken is way too
long,
1/2^4096 is 10^(-1234) approx. It should go very fast, it does
not.
There is still the problem of squaring 4096 times which can be
speeded
up. My estimation is that : by using Maple own routines, some
refinements,
we get what : 1 min of processing time when Mathematica uses 1.11
seconds.

I remember that people at Mathematica as early as 1997 were already
making
plans to implement fast routines for the computation of the
inverses, etc.
with sophisticated algorithms.

I was surprised that even in 2010, Maple 13 is way behind for the
implementation
of fast routines on the inverse and multiplication of big numbers.
in my opinion
this is the source of the problem.

Simon Plouffe

Richard Fateman

unread,

Jan 18, 2010, 4:55:22 PM1/18/10

to

The simplest way for a computer algebra
system to do a decent job at low cost is to use someone else's package,
some of which are benchmarked by D.J. Bernstein,

http://cr.yp.to/arith.html

The developers of a computer algebra system might independently
implement a top-notch huge-number multiplication (etc) system tuned to
each of its platforms, but this would be a kind of surprising
expenditure of effort to speed up certain kinds of computation for what
is probably a very small percentage of the audience.

I think that acceptable behavior for operations on numbers of (say) 5 to
100 decimal digits is probably far more important. Especially at the low
end, since those numbers are used very often for mundane operations like
array indexing, counting, etc.

RJF

Dann Corbit

unread,

Jan 18, 2010, 4:57:54 PM1/18/10

to

In article <d9ockrv...@ada0.ifam.uni-hannover.de>, axio...@yahoo.de
says...

Store a billion digits in a memory mapped database {for common, popular
constants} and return the subset that they request. If they request
more than a billion digits, then use the billion digit value as a
starting point for iterated improvement using a suitable, rapidly
converging algorithm.

It would be interesting to see the timing for MPIR:
http://www.mpir.org/
which is very active in development right now.

Anyway, for calculating the values I guess that good advice might be
found here:
http://numbers.computation.free.fr/Constants/constants.html

Dann Corbit

unread,

Jan 18, 2010, 5:23:43 PM1/18/10

to

In article <4B54D8CA...@cs.berkeley.edu>, fat...@cs.berkeley.edu
says...

plouffe

unread,

Jan 18, 2010, 5:47:13 PM1/18/10

to

Hello mr Fateman,

I agree with you that these considerations
are important to very few people. Maple is the
product of hundreds of years-person in development.
I know personaly many of them as well as
people from wolfram.

But also, what was true before the year 2000 is
less true today since the pc's are much faster.

Also, the team of developpers at wolfram are professionnal
mathematicians and they are paid, many developers
in the free-world of open-source area are volunteers,
this is part of the case with Maple. This makes a
big difference.

Despite this, Pari-Gp is faster in many computations
and compares favorably to wolfram, at least we
have the choice to use one or the other (this is my case).

Also, condider this, not long time ago, I began the
computation of Bernoulli numbers and realized that the
Maple implementation was kind of very slow. Then I began
to develop a program with Greg Fee to compute them
using a very simple formula for the asymptotics and
it worked very fine, I could compute B(10000) in a few
seconds when maple was taking forever to compute B(1000).

Then I began to compute larger value, I pushed the program
written in Maple to B(750000), it was the record for a short
time. Then it catched the attention of many people on that
problem. Today the record computation is something like
B(100000000) which uses a very specialized algorithm.
Of course my record computation looked like mickey mouse
compared to these high performance records. But still,
it pushed the thing a little bit since now on Pari-GP
and Mathematica it is fast,
I do not know what they are using but
the computation of Bernoulli numbers is really fast
compared to the old recurrence formula (which was a waste
of cpu time). So, by pushing the computation to that
limit helped to improve the general behavior of those
programs, it forced them to re-consider that particular
computation.

Sometimes, you have to shake the cage a little.

I submitted a 'compte-rendu' of this story and the
algorithm to a journal, essentialy I have been told
that despite the originality of my results , this
is 'dépassé', well, what a surprise...
In the mean time, maple is still slow on that topic
compared to mathematica and pari-gp.

I am not that certain that these computations to
millions of digits are not important. In my opinion
it is.

Best regards,

Simon Plouffe

Hello mr Fateman,

I agree with you that these considerations
are important to very few people. Maple is the
product of hundreds of years-person in development.
I know personaly many of them as well as
people from wolfram.

But also, what was true before the year 2000 is
less true today since the pc's are much faster.

Also, the team of developpers at wolfram are professionnal
mathematicians and they are paid, many developers
in the free-world of open-source area are volunteers,
this is part of the case with Maple. This makes a
big difference.

Despite this, Pari-Gp is faster in many computations
and compares favorably to wolfram, at least we
have the choice to use one or the other (this is my case).

Also, condider this, not long time ago, I began the
computation of Bernoulli numbers and realized that the
Maple implementation was kind of very slow. Then I began
to develop a program with Greg Fee to compute them
using a very simple formula for the asymptotics and
it worked very fine, I could compute B(10000) in a few
seconds when maple was taking forever to compute B(1000).

Then I began to compute larger value, I pushed the program
written in Maple to B(750000), it was the record for a short
time. Then it catched the attention of many people on that
problem. Today the record computation is something like
B(100000000) which uses a very specialized algorithm.
Of course my record computation looked like mickey mouse
compared to these high performance records. But still,
it pushed the thing a little bit since now on Pari-GP
and Mathematica it is fast,
I do not know what they are using but
the computation of Bernoulli numbers is really fast
compared to the old recurrence formula (which was a waste
of cpu time). So, by pushing the computation to that
limit helped to improve the general behavior of those
programs, it forced them to re-consider that particular
computation.

Sometimes, you have to shake the cage a little.

I submitted a 'compte-rendu' of this story and the
algorithm to a journal, essentialy I have been told
that despite the originality of my results , this
is 'dépassé', well, what a surprise...
In the mean time, maple is still slow on that topic
compared to mathematica and pari-gp.

I am not that certain that these computations to
millions of digits are not important. In my opinion
it is.

Best regards,

Simon Plouffe

clicl...@freenet.de

unread,

Jan 18, 2010, 7:53:01 PM1/18/10

to

plouffe schrieb:

>
> The problem I think is that the bignum package does not use Karatsuba
> algorithm to multiply big numbers.
>

The Karatsuba algorithm, being an n^1.585 algorithm, isn't very
interesting for 10^6 decimal digits. Fast Fourier Transform gives rise
to an n*log(n) algorithm, see the chapter on Less Numerical Algorithms
in Numerical Recipes by Press, Teukolsky, Vetterling, Flannery.

Martin.

Dann Corbit

unread,

Jan 18, 2010, 9:04:27 PM1/18/10

to

In article <4B55026D...@freenet.de>, clicl...@freenet.de says...

If the Plouffe you are responding to is this one:
http://www.lacim.uqam.ca/~plouffe/
who is the author of this thing (among other things):
http://pi.lacim.uqam.ca/

then I guess he knows a lot about multiple precision mathamatics.

Be that as it may, ffts are usually used for very large precisions. Not
sure where Karatsuba coughs out and fft takes over, but surely by the
millions of digits and probably by the tens of thousands for most
implementations.

In any case, there appears to be a defect in the bignum package in use
(presumably, it is using the "schoolbook" algorithm for large
precisions, which would be pretty awful.

clicl...@freenet.de

unread,

Jan 19, 2010, 12:13:31 AM1/19/10

to

Dann Corbit schrieb:

>
> In any case, there appears to be a defect in the bignum package in use
> (presumably, it is using the "schoolbook" algorithm for large
> precisions, which would be pretty awful.

The Maple timings in the link you provided,

<http://pari.math.u-bordeaux.fr/benchs/timings-mpfr.html>,

show Karatsuba scaling out to n = 10^4 decimal digits. So Maple's
Karatsuba multiplication must start to fail somewhere between n = 10^4
and 10^6. This should be easy to pin down.

Martin.

Roman Pearce

unread,

Jan 20, 2010, 8:04:13 PM1/20/10

to

It would be silly to cache a million digits of a constant like this.
Far more useful to simply write a fast routine. It's a 1.1MB number.
That's not big. Maybe it would have been in 1995.

G. A. Edgar

unread,

Jan 21, 2010, 2:08:25 PM1/21/10

to

It seems Maple computes e as hypergeom([],[],1) ...
Now in fact computing e to 1 million places is a one-time computation:
you don't do that exact thing repeatedly.

Maybe it would also be useful to compare these different CASs in
a computation that they haven't been able to prepare in advance for.
Say... hypergeom([1,1],[1/2,3/2],1) to a million places.
I got 117 seconds (nearly 2 minutes) in Maple 13 on this old 2.4 GHz
Mac.

--
G. A. Edgar http://www.math.ohio-state.edu/~edgar/

Axel Vogt

unread,

Jan 21, 2010, 2:58:23 PM1/21/10

to

G. A. Edgar wrote:
> It seems Maple computes e as hypergeom([],[],1) ...
> Now in fact computing e to 1 million places is a one-time computation:
> you don't do that exact thing repeatedly.
>
> Maybe it would also be useful to compare these different CASs in
> a computation that they haven't been able to prepare in advance for.
> Say... hypergeom([1,1],[1/2,3/2],1) to a million places.
> I got 117 seconds (nearly 2 minutes) in Maple 13 on this old 2.4 GHz
> Mac.
>

Or staying to exp: how do they compare for exp(+-sqrt(2.0)) ?

Phil Carmody

unread,

Jan 23, 2010, 6:29:47 PM1/23/10

to

A N Niel <ann...@nym.alias.net.invalid> writes:

Entirely possibly from modern bloatware - no-body will notice anyone
secreting a million digits of E somewhere dark and dank. However, what
stops Newton-Raphson from giving you quadratic convergence?

Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1

Phil Carmody

unread,

Jan 23, 2010, 6:32:32 PM1/23/10

to

plouffe <simon....@gmail.com> writes:
> Hello,
>
> I made this simple test, computing exp(1) or E to 1 million digits
> with
> Maple 13, Mathematica 7.0 and Pari Gp 2.3.4 on a 2x4 Intel E5405
> running
> at 2.0 Ghz with 4 gigs of RAM. All installations are 64 bits on a
> Ubuntu 9.10.
>
> Maple 13 : ------------------------- 165 seconds
> Mathematica 7: ------------------ 1.11 seconds
> Pari GP 2.3.4 : -------------------- 936 seconds
>
> What is this!??

Pari/GP 2.3.1 on 1 core of a 2GHz core2duo (the other core is
thrashing wine and crossover office currently, so this figure
is not as low as it might be)

? exp(1);
time = 1mn, 48,306 ms.

It looks like you have a seriously misconfigured Pari.

Phil Carmody

unread,

Jan 23, 2010, 6:34:57 PM1/23/10

to

plouffe <simon....@gmail.com> writes:
> Hello,
>
> 1) I do no use precomputed values.
> 2) By using an old trick which is exp(1/2^m) and then squaring m
> times
> the time taken in maple can be reduced to about 2 minutes.
> This is explained in the classic book of Henrici on numerical
> analysis.
> 3) It is true that for Pi, Maple used to have the first 10000 digits
> stored in the program, I do not know about E. 10000 digits is
> small
> , this takes a fraction of a second on a modern machine.
>
> The problem I think is that the bignum package does not use Karatsuba
> algorithm to multiply big numbers.

*Using* Karatsuba would be a dreadful mistake on the million-digit
numbers you mention.

David Bernier

unread,

Jan 24, 2010, 4:15:26 AM1/24/10

to

[...]

Obviously, I get results for Pari/GP much more in-line with
yours, but I can't explain Simon Plouffe's Pari-GP "anomalous"
timings.

For 1 million of exp(1) , I estimated 1min 17 seconds.
For 2 million digits of exp(1), I estimated 7 min 12 seconds.

The default allocation of 8,000,000 bytes wasn't enough,
[ the PARI stack overflows ! ]
and neither was 50,000,000 bytes.

But 250,000,000 bytes was enough for the 'e' to
1 million and 'e' to 2 million digits computations.

(Stack size increased via allocatemem(250000000) at the command-line.)

My set-up in software and hardware:

GP/PARI CALCULATOR Version 2.3.4 (released)
amd64 running linux (x86-64/GMP-4.2.4 kernel) 64-bit version
compiled: Feb 26 2009, gcc-4.4.0 20090219 (Red Hat 4.4.0-0.21) (GCC)
Intel QuadCore, Q6600 at 2.4 GHz .

P.S. Fedora 11 for x86_64 4 Gig Ram, and one 1-core process
running apart from Pari-gp computation of 'e'.

David Bernier

plouffe

unread,

Jan 28, 2010, 10:30:20 AM1/28/10

to

HEllo,

of course I called GP with the option -s 1000000000
because the standard option cannot compute E to
1 million digits.

I used the pari GP 2.3.4 64 bits on a E5405 2x4 intel processor
with 4 gigs of ram and Ubuntu 9.10

simon plouffe