185 views

Skip to first unread message

May 2, 2008, 2:34:35 PM5/2/08

to sage-devel

Oleksandr Pavlyk reports on the Wolfram Blog that he has computed the

10 millionth Bernoulli number using Mathematica:

http://blog.wolfram.com/2008/04/29/today-we-broke-the-bernoulli-record-from-the-analytical-engine-to-mathematica/

How does sage's Bernoulli number implementation compare? I'd like to

see bernoulli(10^7) in sage beating Mathematica's time. And then

computing the 20 millionth Bernoulli number...

Fredrik

10 millionth Bernoulli number using Mathematica:

http://blog.wolfram.com/2008/04/29/today-we-broke-the-bernoulli-record-from-the-analytical-engine-to-mathematica/

How does sage's Bernoulli number implementation compare? I'd like to

see bernoulli(10^7) in sage beating Mathematica's time. And then

computing the 20 millionth Bernoulli number...

Fredrik

May 2, 2008, 2:56:30 PM5/2/08

to sage-devel

It takes about 30 seconds on my machine to get the 10^5 Bernoulli

number. The mathematica blog says it took a "development" version of

mathematica 6 days to do the 10^7 calc. So it would probably take

some work, but we are not that badly off as is.

-M. Hampton

On May 2, 12:34 pm, Fredrik Johansson <fredrik.johans...@gmail.com>

wrote:

number. The mathematica blog says it took a "development" version of

mathematica 6 days to do the 10^7 calc. So it would probably take

some work, but we are not that badly off as is.

-M. Hampton

On May 2, 12:34 pm, Fredrik Johansson <fredrik.johans...@gmail.com>

wrote:

> Oleksandr Pavlyk reports on the Wolfram Blog that he has computed the

> 10 millionth Bernoulli number using Mathematica:http://blog.wolfram.com/2008/04/29/today-we-broke-the-bernoulli-recor...
May 2, 2008, 3:00:26 PM5/2/08

to sage-...@googlegroups.com

On May 2, 2008, at 2:56 PM, mhampton wrote:

> It takes about 30 seconds on my machine to get the 10^5 Bernoulli

> number. The mathematica blog says it took a "development" version of

> mathematica 6 days to do the 10^7 calc. So it would probably take

> some work, but we are not that badly off as is.

But what about the asymptotics? I tried 10^5 and 2*10^5 and 4*10^5

and it wasn't pretty.

david

May 2, 2008, 3:01:38 PM5/2/08

to sage-...@googlegroups.com

I might take a look at this, as there are some ways fo computing B nos

which are very much faster tha others, and not everyone knows them.

Pari has something respectable, certainly.

which are very much faster tha others, and not everyone knows them.

Pari has something respectable, certainly.

John

2008/5/2 mhampton <hamp...@gmail.com>:

May 2, 2008, 3:10:10 PM5/2/08

to sage-...@googlegroups.com

ok, so the docstring reaveals (1) that the pari version is "by far the

fastest" as I suspected, but also that for n>50000 that we use a gp

interface rather than the pari library " since the C-library interface

to PARI

is limited in memory for individual operations" -- whatever that means!

fastest" as I suspected, but also that for n>50000 that we use a gp

interface rather than the pari library " since the C-library interface

to PARI

is limited in memory for individual operations" -- whatever that means!

That might explain David's timing observations.

I tihnk the pari implementation is actually quite simple (and there is

a huge amount about Berouilli nos in Henri Cohen's latest book too)

which suggests that doing a cython implementation would not be hard.

Maybe this is time for a repeat performance of the partitions

competition with M**ca!

John

2008/5/2 John Cremona <john.c...@gmail.com>:

May 2, 2008, 3:11:51 PM5/2/08

to sage-...@googlegroups.com

Sage's Bernoulli number is *just* PARI/GP's bernoulli number implementation.

Last time I tried timing Sage versus Mathematica's Bernoulli number command

(which was 2 years ago), Sage was twice as fast.

William

May 2, 2008, 3:40:00 PM5/2/08

to sage-...@googlegroups.com

I couldn't find any information about the hardware that guy used.

64-bit? 32-bit?

1.8Ghz or 3Ghz? Could somebody write and ask?

Also, when I tried

bernoulli(10^7+2)

directly in Sage there were a couple of issues that arose, since that command

is much more designed for smaller input. I fixed those small issues.

I guess we'll see in a week ..

William

May 2, 2008, 3:41:48 PM5/2/08

to sage-...@googlegroups.com

On May 2, 2008, at 3:40 PM, William Stein wrote:

> Also, when I tried

>

> bernoulli(10^7+2)

>

> directly in Sage there were a couple of issues that arose, since

> that command

> is much more designed for smaller input. I fixed those small issues.

> I guess we'll see in a week ..

I hope you did:

sage: x = bernoulli(10^7 + 2)

and not

sage: bernoulli(10^7 + 2)

david

May 2, 2008, 3:43:27 PM5/2/08

to sage-...@googlegroups.com

On Fri, May 2, 2008 at 3:40 PM, William Stein <wst...@gmail.com> wrote:

>

> On Fri, May 2, 2008 at 11:34 AM, Fredrik Johansson

> <fredrik....@gmail.com> wrote:

> >

>

> > Oleksandr Pavlyk reports on the Wolfram Blog that he has computed the

> > 10 millionth Bernoulli number using Mathematica:

> > http://blog.wolfram.com/2008/04/29/today-we-broke-the-bernoulli-record-from-the-analytical-engine-to-mathematica/

> >

> > How does sage's Bernoulli number implementation compare? I'd like to

> > see bernoulli(10^7) in sage beating Mathematica's time. And then

> > computing the 20 millionth Bernoulli number...

>

> I couldn't find any information about the hardware that guy used.

> 64-bit? 32-bit?

> 1.8Ghz or 3Ghz? Could somebody write and ask?

>

> On Fri, May 2, 2008 at 11:34 AM, Fredrik Johansson

> <fredrik....@gmail.com> wrote:

> >

>

> > Oleksandr Pavlyk reports on the Wolfram Blog that he has computed the

> > 10 millionth Bernoulli number using Mathematica:

> > http://blog.wolfram.com/2008/04/29/today-we-broke-the-bernoulli-record-from-the-analytical-engine-to-mathematica/

> >

> > How does sage's Bernoulli number implementation compare? I'd like to

> > see bernoulli(10^7) in sage beating Mathematica's time. And then

> > computing the 20 millionth Bernoulli number...

>

> I couldn't find any information about the hardware that guy used.

> 64-bit? 32-bit?

> 1.8Ghz or 3Ghz? Could somebody write and ask?

I did earlier, and I hope he will answer.

didier

May 2, 2008, 3:43:49 PM5/2/08

to sage-devel

I think the asymptotics aren't going to go our way if we use pari. It

takes 11s for 10^5 and I've been sitting here for quite a few minutes

and didn't get 10^6 yet.

I think pari uses the zeta function to compute bernoulli numbers.

If I'm reading the code right it first computes 1/zeta(n) using the

Euler product, then computes the numerator of the bernoulli number to

the required precision using this value, then divides by the required

denominator, which is just a product of primes.

Bill.

On 2 May, 20:11, "William Stein" <wst...@gmail.com> wrote:

> On Fri, May 2, 2008 at 11:34 AM, Fredrik Johansson

>

takes 11s for 10^5 and I've been sitting here for quite a few minutes

and didn't get 10^6 yet.

I think pari uses the zeta function to compute bernoulli numbers.

If I'm reading the code right it first computes 1/zeta(n) using the

Euler product, then computes the numerator of the bernoulli number to

the required precision using this value, then divides by the required

denominator, which is just a product of primes.

Bill.

On 2 May, 20:11, "William Stein" <wst...@gmail.com> wrote:

> On Fri, May 2, 2008 at 11:34 AM, Fredrik Johansson

>

> <fredrik.johans...@gmail.com> wrote:

>

> > Oleksandr Pavlyk reports on the Wolfram Blog that he has computed the

> > 10 millionth Bernoulli number using Mathematica:

> > http://blog.wolfram.com/2008/04/29/today-we-broke-the-bernoulli-recor...
>

> > Oleksandr Pavlyk reports on the Wolfram Blog that he has computed the

> > 10 millionth Bernoulli number using Mathematica:

May 2, 2008, 3:45:38 PM5/2/08

to sage-...@googlegroups.com, Kevin McGown

On Fri, May 2, 2008 at 12:10 PM, John Cremona <john.c...@gmail.com> wrote:

>

> ok, so the docstring reaveals (1) that the pari version is "by far the

> fastest" as I suspected, but also that for n>50000 that we use a gp

> interface rather than the pari library " since the C-library interface

> to PARI

> is limited in memory for individual operations" -- whatever that means!

>

> ok, so the docstring reaveals (1) that the pari version is "by far the

> fastest" as I suspected, but also that for n>50000 that we use a gp

> interface rather than the pari library " since the C-library interface

> to PARI

> is limited in memory for individual operations" -- whatever that means!

That's out of date now that Gonzalo T. and I fixed the pari interface

so that it automatically doubles the stack if needed. The

code needs to be fixed to never use gp by default. If you explicitly

use algorithm='pari' it will still switch to gp; changing this was

one of my fixes to my code so I could try this.

Tom is trying the whole calculation directly in GP.

He also did a log fit to timings and estimates it should take

about a week in Sage on his machine. We'll see.

>

> That might explain David's timing observations.

>

> I tihnk the pari implementation is actually quite simple (and there is

> a huge amount about Berouilli nos in Henri Cohen's latest book too)

> which suggests that doing a cython implementation would not be hard.

>

> Maybe this is time for a repeat performance of the partitions

> competition with M**ca!

The complexity mostly depends on the precision one uses in

computing a certain Euler product approximation to zeta

and also the number of factors in the product. If you look

at the PARI source code the comments do *not* inspire confidence in

its correctness. I had a student give a provable bound on precision

and number of factors needed and wasn't able to get anything

as good as what PARI uses.

Here's the funny part of the PARI code (in trans3.c):

/* 1.712086 = ??? */

t = log( gtodouble(d) ) + (n + 0.5) * log(n) - n*(1+log2PI) + 1.712086;

-- William

May 2, 2008, 3:46:07 PM5/2/08

to sage-...@googlegroups.com

On May 2, 2008, at 3:43 PM, Bill Hart wrote:

> I think the asymptotics aren't going to go our way if we use pari. It

> takes 11s for 10^5 and I've been sitting here for quite a few minutes

> and didn't get 10^6 yet.

So far I have on a 2.6GHz opteron:

sage: time x = bernoulli(60000)

Wall time: 3.79

sage: time x = bernoulli(120000)

Wall time: 16.97

sage: time x = bernoulli(240000)

Wall time: 118.24

sage: time x = bernoulli(480000)

Wall time: 540.25

and I'll report back with 960000 hopefully within an hour.

david

May 2, 2008, 3:46:36 PM5/2/08

to sage-...@googlegroups.com

I did:

t=cputime(); pari.allocatemem(); pari.allocatemem();

pari.allocatemem(); pari.allocatemem(); pari.allocatemem();

pari.allocatemem(); pari.allocatemem(); pari.allocatemem(); b =

bernoulli(10^7+2, algorithm='pari'); b.save('bern.sobj');

save(t,'time.sobj')

after patching Sage to always use the PARI C library if algorithm='pari'.

-- William

May 2, 2008, 3:55:54 PM5/2/08

to sage-...@googlegroups.com

On May 2, 2008, at 3:45 PM, William Stein wrote:

> The complexity mostly depends on the precision one uses in

> computing a certain Euler product approximation to zeta

> and also the number of factors in the product. If you look

> at the PARI source code the comments do *not* inspire confidence in

> its correctness. I had a student give a provable bound on precision

> and number of factors needed and wasn't able to get anything

> as good as what PARI uses.

>

> Here's the funny part of the PARI code (in trans3.c):

>

> /* 1.712086 = ??? */

> t = log( gtodouble(d) ) + (n + 0.5) * log(n) - n*(1+log2PI) +

> 1.712086;

One way to check it is to use the bernoulli_mod_p_single() function,

which computes B_k mod p for a single p and k, and uses a completely

independent algorithm.

sage: x = bernoulli(240000)

sage: p = next_prime(500000)

sage: bernoulli_mod_p_single(p, 240000)

498812

sage: x % p

498812

sage: p = next_prime(10^6)

sage: bernoulli_mod_p_single(p, 240000)

841174

sage: x % p

841174

So I would say the answer is correct.

david

May 2, 2008, 4:04:10 PM5/2/08

to sage-...@googlegroups.com

I've done numerous similar tests, and

I definitely don't think PARI is giving wrong answers.

The issue is just that I've written a paper to generalize

the algorithm to generalized Bernoulli numbers, and was

very annoyed that I couldn't prove that even the algorithm

used by PARI worked.

-- William

May 2, 2008, 4:08:52 PM5/2/08

to sage-...@googlegroups.com

Funny this should come up. William just gave a take-home midterm in which we had to predict the runtime for various computations, so I wrote some generic code to help. According to my code, and some liberal assumptions, it should take 5.1 days. I've attached the plots that show the curves I fit to some runtime data (x-axis is log(n,1.5) y-axis is seconds).

However, this same code predicted that computing the determinant of a 10000x10000 matrix with single-digit entries would take 20 hours, but it really took 30 hours. So my estimates are not to be trusted too much as the numbers grow...

May 2, 2008, 4:13:15 PM5/2/08

to sage-...@googlegroups.com

On May 2, 2008, at 4:08 PM, boo...@u.washington.edu wrote:

> Funny this should come up. William just gave a take-home midterm

> in which we had to predict the runtime for various computations, so

> I wrote some generic code to help. According to my code, and some

> liberal assumptions, it should take 5.1 days. I've attached the

> plots that show the curves I fit to some runtime data (x-axis is log

> (n,1.5) y-axis is seconds).

Sorry, could you please say more precisely what the two axes are? I'm

seeing negative time the way I interpret your statement.

david

May 2, 2008, 4:15:44 PM5/2/08

to sage-...@googlegroups.com

Sorry, the y-axis in the lower plot is log(time in seconds).

On Fri, 2 May 2008, David Harvey wrote:

May 2, 2008, 4:24:05 PM5/2/08

to sage-...@googlegroups.com

One more data point (2.6GHz opteron):

sage: time x = bernoulli(60000)

Wall time: 3.79

sage: time x = bernoulli(120000)

Wall time: 16.97

sage: time x = bernoulli(240000)

Wall time: 118.24

sage: time x = bernoulli(480000)

Wall time: 540.25

sage: time x = bernoulli(960000)

Wall time: 2436.06

The ratios between successive times are:

4.47757255936675

6.96758986446671

4.56909675236806

4.50913465987969

If we guess that it's "really" 4.5, then the complexity is N^(log

(4.5) / log(2)) = N^2.17. This is puzzling; I thought the algorithm

was supposed to be better than quadratic. Does anyone know what the

theoretical complexity is supposed to be?

Anyway, extrapolating gives about 4.5 days, pretty much the same as

what Tom estimates. I'm going to start it running now.

david

May 2, 2008, 4:34:28 PM5/2/08

to sage-...@googlegroups.com

Here is some more information about the machine used to compute this:

---------- Forwarded message ----------

From: Oleksandr Pavlyk <pav...@gmail.com>

Date: Fri, May 2, 2008 at 4:29 PM

Subject: Re: Today We Broke the Bernoulli Record: From the Analytical

Engine to Mathematica

To: didier deshommes <dfde...@gmail.com>

Hi Didier,

I used Linux, with 64 bit AMD processor:

AMD Opteron(tm) Processor 250

cpu MHz : 1000.000

cache size : 1024 KB

and 8GB of memory, but as I say in the blog, I did

not use that much.

The calculations were done using development build of

Mathematica, but calculations will go through in any

flavor of Mathematica version 6 as well, to the best of

my knowledge. Just run

Timing[ result = BernoulliB[10^7]; ]

It will take about twice longer on 32-bit processors,

thus about 2 weeks.

Please do not hesitate to ask further questions.

Sincerely,

Oleksandr Pavlyk

On Fri, May 2, 2008 at 2:12 PM, didier deshommes <dfde...@gmail.com> wrote:

> Hi Dr. Pavlyk,

> My question is in referrence to your blog post:

> http://blog.wolfram.com/2008/04/29/today-we-broke-the-bernoulli-record-from-the-analytical-engine-to-mathematica/

>

> - Do you have the specs of the machine you ran this off? CPU, memory, etc.

> - I assume this function is in the development version of mathematica?

>

> Thanks for your informative post!

>

> didier

May 2, 2008, 4:38:30 PM5/2/08

to sage-devel

On May 2, 10:34 pm, "didier deshommes" <dfdes...@gmail.com> wrote:

> Here is some more information about the machine used to compute this:

> Hi Didier,

>

> I used Linux, with 64 bit AMD processor:

>

> AMD Opteron(tm) Processor 250

> cpu MHz : 1000.000

> cache size : 1024 KB

assume that it would run at full speed during the computation :)

> and 8GB of memory, but as I say in the blog, I did

> not use that much.

Cheers,

Michael

May 2, 2008, 5:22:24 PM5/2/08

to sage-devel

I did some computations using von Staudt's theorem and up to 400000 no

errors. Of course that doesn't prove anything for much larger n.

Bill.

On 2 May, 21:04, "William Stein" <wst...@gmail.com> wrote:

errors. Of course that doesn't prove anything for much larger n.

Bill.

On 2 May, 21:04, "William Stein" <wst...@gmail.com> wrote:

May 2, 2008, 7:30:44 PM5/2/08

to sage-devel

The theoretical complexity of all the algorithms that rely on

recurrences is supposed to be n^2. But this doesn't take into account

the size of the numbers themselves. When you do this they are all

about n^3 as far as I can see. You can use Ramanujan identities, the

Akiyama-Tanigawa algorithm, the identity used by Lovelace, but all are

n^3 or so.

The theoretical complexity of the version using the zeta function

looks something like n log n steps at precision n log n, i.e. time n^2

(log n)^2.

Bill.

recurrences is supposed to be n^2. But this doesn't take into account

the size of the numbers themselves. When you do this they are all

about n^3 as far as I can see. You can use Ramanujan identities, the

Akiyama-Tanigawa algorithm, the identity used by Lovelace, but all are

n^3 or so.

The theoretical complexity of the version using the zeta function

looks something like n log n steps at precision n log n, i.e. time n^2

(log n)^2.

Bill.

May 2, 2008, 8:21:06 PM5/2/08

to sage-devel

Actually, it might be n/log(n) steps, so the time might be something

like n^2 though there are other terms involved.

Bill.

like n^2 though there are other terms involved.

Bill.

May 3, 2008, 8:12:36 AM5/3/08

to sage-devel

I think I nearly understand what Pari does.

The value of B_k is given by zeta(n)*(2*n!)/(2^n pi^n). However,

zeta(n) is *very* close to 1 for large n. So one starts by computing

zeta to a precision given by the size of (2*n!)/(2^n pi^n) (which is

basically the size of B_k) with 3 added to the precision to take care

of small values of n.

To compute how large (2*n!)/(2^n pi^n) can be, one uses Stirling's

formula for n!, i.e. n! ~ sqrt(2n*pi)*n^n*e^(-n). The log of this

expression is never more than 0.1 of the log of n!. Taking the log of

the expression one gets when plugging in Stirling's approximation, one

gets that the log of B_k does not exceed (n + 0.5) * log(n) -

n*(1+log2PI) + log(2*sqrt(2*Pi)) + 0.1 except for very small n (for

which it is sufficient to add 3 to the precision).

Now we need to know how big the numerator can be in B_k. But we can

compute the denominator precisely using the Clausen-von Staudt

formula. Pari does this and calls the denominator d.

Now the log of the numerator is no bigger than t = log(d) + (n + 0.5)

* log(n) - n*(1+log2PI) + log(2*sqrt(2*Pi)) + 0.1 = log(d ) + (n +

0.5) * log(n) - n*(1+log2PI) + 1.712086.

If this is how big the numerator of B_k is, we only need to compute it

to half a part in e^t. Thus we only need to compute 1/zeta(n) to half

a part in e^t. This is done by computing the inverse of the Euler

product for sufficiently many primes. Suppose we compute this for

primes up to p. Then the error in the inverse of the zeta function is

less than sum 1/p^n + 1/(p+1)^n +..... Consider this sum p terms at a

time. The first p are <= 1/p^n, the second p terms are <= 1/(2p)^n,

etc. Thus the error is quite a bit less than zeta(n) * p/p^n ~ 1/

p^(n-1).

Thus if we want this error to be less than half a part in e^t we need

to choose p to be about exp(t)^(1/(n-1)). Pari uses exp((t -

log(n-1)) / (n-1)) and I'm not sure where the extra log(n-1) comes

from, but either they use a slightly better approximation than me, or

they perhaps note that one doesn't need the first 1/p^n in the

approximation.

Pari could be improved by noting that one does not need the last few

terms of the numerator, as they can be recovered from Clausen-von

Staudt, which actually gives the fractional part of B_k. But for large

n, this is not a significant saving.

Bill.

The value of B_k is given by zeta(n)*(2*n!)/(2^n pi^n). However,

zeta(n) is *very* close to 1 for large n. So one starts by computing

zeta to a precision given by the size of (2*n!)/(2^n pi^n) (which is

basically the size of B_k) with 3 added to the precision to take care

of small values of n.

To compute how large (2*n!)/(2^n pi^n) can be, one uses Stirling's

formula for n!, i.e. n! ~ sqrt(2n*pi)*n^n*e^(-n). The log of this

expression is never more than 0.1 of the log of n!. Taking the log of

the expression one gets when plugging in Stirling's approximation, one

gets that the log of B_k does not exceed (n + 0.5) * log(n) -

n*(1+log2PI) + log(2*sqrt(2*Pi)) + 0.1 except for very small n (for

which it is sufficient to add 3 to the precision).

Now we need to know how big the numerator can be in B_k. But we can

compute the denominator precisely using the Clausen-von Staudt

formula. Pari does this and calls the denominator d.

Now the log of the numerator is no bigger than t = log(d) + (n + 0.5)

* log(n) - n*(1+log2PI) + log(2*sqrt(2*Pi)) + 0.1 = log(d ) + (n +

0.5) * log(n) - n*(1+log2PI) + 1.712086.

If this is how big the numerator of B_k is, we only need to compute it

to half a part in e^t. Thus we only need to compute 1/zeta(n) to half

a part in e^t. This is done by computing the inverse of the Euler

product for sufficiently many primes. Suppose we compute this for

primes up to p. Then the error in the inverse of the zeta function is

less than sum 1/p^n + 1/(p+1)^n +..... Consider this sum p terms at a

time. The first p are <= 1/p^n, the second p terms are <= 1/(2p)^n,

etc. Thus the error is quite a bit less than zeta(n) * p/p^n ~ 1/

p^(n-1).

Thus if we want this error to be less than half a part in e^t we need

to choose p to be about exp(t)^(1/(n-1)). Pari uses exp((t -

log(n-1)) / (n-1)) and I'm not sure where the extra log(n-1) comes

from, but either they use a slightly better approximation than me, or

they perhaps note that one doesn't need the first 1/p^n in the

approximation.

Pari could be improved by noting that one does not need the last few

terms of the numerator, as they can be recovered from Clausen-von

Staudt, which actually gives the fractional part of B_k. But for large

n, this is not a significant saving.

Bill.

May 3, 2008, 8:14:19 AM5/3/08

to sage-devel

Sorry, this:

The log of this expression is never more than 0.1 of the log of n!.

should read:

The log of this expression is always within 0.1 of the log of n!.

Bill.

The log of this expression is never more than 0.1 of the log of n!.

The log of this expression is always within 0.1 of the log of n!.

Bill.

May 3, 2008, 9:57:38 AM5/3/08

to sage-devel

I probably also mean:

Then the error in the the zeta function

not:

Then the error in the inverse of the zeta function

Then the error in the the zeta function

not:

Then the error in the inverse of the zeta function

May 5, 2008, 4:02:54 PM5/5/08

to sage-...@googlegroups.com

My computation of bernoulli(10^7+4) using GP version 2.3.3 has completed in 217417011 miliseconds. That's about 2 days, 12 hours. Anybody know how I can print the thing to file?

Machine:

Quad-core 2.0Ghz Xeon, 1333MHz FSB, 32GB RAM.

Currently, my gp session is using 4GB of RAM.

May 5, 2008, 4:09:12 PM5/5/08

to sage-...@googlegroups.com

On Mon, May 5, 2008 at 1:02 PM, <boo...@u.washington.edu> wrote:

>

> My computation of bernoulli(10^7+4) using GP version 2.3.3 has completed in 217417011 miliseconds. That's about 2 days, 12 hours. Anybody know how I can print the thing to file?

>

>

> My computation of bernoulli(10^7+4) using GP version 2.3.3 has completed in 217417011 miliseconds. That's about 2 days, 12 hours. Anybody know how I can print the thing to file?

>

So PARI is already over twice as fast as Mathematica? Cool!

BTW, I started almost same computation at the same time on a 32-bit

build of PARI under os x, and it will probably take days more.

We need 64-bit Sage on OS X!

> Machine:

> Quad-core 2.0Ghz Xeon, 1333MHz FSB, 32GB RAM.

>

> Currently, my gp session is using 4GB of RAM.

Use the write command:

? write('foo', 92834028340)

?

>

>

>

>

>

> >

>

--

William Stein

Associate Professor of Mathematics

University of Washington

http://wstein.org

May 6, 2008, 12:53:53 PM5/6/08

to sage-devel

That certainly merits a blog post somewhere - ?

May 6, 2008, 1:09:18 PM5/6/08

to sage-...@googlegroups.com

You might also want to check the result using the

bernoulli_mod_p_single function that I mentioned a few days ago on

this thread.

david

May 6, 2008, 1:15:00 PM5/6/08

to sage-devel

William has mentioned some congruence tests that we can perform -- I'd like to make sure that I got the right answer before we pat ourselves on the back too much.

May 6, 2008, 1:18:53 PM5/6/08

to sage-...@googlegroups.com

On Tue, May 6, 2008 at 10:15 AM, <boo...@u.washington.edu> wrote:

>

> William has mentioned some congruence tests that we can perform -- I'd like to make sure that I got the right answer before we pat ourselves on the back too much.

>

>

>

> William has mentioned some congruence tests that we can perform -- I'd like to make sure that I got the right answer before we pat ourselves on the back too much.

>

>

David Harvey's congruence tests would be pretty good. Just choose

*any* prime p > 10^7 + 10

say and compute B_{10^7+4} modulo it using David Harvey's function;

sage: p = next_prime(10^7+10)

sage: time bernoulli_mod_p_single(p, 10^7+4)

CPU times: user 0.49 s, sys: 0.00 s, total: 0.49 s

Wall time: 0.51

8087583

Pretty cool, eh?

William

May 6, 2008, 1:57:00 PM5/6/08

to sage-...@googlegroups.com

..... and the natural question is, could you then reconstruct the

actual bernoulli number, by computing it modulo sufficiently many

primes? Well, I did the estimates a few days ago, and it turns out

that the asymptotic behaviour of this proposed algorithm is pretty

much *the same* as the one using the zeta function (i.e. the one Pari

uses); they are both around n^2 log^2(n), if I'm not mistaken.

Unfortunately, I did some further tests, and even if I sped up

bernoulli_mod_p_single() by the largest factor I could conceive of,

overall it would still be maybe 50x slower than Pari. If anyone can

find an extra constant factor of 50x in this algorithm, I'd love to

hear about it.

david

May 6, 2008, 2:08:11 PM5/6/08

to sage-...@googlegroups.com

I think maybe yours parallelizes much more easily than

Pari's, so just use 1000 computers :-).

-- William

May 6, 2008, 2:12:30 PM5/6/08

to sage-...@googlegroups.com

I think a blog post with PARI timings and then timings for a modular

dsage approach would be cool.

dsage approach would be cool.

--Mike

May 6, 2008, 2:16:09 PM5/6/08

to sage-...@googlegroups.com

On May 6, 2008, at 2:12 PM, Mike Hansen wrote:

> I think a blog post with PARI timings and then timings for a modular

> dsage approach would be cool.

Probably not so cool, since it would be like 50 machines vs one machine.

david

May 6, 2008, 2:19:33 PM5/6/08

to sage-...@googlegroups.com

> Probably not so cool, since it would be like 50 machines vs one machine.

Sure, but the Mathematica blog post is scalablity: "In Mathematica, a

core principle is that everything should be scalable. So in my job of

creating algorithms for Mathematica I have to make sure that

everything I produce is scalable."

--Mike

May 6, 2008, 2:25:02 PM5/6/08

to sage-devel

I agree, I think demonstrating a distributed algorithm would be very

cool. From what I can tell of processor trends, we won't see enormous

gains in speed but we might see an awful lot of processors (like

Intel's prototype 80-core chip).

cool. From what I can tell of processor trends, we won't see enormous

gains in speed but we might see an awful lot of processors (like

Intel's prototype 80-core chip).

May 6, 2008, 2:55:15 PM5/6/08

to sage-...@googlegroups.com

But Pari's algorithm is already parallelisable (just farm off a bunch

of euler factors to each thread).

The only advantage my algorithm has in "scalability" is that each

thread needs only O(1) memory, instead of O(n log n) which would be

required for Pari's algorithm. So if memory were tight, but you had

lots of processors, and your n was really big, then perhaps this

algorithm would win. But when I think about the actual numbers

involved, the economics just don't work out. Even 80 processors is a

pathetically small number, given a 50x serial slowdown in the

algorithm. You would need thousands of cpus to even consider this

approach. And even then, it would only be worthwhile if each cpu had

very limited memory.

david

May 6, 2008, 5:40:30 PM