Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

3 views

Skip to first unread message

Mar 22, 2002, 7:01:01 PM3/22/02

to

I've added some more programs. It looks like this now.

MockMMA 5.8 seconds (including 2.1 in GC)

Macsyma (in ACL 6.1) 6.9 seconds (including 2.0 in GC)

Hashing 8.6 seconds (including 2.2 in GC)

Fermat 9.2 seconds

Macsyma (maxima5.5) 32.0 seconds (? GC not reported, second trial)

Macsyma (commercial) 37.5 seconds (including 29.78 in GC)

Macsyma (maxima5.5) 53.5 seconds (? GC not reported, first trial)

Maple7 50.0 seconds (? GC, it was slower on 2nd trial)

Mathematica 4.1 82.3 seconds (? it was slower on 2nd trial)

MuPad 1.4.2 Pro 118.6 seconds

for multiplying the expanded version of (1+x+y+z)^20 by itself.

details on http://www.cs.berkeley.edu/~fateman/papers/fastmult.pdf

comments invited.

RJF

Mar 23, 2002, 10:30:33 AM3/23/02

to

Richard Fateman wrote:

Interesting challenge, just tested with Giac (static readline version

compiled with gcc 2.96 CXXFLAGS=-O2)

35s on a Celeron 766Mhz, 100M RAM, 128K cache (kernel=Linux 2.4.3)

Memory used was around 10M.

Giac use ordered distributed representation for polynomials. The monomials

exponents are stored in a vector of int. Bignum arithmetic uses GMP. No

garbage collector.

I believe that it would be interesting to know the exact representation

used for polynomials in the system you tested, as well as the limit of the

representations.

For example, recursive representation is obviously faster on this

problem than distributed representation (where sorting is needed)...

Mar 23, 2002, 11:48:57 AM3/23/02

to

Thanks for the additional data. I would guess that Giac, normalized

for the difference between 933Mhz and 766Mhz, would be about 28 seconds.

The exact representation for the systems I know about can be

viewed directly (I can provide the source code) form MockMMA,

Macsyma (public domain version), Hashing. I cannot provide

the commercial Macsyma, though I think it is probably the same,

nor can I provide Maple, Mathematica or Mupad. There is

some description in the literature of Maple.

The reference URL has some details, but the essence is that

MockMMA uses recursive representation, dense-vectors of bignums

at each level x,y,z.

Macsyma uses recursive representation, sparse lists of

exponent/coefficient (bignum) at each level.

Hashing uses a distributed form, unordered hashtable with

key = exponents compressed into a short integer

value = bignum.

for the difference between 933Mhz and 766Mhz, would be about 28 seconds.

The exact representation for the systems I know about can be

viewed directly (I can provide the source code) form MockMMA,

Macsyma (public domain version), Hashing. I cannot provide

the commercial Macsyma, though I think it is probably the same,

nor can I provide Maple, Mathematica or Mupad. There is

some description in the literature of Maple.

The reference URL has some details, but the essence is that

MockMMA uses recursive representation, dense-vectors of bignums

at each level x,y,z.

Macsyma uses recursive representation, sparse lists of

exponent/coefficient (bignum) at each level.

Hashing uses a distributed form, unordered hashtable with

key = exponents compressed into a short integer

value = bignum.

As far as limits: the way I implemented hashing, all the

exponents are stored in one small integer. Generalizing this

to a vector of integers is simple but would increase storage

and slow down computation. The hashing representation is

much smaller than macsyma or mockmma, and so could handle

larger problems. There are no limits on the largest exponent

inthe Macsyma representation because x^(1000)+1 is simply

a list (x 1000 1 0 1). In the Mockmma representation,

there is a limit in that x^(1000)+1 would be stored in an

array of 1001 coefficients, and so you would run out of

memory sooner.

Maple, at least on some machines, had a limit on the

number of terms in an expression. This was occasionally

encountered. I don't know if this limit has been removed

for Maple 7.

I think that each system has some maximum bignum

size; While many systems seem to have a design that

would allow one to fill all of memory with one long

chain of bignum digits, in practice there is usually

something else that stops it.

RJF

Briefly: Macsyma us

Mar 23, 2002, 2:07:43 PM3/23/02

to

Richard Fateman <nos...@nowhere.net> wrote:

: Thanks for the additional data. I would guess that Giac, normalized

: for the difference between 933Mhz and 766Mhz, would be about 28 seconds.

: Thanks for the additional data. I would guess that Giac, normalized

: for the difference between 933Mhz and 766Mhz, would be about 28 seconds.

Please be careful with such comparisons. I have seen huge (factor 2)

differences on one machine when the malloc algorithm was changed. Also,

memory access latency is usually more important than CPU cycles. To be

sure, use one machine only and run all the tests under the same OS.

[...]

: Maple, at least on some machines, had a limit on the

: number of terms in an expression. This was occasionally

: encountered. I don't know if this limit has been removed

: for Maple 7.

Yes, it has for Maple 6.

: I think that each system has some maximum bignum

: size; While many systems seem to have a design that

: would allow one to fill all of memory with one long

: chain of bignum digits, in practice there is usually

: something else that stops it.

Really??? What stops it on Maple or Mathematica? In CLN's exact

types we certainly have no such limits.

Regards

-richy.

--

Richard B. Kreckel

<Richard...@Uni-Mainz.DE>

<http://wwwthep.physik.uni-mainz.de/~kreckel/>

Mar 23, 2002, 1:50:22 PM3/23/02

to

> Hashing uses a distributed form, unordered hashtable with

> key = exponents compressed into a short integer

> value = bignum.

>

> key = exponents compressed into a short integer

> value = bignum.

>

By short integer, you mean something that fit in a register of the CPU?

That would for 3 vars mean a max exponent of about 1000 on a 32 bit CPU?

> As far as limits: the way I implemented hashing, all the

> exponents are stored in one small integer. Generalizing this

> to a vector of integers is simple but would increase storage

> and slow down computation. The hashing representation is

> much smaller than macsyma or mockmma, and so could handle

> larger problems.

Correct me if I'm wrong, but this representation would not

work if you had too much variables because the short int

representation would not allow some powers?

Mar 23, 2002, 2:58:39 PM3/23/02

to

Parisse Bernard wrote:

>>Hashing uses a distributed form, unordered hashtable with

>>key = exponents compressed into a short integer

>>value = bignum.

>>

>>

>

> By short integer, you mean something that fit in a register of the CPU?

> That would for 3 vars mean a max exponent of about 1000 on a 32 bit CPU?

That's right. It would require no change in the source code in lisp,

but would make it run slower if you made the exponents larger than

a word. Or we could hash on lists, which would be larger and

slower both.

>

>

>>As far as limits: the way I implemented hashing, all the

>>exponents are stored in one small integer. Generalizing this

>>to a vector of integers is simple but would increase storage

>>and slow down computation. The hashing representation is

>>much smaller than macsyma or mockmma, and so could handle

>>larger problems.

>>

>

> Correct me if I'm wrong, but this representation would not

> work if you had too much variables because the short int

> representation would not allow some powers?

It would still work, but be slower since the exponents

would be encoded as bignums. I guess I could time some

version of that too.

>

Mar 24, 2002, 5:00:20 AM3/24/02

to

Richard Fateman a écrit :

>

> Parisse Bernard wrote:

>

> >>Hashing uses a distributed form, unordered hashtable with

> >>key = exponents compressed into a short integer

> >>value = bignum.

> >>

> >>

> >

> > By short integer, you mean something that fit in a register of the CPU?

> > That would for 3 vars mean a max exponent of about 1000 on a 32 bit CPU?

>

> That's right. It would require no change in the source code in lisp,

> but would make it run slower if you made the exponents larger than

> a word. Or we could hash on lists, which would be larger and

> slower both.

>

> Parisse Bernard wrote:

>

> >>Hashing uses a distributed form, unordered hashtable with

> >>key = exponents compressed into a short integer

> >>value = bignum.

> >>

> >>

> >

> > By short integer, you mean something that fit in a register of the CPU?

> > That would for 3 vars mean a max exponent of about 1000 on a 32 bit CPU?

>

> That's right. It would require no change in the source code in lisp,

> but would make it run slower if you made the exponents larger than

> a word. Or we could hash on lists, which would be larger and

> slower both.

I have made a corresponding C++ program (see below) and I get around 18s

on my Celeron 766M, memory used around 2M.

You can replace giac and gen by any library/class representing bignum

(gen is a generic type that can represent other data types than just

large integers)

#include <giac/giac.h>

using namespace std;

using namespace giac;

typedef map<int,gen> pol;

void fastmult(const pol & p1,const pol & p2,pol & product){

product.clear();

pol::const_iterator it1=p1.begin(),it1end=p1.end();

pol::const_iterator it2,it2end=p2.end();

pol::iterator prod_it;

int sum1,sum;

gen coeff1;

for (;it1!=it1end;++it1){

sum1=it1->first;

coeff1=it1->second;

for (it2=p2.begin();it2!=it2end;++it2){

sum=sum1+it2->first;

prod_it=product.find(sum);

if (prod_it==product.end())

product[sum]=coeff1*it2->second;

else

prod_it->second = prod_it->second +coeff1*it2->second;

}

}

}

void fastpower(const pol & p,int k,pol & res){

if (k==0){

res.clear();

res[0]=1;

return;

}

res=p;

pol tmp;

for (int i=1;i<k;++i){

tmp=res;

fastmult(tmp,p,res);

}

}

ostream & operator << (ostream & of,const pol & p){

pol::const_iterator it=p.begin(),itend=p.end();

for (;it!=itend;++it){

of << it->second << "[" << it->first << "] ";

}

return of;

}

int main(){

int k;

cout << "Enter power";

cin >> k;

pol xyz;

xyz[0]=1; // cst coeff

xyz[1]=1; // x coeff

xyz[1<<8]=1; // y coeff

xyz[1<<16]=1; // z coeff

cout << xyz << endl;

// make xyz^20

pol xyz20,xyz40;

fastpower(xyz,k,xyz20);

cout << xyz20.size() << endl;

clock_t start=clock();

fastmult(xyz20,xyz20,xyz40);

// cout << xyz40 << endl;

clock_t end=clock();

cout << end-start << endl;

cout << "Number of monomials" << xyz40.size() << endl;

}

Mar 24, 2002, 3:53:45 PM3/24/02

to

Thanks to all of you for contributing!

I think that multiplication using hashtables

in Lisp looks somewhat neater, than writing it in C++

at least if you are familiar with lisp. I have removed

some error checking from the code below, but the n^2

operation is just a doubly-nested "maphash".

(defun hashtimes(r1 r2) ;r1 and r2 are hash tables

(let((res (make-hash-table :test #'eq)))

(maphash #'(lambda(key val)

(maphash #'(lambda(key2 val2)

(incf (gethash (+ key key2) res 0)

(* val val2)

))r2)) r1) ;* may be bignum mult

;;optional, remove 0 terms from hashtable result

(maphash #'(lambda(key val)(if (eql 0 val)(remhash key res)))res)

;; make the form of the result acceptable to macsyma

`(($hashtable simp ,res, vars) |$encoded|,(hash-table-count res))))

I think that to make these programs much faster we would

have to

(a) make bignum arithmetic faster

or

(b) make bignum arithmetic cause less storage allocation.

One way to do this is to change bignum mult or add so that

we compute x:=a*b+x

by computing a*b in a temporary location that is immediately

reclaimed and doesn't need to be garbage collected (in lisp

this would be called stack-allocated variables),

and by re-using the memory cells used for x in the new

value for x.

A really clever lisp compiler could know all this because

the inner statement is

(incf x (* v v2)) where x= hashtable access.

It is obvious that the value (* v v2) is not used anywhere

else because it is inaccessible. It is obvious that x

can be overwritten because the statement (incf x y) means

x:=x+y.

the compiler I'm using is not nearly so clever, and in

fact it even recomputes the hash location twice. If it

were changed to compute once, it would improve the time

by some 10%.

RJF

Mar 25, 2002, 7:29:08 AM3/25/02

to

Richard Fateman <fat...@cs.berkeley.edu> wrote in message news:<3C9BC5B...@cs.berkeley.edu>...

I tried your example on meditor : it took 204.5 seconds on a 700MHZ

Processor. As pointed out by R.Kreckel, you should test it on your

computer (with an up-to-date JVM) for accuracy. The polynomials are

represented in distributed form, ordered using a beta-tree with

monomials as keys and coefficients as values. The monomials are arrays

of integer exponents. The coefficients are java BigIntegers. Here is

the program:

import spoke.math.*;

import spoke.math.analyse.*;

public class Essai {

public static void main(String args[]) {

Arithmetique a=Polynome.valueOf(Expression.valueOf("(1+x+y+z)^20"),new

Variable[] {new Constante("x"),new Constante("y"),new

Constante("z")},Lexicographique.comparer,0);

System.out.println(a);

long t=System.currentTimeMillis();

Arithmetique b=a.multiply(a);

System.out.println(System.currentTimeMillis()-t);

}

}

Message has been deleted

Mar 26, 2002, 2:32:00 PM3/26/02

to

"Richard B. Kreckel" a écrit :

>

> Richard Fateman <nos...@nowhere.net> wrote:

> : Thanks for the additional data. I would guess that Giac, normalized

> : for the difference between 933Mhz and 766Mhz, would be about 28 seconds.

>

> Please be careful with such comparisons. I have seen huge (factor 2)

> differences on one machine when the malloc algorithm was changed. Also,

> memory access latency is usually more important than CPU cycles. To be

> sure, use one machine only and run all the tests under the same OS.

>

> Richard Fateman <nos...@nowhere.net> wrote:

> : Thanks for the additional data. I would guess that Giac, normalized

> : for the difference between 933Mhz and 766Mhz, would be about 28 seconds.

>

> Please be careful with such comparisons. I have seen huge (factor 2)

> differences on one machine when the malloc algorithm was changed. Also,

> memory access latency is usually more important than CPU cycles. To be

> sure, use one machine only and run all the tests under the same OS.

Right. I have tested on a PIII 800Mhz 256K cache and I get 19.5s for

Giac

and 13.5s for the C++ program.

Mar 27, 2002, 11:05:18 PM3/27/02

to

Here are some results on my 700 MHz PIII Linux machine using (not yet released

but getting close) Maxima 5.9.0:

but getting close) Maxima 5.9.0:

Maxima 5.9.0 (old, using GCL) : 112.21 seconds

Maxima 5.9.0 (new*, using GCL): 18.80 seconds

Maxima 5.9.0 (using clisp) : 31.30 seconds

Maxima 5.9.0 (using CMUCL) : 7.91 seconds

*The new version includes some default memory preallocation with GCL. The

preallocation hasn't made it into cvs yet.

Given that the numbers below come from a 933 MHz machine, I think the above

numbers compare pretty well.

--Jim

Mar 28, 2002, 11:35:37 AM3/28/02

to

In article <ua55h5k...@corp.supernews.com>, James Amundson

<amun...@fnal.gov> wrote:

<amun...@fnal.gov> wrote:

I tried this on my 533 MHz G4 PowerMac and got the following:

Maple7 57.0 seconds

MapleV 21.1 seconds

Does anyone know if this is true in general, that Maple7 is 2 to 3 times

slower than MapleV?

John

Mar 28, 2002, 11:44:26 AM3/28/02

to

John Greene <jgr...@d.umn.edu> wrote:

> [...]

> Does anyone know if this is true in general, that Maple7 is 2 to 3 times

> slower than MapleV?

Certainly not. See ?updates,Maple7,efficiency for counterexamples.

--

Thomas Richard

Maple Support

Scientific Computers GmbH

http://www.scientific.de

Mar 28, 2002, 2:52:31 PM3/28/02

to

I think the evidence is that this computation

has gotten slower, not that EVERYTHING is slower!

See below for the latest collection of times

for this problem.

I got a copy of FORM 3.0 for my machine.

Really fast, but quite different from the other

systems.

RJF

has gotten slower, not that EVERYTHING is slower!

See below for the latest collection of times

for this problem.

I got a copy of FORM 3.0 for my machine.

Really fast, but quite different from the other

systems.

RJF

Thomas Richard wrote:

> John Greene <jgr...@d.umn.edu> wrote:

>

>

>>[...]

>>Does anyone know if this is true in general, that Maple7 is 2 to 3 times

>>slower than MapleV?

>>

>

> Certainly not. See ?updates,Maple7,efficiency for counterexamples.

>

>

Form 3.0* 0.9 sec (or 94 sec, depending...)

Reduce 3.7 (in CSL) 5.0 sec (including 1.3 in GC)

MockMMA (in ACL 6.1) 5.8 sec (including 2.1 in GC)

Macsyma (in ACL 6.1) 6.9 sec (including 2.0 in GC)

Hashing (in ACL 6.1) 8.6 sec (including 2.2 in GC)

Fermat 8.6 sec (fastest of 2 runs.)

Maxima5.5 (in GCL) 32.0 sec (? GC not reported, second trial)

Macsyma (commercial) 37.5 sec (including 29.78 in GC)

Maxima5.5 (in GCL) 53.5 sec (? GC not reported, first trial)

Maple7 50.0 sec (? GC, it was slower on 2nd trial)

Mathematica 4.1 82.3 sec (? it was slower on 2nd trial)

MuPad 1.4.2 Pro 118.6 sec

Times on other machines:

Mupad 1.4.2* 119.2 sec

Mupad 2.0* 117.0 sec

Maxima5.6* 51.0 sec

Giac2.96* 35.0 sec

Giac2.96* Hashing 18.0 sec

Ginac 1.0.8* 57.3 sec

MapleVR3* 16.7 sec

MapleVR4* 17.9 sec

MapleVR5* 18.4 sec

Meditor* 204.5 sec

Maxima 5.9.0(GCL-1)*112.21 sec

Maxima 5.9.0(GCL-2)* 18.80 sec

Maxima 5.9.0(clisp)* 31.30 sec

Maxima 5.9.0(CMUCL)* 7.91 sec

\end{verbatim}}

*(Giac 2.96 compiled with gcc 2.96 CXXFLAGS=-02 on Celeron 766Mhz 100M

RAM 128K cache, kernel=Linux 2.4.3, data points provided by Paraisse

Bernard. Numerous tests were

run by Richard Kreckel on a P-III 1GHz 5112kB cache

Linux machine. Times for his machine seems to be about the same

as mine for MuPad 1.4.2, Macsyma 5.6, and Mathematica, so probably

the numbers for Form, Ginac-1.0.8, Mupad 2.0, MapleVR3,4,5

are about comparable. Meditor took 204.5

sec on a 700Mhz processor. Meditor run by Raphael Jolly.

The collection of Maxima 5.9 timings are from James Amundson,

running on a 700Mhz P-III linux machine. The GCL-1 time is, I believe,

for the system that took 53 seconds on our machine. The GCL-2 time

is given more memory allocation. Clisp is a byte-coded common lisp

that is expected to be slower. CMUCL is a highly optimzing implementation

of common lisp based on work at Carnegie Mellon Univ. If it scales up

comparably to the other measurements, it would be fastest of the

lisp systems. Currently CMUCL runs only on Linux.

)

Mar 29, 2002, 1:15:21 AM3/29/02

to

> *(Giac 2.96 compiled with gcc 2.96 CXXFLAGS=-02 on Celeron 766Mhz 100M

> RAM 128K cache, kernel=Linux 2.4.3, data points provided by Paraisse

> Bernard.

> RAM 128K cache, kernel=Linux 2.4.3, data points provided by Paraisse

> Bernard.

Please note that it's not Giac 2.96, it's Giac 0.2.2 compiled with gcc

2.96.

And I get 19.5s (giac) or 13.5 (C++ prog) on a PIII 800Mhz with 256K

cache.

For those who want to test, the giac Linux executable can be downloaded

at

ftp://fourier.ujf-grenoble.fr/pub/hp48/giac.gz

(the source is available in the same directory).

Mar 30, 2002, 12:25:32 AM3/30/02

to

Hello *,

Here is one more data point:

Axiom 2.3 13.56 (evaluation) + 6.56 (GC)

This is Duron-600 running Linux.

Andrey

Mar 30, 2002, 12:28:57 AM3/30/02

to

And one more data point:

REDUCE 3.7 6.02 (evaluation) + 0.22 (GC)

PSL-based REDUCE on the same Duron-600 running Linux

Andrey

Apr 1, 2002, 3:37:58 PM4/1/02

to

>>>>> "Richard" == Richard Fateman <fat...@cs.berkeley.edu> writes:

Richard> lisp systems. Currently CMUCL runs only on Linux.

Just a quick correction. CMUCL runs on Linux and Solaris systems. I

think it also works on OpenBSD and NetBSD systems. Older versions ran

on Alphas, MIPS, and HPs. There has been some talk about reviving

some of these other platforms. The limiting factor has been access

and desire for these other platforms.

SBCL, a derivative of CMUCL, runs on several platforms as well,

including Linux/PPC. I expect about the same performance.

Ray

Apr 3, 2002, 3:04:59 AM4/3/02

to

Richard Fateman <fat...@cs.berkeley.edu> writes:

> MuPad 1.4.2 Pro 118.6 seconds

>

> for multiplying the expanded version of (1+x+y+z)^20 by itself.

>

> details on http://www.cs.berkeley.edu/~fateman/papers/fastmult.pdf

While this is certainly not satisfactory, I'd like to point out that

your comparison seems to be a bit unfair: If I read your paper

correctly, your own code uses a data structure holding only

polynomials, while your MuPAD code uses general expressions. Try

this version:

>> a:=expand((1+x+y+z)^20):

>> p:=poly(a):

>> q:=poly(a+1):

>> time(expand(p*q))

30000

I.e., 30 seconds. Not really good, but better.

--

Attachment? Nein: http://piology.org/ILOVEYOU-Signature-FAQ.html

begin LOVE-LETTER-FOR-YOU.txt.vbs

Christopher Creutzig, MuPAD group, Tel.: +49-5251-60-5525

end

Apr 3, 2002, 10:02:49 AM4/3/02

to

Your assumption about data structures is correct. I was not

aware of the poly() construction in MuPad and will test

your version soon.

Thanks for the suggestion. I'll have to learn about poly().

aware of the poly() construction in MuPad and will test

your version soon.

Thanks for the suggestion. I'll have to learn about poly().

RJF

Apr 3, 2002, 11:54:20 AM4/3/02

to

Mupad's time is reduced from 118.6 to 29.4 seconds on my machine.

I was disappointed to see that

r:=x+y;

poly(r)+1 ; is not a poly

poly(poly(r)+1); returns FAIL.

And that "poly" form is not contagious as is rat form in macsyma,

Except that poly(r)*poly(r) = poly(r)*r = poly(r^2) = expand(poly(r)*r).

So sometimes it is.

There is, in mupad, a substantial substructure of domains.

This poly form is part of that structure, but does not

seem to do what I would intuitively expect, at least in

returning FAIL.

RJF

Apr 6, 2002, 12:27:35 AM4/6/02

to

Just out of curiosity, since it seems you are using a dense representation,

how well does it fare on expressions like

p = (x^(2^100-1)+x^123+x^57x^13)

?

It seems unfair you use only one test, one sample point, one for which your representation

is efficient due to the sparse nature of (1+x+y+z)^20 (due to the '1').

Of course computers are good at dealing with arrays. But with sparse objects?

Wouldn't a general test suite be more fair? Wouldn't a sparse representation

be better overall?

Also, out of curiosity, are all the tests run on the same machine? Are you using

the CPU speed as the sole measure of the speed of the machine?

Ayal Pinkus

Apr 5, 2002, 5:11:30 PM4/5/02

to apinkus

apinkus wrote:

> Just out of curiosity, since it seems you are using a dense representation,

> how well does it fare on expressions like

>

> p = (x^(2^100-1)+x^123+x^57x^13)

>

> ?

OK, there are 3 different representations of interest here.

the vector representation used by MockMMa

the list representation used by Macsyma (also Reduce... )

the hashtable representation used by (un-named 1-page lisp program).

For vector, it would be a big loss since I would have to allocate a vector

[1 0 0 0..... 1 0 0 0 0..... 1 ....1 0 0 0 0...] and that itself would

be impossible. However if one did this: let y=x^(2^100-1), then this

would possibly win.

For list, there would be no problem at all.

(x bignumber 1 123 1 57 1 13 1) would be about as fast.

For hash, I would have to have a much longer exponent key to encode this

stuff,

so the exponent would be a few hundred bits. It would slow things down

a little,

but not as much as making the coefficients really huge.

>

> It seems unfair you use only one test, one sample point, one for which your representation

> is efficient due to the sparse nature of (1+x+y+z)^20 (due to the '1').

I don't think it would make much difference to leave out the 1.

>

> Of course computers are good at dealing with arrays. But with sparse objects?

The hash table is a sparse object. So is the list.

>

> Wouldn't a general test suite be more fair? Wouldn't a sparse representation

> be better overall?

I think that judiciously switching between sparse and dense is a way

to win. If you only had one representation, sparse would be more

plausible.

More test cases are possible, but there has to be some selection

to make a point..

>

> Also, out of curiosity, are all the tests run on the same machine?

Yes. a 933 MHz Pentium III ? that is in my office. Running Windows 2000

5.00.2195

It says x86 family 6 model 8 stepping 6 523,568KB RAM

Are you using

> the CPU speed as the sole measure of the speed of the machine?

Yes

Apr 6, 2002, 3:27:39 AM4/6/02

to Richard Fateman

Let me start off by saying that it is indeed an interesting test.

I maintain Yacas, and it did show the limitations of the system

very clearly. So thanks for sharing the information!

However, you compare the times the systems take, making it a

competition between systems based on this test.

>

> OK, there are 3 different representations of interest here.

> the vector representation used by MockMMa

> the list representation used by Macsyma (also Reduce... )

> the hashtable representation used by (un-named 1-page lisp program).

>

> For vector, it would be a big loss since I would have to allocate a vector

> [1 0 0 0..... 1 0 0 0 0..... 1 ....1 0 0 0 0...] and that itself would

> be impossible. However if one did this: let y=x^(2^100-1), then this

> would possibly win.

>

True, but then you would be helping the CAS. The whole point is to have

it done automatically? You can help the CAS even more by writing this in c++.

I'm sure you could hand-optimize it to outperform any lisp compiler.

Write a c++ program specifically for this test. If I wanted to cheat I could

have the compiler unroll some template classes. Then runtime would be low.

If you are indeed interested in the coefficients of the expression, if they

really were of interest for some research, then this would be a valid and

fruitful approach.

Or, a closed form expression for the coefficients of the terms might be found.

> For list, there would be no problem at all.

> (x bignumber 1 123 1 57 1 13 1) would be about as fast.

>

Slower than your vector/array representation though, since arrays

can be dealt with more efficiently by computers. The point: other

systems might be slower, because they use sparse representations,

but at the same time they can solve problems quickly that would be

troublesome in your system. So that is why it is perhaps not a fair

test.

>

> I think that judiciously switching between sparse and dense is a way

> to win. If you only had one representation, sparse would be more

> plausible.

>

> More test cases are possible, but there has to be some selection

> to make a point..

>

Forgive me if this sounds rude, but in making the point, wouldn't it be fair

to discuss disadvantages also? What is the point of the article by the way?

1) That dense representations are good for some specific calculations?

Because then it would have been best to compare various representations

implemented under the same lisp system.

2) That lisp is the way to go? Then the same algorithm should have been

implemented under different systems, in different languages, for comparison.

3) The quality of MockMMA, but then an extensive test suite would have been

appropriate.

4) That it is an interesting test for CAS systems? That is true. But it is

unfair to make a time comparison between your 'special purpose' code to general

all-purpose systems which have to perform reasonably for many more different inputs.

>

> Are you using

> > the CPU speed as the sole measure of the speed of the machine?

>

> Yes

>

Ok, that is fair, I suppose.

But an interesting test nonetheless!

Ayal Pinkus

0 new messages

Search

Clear search

Close search

Google apps

Main menu