multivariate factoring - use maxima ?

82 views
Skip to first unread message

Roman Pearce

unread,
Mar 31, 2008, 6:39:17 PM3/31/08
to sage-devel
Please excuse a (possibly naive) suggestion, but why not use Maxima
for multivariate gcds and factorization ? I looked at the source code
and it appears to do Hensel lifting for both. That is the correct
algorithm that Sage appears to need. I'm not sure how to run it mod p
or over GF(p^q), but it must be possible.

Mike Hansen

unread,
Mar 31, 2008, 7:27:11 PM3/31/08
to sage-...@googlegroups.com
Hi Roman,

It seems that for characteristic 0, Maxima is quite a bit faster than
Singular, but there's some overhead in talking to Maxima over the
pexpect interface. In any case, it is still _way_ slower than what we
need. For example, Maxima takes around 10 seconds to do the bivariate
gcd of degree 100 listed here (
http://magma.maths.usyd.edu.au/users/allan/gcdcomp.html ). Magma on
the other hand takes 0.06 seconds.

--Mike

William Stein

unread,
Mar 31, 2008, 7:38:48 PM3/31/08
to sage-...@googlegroups.com
On Mon, Mar 31, 2008 at 4:27 PM, Mike Hansen <mha...@gmail.com> wrote:
>
> Hi Roman,
>
> It seems that for characteristic 0, Maxima is quite a bit faster than
> Singular, but there's some overhead in talking to Maxima over the
> pexpect interface. In any case, it is still _way_ slower than what we
> need. For example, Maxima takes around 10 seconds to do the bivariate
> gcd of degree 100 listed here (
> http://magma.maths.usyd.edu.au/users/allan/gcdcomp.html ). Magma on
> the other hand takes 0.06 seconds.

Even with the overhead Maxima is totally and completely unacceptable.
To take Joel Mohler's example of

t=-p10^170*X1^10*X2^10+p10^130*X1^10*X2^5+p10^130*X1^5*X2^10-p10^90*X1^5*X2^5+p10^80*X1^5*X2^5-p10^40*X1^5-p10^40*X2^5+1

I just tried this in Maxima, and it's been sitting there for about
*FIFTEEN MINUTES*:

sage: time maxima.eval('factor(-p10^170*X1^10*X2^10+p10^130*X1^10*X2^5+p10^130*X1^5*X2^10-p10^90*X1^5*X2^5+p10^80*X1^5*X2^5-p10^40*X1^5-p10^40*X2^5+1)')
... fifteen minutes and still going ...

Magma, on the other hand is instant!

sage: R.<p10,g0,g1,g2,g3,g4,X1,X2>=ZZ[]
sage: t=-p10^170*X1^10*X2^10+p10^130*X1^10*X2^5+p10^130*X1^5*X2^10-p10^90*X1^5*X2^5+p10^80*X1^5*X2^5-p10^40*X1^5-p10^40*X2^5+1
sage: s = magma(t)
sage: print magma.eval('time Factorization(%s)'%s.name())
[
<p10^8*X2 - 1, 1>,
<p10^8*X1 - 1, 1>,
<p10^18*X1*X2 - 1, 1>,
<p10^32*X2^4 + p10^24*X2^3 + p10^16*X2^2 + p10^8*X2 + 1, 1>,
<p10^32*X1^4 + p10^24*X1^3 + p10^16*X1^2 + p10^8*X1 + 1, 1>,
<p10^72*X1^4*X2^4 + p10^54*X1^3*X2^3 + p10^36*X1^2*X2^2 + p10^18*X1*X2 + 1, 1>
]
Time: 0.000


Since the goal of Sage is to be a viable alternative to Maple,Magma,Matlab,
and Mathematica, and Maxima just doesn't hack it speed wise, the only choice
is to implement something ourselves. E.g., the code Joel posted at
http://trac.sagemath.org/sage_trac/ticket/2179 does the above example in
about 0.10 seconds. That's a good start.

The important thing isn't what algorithm is implemented, but that the result
is fast(er than Magma).

-- William

> --Mike
>
>
>
> On Mon, Mar 31, 2008 at 3:39 PM, Roman Pearce <rpea...@gmail.com> wrote:
> >
> > Please excuse a (possibly naive) suggestion, but why not use Maxima
> > for multivariate gcds and factorization ? I looked at the source code
> > and it appears to do Hensel lifting for both. That is the correct
> > algorithm that Sage appears to need. I'm not sure how to run it mod p
> > or over GF(p^q), but it must be possible.
> >
> > >
> >
>
> >
>

--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org

Joel B. Mohler

unread,
Mar 31, 2008, 8:41:37 PM3/31/08
to sage-...@googlegroups.com
On Monday 31 March 2008 07:38:48 pm William Stein wrote:
> On Mon, Mar 31, 2008 at 4:27 PM, Mike Hansen <mha...@gmail.com> wrote:
> >  Hi Roman,
> >
> >  It seems that for characteristic 0, Maxima is quite a bit faster than
> >  Singular, but there's some overhead in talking to Maxima over the
> >  pexpect interface.  In any case, it is still _way_ slower than what we
> >  need.  For example, Maxima takes around 10 seconds to do the bivariate
> >  gcd of degree 100 listed here (
> >  http://magma.maths.usyd.edu.au/users/allan/gcdcomp.html ).  Magma on
> >  the other hand takes 0.06 seconds.
>
> Even with the overhead Maxima is totally and completely unacceptable.
> To take Joel Mohler's example of
>
> t=-p10^170*X1^10*X2^10+p10^130*X1^10*X2^5+p10^130*X1^5*X2^10-p10^90*X1^5*X2
>^5+p10^80*X1^5*X2^5-p10^40*X1^5-p10^40*X2^5+1
>
> I just tried this in Maxima, and it's been sitting there for about
> *FIFTEEN MINUTES*:

I'm amazed maxima is so bad at that example -- it's really about as easy as it
gets actually (there are 3 factors, one does not contain X1 and one does not
contain X2 -- you really don't have to factor at all, just gcd a couple of
times; then each of these 3 factors split again as a difference of 5th
powers).

I've been working very hard at this factoring for the last little bit as it's
really caught my imagination. There's a bivariate factorization scheme by
Gao using differential equations which seems like it might be a really good
tool. The simple application of specializing to a prime to reduce the number
of variable results in humongous coefficients and leads to a very slow
univariate factoring problem. It also has other technical difficulties like
spurious factors which are hard to resolve quickly.

I think there are 2 parts left to solving the factoring riddle for ZZ (and
possibly other bases, but I've been focusing on ZZ):
1) Fast basic arithmetic with a polydict -- I'm making headway
2) *Super-duper Fast* gcd algorithms -- this is going to be much harder than
#1 and the factoring bit. It's utterly crucial to the factoring methods --
crucial for finding squarefree decompositions, crucial for picking off easy
factors like my "hard" example has ... We need to implement every algorithm
we can find for this, benchmark them every way we can, and develop good
heuristics to choose amongst them. It can't be stressed enough that gcd is
central to the entire undertaking.

I'm hoping to have a first stab at industrial strength mpoly factoring over ZZ
in time for 3.0. It won't have industrial strength speed because of the gcd
going to singular, but a drop-in gcd replacement will make a world of
difference. Singular manages to suck in utterly grand fashion for both gcd
and factoring. I'm absolutely certain that we can beat them handily (but
probably not in the next month :-).

--
Joel

Roman Pearce

unread,
Mar 31, 2008, 9:48:44 PM3/31/08
to sage-devel
> The important thing isn't what algorithm is implemented, but that the
> result is fast(er than Magma).

The important thing is whether users can hope to get an answer at all.
The old Singular factoring code was hopeless and I bet multivariate
gcd
is still hopeless. And what about correct ? The new code you linked
to looks like a probabilistic algorithm. There's no point in being
fast if you're wrong, and developing good routines will take months.
I simply wanted to point out the existence of what appears to be
correct routines, even if they are slow.

Also, to Joel Mohler:
You need "Algorithms for Computer Algebra" by Geddes, Czapor, and
Labahn:
Chapter 5: Chinese Remainder Theorem
Chapter 6: Newton's Iteration and Hensel Lifting
Chapter 7: Polynomial GCD
Chapter 8: Polynomial Factorization

What you are trying to do is not a "weeks long" project, it is one
of the central achievements of the entire field. It took a
decade
to do the first time, so don't expect to have "industrial strength"
routines soon. It will realistically take months of full time work.
There's about 100 pages of material in that book, when you take out
the exercises, etc. You need it all.

The people on this list seem to hilariously underestimate the depth
of this problem, and that concerns me. I want Sage to succeed, and
you can't with that attitude. This is a massive undertaking, and
if you treat it like it's not it is ultimately very discouraging.

I'd spell it all out: the things you need to do, the problems and
subproblems, but it's easier and better if you just read the book.

Mike Hansen

unread,
Apr 1, 2008, 12:21:07 AM4/1/08
to sage-...@googlegroups.com
I've posted some benchmarks at
http://wiki.sagemath.org/MultivariateGCDBenchmarks .

--Mike

William Stein

unread,
Apr 1, 2008, 12:36:46 AM4/1/08
to sage-...@googlegroups.com
On Mon, Mar 31, 2008 at 9:21 PM, Mike Hansen <mha...@gmail.com> wrote:
>
> I've posted some benchmarks at
> http://wiki.sagemath.org/MultivariateGCDBenchmarks .
>

Just out of curiosity, is Magma vastly faster
than everything else in every single benchmark? I'm having
some trouble matching up what's at the above website with
the corresponding (?) timings on the Magma website. The
Magma timings are *all* < 0.65 seconds, whereas timings
on the above page are often many seconds to minutes.

By the way, Richard Fateman pointed out to me offlist that
Maxima running on top of clisp _might_ be much
slower than Maxima on gcl. This could be relevant to
our benchmarking.

-- William

--

William Stein

unread,
Apr 1, 2008, 12:55:57 AM4/1/08
to sage-...@googlegroups.com
On Mon, Mar 31, 2008 at 6:48 PM, Roman Pearce <rpea...@gmail.com> wrote:
>
> > The important thing isn't what algorithm is implemented, but that the
> > result is fast(er than Magma).
>
> The important thing is whether users can hope to get an answer at all.
> The old Singular factoring code was hopeless and I bet multivariate
> gcd
> is still hopeless. And what about correct ? The new code you linked
> to looks like a probabilistic algorithm.

Just because an algorithm is probabilistic
doesn't mean it necessarily gives wrong answers. By the way, see

http://www-fourier.ujf-grenoble.fr/~parisse/publi/gcdheu.pdf

which is I think a paper that proves correctness of the algorithm
we're talking about. (I've not carefully read the above short
paper, so I'm not vouching for it though.)

> There's no point in being
> fast if you're wrong, and developing good routines will take months.

In some cases being fast but possibly wrong is very useful. I think
this is not the situation here for mpoly gcd. It is the case for many
interesting computations with algebraic number fields, and people just
have to live with it. By the way, for factoring and gcd there are
double checks.

> I simply wanted to point out the existence of what appears to be
> correct routines, even if they are slow.

I do greatly appreciate that, and I think I misunderstood your comment
to suggest that we should forget about implementing our own code
and just use Maxima and be done with it. I apologize. I was responding
off list to a lot of "use lisp/Maxima or else" emails from somebody,
and was perhaps short tempered as a result. Hopefully you understand.

> Also, to Joel Mohler:
> You need "Algorithms for Computer Algebra" by Geddes, Czapor, and
> Labahn:
> Chapter 5: Chinese Remainder Theorem
> Chapter 6: Newton's Iteration and Hensel Lifting
> Chapter 7: Polynomial GCD
> Chapter 8: Polynomial Factorization
>
> What you are trying to do is not a "weeks long" project, it is one
> of the central achievements of the entire field. It took a
> decade
> to do the first time, so don't expect to have "industrial strength"
> routines soon. It will realistically take months of full time work.
> There's about 100 pages of material in that book, when you take out
> the exercises, etc. You need it all.

We'll see. By the way, who "did it"? Does *any* program doing
multivariate polynomial gcd and factorization well except Magma?
I'm not being rhetorical -- I just don't know the answer. If only
Magma does well, we should just find out what algorithms they
use, and if they aren't just copying from the above book, then we
*need* to be more clever than whatever is in that book. Allan
Steel did it -- by himself -- and so can we.

By the way, the book you suggest above was published in 1992.
Allan Steel implemented all the fast gcd/factoring code in Magma
many years later. It's perhaps highly likely he made great progress over
what's in that book. Maybe we can appeal to his vanity and
ask him.

> The people on this list seem to hilariously underestimate the depth
> of this problem, and that concerns me. I want Sage to succeed, and
> you can't with that attitude. This is a massive undertaking, and
> if you treat it like it's not it is ultimately very discouraging.
>
> I'd spell it all out: the things you need to do, the problems and
> subproblems, but it's easier and better if you just read the book.

I'm curious -- have you actually implemented multivariate gcd and/or
multivariate factorization? You talk as if you have.

-- William

William Stein

unread,
Apr 1, 2008, 1:03:35 AM4/1/08
to sage-...@googlegroups.com

Wait, actually Allan says what Magma does on this page:
http://magma.maths.usyd.edu.au/magma/htmlhelp/text314.htm#1994
And he says in each case that he uses algorithms that he made
that are based on those from the book you cite. For example
he writes: "... (3) a recursive multivariate evaluation-interpolation
algorithm (similar to that in [GCL92, section 7.4]), which in fact
works generically
over {Z} or most fields."

William

root

unread,
Apr 1, 2008, 2:21:20 AM4/1/08
to wst...@gmail.com, sage-...@googlegroups.com
William,

>By the way, Richard Fateman pointed out to me offlist that
>Maxima running on top of clisp _might_ be much
>slower than Maxima on gcl. This could be relevant to
>our benchmarking.

Not to start an implementation war but GCL compiles to C which
compiles to machine code whereas clisp is interpreted. Both Axiom
and Maxima have implementations on top of GCL. GCL includes a
routine that will output the lisp type declarations and if these
are loaded into the image with a recompile the resulting code is
even faster. So if you use Maxima, use the GCL version.

CMUCL/SBCL are capable of slightly tighter optimizations because
they grew out of the SPICE Lisp project (under Scott Fahlman)
and concentrated on code optimizations. Under CMUCL I managed to
optimize a function down to a single machine instruction so it IS
possible to get maximal optimizations. But GCL also lays down some
very tight code since proper declarations can usually eliminate
type dispatch, which is important in the inner loops.

You can see the generated code by calling (disassemble ....)

On the other hand you're likely to get small factors of improvements
by changing to compiled lisps (but certainly much faster than python).
My experience shows that its the algorithm changes that matter most.


On a related note, I found it odd that someone commented that
they wish to remove Lisp from Sage. Do you have any idea what
might motivate that?

Tim Daly
Axiom


William Stein

unread,
Apr 1, 2008, 1:19:24 AM4/1/08
to da...@axiom-developer.org, sage-...@googlegroups.com
On Mon, Mar 31, 2008 at 11:21 PM, root <da...@axiom-developer.org> wrote:
> William,
>
>
> >By the way, Richard Fateman pointed out to me offlist that
> >Maxima running on top of clisp _might_ be much
> >slower than Maxima on gcl. This could be relevant to
> >our benchmarking.
>
> Not to start an implementation war but GCL compiles to C which
> compiles to machine code whereas clisp is interpreted.

Not at all. Many thanks for your helpful remarks below...!

> Both Axiom
> and Maxima have implementations on top of GCL. GCL includes a
> routine that will output the lisp type declarations and if these
> are loaded into the image with a recompile the resulting code is
> even faster. So if you use Maxima, use the GCL version.

Unfortunately we've had a lot of problems building GCL on all
our target platforms and have worries since it hasn't had a single
official release in about 3 years. GCL's fine on Linux, but not
elsewhere. I would have very very much liked to use GCL
for Sage originally, but for the life of me I couldn't get it to build
on the Sage platforms, unfortunately. We've tried several times
in the subsequent years, but always failed. Maybe it is time to
try yet again...

> CMUCL/SBCL are capable of slightly tighter optimizations because
> they grew out of the SPICE Lisp project (under Scott Fahlman)
> and concentrated on code optimizations. Under CMUCL I managed to
> optimize a function down to a single machine instruction so it IS
> possible to get maximal optimizations. But GCL also lays down some
> very tight code since proper declarations can usually eliminate
> type dispatch, which is important in the inner loops.
>
> You can see the generated code by calling (disassemble ....)
>
> On the other hand you're likely to get small factors of improvements
> by changing to compiled lisps (but certainly much faster than python).
> My experience shows that its the algorithm changes that matter most.

Very interesting. I wouldn't what the differences are in
speed between gcl and clisp maxima... A factor of 2 or 10,
or maybe it varies a lot depending on the operation...

>
>
>
>
> On a related note, I found it odd that someone commented that
> they wish to remove Lisp from Sage. Do you have any idea what
> might motivate that?

Michael Abshoff made that comment. He's motivated by wanting
to port Sage to a wide range of architectures and keep everything
maintainable, since he works incredibly hard on that. He suffers
a huge amount trying to deal with build issues on various platforms
such as solaris, Linux PPC, etc. I'm sure you understand well
how hard building complicated math software is on a range of platforms!

By the way, his comment was not an "official opinion" of the
Sage project or anything, and packages don't leave Sage without
a lot of discussion, voting, etc.

-- William

root

unread,
Apr 1, 2008, 2:58:31 AM4/1/08
to wst...@gmail.com, da...@axiom-developer.org, sage-...@googlegroups.com
>Michael Abshoff made that comment. He's motivated by wanting
>to port Sage to a wide range of architectures and keep everything
>maintainable, since he works incredibly hard on that. He suffers
>a huge amount trying to deal with build issues on various platforms
>such as solaris, Linux PPC, etc. I'm sure you understand well
>how hard building complicated math software is on a range of platforms!

I understand. I have ported Axiom from Maclisp/VMLisp on mainframes to
workstations to PCs under a dozen lisps and a dozen opsys versions,
including Dos so I understand his pain. In fact, I think that porting
Sage is going to absorb a very large portion of your available time
and energy. The port to the next version of Python should be fun.
Trust me, you're underpaying Michael :-)

Tim Daly
Axiom

Michael Brickenstein

unread,
Apr 1, 2008, 1:52:50 AM4/1/08
to sage-devel, bur...@erocal.org
Hi!
I think, multivariate gcd is a neverending topic, discussed very often
in many communities.
Actually, I never tried to implement that and hopefully I never will.
But I *enjoyed* ;-) many discussions about it.

So, I think, if you want to have success, the first task is, what
Roman said:
Learning, that the multivariate GCD is a big project and cannot be
accomplished
in a weeks project or studying a few examples.

I have seen many cases, where multivariate gcd computations (as part
of a big complex computation)
took "forever" in one version of Singular and were immediately
calculated in the next version (but for another set of examples
we had just the opposite phenomenon).
And this isn't a Singular problem, but it is related to the problem
itself.

So stop this nonsense, find the *easy secret* of fast multivariate gcd
and realize, that it is hard work.
In fact, we know, that part of the strength of the implementation in
Magma was the hard work Allan Steel put in,
treating all these special cases.

I didn't try maxima and I don't know anything about the efficiency of
its polynomial data structures.
But I know that this statement looks very true:
>My experience shows that its the algorithm changes that matter most.
So it might be the case, that the code in Maxima or Singular provides
a good basis for really fast gcd (but that needs work)

A statement to the Singular gcd:
As far as I know the gcd in Singular is part of Singular-factory, a
quite independent C++-library.
I features a recursive polynomial representation (which is probably a
good data structure for multivariate gcds).
Pro:
- quite small
- also used in Macaulay 2
- improving it will speed up other parts of SAGEs commutative algebra
(which relies on libSingular)
Contra:
- quite old (not everything works)

Best regards,
Michael

William Stein

unread,
Apr 1, 2008, 1:54:00 AM4/1/08
to da...@axiom-developer.org, sage-...@googlegroups.com
On Mon, Mar 31, 2008 at 11:58 PM, root <da...@axiom-developer.org> wrote:
> >Michael Abshoff made that comment. He's motivated by wanting
> >to port Sage to a wide range of architectures and keep everything
> >maintainable, since he works incredibly hard on that. He suffers
> >a huge amount trying to deal with build issues on various platforms
> >such as solaris, Linux PPC, etc. I'm sure you understand well
> >how hard building complicated math software is on a range of platforms!
>
> I understand. I have ported Axiom from Maclisp/VMLisp on mainframes to
> workstations to PCs under a dozen lisps and a dozen opsys versions,
> including Dos so I understand his pain.

Wow, you should volunteer to help us :-). (Just kidding; I'm sure you're
very busy with Axiom, etc.)

> In fact, I think that porting
> Sage is going to absorb a very large portion of your available time
> and energy.

Not *my* time. But you're right that will absorb a lot of time. It
is very much worth the effort though.

> The port to the next version of Python should be fun.

Fortunately we can leave that mostly to the Python community.
Or, do you mean porting Sage to run under Python 3.0? Yep,
that's going to be a whopper. Our plan is to wait until
the many Python components of Sage transition forward,
and when that is done, then we will too. We won't until that
happens. I think it won't be too bad for us *after* that happens,
since many of the changes in Python 3.0 are things we
really want; trying to transition before all the 3rd party
packages transiiton would be insanely hard.

Fortunately, we've been careful only to include very well
supported active 3rd party Python modules in Sage (I hope! Knock
on wood.).
Here are the 3rd party Python modules we use in Sage:
* matplotlib -- plotting
* mercurial -- revision control
* moinmoin -- wiki
* networkx -- graph theory
* numpy -- numerical linear algebra
* pexpect -- pseudo tty
* pycrypto
* scipy -- numerics
* scons
* sqlalcheymy
* sympy
* twisted
* weave
* zodb3

> Trust me, you're underpaying Michael :-)

Michael is amazing. He's a hero.

-- William

Timothy G Abbott

unread,
Apr 1, 2008, 2:10:30 AM4/1/08
to sage-...@googlegroups.com, wst...@gmail.com
Since the Debian distribution of SAGE uses Maxima with GCL list, I figured
I'd run the benchmarks Mike posted on my installation. The SAGE times are
comparable to those in Mike's test, while the Maxima tests are faster:

sage: load /home/tabbott/fermat_gcd_1var.py
sage: time a = p1.gcd(p2, algorithm='ezgcd')
Wall time: 0.03 s (compare with 0.02)
sage: mp1 = maxima(p1)
sage: mp2 = maxima(p2)
sage: time a = mp1.gcd(mp2)
Wall time: 0.07 s (compare with 0.20)

sage: load /home/tabbott/fermat_gcd_4var.py
sage: time a = p1.gcd(p2, algorithm='ezgcd')
Wall time: 0.12 s (compare with 0.13)
sage: mp1 = maxima(p1)
sage: mp2 = maxima(p2)
sage: time a = mp1.gcd(mp2)
Wall time: 0.51 (compare with 1.04)

So, it looks like one's getting a 2-3x speedup by using Maxima with GCL
lisp on this particular test.

Of course, the fact that I'm running with GCL lisp also means that my SAGE
fails all the Maxima doctests involving real numbers due to roundoff
differences.

-Tim Abbott

Roman Pearce

unread,
Apr 1, 2008, 2:50:11 AM4/1/08
to sage-devel
On Mar 31, 10:55 pm, "William Stein" <wst...@gmail.com> wrote:
> On Mon, Mar 31, 2008 at 6:48 PM, Roman Pearce <rpear...@gmail.com> wrote:
>
> > You need "Algorithms for Computer Algebra" by Geddes, Czapor, and
> > Labahn:
> > Chapter 5: Chinese Remainder Theorem
> > Chapter 6: Newton's Iteration and Hensel Lifting
> > Chapter 7: Polynomial GCD
> > Chapter 8: Polynomial Factorization
...
> > There's about 100 pages of material in that book, when you take out
> > the exercises, etc. You need it all.

> We'll see. By the way, who "did it"?
For Maple, I believe Michael Monagan and Laurent Bernadin wrote a lot
of the main routines in use today. But many other people have
contributed over a long period of time. A lot of old code has been
replaced, and the full list of contributors would have at least 20
people on it.

> Does *any* program doing
> multivariate polynomial gcd and factorization well except Magma?

Yes, Maple and Mathematica. It's not just factoring over Z either.
You will eventually want to factor over algebraic number fields and
function fields. Factoring and GCDs are a huge problem and getting
them implemented is a significant accomplishment. You will also need
resultants, probably.

> By the way, the book you suggest above was published in 1992.
> Allan Steel implemented all the fast gcd/factoring code in Magma
> many years later. It's perhaps highly likely he made great progress over
> what's in that book.

The book describes the basic core approach which everyone needs to
understand. People like Allan Steel work on these algorithms over and
over and develop deep insight into them. That's how you make
fundamental improvements. Often an improvement will be something
simple, but thinking of it and understanding why it works is the
product of a massive amount of experience and expertise. People can't
just go and get that out of a book. You have to be fully immersed in
the problem to understand what is going on.

I am not an expert in this area, but if I had to write this thing I
would start with that book. You can work through all the algorithms
carefully. A careful choice of data structures and pre-existing
routines would produce a first version that "works". It wouldn't be
great, but it would get Sage out of a rut. Then you go back and try
again, with better data structures and better routines and a little
more insight. By the third version or so, you can make fundamental
improvements. There are many, many things you can do. It's a whole
area of computer algebra. The best strategy is to find a student who
is interested in factorization. Sage has some nice libraries (fast
GMP, FLINT) and a nice programming language that can be compiled. So
there is the potential here to write fast code. But getting the
algorithms right is a big commitment, say 1-2 months for a first
version (using pre-existing routines) and 2-4 months for each version
after that (when you're rewriting things from scratch). It could be
longer, depending on how it goes.

> I'm curious -- have you actually implemented multivariate gcd and/or
> multivariate factorization?

During (most of) a course in computer algebra, I implemented the
algorithms on top of Maple. Of course my code was garbage, but I
worked through the algorithms and got an appreciation for them. It's
not a giant impossible task (unless you insist on beating Magma), but
it can't be done "part-time" either. And you have to start by saying
"our code might be slower on a lot of examples, but we're going to use
the right algorithms." That's a difficult pill to swallow. One big
leg up is that Sage already has good univariate factorization mod p
and over GF(p^q). I suspect that in terms of code, there is not
really that much to write, but acquiring the necessary experience will
take time and a lot of effort.

One final piece of advice - you can't win by copying your
competitors. So forget about what Magma does for now. Magma may be
the fastest system out there, but I doubt it's the fastest system
*possible*. Allan Steel surely knows "there's no such thing as the
fastest code". This is really true.

Bill Hart

unread,
Apr 1, 2008, 5:05:24 AM4/1/08
to sage-devel
Roman, I thoroughly agree with you that the multipolygcd and factoring
problem is not going to go away overnight. I'm sure by your comments
that you can guess what we've been doing with FLINT for univariate gcd
and how long even that is taking.

Also, I too get frustrated by some of the simple-minded comments that
get made about this problem. I also get tired of posting about the
information available about what Magma does for GCD and factoring and
about the various papers available on this subject. But, this is
partly my own fault, since Mike Hanson set up a webpage to blog
everything we know about the subject in order to come up with a
strategy for dealing with this problem, and I've been too lazy (or
busy) to put my notes on there.

But let me say from off-list discussions with Joel Mohler that I think
he just might be going to prove us all wrong. There just might be a
simple set of algorithms which give the answer almost all the time
quickly (if you are comparing to Magma) for multipoly factoring. So
I'm not ready to make a call on this one. I'm not saying I'm off to
the local book maker to place money on Joel's approach, but simply
that he's already proved me wrong on a number of points of
"established wisdom" and his overall plan looks very promising.

But having said that, there is no *single* algorithm which will always
work quickly.

There have also been some recent improvements in algorithms which will
take much of the hard work out of this problem. This is an advantage
that people in the past didn't have. It pains me to admit it, but
possibly "doing things properly" by going through all the "really hard
work" that people have done in the past, just may not be necessary any
more. How sad, because I was looking forward to this challenge! But
we'll see.

Regarding probabilistic algorithms, we need to be careful to
distinguish Las Vegas algorithms from Monte Carlo algorithms (you can
check wikipedia for what I am taking as my definitions). Victor Shoup
actually implements both in NTL, but one kind of probabilistic
algorithm is perfectly fine for inclusion in SAGE so long as early
bailout is implemented (something SINGULAR is guilty of not doing).The
other kind of probabilistic algorithm is fine too, if proper checks of
correctness are made and if there is a fallback to another algorithm
if the checks fail. I essentially do this in FLINT for GCD at one
point, due to a comment of Allan Steel on the Magma website and the
result is something like a 100 times speedup in getting the correct
answer. At one point, what I do in FLINT is asymptotically (and
practically much) faster than what Pari does because of this simple
observation.

By the way, it turns out that a single algorithm is sufficient to beat
Magma at univariate GCD over ZZ. But at the very least one can
actually be asymptotically faster (in increasing size of
coefficients), so why stop there! Let's plan to eventually look well
past Magma.

Bill.

mabshoff

unread,
Apr 1, 2008, 5:57:29 AM4/1/08
to sage-devel


On Apr 1, 7:19 am, "William Stein" <wst...@gmail.com> wrote:
> On Mon, Mar 31, 2008 at 11:21 PM, root <d...@axiom-developer.org> wrote:

<SNIP>

[Apologies in advance for going somewhat off topic in this thread and
ranting a little too much :)]

Hi Tim,

> > Both Axiom
> >  and Maxima have implementations on top of GCL. GCL includes a
> >  routine that will output the lisp type declarations and if these
> >  are loaded into the image with a recompile the resulting code is
> >  even faster. So if you use Maxima, use the GCL version.
>
> Unfortunately we've had a lot of problems building GCL on all
> our target platforms and have worries since it hasn't had a single
> official release in about 3 years.  GCL's fine on Linux, but not
> elsewhere.   I would have very very much liked to use GCL
> for Sage originally, but for the life of me I couldn't get it to build
> on the Sage platforms, unfortunately.  We've tried several times
> in the subsequent years, but always failed.   Maybe it is time to
> try yet again...

I am pretty sure it is well known that gcl is faster than clisp if you
start looking around a little.

> >  CMUCL/SBCL are capable of slightly tighter optimizations because
> >  they grew out of the SPICE Lisp project (under Scott Fahlman)
> >  and concentrated on code optimizations. Under CMUCL I managed to
> >  optimize a function down to a single machine instruction so it IS
> >  possible to get maximal optimizations. But GCL also lays down some
> >  very tight code since proper declarations can usually eliminate
> >  type dispatch, which is important in the inner loops.
>
> >  You can see the generated code by calling (disassemble ....)
>
> >  On the other hand you're likely to get small factors of improvements
> >  by changing to compiled lisps (but certainly much faster than python).
> >  My experience shows that its the algorithm changes that matter most.
>
> Very interesting.   I wouldn't what the differences are in
> speed between gcl and clisp maxima...  A factor of 2 or 10,
> or maybe it varies a lot depending on the operation...
>
>
>
> >  On a related note, I found it odd that someone commented that
> >  they wish to remove Lisp from Sage. Do you have any idea what
> >  might motivate that?
>
> Michael Abshoff made that comment.


My exact comment was

We are currently in the process of implementing symbolic
caculus in Sage itself and will hopefully in the next year move
differentiation, limits and integration and so on away from Maxima
into the Sage core. This is motivated by the need for performance and
the fact that we want to dump any lisp based code from the core of
Sage in the long term.

See http://groups.google.com/group/sci.math.symbolic/browse_thread/thread/11d3054c63e76aff/922f2b5de7b38ac2#922f2b5de7b38ac2
for the whole thread that is still on going and is quite an
interesting read.

"we" in that context wasn't meant as an officially sanctioned
statement from the Sage community, but it certainly expresses the
opinion several other people and I have. The motivation is not hatred
of lisp since I have never used it beyond playing around with it a
little in Emacs. I understand it solves certain problems quite nicely,
but in the end it isn't part of my tool set and due to my experiences
listed below which do include the porting issues I am not very bullish
on lisp.

>  He's motivated by wanting
> to port Sage to a wide range of architectures and keep everything
> maintainable, since he works incredibly hard on that.   He suffers
> a huge amount trying to deal with build issues on various platforms
> such as solaris, Linux PPC, etc.  I'm sure you understand well
> how hard building complicated math software is on a range of platforms!

After having played extensively with compiling open source lisp
implementations [only on Solaris, Linx and OSX, much less *BSD, AIX
and so on] let me give you my impression: I know of five open source
implementations of lisp:

Do not need to bootstrap themselves

a) gcl: no Solaris support, depends on libbfd [and elf precursor!], no
stable release in nearly five years. I broke the compilation trivially
on RedHat Enterprise 5 - a platform which should be well supported.
The issue was easy to fix, but that is a different story.
b) clisp: broken with gcc 4.2.x and gcc 4.3.0 for at least six months.
In effect a single developer. compiles on Solaris with gcc 4.2.2, but
segfaults in make check more or less instantly, segfaults building
Maxima when build with gcc 3.4.6 and also doing "make check" [but
later than with gcc 4.2.2]. gcc specific stack pointer code, so no
option to use the Sun Forte compiler on Solaris. This has been fixed
in CVS two, three days ago, but now the autoconf script for Solaris is
broken.
c) ecls: the ray of hope? But Maxima can't be build on top of it due
to same issues with save and load image. FriCAS has been ported to it.
Slower than clisp at the moment, but things are improving [Waldek
Hebisch gave a lot of input on pretty printing and closures over the
last two months or so and consequently things did improve in that area
*a lot*]. Like clisp in effect a single developer is working on this,
so the code has a bad bus factor.

Needs an existing lisp machine to boot and compile itself

d) sbcl - it works well on Solaris. But since it needs another lisp
instance it doesn't fit our requirements. Of the open lisp
implementations this seems to be the most portable and generally
delivers very good performance.
c) cmucl - it runs, but I managed to segfault it with the current
Maxima on Solaris.

So: that limits my choices of lisp on Solaris to (d) - which makes it
useless for Sage. There are commercial implementations, but those do
not ship with sources and are not acceptable for certain parties for
that reason.

If Maxima were ported to ecls I would put a lot of energy to switch
from clisp to ecls, regardless of potential performance problems in
the short term. Maxima isn't competitive as is for many things and a
constant slowdown by a factor of three will more than make up for the
reduced pain to compile clisp IMHO. This is also not the official Sage
opinion, but I am sure William will not disagree with me.

So what is my conclusion from all of the above: The the lisp ecosystem
is not exactly thriving because its tools which it depends upon are in
bad shape. Look for binaries for some more exotic arches [i.e. non x86
and x86-64 Linux] for cmucl or sbcl. If you don't find any people do
not care for those platforms and that is allways a bad sign. There are
probably blastwave binaries available [for clisp it is an ancient
2.39, but as non-root I cannot install those].

From my experience porting Sage or any other code I have worked on I
have drawn the following conclusion: If projcet X compiles fine,
passes it test suite and valgrinds clean it *does not* mean that the
code is even remotely close to bug free. It just means that you have
by design not found the bugs that do not affect you or are hidden by
your particular choice of compiler. Exposing Sage to Solaris or even
PPC Linux has pointed me to an rather large number of bugs. Other code
I have worked on has improved by porting it to Sun Forte's compiler,
MSVC or also IBM's C compiler. So that leaves me to believe that
porting Sage to platforms [PPC Linux for example] of insignificant
commercial value [i.e. no commercial software vendor would port their
mathematical code to it, i.e. No Maple or Mathematica for PPC Linux]
might seem like a waste of resources, but it makes the code better. In
my experience it is only a question of time until some bug that is
hidden by your choice of tool set bites you in the ass. A prime
example is

http://trac.sagemath.org/sage_trac/ticket/1337

which was caught instantly on PPC Linux since it segfaulted Sage on
exit every time. Depending on which patches you merged it would do the
same on x86-64. We didn't merge a fix for a huge leak in the coercion
system because of that bug. There was a lot of discussion about that
bug off list and During SD7 and 8 and we finally fixed it because I
really, really wanted to merge that bugfix. But that is the reason I
consider exotic arches the "canary in the coal mine" for Sage and
really any other piece of code in general. Right now Solaris has
exposed a bunch of issues regarding alignment, endianess issues,
assumptions about pointer size and ints and so on. And all of them can
potentially become a problem for you favorite arch. So dismissing
arches because they are not mainstream is a bad thing.

> By the way, his comment was not an "official opinion" of the
> Sage project or anything, and packages don't leave Sage without
> a lot of discussion, voting, etc.

And even if it were to happen it will be a long, drawn out process.

>  -- William

Cheers,

Michael

Bill Hart

unread,
Apr 1, 2008, 6:44:59 AM4/1/08
to sage-devel


On 1 Apr, 05:21, "Mike Hansen" <mhan...@gmail.com> wrote:
> I've posted some benchmarks athttp://wiki.sagemath.org/MultivariateGCDBenchmarks.
>
> --Mike

I can't do timings for the degree 1000 or 2000 (at least Allan Steel
gives it as degree 2000, whereas your page Mike seems to say it is
degree 10000), since FLINT doesn't have the asymptotically fast half
gcd implemented yet for polynomials over Z/pZ, which is a prerequisite
for fast times for these problems.

But for the degree 100 problem I can give a rough preliminary timing.
FLINT computes it in 0.0008949s. Magma currently takes 0.0008300s. So
I guess that is about 24 times faster than ezgcd and 30 times faster
than modular and 200 times faster than Maxima and 2000 times faster
than NTL (or is it Pari? - I don't know what the (univariate) timing
means).

FLINT will be at least 50% faster by the time I'm done with this
function. It's not even properly tuned yet!

Bill.



mabshoff

unread,
Apr 1, 2008, 7:09:24 AM4/1/08
to sage-devel
To make my comment somewhat clearer: I did not mean the removal of
lisp based code in Sage in general, but from the supported, standard
components. I.e. having interfaces to Maxima, Axiom [or FriCAS or
OpenAxiom] is desirable, but I do not believe those should be in the
core of Sage. If one person is willing to maintain an interface and an
optional spkg for $CAS Sage should support that. But it is quite
different to make it a core component since that requires build
support and also the ability on our end to potentially fix things. So
far there is insufficient expertise to deal with code in Axiom or
Maxima IMHO.

> Trust me, you're underpaying Michael :-)

Thanks. I am aware of that fact. Indeed, I walked away from a
financially potentially much more lucrative project. So: What is my
motivation to work for Sage? It is fun and as long as I can pay my
bills I will always chose the more fun project. Another aspect is also
that decisions are made on technical merit and in the open. One of the
reasons I have walked away from the last two jobs I help was precisely
because my technical recommendations were ignore.

> Tim Daly
> Axiom

Cheers,

Michael
Message has been deleted

parisse

unread,
Apr 8, 2008, 11:21:06 AM4/8/08
to sage-devel
> Just because an algorithm is probabilistic
> doesn't mean it necessarily gives wrong answers. By the way, see
>
> http://www-fourier.ujf-grenoble.fr/~parisse/publi/gcdheu.pdf
>
> which is I think a paper that proves correctness of the algorithm
> we're talking about. (I've not carefully read the above short
> paper, so I'm not vouching for it though.)

More precisely it fills a little gap in the proof of correctness for
multivariate polynomials and extends the coefficients from Z to Z[i].
Gcdheu is one of about 4 main algorithms (ezgcd, gcdheu, modular with
sparse and dense variants, subresultant), all implemented in Giac.
http://www-fourier.ujf-grenoble.fr/~parisse/giac.html
I have added a benchmark link with Fermat gcd tests, giac seems 5 to
10 * faster than maxima. I don't have magma, it is most probably
another factor of 10 * faster.
Sage developers may of course decide to write their own gcd
algorithms, but that's a large piece of code to write and test. Why
not make a link to giac (which is a C++ library) instead? Sage and
giac could then benefit from improvements and more interesting piece
of code could be written, like bivariate factorization (which would
improve giac multivariate factorization) or absolute factorization
over C. Moreover, sage could benefit from the calculus code of giac
(series, integration, etc.). I would be ready to contribute to write
such a link, but I would need help on the sage side, I don't know
python at all.

B. Parisse

Mike Hansen

unread,
Apr 8, 2008, 3:25:40 PM4/8/08
to sage-...@googlegroups.com
> I have added a benchmark link with Fermat gcd tests, giac seems 5 to
> 10 * faster than maxima. I don't have magma, it is most probably
>
> another factor of 10 * faster.

I think that comparison with Magma is a little optimistic, especially
in the modular case.

http://www-fourier.ujf-grenoble.fr/~parisse/giac/benchmarks/gcd_timings
http://magma.maths.usyd.edu.au/users/allan/gcdcomp.html

--Mike

Joel B. Mohler

unread,
Apr 8, 2008, 4:44:41 PM4/8/08
to sage-...@googlegroups.com
On Tuesday 08 April 2008 03:25:40 pm Mike Hansen wrote:
> Today 03:25:40 pm

Now, it's true that we don't want to re-invent the wheel. However, it seems
to me that there aren't any opensource packages that manage the range of base
rings that we'd want sage to handle for multivariate polynomial rings.
Singular is probably the closest, but ZZ is only experimental.

So, while I may be reinventing the wheel for mpoly factoring over ZZ (although
it appears somewhat feasible that the wheel will be re-invented better), it
has already had a number of positive changes in the generic multivariate
polydict polynomial implementation in sage. I think the generic-ness of the
code is something worth pushing forward and seeing how fast we could be with
a generic base ring. I (and Martin, too) am interested in this generic
approach. Indeed, even the factoring algorithm has a large "prefix"
(square-free decomposition and the like) which is largely independent of the
base-ring.

So, my bottom line is this, there might be special purpose implementations for
specific base-rings, but I don't think that is reason to abandon pushing our
own implementation because I think our implementation can handle base rings
that nobody else can. Of course, it is likely that this generic-ness will
cost us some speed. But, it is not obvious to me that that cost is
necessarily prohibitive.

--
Joel

parisse

unread,
Apr 9, 2008, 6:40:19 AM4/9/08
to sage-devel


On 8 avr, 21:25, "Mike Hansen" <mhan...@gmail.com> wrote:
> > I have added a benchmark link with Fermat gcd tests, giac seems 5 to
> > 10 * faster than maxima. I don't have magma, it is most probably
>
> > another factor of 10 * faster.
>
> I think that comparison with Magma is a little optimistic, especially
> in the modular case.
>
>

Right, I thought the Opteron was faster than it really is. I have
compiled giac over an Opteron (244) to be able to make a more
meaningfull comparison and updated the timings. The improvements of
magma between version 2.09 (where giac has comparable timings in the
non modular case) and 2.12 suggests that either new algorithms were
found (but I never heard of them) or existing algorithms were improved
(tested with these benchmarks). And maybe, magma 2.12 has speed
improvements related to the 64 bit architecture.
Anyway, using giac would improve the current sage gcd speed and
working on efficiency would benefit to both. Now, if sage developers
are not interested, it's not a main concern for me, giac has it's own
interfaces, but I would find a little bit stupid that these two open-
source softwares which have a common target to provide an alternative
to maple/mathematica would not cooperate.

Martin Albrecht

unread,
Apr 11, 2008, 5:39:03 PM4/11/08
to sage-...@googlegroups.com

Hi,

sorry for joining in so late. Generally speaking, it is no necessarily easy to
determine what the "Sage developers" want, since we are all scattered across
the globe and all have different agendas/preferences/ideas. Also, all one
needs is one Sage developer (oneself can be that developer) to start working
on something. Thus, the lack of response doesn't mean that "the Sage
developers" don't want to collaborate. Each one of us might just have too
much on her plate right now.

To the topic: I don't really get the timings you posted, i.e. I cannot really
relate them to the timings posted on the Sage Wiki, could you try to provide
data that is easier to relate, or maybe I'm missing something.

I would say the best way to cooporate (since both projects aim to be a
frontend and IIRC you also ship Pari etc.) is to help each other out on the
C(++) library level. Especially for multivariate factoring, it would make
sense to colaborate on C/C++ library all projects could you. This way
everybody wins.

I don't really know the area but the "Factory" might be a good starting point.
Factory is the stand-alone (!) library used by Singular to do multivariate
factoring. Of course the defects of Factory started this thread in the first
place, but it might still have some sort of framework such that one can start
developing the missing algorithms. I would like to hear from people with more
experience in that area what they think about Factory's data structures
(Roman?). Another big plus for contributing to Factory is that it is somehow
widely used: Singular, Macaulay2 and Sage. Btw. Michael, what is the status
of multivariate factoring in CoCoALib?

To put some code where my mouth is, I can provide the following: We had a
direct interface to the factory once and I could revive it: I.e. I could
provide a Python interface to Factory that looks as much as possible like the
C++ interface in terms of class names, methods, etc. Then, people can start
going crazy and implement their favorite factoring algorithms, heuristics,
tricks in the Sage interpreter rather than on C level and make use of all the
Sage library (I guess linear algebra could be relevant). As a crucial second
step -- if the result doesn't totally suck -- this all gets ported down to
C++. I can't do the "creative" steps, where one has to write new algorithms
etc., but I can certainly help for the first (to Python) and the last (from
Python) step. But maybe this whole idea is just bollocks, I don't know.

Thoughts?

Martin

PS: In any case, If anyone wants to work on an optional GIAC spkg, please
speak up!


--
name: Martin Albrecht
_pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
_www: http://www.informatik.uni-bremen.de/~malb
_jab: martinr...@jabber.ccc.de

Joel B. Mohler

unread,
Apr 11, 2008, 10:23:38 PM4/11/08
to sage-...@googlegroups.com
On Friday 11 April 2008 05:39:03 pm Martin Albrecht wrote:
> PS: In any case, If anyone wants to work on an optional GIAC spkg, please
> speak up!

I tried to build GIAC and it didn't go so well. The seemingly obvious way to
have it pick up sage pre-installed components (--prefix=...) allowed
configure to go through fine, but the compile did not. I manually fixed this
in one directory, but another failed then. I know that this is way to vague
to be a real bug report.

Another thing is that I think giac should be separated out from xcas so as to
not introduce an X dependency. This too, I'm not sure that I did my own
research adequately.

Nevertheless, I say all that to say that I did (quickly) attempt to try out
giac and when I get another bit of time to return to these pursuits that is
near the top of the list of things to try.

But, I'm also interested in making the sage sparse multivariate a very fast
alternative for simple arithmetic since we can offer a breadth of choices for
base rings that appears unparalleled. But, indeed, I'm simply enjoying
learning about mpoly factorization myself ... and one way (the best way?) to
learn it is to try and *do* it for oneself.

--
Joel

mabshoff

unread,
Apr 11, 2008, 10:57:13 PM4/11/08
to sage-devel

Hi,

On Apr 12, 4:23 am, "Joel B. Mohler" <j...@kiwistrawberry.us> wrote:
> On Friday 11 April 2008 05:39:03 pm Martin Albrecht wrote:
>
> > PS: In any case, If anyone wants to work on an optional GIAC spkg, please
> > speak up!
>
> I tried to build GIAC and it didn't go so well.  The seemingly obvious way to
> have it pick up sage pre-installed components (--prefix=...) allowed
> configure to go through fine, but the compile did not.  I manually fixed this
> in one directory, but another failed then.  I know that this is way to vague
> to be a real bug report.
>
> Another thing is that I think giac should be separated out from xcas so as to
> not introduce an X dependency.  This too, I'm not sure that I did my own
> research adequately.

You can disable the X component of GIAC, so that isn't too much of a
problem. But the fact that it isn't cleanly separate is something I
would consider non-optimal.

> Nevertheless, I say all that to say that I did (quickly) attempt to try out
> giac and when I get another bit of time to return to these pursuits that is
> near the top of the list of things to try.
>
> But, I'm also interested in making the sage sparse multivariate a very fast
> alternative for simple arithmetic since we can offer a breadth of choices for
> base rings that appears unparalleled.  But, indeed, I'm simply enjoying
> learning about mpoly factorization myself ... and one way (the best way?) to
> learn it is to try and *do* it for oneself.

Aside from that we did discuss GIAC again in IRC a couple days ago
[some people did look at it in the past and it has come up in the
past] and there were a couple comments [my recollection, not some
official statement]:

* Any standard component of Sage for now has to be GPL V2+
compatible. GIAC seems to be GPL V3[+] now
* The code is sparingly commented and seems to conform to its own
coding style. We specifically looked at the Risch code and also
infrastructure like vecteur - this is really a big issue since dealing
with sparingly documented code in the past had proven to be a huge
pain in case we had to fix bugs
* It is unclear that a ten time speedup over the current code is
worth dealing with 150k lines of code. It is another polynomial class
to deal with in coercion, build issues and so on. If it were "Magma"
speed that probably would be something we would be willing to live
with ;)
* The code doesn't seem to have MSVC support and due to its size the
size/benefit ratio doesn't look good. Since I will likely end up
porting the code I am not too excited to have to deal with another
150k lines of code.
* there are CoCoALib leftover files in the src directory, namely Tmp*
- I am sure this is an oversight.
* It is likely that improved algorithms will need fast linear algebra
or some other goodies and writing interfaces to GIAC if we already
have them in Sage seems of little benefit to us.

All of the above can be overcame and somebody else should take a look
and check if my impression is correct and you should correct me if I
am wrong.

> --
> Joel

Cheers,

Michael

parisse

unread,
Apr 12, 2008, 2:52:03 AM4/12/08
to sage-devel

;; This buffer is for notes you don't want to save, and for Lisp
evaluation.
;; If you want to create a file, visit that file with C-x C-f,
;; then enter the text in that file's own buffer.

> You can disable the X component of GIAC, so that isn't too much of a
> problem. But the fact that it isn't cleanly separate is something I
> would consider non-optimal.
>

The main reason is that it is not a top level priority to do a proper
splitting. And for example readrgb is using the FLTK images library.
But as you say, it can be compiled without X.

>
> Aside from that we did discuss GIAC again in IRC a couple days ago
> [some people did look at it in the past and it has come up in the
> past] and there were a couple comments [my recollection, not some
> official statement]:
>
> * Any standard component of Sage for now has to be GPL V2+
> compatible. GIAC seems to be GPL V3[+] now

This should not be a problem. I can switch back to v2. But the normal
way should be that v2 softwares go to v3.

> * The code is sparingly commented and seems to conform to its own
> coding style. We specifically looked at the Risch code and also
> infrastructure like vecteur - this is really a big issue since dealing
> with sparingly documented code in the past had proven to be a huge
> pain in case we had to fix bugs

I believe that someone looking at someone else code will always find
it difficult (unless they worked together), I find it hard myself to
read other people code. I certainly do not pretend to comment it
perfectly (and I don't plan to make a detailled description of the
math behind), I do so that I can fix bug myself using gdb. Sometimes
in the future, I'll certainly have to make some documentation for
programmers, but I'll move it as a priority only if I'm sure active
developers are using giac.

> * It is unclear that a ten time speedup over the current code is
> worth dealing with 150k lines of code. It is another polynomial class
> to deal with in coercion, build issues and so on. If it were "Magma"
> speed that probably would be something we would be willing to live
> with ;)

The current bottleneck which explains at least a part of the speed
difference with magma is sparse multivariate division, I'm going to
improve it and I'll update my benchmarks. Now, I want to emphasize
that giac is not just another polynomial library. It is a full-
featured CAS that sage could call as a library (unlike other CAS where
most probably communication is handled by text streams).

> * The code doesn't seem to have MSVC support and due to its size the
> size/benefit ratio doesn't look good. Since I will likely end up
> porting the code I am not too excited to have to deal with another
> 150k lines of code.

I don't see any interest to have MSVC support myself, giac/xcas is
ported on windows with cygwin. I bet however that it is easier to port
code to another compiler than to implement, test and fix again the
algorithms :-)

> * there are CoCoALib leftover files in the src directory, namely Tmp*
> - I am sure this is an oversight.

These files are not part of libcocoa.a (I did not recheck since the
last time I downloaded cocoa), but they are required for the F5
implementation in cocoa. This F5 is not used by default inside giac,
it seems slower than cocoa standard reduction algorithm which is used
(if CoCoA is detected), so that giac has a correct Groebner basis
implementation. I'll certainly try to build a link to singular when
I'll have some time (I didn't try before since it was not linkable as
a library).

> * It is likely that improved algorithms will need fast linear algebra
> or some other goodies and writing interfaces to GIAC if we already
> have them in Sage seems of little benefit to us.
>

I believe that the main advantage for sage would be to use giac as an
alternative for maxima for calculus, and that would be useful for giac
since the number of users would grow. I'm working on the efficiency of
the basic algorithm of giac (like polynomial arithmetic but also
linear algebra, like determinant over the integers) mainly because
they are required for that. They could be used inside sage until a
better alternative is ready (and people could choose to implement new
functions first instead). I'm also interested to use sage components
to improve giac if possible, like I did with pari, ntl, cocoa (would
you for example implement polynomial arithmetic so that it could be
used from C++?).

mabshoff

unread,
Apr 12, 2008, 10:31:43 AM4/12/08
to sage-devel


On Apr 12, 8:52 am, parisse <bernard.pari...@ujf-grenoble.fr> wrote:

Hi,

> ;; This buffer is for notes you don't want to save, and for Lisp
> evaluation.
> ;; If you want to create a file, visit that file with C-x C-f,
> ;; then enter the text in that file's own buffer.

Emacs people ;)

> > You can disable the X component of GIAC, so that isn't too much of a
> > problem. But the fact that it isn't cleanly separate is something I
> > would consider non-optimal.
>
> The main reason is that it is not a top level priority to do a proper
> splitting. And for example readrgb is using the FLTK images library.
> But as you say, it can be compiled without X.
>
>
>
> > Aside from that we did discuss GIAC again in IRC a couple days ago
> > [some people did look at it in the past and it has come up in the
> > past] and there were a couple comments [my recollection, not some
> > official statement]:
>
> >  * Any standard component of Sage for now has to be GPL V2+
> > compatible. GIAC seems to be GPL V3[+] now
>
> This should not be a problem. I can switch back to v2. But the normal
> way should be that v2 softwares go to v3.

Nope, it isn't. After initially switching to GPL V3+ we have decided
to remain at GPL V2+ for now. Since we have discussed this quite
extensively in the past here is no need to rehash this here. I don't
need another drawn out licensing discussion.

> >  * The code is sparingly commented and seems to conform to its own
> > coding style. We specifically looked at the Risch code and also
> > infrastructure like vecteur - this is really a big issue since dealing
> > with sparingly documented code in the past had proven to be a huge
> > pain in case we had to fix bugs
>
> I believe that someone looking at someone else code will always find
> it difficult (unless they worked together),

Well, I can see if documentation is available or not. The giac code
isn't particularly hard to read o understand, but it is quite
different to extend the code without spending some time with it.

> I find it hard myself to
> read other people code. I certainly do not pretend to comment it
> perfectly (and I don't plan to make a detailled description of the
> math behind),

That isn't really the problem. I just don't want to spend reading code
all over the library to drill down to one local problem.

> I do so that I can fix bug myself using gdb. Sometimes
> in the future, I'll certainly have to make some documentation for
> programmers, but I'll move it as a priority only if I'm sure active
> developers are using giac.

This is certainly a chicken/egg problem. I did revisit risch.[cc|h]
and vecteur.[cc|h] and rish.cc was better commented than I did
remember. But what I am missing is doxygen style documentation of
input and output parameters.

> >  * It is unclear that a ten time speedup over the current code is
> > worth dealing with 150k lines of code. It is another polynomial class
> > to deal with in coercion, build issues and so on. If it were "Magma"
> > speed that probably would be something we would be willing to live
> > with ;)
>
> The current bottleneck which explains at least a part of the speed
> difference with magma is sparse multivariate division, I'm going to
> improve it and I'll update my benchmarks. Now, I want to emphasize
> that giac is not just another polynomial library.

Sure, we are aware of that.

> It is a full-
> featured CAS that sage could call as a library (unlike other CAS where
> most probably communication is handled by text streams).

Yes, we agree here, too, but we are only interested in factorization
here at the moment. Otherwise everything else seems to be covered by
other components in Sage.

> >  * The code doesn't seem to have MSVC support and due to its size the
> > size/benefit ratio doesn't look good. Since I will likely end up
> > porting the code I am not too excited to have to deal with another
> > 150k lines of code.
>
> I don't see any interest to have MSVC support myself,

Since I and some other people are porting Sage to MSVC I prefer to not
add to my workload unless there is a good reason to do so. Adding 150k
lines of code with unknown problems and keeping that code working and
running without any interest from upstream is far from an ideal
situation. The effort to replace Maxima with something else not
written in lisp is precisely routed in there, not because I dislike
lisp and/or Maxima. It is just a pain to support lisp. And since its
performance is far from stellar it will result in a faster and smaller
Sage.

> giac/xcas is
> ported on windows with cygwin. I bet however that it is easier to port
> code to another compiler than to implement, test and fix again the
> algorithms :-)

Well, I tend to disagree on that: Porting the code is only a small
part of the effort. And it isn't only a port to MSVC, it is also
porting to Sun's Forte Compiler and potentially other compilers down
the road. We are interested in only a small part of the functionality
and giac has been looked at in the past [that discuss has happened
face to face at Sage Days or in IRC] and we never came to the
conclusion that it was worth the effort [i.e. the cost was not worth
the effort]:

* Gary Funrnish has been working on Symbolics for the last four
months and is funded over the summer via an REU to finish that work.
Parts of that code might even make it into a 3.x release before Dev1.
By the end of the summer we will likely have a Risch implementation
and since Gruntz is implemented in sympy we plan to use that. Should
sympy not get their ass in gear soon I am sure that somebody will port
those 2000 lines of code to our symbolics framework
* Mike Hansen and two students from the Singular team are currently
working on the gcd issues of Singular. Since a fast gcd is essential
to some of the things Mike is doing he is certainly motivated to solve
that problem. I am not sure what the status of the students is, but I
wouldn't be surprised if their work makes it into the next main
Singular release this fall.

So, that leaves mv-factorization. I am not expert on this and I am
sure that if people like Bill Hart say that it is a hard problem to
get right and fast I am sure it will take quite some effort to get it
done. giac has potential problems [please correct me if I am wrong]:

* a low busfactor, i.e. few if not one active developer
* no really visible developer community
* no public CVS or version control system
* unsatisfying documentation
* unkown build issues: Ask for example John Cremona what happened
when his eclib code got merged in Sage. We did discover quite a number
of issues with NTL as well as newer compilers which took a while to
fix. And that was with a codebase 20 years old that has been used
quite intense by a number of people in the field. It took month to get
all of this sorted out, not because John is a bad coder [he is stellar
in my book], but because when you write code you only hit the bugs
that show up with your particular combination of tool chain and env.
Once you throw the code into the wild different things tend to
happen.
* unknown memory leak issues [Did you ever run valgrind? If not I
would highly recommend it]
* unknown portability issues: Sun Forte, MSVC, alignment problems,
endianess issues, i.e. I could point to a number of issues in
libSingular on Solaris for example or in symmetrica on Sparc or PPC
and I am sure you get the idea.
* unknown number of external library users: I am not away of anybody
external using giac as a library. From my experience with libSingular
I am not too optimistic that we will not run into a number of issues
here. Look at #2822: PolyBoRi is designed as a library to be used from
Python. But I spend about two days chasing down some issue that was
caused by the way we use the library.
* small test suite: it seems that there are only the files in $GIAC/
check to do some testing. It is only a handful of files compared to
52,000 doctests in Sage. Are the files in doc/dxcas used for testing?
And from past experience Sage's doctests have caught issues in
components used by Sage that were either not caught by their own
testsuite or nor at all since the project didn't have any automated
testing at all.
* murky licensing situation - see below

The unknows would all have to be evaluated before I would give my "+1"
on integration and since nobody at the moment is jumping up and down
to do the evaluation and integration work I am skeptical on the
prospects. Since I am doing the port and the Solaris as well as MSVC
port of Sage is considered to be of ultimate strategic value my
opinion in that department does count. I am sure all of the issues
above can be overcame [in case they do exist], but I am not going to
work on any of this unless factorization becomes more important than
the ports. And I don't see that happening because certain institutions
are paying me to port to Solaris and Windows. And if one pays for the
band that person decides what music is played.

So: What should you do? Start with an optional or experimental spkg
and prove me wrong. ;)

> >  * there are CoCoALib leftover files in the src directory, namely Tmp*
> > - I am sure this is an oversight.
>
> These files are not part of libcocoa.a (I did not recheck since the
> last time I downloaded cocoa),

No, they are not. They *used* to be part of CoCoALib [up to 0.98.1 I
believe], but the file has been removed early on in the 0.99.x series.
But the file is not part of CoCoALib 0.99.12 which relicensed the
library to GPL V3 only. from GPL V2 only. But the copyright reads:

// Copyright (c) 2006 Stefan Kaspar

// This file is part of the source of CoCoALib, the CoCoA Library.

// CoCoALib is free software; you can redistribute it and/or modify
// it under the terms of the GNU General Public License (version 3)
// as published by the Free Software Foundation. A copy of the full
// licence may be found in the file COPYING in this directory.

Since Stefan Kaspar sits in the office next to mine I am certain that
I would be aware of a license change. The current file looks like the
following:

// This file is part of the source of ApCoCoALib, the ApCoCoA Library.
//
// Copyright (c) ApCoCoA Project (Prof. Dr. Martin Kreuzer, Uni
Passau)
//
// Author: 2006, 2007 Stefan Kaspar
//
// Visit http://apcocoa.org/ for more information regarding ApCoCoA
// and ApCoCoALib.
// Visit http://www.apcocoa.org/wiki/ApCoCoA:KnownIssues for bugs,
problems
// and known issues.
//
// ApCoCoALib is free software; you can redistribute it and/or modify
// it under the terms of the GNU General Public License (version 2 or
later)
// as published by the Free Software Foundation. A copy of the full
// licence may be found in the file COPYING in this directory.

Five months ago this it was:

// Copyright (c) 2006, 2007 Stefan Kaspar
//
// Visit http://apcocoa.org/ for more information regarding ApCoCoA
// and ApCoCoALib.
// visit http://www.apcocoa.org/wiki/ApCoCoA:KnownIssues for bugs,
problems
// and known issues.
//
// ApCoCoALib is free software; you can redistribute it and/or modify
// it under the terms of the GNU General Public License (version 2 or
later)
// as published by the Free Software Foundation. A copy of the full
// licence may be found in the file COPYING in this directory.

And three months before that it was:

// Copyright (c) 2006, 2007 Stefan Kaspar
//
// Visit http://apcocoa.org/ for more information regarding ApCoCoA
// and ApCoCoALib.
// visit http://www.apcocoa.org/wiki/ApCoCoA:KnownIssues for bugs,
problems
// and known issues.
//
// ApCoCoALib is free software; you can redistribute it and/or modify
// it under the terms of the GNU General Public License (version 2)
// as published by the Free Software Foundation. A copy of the full
// licence may be found in the file COPYING in this directory.

The current CoCoALib 0.99.5 does not contain any FGLM related file.

So I am quite curious where a GPL V3 only TmpFFLM.c (the same applies
to LESystemSolver. [CH], too) is coming from. I did work closely with
the CoCoA team for 3+ years and have beyond 150 patches in CoCoA as
well as CoCoALib, so I am quite familiar with the people as well as
the code. Since the CoCoA Team does not own the copyright on those
files (they are part of ApCoCoALib) they cannot relicense those files.

> but they are required for the F5
> implementation in cocoa. This F5 is not used by default inside giac,
> it seems slower than cocoa standard reduction algorithm which is used
> (if CoCoA is detected), so that giac has a correct Groebner basis
> implementation. I'll certainly try to build a link to singular when
> I'll have some time (I didn't try before since it was not linkable as
> a library).
>
> >  * It is likely that improved algorithms will need fast linear algebra
> > or some other goodies and writing interfaces to GIAC if we already
> > have them in Sage seems of little benefit to us.
>
> I believe that the main advantage for sage would be to use giac as an
> alternative for maxima for calculus, and that would be useful for giac
> since the number of users would grow. I'm working on the efficiency of
> the basic algorithm of giac (like polynomial arithmetic but also
> linear algebra, like determinant over the integers) mainly because
> they are required for that. They could be used inside sage until a
> better alternative is ready (and people could choose to implement new
> functions first instead). I'm also interested to use sage components
> to improve giac if possible, like I did with pari, ntl, cocoa (would
> you for example implement polynomial arithmetic so that it could be
> used from C++?).

Well, fast Symbolics is already being implemented and quite far
already. The gcd situation is also being looked at. That leaves
factorization and a ten time speedup is probably not enough of an
incentive now to get giac into Sage right now. It might be different
in the future, but from my perspective the "rapid growth phase" where
Sage did integrate huge external projects with a ot of overlap with
existing functionality into the standard distribution did end with
ATLAS. What is now being merged is usually small, specialized
libraries that do a couple things and do those particularly well. If
factorization could be broken out from giac in a reasonably small
package (like factory in Singular) we might have something to discuss,
but as a whole I do not see this as a good fit.

Cheers,

Michael

Michael Brickenstein

unread,
Apr 12, 2008, 11:33:00 AM4/12/08
to sage-devel

> * unknown number of external library users: I am not away of anybody
> external using giac as a library. From my experience with libSingular
> I am not too optimistic that we will not run into a number of issues
> here. Look at #2822: PolyBoRi is designed as a library to be used from
> Python. But I spend about two days chasing down some issue that was
> caused by the way we use the library.

We appreciate all this work involved in the SAGE integration of our
library, which resulted in a cleaner PolyBoRi.
Just to illustrate Michaels argument:
He has only seen the sunside of PolyBoRi, that means, when it was
designed to be used from Python.
Before, we also had a Python-interface. But the C++-interface of CUDD
made it impossible to have automatic resource management.
So essentially I made sure in every Python script, that these
resources where deallocated in the correct order.
The only solution was to write our own C++ interface to CUDD.
Michael

mabshoff

unread,
Apr 12, 2008, 11:53:06 AM4/12/08
to sage-devel


On Apr 12, 5:33 pm, Michael Brickenstein <brickenst...@mfo.de> wrote:
> >  * unknown number of external library users: I am not away of anybody
> > external using giac as a library. From my experience with libSingular
> > I am not too optimistic that we will not run into a number of issues
> > here. Look at #2822: PolyBoRi is designed as a library to be used from
> > Python. But I spend about two days chasing down some issue that was
> > caused by the way we use the library.

Hi Michael B.,

> We appreciate all this work involved in the SAGE integration of our
> library, which resulted in a cleaner PolyBoRi.
> Just to illustrate Michaels argument:
> He has only seen the sunside of PolyBoRi, that means, when it was
> designed to be used from Python.

;)

> Before, we also had a Python-interface. But the C++-interface of CUDD
> made it impossible to have automatic resource management.

Now you are scaring me.

> So essentially I made sure in every Python script, that these
> resources where deallocated in the correct order.
> The only solution was to write our own C++ interface to CUDD.
> Michael

Hehe, it looks like there might still be clouds in the sky, at least
with our interface. Valgrinding pbori.pyx results in

==6362== 61,866,220 bytes in 59 blocks are still reachable in loss
record 13,368 of 13,369
==6362== at 0x4A1BE1B: malloc (vg_replace_malloc.c:207)
==6362== by 0x159EE986: MMalloc (safe_mem.c:62)
==6362== by 0x15A076BD: Cudd_Init (cuddInit.c:151)
==6362== by 0x159A26AE:
polybori::BoolePolyRing::BoolePolyRing(unsigned, int, bool)
(CCuddCore.h:120)
==6362== by 0x15964DEB:
__pyx_pf_4sage_5rings_10polynomial_5pbori_21BooleanPolynomialRing___init__(_object*,
_object*, _o
bject*) (ccobject.h:50)
==6362== by 0x458E90: type_call (typeobject.c:436)
==6362== by 0x415592: PyObject_Call (abstract.c:1860)
==6362== by 0x482271: PyEval_EvalFrameEx (ceval.c:3775)
==6362== by 0x48531A: PyEval_EvalCodeEx (ceval.c:2831)
==6362== by 0x4840A4: PyEval_EvalFrameEx (ceval.c:494)
==6362== by 0x48531A: PyEval_EvalCodeEx (ceval.c:2831)
==6362== by 0x483A3C: PyEval_EvalFrameEx (ceval.c:3660)
==6362==
==6362==
==6362== 494,929,760 bytes in 59 blocks are still reachable in loss
record 13,369 of 13,369
==6362== at 0x4A1BE1B: malloc (vg_replace_malloc.c:207)
==6362== by 0x159EE986: MMalloc (safe_mem.c:62)
==6362== by 0x15A2F10E: cuddInitCache (cuddCache.c:148)
==6362== by 0x15A0765F: Cudd_Init (cuddInit.c:146)
==6362== by 0x159A26AE:
polybori::BoolePolyRing::BoolePolyRing(unsigned, int, bool)
(CCuddCore.h:120)
==6362== by 0x15964DEB:
__pyx_pf_4sage_5rings_10polynomial_5pbori_21BooleanPolynomialRing___init__(_object*,
_object*, _o
bject*) (ccobject.h:50)
==6362== by 0x458E90: type_call (typeobject.c:436)
==6362== by 0x415592: PyObject_Call (abstract.c:1860)
==6362== by 0x482271: PyEval_EvalFrameEx (ceval.c:3775)
==6362== by 0x48531A: PyEval_EvalCodeEx (ceval.c:2831)
==6362== by 0x4840A4: PyEval_EvalFrameEx (ceval.c:494)
==6362== by 0x48531A: PyEval_EvalCodeEx (ceval.c:2831)

Since we are not sure yet if it is the interface's or PolyBoRi's fault
I haven't pinged you about this. Malb said that he would poke around
this weekend since we was working with PolyBoRi anyway. The above is
new in 0.3.1.

Cheers,

Michael

parisse

unread,
Apr 12, 2008, 3:44:57 PM4/12/08
to sage-devel
>
> Nope, it isn't. After initially switching to GPL V3+ we have decided
> to remain at GPL V2+ for now. Since we have discussed this quite
> extensively in the past here is no need to rehash this here. I don't
> need another drawn out licensing discussion.
>

Well, I don't see why it is a concern since giac can be relicensed to
version 2.

> > I do so that I can fix bug myself using gdb. Sometimes
> > in the future, I'll certainly have to make some documentation for
> > programmers, but I'll move it as a priority only if I'm sure active
> > developers are using giac.
>
> This is certainly a chicken/egg problem. I did revisit risch.[cc|h]
> and vecteur.[cc|h] and rish.cc was better commented than I did
> remember. But what I am missing is doxygen style documentation of
> input and output parameters.
>

This is the developer doc I plan to write someday, but it is currently
not a main priority.

>
> Yes, we agree here, too, but we are only interested in factorization
> here at the moment. Otherwise everything else seems to be covered by
> other components in Sage.
>

I believed that you would appreciate to see other developers focus on
the aspects you do not want to develop themselves, like integration
which BTW is far more than just implementing the Risch algorithm. If
you don't see any interest on integrating giac in sage, fine, as I
said earlier I do not *need* sage, of course I would appreciate
feedback from their users, but I can continue my way expanding giac
and integrating the same libraries into giac (as sage does) and
certainly I will not loose anymore time justifying me here when I read
the ton of some of your next comments, like:

> We are interested in only a small part of the functionality
> and giac has been looked at in the past [that discuss has happened
> face to face at Sage Days or in IRC] and we never came to the
> conclusion that it was worth the effort [i.e. the cost was not worth
> the effort]:
>

>giac has potential problems [please correct me if I am wrong]:
>
> * a low busfactor, i.e. few if not one active developer
> * no really visible developer community
> * no public CVS or version control system

Integration into sage would certainly help!

> * unsatisfying documentation

The focus in on xcas user documentation

> * unkown build issues:

of course, since they are unknown. Note however that it is ported to
mac os and win and arm linux and wince.

> * unknown memory leak issues [Did you ever run valgrind? If not I
> would highly recommend it]

yes

> * unknown portability issues: Sun Forte, MSVC, alignment problems,
> endianess issues,

see arm port. I never cared about sun or msvc.

> * small test suite: it seems that there are only the files in $GIAC/
> check to do some testing. It is only a handful of files compared to
> 52,000 doctests in Sage.

You would be surprised by the number of bugs that are caught by
definite integration. Giac is certainly not bugfree but it can be and
begins to be used as an alternative to maple.

>I am sure all of the issues
> above can be overcame [in case they do exist], but I am not going to
> work on any of this unless factorization becomes more important than
> the ports. And I don't see that happening because certain institutions
> are paying me to port to Solaris and Windows. And if one pays for the
> band that person decides what music is played.
>
> So: What should you do? Start with an optional or experimental spkg
> and prove me wrong. ;)
>

That is not the way I see collaboration. I will not do all the work
myself.

>If
> factorization could be broken out from giac in a reasonably small
> package (like factory in Singular) we might have something to discuss,
> but as a whole I do not see this as a good fit.
>

Obviously factorization depends on all the basic algorithms like gcd,
polynomials etc. and it can not be isolated easily. And if it's the
only thing you would be interested in giac, then there is indeed no
need to continue this discussion :-)

Alexander Dreyer

unread,
Apr 12, 2008, 8:04:12 PM4/12/08
to sage-devel
On 12 Apr., 17:53, mabshoff <Michael.Absh...@mathematik.uni-
dortmund.de> wrote:
[...]

> Since we are not sure yet if it is the interface's or PolyBoRi's fault
> I haven't pinged you about this. Malb said that he would poke around
> this weekend since we was working with PolyBoRi anyway. The above is
> new in 0.3.1.
I've attached another patch to
http://trac.sagemath.org/sage_trac/ticket/2822

maybe it solve some some of that leak problems.

Regards,
Alexander

mabshoff

unread,
Apr 12, 2008, 8:24:25 PM4/12/08
to sage-devel


On Apr 12, 9:44 pm, parisse <bernard.pari...@ujf-grenoble.fr> wrote:

Hi,

> > Nope, it isn't. After initially switching to GPL V3+ we have decided
> > to remain at GPL V2+ for now.  Since we have discussed this quite
> > extensively in the past here is no need to rehash this here. I don't
> > need another drawn out licensing discussion.
>
> Well, I don't see why it is a concern since giac can be relicensed to
> version 2.

Sure, I was just pointing out that not everybody prefers the GPL V3.

> > > I do so that I can fix bug myself using gdb. Sometimes
> > > in the future, I'll certainly have to make some documentation for
> > > programmers, but I'll move it as a priority only if I'm sure active
> > > developers are using giac.
>
> > This is certainly a chicken/egg problem. I did revisit risch.[cc|h]
> > and vecteur.[cc|h] and rish.cc was better commented than I did
> > remember. But what I am missing is doxygen style documentation of
> > input and output parameters.
>
> This is the developer doc I plan to write someday, but it is currently
> not a main priority.
>

Sure, I understand that and since it is your time it is your call how
you allocate your resources. But it would make it much easier for us
to wrap the library if documentation existed.

>
> > Yes, we agree here, too, but we are only interested in factorization
> > here at the moment. Otherwise everything else seems to be covered by
> > other components in Sage.
>
> I believed that you would appreciate to see other developers focus on
> the aspects you do not want to develop themselves, like integration
> which BTW is far more than just implementing the Risch algorithm.

For use integration == Risch and limits == Grunt, they are not to be
taken literally.

> If you don't see any interest on integrating giac in sage, fine,

That is not what is the case here. There is certainly interest, but it
is no longer "easy" to get into Sage. We support or intend to support
different platforms than you run on and it is a concern that your code
runs on it. Since factorization is something essential it should not
be taken lightly and since I am the person who in the end will have to
deal a lot with those issues I am concerned because I have been burned
way to often during the last six months.

> as I
> said earlier I do not *need* sage, of course I would appreciate
> feedback from their users, but I can continue my way expanding giac
> and integrating the same libraries into giac (as sage does) and
> certainly I will not loose anymore time justifying me here when I read
> the ton of some of your next comments, like:
>
> > We are interested in only a small part of the functionality
> > and giac has been looked at in the past [that discuss has happened
> > face to face at Sage Days or in IRC] and we never came to the
> > conclusion that it was worth the effort [i.e. the cost was not worth
> > the effort]:

Well, that reflects my personal opinion and around here we prefer to
be honest. After discussing GIAC in IRC right after your original post
I waited a while for somebody else to jump into the discussion and/or
volunteer to do some work. That didn't happen, so instead of just
ignoring you I wrote my honest opinion on things. Email certainly
isn't the best medium for discussion between people who don't know
each other and I had no intention of offending you personally or say
bad things about your code.

> >giac has potential problems [please correct me if I am wrong]:
>
> >  * a low busfactor, i.e. few if not one active developer
> >  * no really visible developer community
> >  * no public CVS or version control system
>
> Integration into sage would certainly help!

William did look at GIAC way back before he selected Maxima for Sage
years ago. One of the issues was build support. Another one was that
it didn't seem to have a large community. Not much seems to have
changed in that regard. So if you start providing things like a devel
mailing list, a public svn/hg/git it would give people a chance to
become involved. There are plenty of people around here who lurk and
might have taken a look at GIAC as a consequence of this discussion.

> >  * unsatisfying documentation
>
> The focus in on xcas user documentation
>
> >  * unkown build issues:
>
> of course, since they are unknown. Note however that it is ported to
> mac os and win and arm linux and wince.

Ok, arm support is good.

> >  * unknown memory leak issues [Did you ever run valgrind? If not I
> > would highly recommend it]
>
> yes

Good.

> >  * unknown portability issues: Sun Forte, MSVC, alignment problems,
> > endianess issues,
>
> see arm port. I never cared about sun or msvc.

Well, hear I would have really preferred to hear something like "I am
glad to integrate changes you might provide".

> >  * small test suite: it seems that there are only the files in $GIAC/
> > check to do some testing. It is only a handful of files compared to
> > 52,000 doctests in Sage.
>
> You would be surprised by the number of bugs that are caught by
> definite integration. Giac is certainly not bugfree but it can be and
> begins to be used as an alternative to maple.

Sure, but with any complex system changes on one end can cause
potential issues in other places. When we upgraded Maxima to 5.14.0 it
started getting certain limits wrong. I am sure they tested, but an
extensive test suite is important from my experience and our review
policy *mandates* doctests for any patch to be merged. I am not
implying your code is buggy, I am just saying that I would prefer a
large test suite because that is how we caught numerous build issues
in for example eclib.

> >I am sure all of the issues
> > above can be overcame [in case they do exist], but I am not going to
> > work on any of this unless factorization becomes more important than
> > the ports. And I don't see that happening because certain institutions
> > are paying me to port to Solaris and Windows. And if one pays for the
> > band that person decides what music is played.
>
> > So: What should you do? Start with an optional or experimental spkg
> > and prove me wrong. ;)
>
> That is not the way I see collaboration. I will not do all the work
> myself.

Sure, the idea behind Sage is collaboration, but somebody has to get
the ball rolling. And that is usually the developer behind the code or
a Sage person who is interested in the functionality. Since nobody has
stepped up so far [that might change] the ball is in your court.

> >If
> > factorization could be broken out from giac in a reasonably small
> > package (like factory in Singular) we might have something to discuss,
> > but as a whole I do not see this as a good fit.
>
> Obviously factorization depends on all the basic algorithms like gcd,
> polynomials etc. and it can not be isolated easily. And if it's the
> only thing you would be interested in giac, then there is indeed no
> need to continue this discussion :-)

I know that, but as Singular shows with libfactory it can be done. I
don't consider this discussion to be not worth your time. If you
consider that certain other features of GIAC are better than anything
else in Sage we need to know about them. It is your code, so you
should be able to tell us what to look at. I.e. a solver of non-linear
systems would also be interesting.

Cheers,

Michael

mabshoff

unread,
Apr 12, 2008, 9:08:20 PM4/12/08
to sage-devel
On Apr 12, 9:44 pm, parisse <bernard.pari...@ujf-grenoble.fr> wrote:

<SNIP>

Hi Bernard,

> >  * unknown memory leak issues [Did you ever run valgrind? If not I
> > would highly recommend it]
>
> yes
<SNIP>

Ok. What about the following then?

[mabshoff@www bin]$ /usr/local/valgrind-3.3.0/bin/valgrind --
tool=memcheck --leak-resolution=high ./icas
==14493== Memcheck, a memory error detector.
==14493== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et
al.
==14493== Using LibVEX rev 1804, a library for dynamic binary
translation.
==14493== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==14493== Using valgrind-3.3.0, a dynamic binary instrumentation
framework.
==14493== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et
al.
==14493== For more details, rerun with: -v
==14493==
Using locale /usr/local/share/locale/
Help file aide_cas not found
// Unable to find keyword file doc/en/keywords
Added 0 synonyms
Welcome to giac readline interface
(c) 2001,2007 B. Parisse & others
Homepage http://www-fourier.ujf-grenoble.fr/~parisse/giac.html
Released under the GPL license 2.0 or above
See http://www.gnu.org for license details
-------------------------------------------------
Press CTRL and D simultaneously to finish session
Type ?commandname for help
0>> 1+1
2

// Time 0.33
1>> ==14493==
==14493== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 23 from
1)
==14493== malloc/free: in use at exit: 14,896,431 bytes in 1,615
blocks.
==14493== malloc/free: 7,838 allocs, 6,223 frees, 15,279,507 bytes
allocated.
==14493== For counts of detected errors, rerun with: -v
==14493== searching for pointers to 1,615 not-freed blocks.
==14493== checked 6,629,940 bytes.
==14493==
==14493== LEAK SUMMARY:
==14493== definitely lost: 6,888 bytes in 685 blocks.
==14493== possibly lost: 6,886 bytes in 166 blocks.
==14493== still reachable: 14,882,657 bytes in 764 blocks.
==14493== suppressed: 0 bytes in 0 blocks.
==14493== Rerun with --leak-check=full to see details of leaked
memory.

In detail:

==14568== LEAK SUMMARY:
==14568== definitely lost: 4,184 bytes in 166 blocks.
==14568== indirectly lost: 2,704 bytes in 519 blocks.
==14568== possibly lost: 6,886 bytes in 166 blocks.
==14568== still reachable: 14,882,657 bytes in 764 blocks.
==14568== suppressed: 0 bytes in 0 blocks.

The big one is likely the pari stack:

==14576== 10,000,000 bytes in 1 blocks are still reachable in loss
record 1,229 of 1,229
==14576== at 0x4C9C828: malloc (vg_replace_malloc.c:207)
==14576== by 0x8BF887A: (within /home/mabshoff/usr/local/bin/icas)

There are plenty of

==14576== 81,940 bytes in 5 blocks are still reachable in loss record
1,227 of 1,229
==14576== at 0x4C9D004: operator new(unsigned) (vg_replace_malloc.c:
224)
==14576== by 0x807A2D1: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x9E343A: pthread_once (in /lib/libpthread-2.5.so)
==14576== by 0x807D810: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x84DEF97: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x85DDEB6: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x8D6B274: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x805374C: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x8D6B1BA: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x8ACD90: (below main) (in /lib/libc-2.5.so)

or things like:

==14576== 25 bytes in 1 blocks are possibly lost in loss record 826 of
1,229
==14576== at 0x4C9D004: operator new(unsigned) (vg_replace_malloc.c:
224)
==14576== by 0x4D52BFA: std::string::_Rep::_S_create(unsigned,
unsigned, std::allocator<char> const&) (in /usr/lib/libstdc++.so.
6.0.8)
==14576== by 0x4D539F4: (within /usr/lib/libstdc++.so.6.0.8)
==14576== by 0x4D53C06: std::string::string(char const*,
std::allocator<char> const&) (in /usr/lib/libstdc++.so.6.0.8)
==14576== by 0x85D5135: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x8D6B274: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x805374C: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x8D6B1BA: (within /home/mabshoff/usr/local/bin/icas)
==14576== by 0x8ACD90: (below main) (in /lib/libc-2.5.so)

Now, I used the latest binary and didn't build from source. A couple
observations:

* you stick your own copy libstdc++.so.6 into $GIAC/lib and I
consider that a bad, bad idea since libstdc++.so depends on kernel
headers.
* icas as well as xcas are both statically linked against pari,
CoCoALib and some other mathematical libs. Since both of them link
against those libraries why not link dynamically and save the space?
* Note that the interface says "Released under the GPL license 2.0 or
above" and not GPL V3

Cheers,

Michael

mabshoff

unread,
Apr 12, 2008, 9:11:31 PM4/12/08
to sage-devel


On Apr 13, 2:04 am, Alexander Dreyer
<jan.alexander.dre...@googlemail.com> wrote:
> On 12 Apr., 17:53, mabshoff <Michael.Absh...@mathematik.uni-dortmund.de> wrote:
>
> [...]
>
> > Since we are not sure yet if it is the interface's or PolyBoRi's fault
> > I haven't pinged you about this. Malb said that  he would poke around
> > this weekend since we was working with PolyBoRi anyway. The above is
> > new in 0.3.1.

Hi Alexander,

> I've attached another patch tohttp://trac.sagemath.org/sage_trac/ticket/2822
>
> maybe it solve some some of that leak problems.

It doesn't seem to make a difference. I did test and build twice to
make sure I got the right numbers. I am not convinced yet that this
isn't a problem in our interface, so I would suggest I start working
on "pure" PolyBoRi + python to nail down any potential problems there
before moving on to Sage. Let me get 3.0 out and then we can revisit
the issue.

> Regards,
>   Alexander

Cheers,

Michael

parisse

unread,
Apr 13, 2008, 12:22:43 PM4/13/08
to sage-devel
> > This is the developer doc I plan to write someday, but it is currently
> > not a main priority.
>
> Sure, I understand that and since it is your time it is your call how
> you allocate your resources. But it would make it much easier for us
> to wrap the library if documentation existed.
>

That's why I plan to do it if I have a reasonable hope that people
will use it.

>
> For use integration == Risch and limits == Grunt, they are not to be
> taken literally.
>

Ok, but in fact the Risch algorithm is for most input never called,
because other heuristics give much faster and more usable output. And
definite integration adds other code.

> Well, that reflects my personal opinion and around here we prefer to
> be honest. After discussing GIAC in IRC right after your original post
> I waited a while for somebody else to jump into the discussion and/or
> volunteer to do some work. That didn't happen, so instead of just
> ignoring you I wrote my honest opinion on things. Email certainly
> isn't the best medium for discussion between people who don't know
> each other and I had no intention of offending you personally or say
> bad things about your code.
>

Ok, let's forget about that.

>
> William did look at GIAC way back before he selected Maxima for Sage
> years ago. One of the issues was build support. Another one was that
> it didn't seem to have a large community. Not much seems to have
> changed in that regard.

On the developer side that's correct, but it's slowly changing on the
user side (I mean people using giac under xcas).

> So if you start providing things like a devel
> mailing list, a public svn/hg/git it would give people a chance to
> become involved. There are plenty of people around here who lurk and
> might have taken a look at GIAC as a consequence of this discussion.
>

There is a phpbb2 forum for giac/xcas (mainly used for xcas today) and
a svn at sourceforge. The svn is currently used only as a backup.

> > > * unknown memory leak issues [Did you ever run valgrind? If not I
> > > would highly recommend it]
>
> > yes
>
> Good.
>

As an answer to your next post, I used valgrind a few times and it
helped me correct a few bugs, but I have no idea how to handle the
kind of output you copied. About the linkage of the static binaries
(using a provided libstdc++), it is the result of user problems
installing xcas_root.tgz: the current binaries have the advantage of
being usable on almost every Linux PC. Debian and Ubuntu users may
install a common debian package which is dynamically linked with the
libstdc++, but again for the user's convenience, all the additionnal
libraries (like NTL, pari, cocoa, etc.) are statically linked in. Xcas
users are *not* expert Linux users (well most of xcas users are not
Linux users at all).

> > > * unknown portability issues: Sun Forte, MSVC, alignment problems,
> > > endianess issues,
>
> > see arm port. I never cared about sun or msvc.
>
> Well, hear I would have really preferred to hear something like "I am
> glad to integrate changes you might provide".
>

Maybe my sentence was not clear (English is not my mother language...)
I thought it was obvious that I would make any reasonable change in
the source code to make giac compatible with other compilers, but that
I would not take care myself of other compilers than gcc.

> > You would be surprised by the number of bugs that are caught by
> > definite integration. Giac is certainly not bugfree but it can be and
> > begins to be used as an alternative to maple.
>
> Sure, but with any complex system changes on one end can cause
> potential issues in other places.

I'm well aware of that, but in my experience, many of them will
interact with the integration code and will therefore be caught by
testintegrate (testlimit is also a good checker).

>I am not
> implying your code is buggy, I am just saying that I would prefer a
> large test suite because that is how we caught numerous build issues
> in for example eclib.
>

I fully agree, giac needs more tests. It's just that I do not have
much time, I have to choose what I do.

> Sure, the idea behind Sage is collaboration, but somebody has to get
> the ball rolling. And that is usually the developer behind the code or
> a Sage person who is interested in the functionality. Since nobody has
> stepped up so far [that might change] the ball is in your court.
>

I hope someone will be interested, since I don't have time to do that
alone myself, and it would most probably be a loss of time if nobody
is interested. But of course I'm ready to work together with someone
who knows what to do on the sage side.

> I know that, but as Singular shows with libfactory it can be done. I
> don't consider this discussion to be not worth your time. If you
> consider that certain other features of GIAC are better than anything
> else in Sage we need to know about them. It is your code, so you
> should be able to tell us what to look at. I.e. a solver of non-linear
> systems would also be interesting.

I do not pretend that feature xyz of giac is better than anything else
in the open-source world, it's just that giac provides *today* many
interlinked features that one expects in a CAS (by "one", I mean a
typical math user, not someone doing research in a very specialized
computer algebra topic), and that it seems some sage ressources are/
will be devoted to write code for features that are already available
in giac. Hence I asked me why don't they link against giac to focus on
new functionnalities? That of course does not mean sage should not
write their own code for such features, but it's not that urgent and
it could be done in such a way that both sage and xcas could benefit.

Bye,
Bernard

rjf

unread,
Apr 26, 2008, 4:36:39 PM4/26/08
to sage-devel, Richard


Sorry to jump in late; I just found this discussion by googling
something else..

Maxima was berated for being too slow in factoring this..
-p10^170*X1^10*X2^10+p10^130*X1^5*X2^10+p10^130*X1^10*X2^5-
p10^90*X1^5*X2^5+p10^80*X1^5*X2^5-p10^40*X2^5-p10^40*X1^5+1

which apparently did not terminate after more than 15 minutes.

After observing that the polynomial is actually a polynomial in x1,
x2, and x3=p10^10, I made that substitution and tried factoring.

Maxima 5.14.0 factored it into 3 factors in 0.8 seconds on my 3GHz
pentium.
The remaining decomposition, factoring these factors further with
p10^10 restored, took 18 seconds more.

Factoring the polynomial as given, took 600 seconds.

Whether the observation of slowness was caused by CLISP rather than
GCL, or hardware differences or ....
a bad algorithm, easily improved, it is hard to say.

If Maxima's factoring program is too slow, it could always call some
other program. Maybe it should be set up to call SAGE? That way the
programmer would have the advantage of all the interactivity of
Maxima, error messages, debugging, graphical front end, as well as
programming Maxima language and lisp, which are compilable languages.

I also noticed a discussion of multivariate GCD heuristics and
simulated annealing. If you think that you can characterize the
domain of GCD inputs in such a way as to cluster inputs that are best
done by a particular algorithm, lots of luck. The choice of the best
GCD algorithm seems to depend most heavily on the ANSWER, not the
input. That is, if the GCD result is 1, you have one choice. If the
GCD(p,q) is p, you have another.

Regards
Reply all
Reply to author
Forward
0 new messages