Random banter about Sage standards

24 views
Skip to first unread message

kcrisman

unread,
Aug 24, 2010, 9:06:44 AM8/24/10
to sage-support, sage-...@googlegroups.com
I've put this on sage-devel where it belongs.

On Aug 24, 5:14 am, "Dr. David Kirkby" <david.kir...@onetel.net>
wrote:
> On 08/23/10 04:20 PM, kcrisman wrote:
>
> > Well, in general it seems to me that most Sage bugs come from things/
> > functionality that didn't exist before, and once they exist people
> > want to start using them.
>
> Well, there are an alful lot of open-bugs in trac. Some have been open a very
> long time. They are assigned to person X (for example I get the Solaris ones),
> but person X does not work on them, but spends time writing new code.

"Assigned to" just means that there is some system default for where a
given category of ticket goes. It does not mean "homework for or you
can't work on Sage any more". But I agree that this is very
confusing. Perhaps someone who has permission to make such changes on
Trac could make that more obvious.

> > But unlike a commercial system, the only
> > realistic way we have to look for these bugs is for people to use the
> > system.  I just don't see how else to do it;
>
> There are several packages in Sage which have test suites that we could run from
> spkg-check. But people have not added the spkg-check file to Sage, so we can not
> run the tests.
>
> http://trac.sagemath.org/sage_trac/ticket/9767# cliquerhttp://trac.sagemath.org/sage_trac/ticket/9613# linboxhttp://trac.sagemath.org/sage_trac/ticket/9311# ratpoints

That would be great. I think this is somewhat orthogonal to fixing
bugs in the Sage library, though; obviously we want to make sure
upstream is good, but again this is a different thing. I guess one
could add this as a reqt. to any upgrades of spkgs.

> and *many* others. Of the 100 or so standard packages in sage, only around 20
> have the ability to run any self-tests of the packages when they are built. In
> some cases, those test suites are *far* more comprehensive than the ones in
> Sage. (I believe Pari is an exception, where the Sage doctests are more
> comprehensive than the Pari tests).
>
> If you look at the first and second columns of this ticket:
>
> http://trac.sagemath.org/sage_trac/ticket/9281
>
> you will see those where there is an spkg-check file, and there where there is
> not one. Not every package in sage has the ability to do any self checks during
> the build process, so we will never be able to get that list complete, but there
> are a lot missing.

That's a great ticket.

Anyway, I think (as you have correctly noted before) we have a bit of
a culture clash between software engineering and mathematics.
However, as far as I can tell, the only way to solve it is to vastly
increase the user base until enough of them become developers that the
load of these things does not fall on just a few people, nearly *all*
of whom (including you, and you have done enormous high-quality work)
are doing it on a volunteer basis.

So we will add what makes Sage better for us. This is definitely not
ideal project management in the sort of setting you are talking about,
but the alternative is Sage staying completely stagnant, I think.
It's a matter of motivating the troops in terms of things like
documentation, testing, etc. And I, for one, would rather have Sage
add lots of useful content for my courses than have it pass every spkg
test - not that those are bad, far from it!

It's a long evolution from "William's private replacement for Magma
written on top of Python" to "highest-quality possible replacement for
M*", and we aren't there yet. But I think if you look at how things
have changed over the last three or four years, the release process
(for instance) that Mitesh et al. are currently doing is vastly
different from the one long ago; five years from now it could be
nearly up to your standards, I hope - because you are right, they are
the right ones to have.

Just have patience with those of us who aren't from a software
background - and trust that we are trying hard to internalize your
lessons, but that we have more immediate needs to fill as well for our
next course or paper. I think that just as Minh's messages about
documentation are slowly taking hold in the whole ecosystem, so are
yours about software engineering.

- kcrisman

Dr. David Kirkby

unread,
Aug 27, 2010, 2:25:40 AM8/27/10
to sage-...@googlegroups.com
On 08/24/10 02:06 PM, kcrisman wrote:
>
> Anyway, I think (as you have correctly noted before) we have a bit of
> a culture clash between software engineering and mathematics.

<SNIP>


> Just have patience with those of us who aren't from a software
> background - and trust that we are trying hard to internalize your
> lessons, but that we have more immediate needs to fill as well for our
> next course or paper. I think that just as Minh's messages about
> documentation are slowly taking hold in the whole ecosystem, so are
> yours about software engineering.
>
> - kcrisman


Just to make a point, my own background is not software engineering. My first
degree is in electrical and electronic engineering, my masters in microwaves and
optoelectronics and my PhD in medical physics. Apart from a very brief spell
(about 6 months), I have never worked in the IT industry.

I first became aware of the subject of software engineering when an Australian
guy joined the department I worked at University College London. Russel's task
was to develop some hardware and software for a research project. He quite
rightly realised that developing software "by the seat of your pants" as he
called it was not the way to go about it. So before starting to write the
software, he purchased a book on the subject of software engineering.

I never gave this topic much more thought until I started working on Sage. I
then because to realise that Sage needs to take a more professional approach to
the development, as it seems a bit add-hock to me.

My own view is I'd rather have something with less features, which I could rely
on, than lots of features I don't trust. When there is little in the way of
project management, and a culture of not doing anything properly, then attitude
tends to spread like a virus.

I'm currently running the doctests 100 times on a machine, with the same build
of Sage that passed all doc tests. This is an interesting failure I observed:

sage -t -long devel/sage/doc/en/constructions/linear_algebra.rst
**********************************************************************
File
"/export/home/drkirkby/sage-4.5.3.alpha2/devel/sage-main/doc/en/constructions/linear_algebra.rst",
line 202:
sage: A.eigenvalues()
Expected:
[3, 2, 1]
Got:
[3, 1]
**********************************************************************


The tests have been run 41 times now, and only once has that test failed. The
answer looks quite reasonable, but I assume is wrong, as the other 40 times the
code gave the expected value. It's these sorts of things that concern me. Why
should the same build of Sage, running exactly the same doctests each time, not
produce repeatable results?

There's been a few failures, though that is the only one I've noticed where the
answer looks very reasonable, but is in fact incorrect.

Dave

John Cremona

unread,
Aug 27, 2010, 4:17:41 AM8/27/10
to sage-...@googlegroups.com

That is very worrying. The matrix A here is

[1 1 0]
[0 2 0]
[0 0 3]

over the rationals, so if eigenvalues are being missed it is in
finding the roots of a rational cubic whose roots are 1,2,3. I tried
tracing through the call to A.eigenvalues() but that is hard to do
since it spends ages doing things whose necessity is hard to
understand (for example, there are calls to cputime()!).

John

>
> The tests have been run 41 times now, and only once has that test failed.
> The answer looks quite reasonable, but I assume is wrong, as the other 40
> times the code gave the expected value. It's these sorts of things that
> concern me. Why should the same build of Sage, running exactly the same
> doctests each time, not produce repeatable results?
>
> There's been a few failures, though that is the only one I've noticed where
> the answer looks very reasonable, but is in fact incorrect.
>
> Dave
>

> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to
> sage-devel+...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>

Dr. David Kirkby

unread,
Aug 27, 2010, 5:03:08 AM8/27/10
to sage-...@googlegroups.com

Thank you for looking at this John.

I think you have just proved one of the points I tried to make John.

* I spend some time testing multiple times and observe that failure, once in 47
runs of the doctests.

* You look at the code and find it's dubious. Calling cputime() when computing
eignevectors does seem a bit odd. Even I know that. But this is getting past the
review process.

If you want, I can create a trac ticket for this, or perhaps its better if you
do it, since you know more about the code. The test procedure was

* Sun Ultra 27
* 3.33 GHz quad core Xeon (hyperhtreaded)
* OpenSolaris 06/2009.
* 12 GB RAM
* Totally unmodified sage-4.5.3.alpha2
* Running 'make ptestlong' in a loop which executes that 100 times.
* The failure was obsevved once in 47 runs to date.

There are other suspicious failures I've observed, but that one stuck me as
particularly worrying as the result seemed to look believable. When you get a
traceback, it's obvious something has gone wrong. But in this case it's less
obvious.

Dave

John Cremona

unread,
Aug 27, 2010, 5:17:35 AM8/27/10
to sage-...@googlegroups.com
wait a minute. I did not really look at the code, and I know nothing
about it at all. The short glance I took at the trace showed me that
I did not understand it at all, and I do not propose to spend more
time looking at it. (That is not because I do not care about code
quality and correctness, I just have other things I need to do!)

John

Dr. David Kirkby

unread,
Aug 27, 2010, 5:35:17 AM8/27/10
to sage-...@googlegroups.com
On 08/27/10 10:17 AM, John Cremona wrote:
> wait a minute. I did not really look at the code, and I know nothing
> about it at all. The short glance I took at the trace showed me that
> I did not understand it at all, and I do not propose to spend more
> time looking at it. (That is not because I do not care about code
> quality and correctness, I just have other things I need to do!)
>
> John

OK, thanks for filling me in.

Dave

Alex Ghitza

unread,
Aug 27, 2010, 7:47:27 AM8/27/10
to Dr. David Kirkby, sage-...@googlegroups.com

(Sigh... And I had promised myself not to get involved in this thread...)

On Fri, 27 Aug 2010 10:03:08 +0100, "Dr. David Kirkby" <david....@onetel.net> wrote:
> On 08/27/10 09:17 AM, John Cremona wrote:
> > On 27 August 2010 07:25, Dr. David Kirkby<david....@onetel.net> wrote:
> >>
> >> My own view is I'd rather have something with less features, which I could
> >> rely on, than lots of features I don't trust. When there is little in the
> >> way of project management, and a culture of not doing anything properly,
> >> then attitude tends to spread like a virus.

I'll deal with this in decreasing order of importance:

1. "there is little in the way of project management" is an undeserved
jab at William, and "[there is] a culture of not doing anything
properly" is an undeserved jab at the *majority* of Sage developers.
Maybe the crusade for more quality control can proceed without this type
of comment.


2. referring to "My own view is I'd rather have something with less
features, which I could rely on, than lots of features I don't trust":
there might be such something's out there. Maybe Maxima does everything
you want without anything you don't. Maybe a calculator does. Maybe
Ondrej Certik's SPD project would be a good starting point if you would
prefer something similar to Sage but much leaner (and without all the
badly-written number theory code we keep hearing about).

> * You look at the code and find it's dubious. Calling cputime() when computing
> eignevectors does seem a bit odd. Even I know that. But this is getting past the
> review process.

Ah good, another chance to take a swing at the review process. Based on
a bug we haven't yet reproduced, don't know where it is, and without
having even peeked at the code.

> If you want, I can create a trac ticket for this, or perhaps its better if you
> do it, since you know more about the code. The test procedure was
>
> * Sun Ultra 27
> * 3.33 GHz quad core Xeon (hyperhtreaded)
> * OpenSolaris 06/2009.
> * 12 GB RAM
> * Totally unmodified sage-4.5.3.alpha2
> * Running 'make ptestlong' in a loop which executes that 100 times.
> * The failure was obsevved once in 47 runs to date.
>
> There are other suspicious failures I've observed, but that one stuck me as
> particularly worrying as the result seemed to look believable. When you get a
> traceback, it's obvious something has gone wrong. But in this case it's less
> obvious.

I don't know about you, but if I get...

Expected:
[3, 2, 1]
Got:
[3, 1]

... I look at what the test is trying to do:

sage: A = matrix(QQ, [[1,1,0],[0,2,0],[0,0,3]])
sage: A


[1 1 0]
[0 2 0]
[0 0 3]

sage: A.eigenvalues()

So A is a diagonal matrix with diagonal entries 1, 2, 3. What do you
think the eigenvalues are? To me [3, 1] looks pretty obviously wrong.

Next step is looking at the code. Here's what I get from spending less
than two minutes on it: the eigenvalues are obtained by computing the
characteristic polynomial of A, and then factoring this polynomial. The
factorisation of the polynomial is outsourced to Pari. The actual
computation of the characteristic polynomial should, in this case, be
outsourced to Linbox.

So the heisenbug is very likely to be either in Pari or in Linbox or in
how we communicate with them. Maybe it's for some reason only seen on
OpenSolaris. (No, I have no real reason to believe this other than
doctesting that file 100 times on 32- and 64-bit Linux machines and not
seeing any failure.) Maybe it's cosmic radiation (no, I'm not serious).
Maybe it's the doctesting code getting bored of running the same test
over and over again and deciding to get creative with it (also not
serious about this one).

Best,
Alex

--
Alex Ghitza -- http://aghitza.org/
Lecturer in Mathematics -- The University of Melbourne -- Australia

Tim Daly

unread,
Aug 27, 2010, 7:59:05 AM8/27/10
to sage-...@googlegroups.com
tl;dr we need to raise the standards and "get it right".

On getting different answers...

Some algorithms in computational mathematics use random
values. Depending on the source of random values you might
get different but correct answers. Random algorithms can
be very much faster than deterministic ones and might even
be the only available algorithms.

In Axiom the test is marked as using a random value so a
test which does not produce the same result every time is
still considered valid.

I do not know if Sage uses such algorithms and I do not
know if that is the source of your failure. If not, then
as you point out, the result is quite troubling. This might
be hard to establish in Sage as it uses several different
subsystems in some computations and who knows if they use
random algorithms?


On software engineering....

One needs to understand a problem to write correct software.
What is not understood will be where the problems arise.
Sage has the advantage of being written by domain experts
so the algorithms are likely of high quality. Unfortunately,
they rest on many layers of algorithms with many assumptions
(it is kittens all the way down).

Most current systems, at least in this generation, have the
participation of some of the original authors so problems
can be understood and fixed. That will not be true in the
future (some of the Axiom's many authors have died).

In software engineering terms the question is, what happens
when the world expert in some area, such as Gilbert Baumslag
in Infinite Group Theory, writes a program to solve a problem
which later fails? The failure could be due to his lack of
understanding of underlying implementation details or it could
be because an upstream system has changed an assumption, as
is being discussed on another thread. That is, "code rot".

Without Gilbert, who can debug it? Do you throw the subsystem
away, as has been discussed with SYMPOW? If you do then you
have given up real expertise. If you don't then you can't trust
the results. It could be many years before another infinite
group theory AND computational mathematics expert can find and
fix the bug. Meanwhile, the code will continue to rot as more
things change under it.

Who can debug this in 30 years? ...

There are three potential attacks on this problem, documentation,
standard test suites, and program proofs.

Sage has been making doctests which are useful for bringing
potential failures to attention. Other than simple checks and
possible user examples they are useless for other purposes.

*Documentation* should consist of a detailed discussion of
the algorithm. Since we are doing computational mathematics the
discussion has to have both a mathematical part and an implementation
part. Since "computational mathematics" is not (yet) a widespread
and recognized department of study at many U.S. universities,
the authors of today are likely weak in either the mathematics or
the implementation. Hopefully this will change in the future.

*Standard test suites*, which involves testing the results against
published reference books (Schaums, Luke, A&S, Kamke, etc) This can
make the testing less ad-hoc. These tests also allow all of the
other systems to publish their compliance results. All software
engineers know that you can't write your own tests so this is a
good way to have a test suite and an excellent way to show users
that your system gets reasonable results.

*Program proofs* are important but not yet well established in this
area. In prior centuries a "proof" in mathematics involved a lot
of "proof by authority" handwaving. In the 20th century the standard
of a mathematical proof was much more rigorous. In computational
mathematics, if we have proofs at all they are likely just some
pre-rigorous handwaving that ignores implementation details.
Worse yet, almost none of the systems so far have any kind of
theoretical scaffolding on which to hang a proof. If you don't
know what a Ring is and you don't have a strong definition of a
Ring underlying your implementation choices, how can you possibly
proof the code correct?


Developing computational mathematics for the long term....

For *documentation*, I believe that we should require literate programs
which contain both the theory and the implementation. Each algorithm
should be of publication quality and peer-reviewed. (Carlo Traverso and
I have proposed a computational mathematics journal that accepts only
literate programs.) You should be able to go to a conference talk,
download the literate paper from a URL (hopefully a peer-reviewed
journal), and "execute the paper" during the presentation on your
system. Imagine the power of pointing out an algorithm failure
"in real time" and the pressure to make sure you get it right.

For *standard test suites*, I believe we should have these in many
different areas. We should collect a set of reference books that are
fairly comprehensive and jointly develop tests that run on all systems.
Rubi is doing this with rule based integration.
http://www.apmaths.uwo.ca/~arich/
Axiom is doing this with the Computer Algebra Test Suite
http://axiom-developer.org/axiom-website/CATS/index.html

If we publish and maintain the test suite results for all systems
there will be great pressure to conform to these results, great
pressure to develop standard algorithms that everyone uses, and
great pressure to improve areas of weakness.

For *program proofs* I can see where ACL2 and COQ can be integrated
into systems so that a published paper can include an automated
proof. This will raise the standard expected for professional
publications in computational mathematics. It will also move us
away from depending on domain experts who have the unfortunate
tendency to die off, both personally and professionally.

The 30 year horizon ...

We are nearly 50 years into computational mathematics.

It is time to develop departments and textbooks, raise the level
of expectation for peer-reviewed publications, develop not only
software engineering practices but also computational engineering
practices. Computational mathematics is the one area of software
where you can independently check your answers and you can expect
to write timeless software that gets the same answers 30 years later.
It really is worth the time to get it right.

To bring this diatribe back to earth and the original subject,
you have identified a symptom of a much larger problem with the
current systems. Unfortunately there is no peer-pressure to have
higher standards. I was hoping that the NIST Digital Library of
Mathematical Functions would be the touchstone of algorithms but
that is clearly not the case.

Perhaps someday there will be sub-department of the NSF that
tried to organize, guide, and fund it but nobody there has
"the vision and the will" to get it done. Maybe we should
nominate William as an NSF director :-)


Tim Daly

mhampton

unread,
Aug 27, 2010, 2:24:10 PM8/27/10
to sage-devel
For the record, I tried the above calculation at least 250,000 times
on two macs (running OSX 10.5 and 10.6) and on Ubuntu 9.10 with a i7
860 processor, had no errors. This was on Sage-4.5.2. I guess I'll
try again with 4.5.3, maybe its related to the Pari upgrade.

Otherwise +1 to Alex's comments.

Marshall

Dr. David Kirkby

unread,
Aug 27, 2010, 3:37:50 PM8/27/10
to sage-...@googlegroups.com
Pari has not been upgraded in 4.5.3 - it will be 4.6.0 before that's updated.

There have been many reports of where doctests fail when running

make ptest
or
make ptestlong

which later pass if run individually. I've lost count of the number of times
I've seen that reported, on every operating system under the Sun.

I'll post a list of the failure once I've run them 100 times.

I've got a feelign the frequency of failures might be related to system load,
though that's hard to prove. There have been long periods where all tests passed
every time, and I believe those were when I was not using the machine for much
else. The failures occcured on runs numbered 4, 11, 17, 21, 22, 23, 26, 34, 60, 66.

So 25 times in succession all tests pass, but then there are periods where there
are a number of failures.

Anyway, I'll just leave it running, and see what happens.

Dave

Dr. David Kirkby

unread,
Aug 27, 2010, 9:54:02 PM8/27/10
to sage-...@googlegroups.com
On 08/27/10 12:59 PM, Tim Daly wrote:
> tl;dr we need to raise the standards and "get it right".
>
> On getting different answers...
>
> Some algorithms in computational mathematics use random
> values. Depending on the source of random values you might
> get different but correct answers. Random algorithms can
> be very much faster than deterministic ones and might even
> be the only available algorithms.

In this case the answer I got was just plain wrong.

So the problem is not a result of a valid algorithm giving different, but
correct answers each time.

> In Axiom the test is marked as using a random value so a
> test which does not produce the same result every time is
> still considered valid.

But this is not the problem here. A root was not found - for whatever reason we
do not know.

> I do not know if Sage uses such algorithms and I do not
> know if that is the source of your failure. If not, then
> as you point out, the result is quite troubling. This might
> be hard to establish in Sage as it uses several different
> subsystems in some computations and who knows if they use
> random algorithms?

It's not very reproducible. I've only seen it once in seventy odd runs.

I suspect it could be an issue with the actual testing framework.

> On software engineering....
>
> One needs to understand a problem to write correct software.
> What is not understood will be where the problems arise.
> Sage has the advantage of being written by domain experts
> so the algorithms are likely of high quality. Unfortunately,
> they rest on many layers of algorithms with many assumptions
> (it is kittens all the way down).

There is however a difference between good mathematical skills and the ability
to write good software. It is very clear to me that some software in Sage has
been written by experts in their fields of mathematics, but who are quite poor
at writing software.

> Most current systems, at least in this generation, have the
> participation of some of the original authors so problems
> can be understood and fixed. That will not be true in the
> future (some of the Axiom's many authors have died).

Though people may chose not to fix problems even if they are alive. People lose
interest. I could imagine someone went to work for Wolfram Research, they would
not appreciate him/her maintaining software used by Sage.

> In software engineering terms the question is, what happens
> when the world expert in some area, such as Gilbert Baumslag
> in Infinite Group Theory, writes a program to solve a problem
> which later fails? The failure could be due to his lack of
> understanding of underlying implementation details or it could
> be because an upstream system has changed an assumption, as
> is being discussed on another thread. That is, "code rot".
>
> Without Gilbert, who can debug it? Do you throw the subsystem
> away, as has been discussed with SYMPOW? If you do then you
> have given up real expertise. If you don't then you can't trust
> the results. It could be many years before another infinite
> group theory AND computational mathematics expert can find and
> fix the bug. Meanwhile, the code will continue to rot as more
> things change under it.


> Who can debug this in 30 years? ...

> There are three potential attacks on this problem, documentation,
> standard test suites, and program proofs.
>
> Sage has been making doctests which are useful for bringing
> potential failures to attention. Other than simple checks and
> possible user examples they are useless for other purposes.

Sage also has the ability to run the test suites provided by the developers of
the constituent parts. So for example the Python, mpir, mpfr and other test
suites can be run.

Unfortunately, most of the packages making up Sage do not have the required file
to execute the tests, which is a real shame, as it often needs to consist of
little more than

make test

or

make check

cliquer, docutils and many other programs in Sage have test suites of their own,
but which can't be executed.

> *Documentation* should consist of a detailed discussion of
> the algorithm. Since we are doing computational mathematics the
> discussion has to have both a mathematical part and an implementation
> part.

True. A lot of code does lack this.

> Since "computational mathematics" is not (yet) a widespread
> and recognized department of study at many U.S. universities,
> the authors of today are likely weak in either the mathematics or
> the implementation. Hopefully this will change in the future.

Maybe, though there may be cases where a person with reasonable mathmatics
skills, but good programming skills, could work with someone who really
understands the maths, but not the computer implementation. It does not
necessarily follow that these have to be implemented by the same person.

> *Standard test suites*, which involves testing the results against
> published reference books (Schaums, Luke, A&S, Kamke, etc) This can
> make the testing less ad-hoc. These tests also allow all of the
> other systems to publish their compliance results.

Yes, though those tests need a lot of work. From the links you provided about
Rubi there were basically 4 results

1) Software produced the correct and simplist result
2) Software produced a correct, but overly complex result.
3) Software failed to compute an answer.
4) Software gave the wrong answer.

I would imagine taking a large number of those tests, and putting them into Sage
would be a huge undertaking.

It's a shame all these systems (Mathematica, Maple, MATLAB, Sage, Axiom etc) all
have different syntaxes. If we were testing C compilers, they would all take the
same code. With the maths sofware, it has to be written for each and every piece
of software.

> All software
> engineers know that you can't write your own tests so this is a
> good way to have a test suite and an excellent way to show users
> that your system gets reasonable results.

It would be good engineering practice for the person writing to the code to not
be the one writing the test suite. In practice, that will be less easy with
open-source software.

I honestly don't know how many Sage developers would actually have picked up a
book on software engineering, and read things like the person writing the code
should not be the person writing the test. My guess is not very many - which is
not helped by the fact the books tend to be quite expensive.

> *Program proofs* are important but not yet well established in this
> area. In prior centuries a "proof" in mathematics involved a lot
> of "proof by authority" handwaving. In the 20th century the standard
> of a mathematical proof was much more rigorous. In computational
> mathematics, if we have proofs at all they are likely just some
> pre-rigorous handwaving that ignores implementation details.
> Worse yet, almost none of the systems so far have any kind of
> theoretical scaffolding on which to hang a proof. If you don't
> know what a Ring is and you don't have a strong definition of a
> Ring underlying your implementation choices, how can you possibly
> proof the code correct?

> Developing computational mathematics for the long term....
>
> For *documentation*, I believe that we should require literate programs
> which contain both the theory and the implementation. Each algorithm
> should be of publication quality and peer-reviewed.

Sounds good for a long term aim.

> For *standard test suites*, I believe we should have these in many
> different areas. We should collect a set of reference books that are
> fairly comprehensive and jointly develop tests that run on all systems.
> Rubi is doing this with rule based integration.
> http://www.apmaths.uwo.ca/~arich/

I read that - see above comments.

> Axiom is doing this with the Computer Algebra Test Suite
> http://axiom-developer.org/axiom-website/CATS/index.html
>
> If we publish and maintain the test suite results for all systems
> there will be great pressure to conform to these results, great
> pressure to develop standard algorithms that everyone uses, and
> great pressure to improve areas of weakness.

But if a test suite was considered a "gold standard" then I doubt it would be
too hard to add a bit of code to Sage which can do a particular integral in the
test suite. Just adding code for the purpose of passing a test, would be
tempting. I'm sure if Wolfram Research thought it was in their commercial
interests, they could get Mathematica to pass all those Rubi tests.

I noticed that Mathematica appeared to do quite a bit better than Maple in those
tests. That did not totally surprise me. You probably are aware of Vladimir
Bondarenko, who is a somewhat strange character. But from some discussions I've
had with him, he is of the opinion that Wolfram Research took quality control
and bug reports more seriously than Maplesoft. But his way of reporting bugs was
not exactly conventional.

Dave

Robert Bradshaw

unread,
Aug 29, 2010, 2:07:57 AM8/29/10
to sage-...@googlegroups.com
On Fri, Aug 27, 2010 at 12:37 PM, Dr. David Kirkby
<david....@onetel.net> wrote:
> On 08/27/10 07:24 PM, mhampton wrote:
>>
>> For the record, I tried the above calculation at least 250,000 times
>> on two macs (running OSX 10.5 and 10.6) and on Ubuntu 9.10 with a i7
>> 860 processor, had no errors.  This was on Sage-4.5.2.  I guess I'll
>> try again with 4.5.3, maybe its related to the Pari upgrade.
>>
>> Otherwise +1 to Alex's comments.
>>
>> Marshall
>>
> Pari has not been upgraded in 4.5.3 - it will be 4.6.0 before that's
> updated.
>
> There have been many reports of where doctests fail when running
>
> make ptest
> or
> make ptestlong
>
> which later pass if run individually. I've lost count of the number of times
> I've seen that reported, on every operating system under the Sun.

I'm willing to bet this is a problem with the doctesting system, not
Sage itself. (Actually, there's at least one known bug that I've
personally run into when running at high levels of parallelization:
http://trac.sagemath.org/sage_trac/ticket/9739 )

> I'll post a list of the failure once I've run them 100 times.
>
> I've got a feelign the frequency of failures might be related to system
> load, though that's hard to prove. There have been long periods where all
> tests passed every time, and I believe those were when I was not using the
> machine for much else. The failures occcured on runs numbered 4, 11, 17, 21,
> 22, 23, 26, 34, 60, 66.
>
> So 25 times in succession all tests pass, but then there are periods where
> there are a number of failures.
>
> Anyway, I'll just leave it running, and see what happens.
>

I ran the eigenvalues test 100,000 times on my Mac (Sage 4.5.2) and
never had it fail once. It looks like linbox is called to create the
charpoly, and IIRC linbox occasionally uses probabilistic methods by
default (with tiny but non-zero chance of failure). I wouldn't rule
out cosmic rays or other hardware failure--is your ram ECC? With 12GB,
you're likely to see a bit flipped per day. Also, you're using a
(relatively) uncommon operating system and set of hardware. This can
be good for exposing bugs, but that's not a good thing when you're
trying to use the software. I have more confidence in software working
correctly on the system(s) used by its developers than a port.

In terms of the general rant, there are two points I'd like to make.
The first is that there's a distinction between the Sage library
itself and the many other spkgs we ship. By far the majority of your
complaints have been about various arcane spgks. Sage is a
distribution, and we do try to keep quality up, but it's important to
note that much of this software is not as directly under our control,
and just because something isn't as good as it could be from a
software engineering perspective doesn't mean that it won't be
extremely useful to many people. Even if it has bugs. We try to place
the bar high for getting an spkg in but blaming the Sage community for
poor coding practices in external code is a bit unfair. I hold the
Sage library itself to a much higher standard.

The second point is that much of the older, crufty code in Sage was
put in at a time when standards were much lower, or even before there
was a referee process at all. I think this was necessary for the
time--Sage would have gotten off the ground if it couldn't have been
useful so quickly. This includes in particular many of the spkgs that
have been grandfathered in and wouldn't make the cut now, but it takes
time to remove/replace/clean them up. Of course there's room for
improvement, but do you think the current review process is
insufficient and lots of new bad code is being written and included?
If so, what should we do better?

- Robert

Tim Daly

unread,
Aug 29, 2010, 4:56:40 AM8/29/10
to sage-...@googlegroups.com
tl;dr old curmudgeon flaming on about the dead past, not "getting it"
about Sage.

Robert Bradshaw wrote:
>
>
> In terms of the general rant, there are two points I'd like to make.
> The first is that there's a distinction between the Sage library
> itself and the many other spkgs we ship. By far the majority of your
> complaints have been about various arcane spgks. Sage is a
> distribution, and we do try to keep quality up, but it's important to
> note that much of this software is not as directly under our control,
> and just because something isn't as good as it could be from a
> software engineering perspective doesn't mean that it won't be
> extremely useful to many people. Even if it has bugs. We try to place
> the bar high for getting an spkg in but blaming the Sage community for
> poor coding practices in external code is a bit unfair. I hold the
> Sage library itself to a much higher standard.
>

The point that "software is not as directly under our control" is not
really valid.

This is a *design* decision of Sage, not a necessary evil. Sage
specifically chose
to build on dozens of other pieces of software. Once you make spkg
functionality
part of Sage functionality, you own it.

The statement that Sage tries "to place the bar high for getting an spkg
in" isn't
actually much of a claim. I've watched the way spkgs get voted onto the
island
and it usually involves a +1 by less than half a dozen people. Would you
really
consider this to be placing "the bar high"? I'd consider developing a
test suite,
or an API function-by-function code review, or a line-by-line code review to
be placing the bar high. At the moment I see Sage writing test cases for
python
code but I don't see the same test cases being pushed into the spkgs.
Even where
external test cases are available (e.g. the computer algebra test suites
for Schaums
and Kamke) I don't see them being run.

From a software engineering perspective there are some things that *are*
directly under Sage control such as the pexpect interfaces. How carefully
are these designed? Just yesterday I saw comments about the gensym
(question-mark variables) connections to Maxima not being handled. This
syntax is not a new Maxima feature so a pexpect interface design could have
taken this into account but it did not. Each pexpect interface should be
designed
to be automatically constructed from the BNF of the underlying spkg. This
would eliminate the mismatch by design and be good software engineering.

The conclusion that "blaming the Sage community for poor coding practices
in external code" as being "a bit unfair" is not valid. While it is
grossly unfair to
assume that spkgs are of poor quality, if your *design* calls for using
materials
of "unknown quality" it seems that a very large portion of your effort
*must*
involve quality reviews of spkgs. End users just see Sage.

Still to come will be the "code rot" issue. Open source packages tend to
have a
very small number of active contributors. Projects tend to stop when
those people
drift away. Once a package is no longer maintained it stops working due to a
lot of factors such as incompatible standards like python 3.0, operating
system changes
like include files, architecture changes like parallel builds, loss of
primary
development platforms like the implosion of open solaris, etc. Recent
examples of this
in Sage might be the Atlas 64bit issue (architecture), the Sympow issue
(author
loss), the loss of pointful effort due to the death of open solaris
(platform death),
the python GIL issue on multicore (software architecture), the rise of
python 3.x
(software standards change), etc.

Now that the wave of new spkg adoption has slowed I expect to see a growing
need for maintaining "upstream" code. By *design*, their problems are
now your
problems. Who will debug a problem that exists in 500,000 lines of
upstream code?
Who will understand the algorithms (e.g. sympow) written by experts, some of
whom are unique in the world, and debug them?

Writing new code is always fun. Maintaining old code you didn't write is
painful.
But from an end-user perspective "it is all Sage" so all bugs are "Sage
bugs".
That may seem unfair but the end-user won't know or care.

The belief that Sage will gradually rewrite the code pile it has (5
million lines?) into
higher quality seems odd. For comparison, Axiom is about 1 million
things-of-code
(lisp doesn't have "lines"). It took over 20 years and over 40 million
dollars of funding.
Scaling linearly, Sage would take 100 years and 200 million dollars to
be rewritten
into "all python". Frankly, I think the spkgs are going to be around for
a very long time.


> The second point is that much of the older, crufty code in Sage was
> put in at a time when standards were much lower, or even before there
> was a referee process at all.

When Axiom was written we were using Liskov's ideas directly from the
primary papers.
I believe that we were the first system to dispatch not only on the type
of the arguments
but also on the type of the return (something that is still not common).
But Axiom was
developed as research software, not with the intention of being brought
to market as a
product (free or commercial). Sage is being developed with this intention.

Our choice of "standards" was to build on abstract algebra. There were a
great many
debates about the right way to do things and we always went back to the
touchstone of
what abstract algebra implied. At the time (40 years ago) there were no
existing examples
of computational mathematics for many of the ideas so we had to invent
them. Axiom
set the standards (e.g. integration) and they were quite high (Axiom
still has the most
complete implementation). Sage has existing examples to guide it.

So at the time Sage was being developed there *were* standards in place.
You seem
to feel that Sage was started "pre-standard" (2005?) and "pre-referee"
(ISSAC?).


> I think this was necessary for the
> time--Sage would have gotten off the ground if it couldn't have been
> useful so quickly. This includes in particular many of the spkgs that
> have been grandfathered in and wouldn't make the cut now, but it takes
> time to remove/replace/clean them up. Of course there's room for
> improvement, but do you think the current review process is
> insufficient and lots of new bad code is being written and included?
> If so, what should we do better?
>

I *do* feel that the current review process in Sage is insufficient (see
my earlier diatriabe).

I see reviews of bug fixes but I don't see reviews of spkgs. We are now
over 50 years
into the development of computational mathematics and Sage has the goal
of competing
with systems developed in the 1970/1980s, over 30 years ago. This would
be a great
thing if Sage were to deeply document the algorithms, develop the
standards, and/or
prove the code correct but I don't see anyone advocating any of these. I
don't see anyone
advocating alternative ideas that would "raise the bar" in computational
mathematics.

Even in the area of education I don't see anyone hammering on the NSF to
fund more
efforts in computational mathematics. I don't see pushback to NIST to
standardize the
algorithms. Obama wants to bring science back to life and encourage
research. As the
largest group of academics I would wish that you would petition the
funding sources.
Even if all of the funds went to Sage I'd still feel that this was
worthwhile.

In short, I don't see *change*.

Tim Daly
(curmudgeon and rjf wannabe)

Dr. David Kirkby

unread,
Aug 29, 2010, 6:47:25 AM8/29/10
to sage-...@googlegroups.com
On 08/29/10 07:07 AM, Robert Bradshaw wrote:
> On Fri, Aug 27, 2010 at 12:37 PM, Dr. David Kirkby

>> There have been many reports of where doctests fail when running


>>
>> make ptest
>> or
>> make ptestlong
>>
>> which later pass if run individually. I've lost count of the number of times
>> I've seen that reported, on every operating system under the Sun.
>
> I'm willing to bet this is a problem with the doctesting system, not
> Sage itself. (Actually, there's at least one known bug that I've
> personally run into when running at high levels of parallelization:
> http://trac.sagemath.org/sage_trac/ticket/9739 )

I tend to agree.

I've run the parallel doctests 100 times now, and are now in the process of
running the serial doctest 100 times using the same compile of Sage. I will see
if serial is more reliable. I'm testing in 12 different directories, to speed
the process up. Otherwise it would take too long to run the long tests serially
100 times.


> I ran the eigenvalues test 100,000 times on my Mac (Sage 4.5.2) and
> never had it fail once. It looks like linbox is called to create the
> charpoly, and IIRC linbox occasionally uses probabilistic methods by
> default (with tiny but non-zero chance of failure). I wouldn't rule
> out cosmic rays or other hardware failure--is your ram ECC? With 12GB,
> you're likely to see a bit flipped per day.

Yes, the machine has ECC RAM. It's also not a cheap PC that's been overclocked.

For example the RAM is 1333 MHz, but Sun reduce the clock speed of the RAM when
there's more than 6 GB. So it runs at a bit under that (I forget what). The
disks are enterprise grade disks, mirrored, with a ZFS file system which should
detect problems, correct them and record them. I've never had a single disk
problem. (The machine is under a year old anyway).

Of course it could be a hardware error, but if so it was not logged as such. But
I've only seen that error once, and can't reproduce it, unlike some other
issues, which seem to be related to pexpect. But I'll post more details later,
when I have done some more testing.

> Also, you're using a
> (relatively) uncommon operating system and set of hardware.

Yes.

> This can
> be good for exposing bugs, but that's not a good thing when you're
> trying to use the software. I have more confidence in software working
> correctly on the system(s) used by its developers than a port.

Yes, I can see that. Though in mitigation I'd say that in the case of Solaris
(but not OpenSolaris), the underlying operating system is probably more stable
than a typical (if not all) Linux systems. But I am using OpenSolaris on this
machine, despite I know its not as stable as Solaris.

> In terms of the general rant, there are two points I'd like to make.
> The first is that there's a distinction between the Sage library
> itself and the many other spkgs we ship. By far the majority of your
> complaints have been about various arcane spgks. Sage is a
> distribution, and we do try to keep quality up, but it's important to
> note that much of this software is not as directly under our control,

Robert, I was going to reply earlier before I see Tim's post, but basically I
was going to say exactly the same has him. There is simply no logic to this
argument.

Sage is a collection of bash/python/perl/C/Fortan/Lisp etc and part of that
collection is the Sage library. But that is Sage. I think I used the analogy the
other day, Airbus can't say when a plane crashed "its not our fault, we did not
make the part that failed". Airbus are responsible for the design of the
aircraft. What they chose is up to them. It's a design decision. Sage made a
design decision to include all these .spkg files.

I think Tim has made this point, so I don't really need to say more.

> and just because something isn't as good as it could be from a
> software engineering perspective doesn't mean that it won't be
> extremely useful to many people.Even if it has bugs.

Agreed. But from my own perspective, as a non-mathematician, I don't fell I can
trust Sage enough to do work that I'd want to do with it, because I have serious
concerns about the development process.

> We try to place
> the bar high for getting an spkg in but blaming the Sage community for
> poor coding practices in external code is a bit unfair.

As Tim and I both point out, that is not a valid argument.

> I hold the
> Sage library itself to a much higher standard.

I've not looked at that in detail, but I see comments from people developing
code in the library that leads me to think I would not trust too much what they do.

> The second point is that much of the older, crufty code in Sage was
> put in at a time when standards were much lower, or even before there
> was a referee process at all. I think this was necessary for the
> time--Sage would have gotten off the ground if it couldn't have been
> useful so quickly.

I have some sympathy with that view, though not an awful lot.

> This includes in particular many of the spkgs that
> have been grandfathered in and wouldn't make the cut now, but it takes
> time to remove/replace/clean them up. Of course there's room for
> improvement, but do you think the current review process is
> insufficient and lots of new bad code is being written and included?

I don't want to get into a rant, but there's one developer, who I would rather
refer to anonymously, but he complained when I did that, so I'll name him -
Nathann Cohen.

He seems a nice guy, but wrote in an email that was circulated to about 6 people
(myself and William included), that he did not want to worry too much about how
code might fail, but rather fix the problems if there are bugs reported. I think
that is very bad practice. But not one single person on that list pointed out
this flaw in this logic. I raised the issue - perhaps I overreacted, but there
was nobody that actually told him that his methods were wrong.

I think there's simply a lack of the right mix of skills in developers.

> If so, what should we do better?

* I think a good start would be to try to attract some compute science students
to do some projects looking at how quality could be improved. In essence, I
don't believe Sage has the right mix of skill sets. An interesting project or
two could be made by getting someone with a more appropriate skill set to come
up with some suggestions.

Doing this, might broaden the user base of Sage too.

* I don't know if William has funding to buy, and encourage developers to read
some books on software engineering. I think there's a lack of awareness of what
is considered good practice, and what is just asking for trouble.

Software engineering is not my background, though I believe I know more about it
than many of the developers. That's only because I've realised that to be
professional at something, one must know what the professionals do. As a
chartered engineer, I try to maintain professional standards.

He can buy me a copy of

http://www.amazon.com/Software-Engineering-9th-Ian-Sommerville/dp/0137035152/ref=sr_1_1?ie=UTF8&s=books&qid=1283077786&sr=8-1

if he wants.

* If there was funding, then pay someone with a strong background in a
commercial environment with knowledge of how to develop software. Someone good
would cost a lot of money. He/she should be interviewed by someone who knows the
subject well. Perhaps ask a prof from the CS dependent to be on an interview panel.

* Have regular "bug fix only" releases, where the aim is just to increase the
quality, and not to add features.

Nathann Cohen has said "-1" to that idea. William has said it would put people
off, and at least one other developer (forget who) did not like it. But I feel
it's probably one of the easiest ways to improve Sage.

* Have a system like gcc, where the releases numbers x.y.z mean something. Only
have z != 0 on bug-fix-only releases. Make it clear on the web site that the the
x.y.0 are less well tested.

* Make release candidates available for anyone to report on.

* Have longer times between the "final" release candidate and a release date. I
expressed concern that 4.5.3.alpha2 was going to be released last Monday and
4.5.3 released on Friday. Luckily that has not happened.

* Something I suggested before, which would not be perfect, but would be useful,
is to have a "risk factor" on trac tickets.

* I think William is spot on this blog post.

http://389a.blogspot.com/2010/08/computational-projects-part-1.html

There should be a difference in code quality that one developers for oneself,
and what gets put into Sage. It sounds like William will be developing things
for himself, which wont be committed to Sage, as he will not have time to
document/test them sufficiently well. That's a good sign.

But I think a lot of the code in Sage is very specialised things, that are only
useful to a very small number of people. IMHO, that should be in external
packages which people include if they want them. These bits of code will be
unmaintainable if the person including them leaves. I don't think they really
should be in the core Sage code. I think the R model is more appropriate.

* Spend more time on implementing well the things we have, rather than go onto
something else. There was for example a complaint on sage-support about a lack
of documentation for plotting 3D, with not all options documented.

Someone said the R interface is "rough around the edges". I'd like to see less
emphasis on sticking things into Sage, and a bit more on not having things that
are "round around the edges".

* Have design documents, which document how specific areas will be done. It
seems at the minute that someone has an idea for new bit of code, they create
some code for the library, it gets reviewed and committed. I don't see any real
design documents.

* Run the self-tests for the packages. In many caes upstream packages have
test code, but the people that added the .spkg's never bothered looking at
running the tests.

* Tim, who clearly knows more about maths than me, has also given a list.

* There are more I could think of, but its pointless listing them all. It
really does need someone with knowledge of best industry practices to look at
what happens.

* Stable and unstable branches, though that might be impractical due to the
lack of people wanting to be release managers. I think the bug-fix-only releases
would reduce the need for this a little, especially if people wanting a stable
release were advised to download a bug-fix-only release.

> - Robert
>


BTW, there's an useful little post on the Wolfram Research web site, where
people make comments about the next Mathematica release

http://blog.wolfram.com/2009/11/12/the-rd-pipeline-for-mathematica/

I guess the comments are genuine, but of course they are moderated. But a couple
strike me as showing what *users* of software want when they get a package like
Mathematica, and I think will apply to many end users (non-developers) of Sage.

==========================================================================
I�d think the timescale involved is a couple of years, not months.

Let WRI use that time to get it right. Any releases with such amazing features
should be industrial-strength, not proof-of-concept. (Please, no more
proof-of-concept.)

Posted by Vince Virgilio
===========================================================================
I second Vince. Take the time to make sure that when it is released it is as
near to bug-free as possible.

Posted by Paul
============================================================================


Sorry the post is so long.

Dave

Dr. David Kirkby

unread,
Aug 29, 2010, 6:09:22 PM8/29/10
to sage-...@googlegroups.com
On 08/29/10 09:56 AM, Tim Daly wrote:
> tl;dr old curmudgeon flaming on about the dead past, not "getting it"
> about Sage.
>
> Robert Bradshaw wrote:
>>
>>
>> In terms of the general rant, there are two points I'd like to make.
>> The first is that there's a distinction between the Sage library
>> itself and the many other spkgs we ship. By far the majority of your
>> complaints have been about various arcane spgks. Sage is a
>> distribution, and we do try to keep quality up, but it's important to
>> note that much of this software is not as directly under our control,
>> and just because something isn't as good as it could be from a
>> software engineering perspective doesn't mean that it won't be
>> extremely useful to many people. Even if it has bugs. We try to place
>> the bar high for getting an spkg in but blaming the Sage community for
>> poor coding practices in external code is a bit unfair. I hold the
>> Sage library itself to a much higher standard.
> The point that "software is not as directly under our control" is not
> really valid.

Agreed.

> The statement that Sage tries "to place the bar high for getting an spkg
> in" isn't
> actually much of a claim. I've watched the way spkgs get voted onto the
> island
> and it usually involves a +1 by less than half a dozen people. Would you
> really
> consider this to be placing "the bar high"?

No, I don't think it places a high bar either.

It is probably seen as a high bar by those that do not have a software
engineering background. To those that do, I suspect they would conclude the same
as you and I.

Take a look at Sqlite's testing proceedures. The test code is 647 times larger
than the actual code for the database. I doubt that attention to detail would
have been very useful in Sage development.

One needs to find a sensible compromise.

> I'd consider developing a
> test suite,
> or an API function-by-function code review, or a line-by-line code
> review to
> be placing the bar high.

Yes, though one does need to be practical about it. Those sorts of things are
essential in code for specific applications (medical, aeronautical), but are
probably not practical for Sage. I doubt anyone at Wolfram Research has ever
gone through every line of ATLAS code, but they use ATLAS.

> At the moment I see Sage writing test cases for
> python
> code but I don't see the same test cases being pushed into the spkgs.
> Even where
> external test cases are available (e.g. the computer algebra test suites
> for Schaums
> and Kamke) I don't see them being run.

That is changing. I've gone through the packages and created a list of those
that are missing the spkg-check files that will allow the self tests to be run.

http://trac.sagemath.org/sage_trac/ticket/9281

The new Pari package will run the test suite if SAGE_CHECK is set to "yes". I've
personally sorted out a couple of packages recently and are just doing cliquer
now.

Robert agreed with me the other day that running short test suites from
spkg-install (i.e. every build) was reasonable.

> The conclusion that "blaming the Sage community for poor coding practices
> in external code" as being "a bit unfair" is not valid.

Agreed. The Sage community, in the most general sense, made decisions.

> Still to come will be the "code rot" issue. Open source packages tend to
> have a
> very small number of active contributors. Projects tend to stop when
> those people
> drift away.

I think this can be avoided to some extent by not adding to the core Sage
library very specialised items that are only of use to a few people. Just
because person X developers some code during his PhD, no matter how useful that
may be to him, I don't think it needs to be a standard part of Sage if its only
going to be used by very few people.

> Now that the wave of new spkg adoption has slowed I expect to see a growing
> need for maintaining "upstream" code. By *design*, their problems are
> now your
> problems. Who will debug a problem that exists in 500,000 lines of
> upstream code?
> Who will understand the algorithms (e.g. sympow) written by experts,
> some of
> whom are unique in the world, and debug them?

How do you expect Wolfram Research, Maplesoft and similar deal with such issues?
They must hit them too. I suspect they have a few nightmares with this, but the
best way is probably to have decent documentation. If code is well commented,
and has references to papers where the algorithms are published, then it sill
probably be maintainable.

> Writing new code is always fun. Maintaining old code you didn't write is
> painful.
> But from an end-user perspective "it is all Sage" so all bugs are "Sage
> bugs".
> That may seem unfair but the end-user won't know or care.

Exactly.

> The belief that Sage will gradually rewrite the code pile it has (5
> million lines?) into
> higher quality seems odd.

As you say, it will not happen.

> So at the time Sage was being developed there *were* standards in place.
> You seem
> to feel that Sage was started "pre-standard" (2005?) and "pre-referee"
> (ISSAC?).

> I see reviews of bug fixes but I don't see reviews of spkgs. We are now
> over 50 years
> into the development of computational mathematics and Sage has the goal
> of competing
> with systems developed in the 1970/1980s, over 30 years ago. This would
> be a great
> thing if Sage were to deeply document the algorithms, develop the
> standards, and/or
> prove the code correct but I don't see anyone advocating any of these. I
> don't see anyone
> advocating alternative ideas that would "raise the bar" in computational
> mathematics.

Given what you said a few days back, that there were few institutions teaching
computational mathematics, would you agree with my point that getting more
computer science skilled developers into Sage is a step in the right direction
to raising the bar?

Architects design buildings. Builders build them. The architect and the builder
communicate and the result is decent building.

> Tim Daly
> (curmudgeon and rjf wannabe)
>

Dave

Bill Hart

unread,
Aug 29, 2010, 8:41:10 PM8/29/10
to sage-devel
Why is this entire thread not on sage-flame? What does software
engineering, documentation, test code, etc. have to do with "Creating
a viable free open source alternative to Magma, Maple, Mathematica and
Matlab."?

I found the entire thread really amusing. I would parody the hell out
of it, but there are people here who I do not want to offend.

Software often seeks to be:

1) Fast
2) Correct
3) Comprehensible
4) Complete
5) User friendly
6) Developed in a timely manner

Open Source projects also have the aims:

7) Attract developers
8) Compete with commercial products

These are all competing aims. No one is going to rewrite all the
underlying C libraries in Python, ever, because of aim 1. No one cares
about broken spkgs because of 4, 6, 7. No one is going to take the
time to learn about software engineering, program proving or even
literate programming because of 6, 7 and often 1.

Any software project which chose Python as the primary language
certainly doesn't really care about 1. Any project which uses a whole
pile of inconsistent C libraries written by mathematicians, without
proper test suites, certainly doesn't care much about 2. Any project
which doesn't demand that all code be written in a literate style,
fully documented, doesn't really care about 3. Any system which is
developed by individuals implementing their fancy, maybe whatever
their research interest currently is, doesn't really care about 4. Any
software which gives a Python stack trace when something goes wrong,
probably didn't care that much about 5. Any project which isn't
completely modularised and which requires code review and merging of
patches into a single monolithic code base probably didn't care about
6. Any project which doesn't immediately merge contributions of
contributors before they bit rot isn't really hitting number 7. Any
project which has very few sources of serious funding certainly might
not manage 8.

Why attack Sage. It is what it is. Why defend it. It certainly didn't/
doesn't get everything right. One thing is for sure. Whatever is wrong
with Sage, it is almost certainly too late to fix it. Whatever is
right with Sage certainly made it popular.

If you want to make Sage seriously innovate to solve one or more of
the above, you need a *large* group of like-minded volunteers who can
help you. You won't do it on your own, no matter how many years you
work, nor how hard.

Many of the things Tim and David say resonate with me. I'd really,
really love a tremendously efficient, well-documented, reliable, Open
Source mathematical project. Having seen how insanely difficult even
just goal number 1 is for just the domain of arithmetic, I honestly
think we haven't got a chance. Not ever. The expertise don't exist in
sufficient quantity. And even those with the expertise, don't have the
time. So, looks like we are stuck with what we got.

By the way, does anyone know what the current state of program proving
is? How does this work? Does one write the proof by hand then
formalise it in code in a formal system until it "parses"? Can someone
give me some references on this. I am genuinely interested. I've read
some comments about SML being good for program proving (why?), and
that the Haskell type system amounts to giving a "proof" under some
circumstances. But it is baking my noodle how any of this has anything
to do with proving programs, especially in mathematics. The "monad"
idea in Haskell gave me some hope, because it sounds mathematical, but
I simply didn't understand it, at the end of the day.

Moreover, what is the current state of artificial intelligence, with
respect to automated theorem proving? Do we understand anything at all
about the process which leads to the discovery of proofs? I can't
shake the feeling that language design is intimately interwoven with
these concepts. But I can't figure out if it is just because Lisp was
developed as part of a funded AI programme, or if there is something
more to it. Is there more to it than that a computer language is a
"way of interacting with a machine", which is a kind of "artificial
intelligence"?

Will Sage eventually be artificially intelligent?

Bill.

Tim Daly

unread,
Aug 29, 2010, 10:09:53 PM8/29/10
to sage-...@googlegroups.com

Bill Hart wrote:
> Why is this entire thread not on sage-flame? What does software
> engineering, documentation, test code, etc. have to do with "Creating
> a viable free open source alternative to Magma, Maple, Mathematica and
> Matlab."?
>

Despite what appears to be competitive badgering I really do want Sage
to succeed.
But I don't want Sage to spread the impression that mathematical
software can't be trusted.

I badger because computational mathematics is my chosen field of study
and I feel it is
*vital* to raise the standards to professional, high quality,
trustworthy levels. If Sage is going
to be the M$ of mathematical software, will it also convince everyone
that math software
just gives highly questionable answers? Every program makes mistakes but
hand-waving
about "it's not my problem, it's the upstream code" gives the whole
field a bad reputation.
And I very much care about that. This *isn't* software, this is
*computational mathematics*.

If advocating such project goals is considered "sage-flame" material
then we all lose.

...[snip]...

> Why attack Sage. It is what it is. Why defend it. It certainly didn't/
> doesn't get everything right. One thing is for sure. Whatever is wrong
> with Sage, it is almost certainly too late to fix it. Whatever is
> right with Sage certainly made it popular.
>

*sigh*


> If you want to make Sage seriously innovate to solve one or more of
> the above, you need a *large* group of like-minded volunteers who can
> help you. You won't do it on your own, no matter how many years you
> work, nor how hard.
>
> Many of the things Tim and David say resonate with me. I'd really,
> really love a tremendously efficient, well-documented, reliable, Open
> Source mathematical project. Having seen how insanely difficult even
> just goal number 1 is for just the domain of arithmetic, I honestly
> think we haven't got a chance. Not ever. The expertise don't exist in
> sufficient quantity. And even those with the expertise, don't have the
> time. So, looks like we are stuck with what we got.
>

Re: "don't have the time"... Unlike any other software effort, programs
like Sage are timeless.
The integral of the sin(x) will always give the same answer, now or 30
years from now. Any
one individual does not have the time but "the project" does, assuming
it lasts. Would you
read a mathematics journal with the editorial policy "we'll print
anything because we don't
have the time?".


> By the way, does anyone know what the current state of program proving
> is? How does this work? Does one write the proof by hand then
> formalise it in code in a formal system until it "parses"? Can someone
> give me some references on this. I am genuinely interested. I've read
> some comments about SML being good for program proving (why?), and
> that the Haskell type system amounts to giving a "proof" under some
> circumstances. But it is baking my noodle how any of this has anything
> to do with proving programs, especially in mathematics. The "monad"
> idea in Haskell gave me some hope, because it sounds mathematical, but
> I simply didn't understand it, at the end of the day.
>

See http://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence
and http://userweb.cs.utexas.edu/users/kaufmann/tutorial/rev3.html#top
and http://coq.inria.fr/a-short-introduction-to-coq

The first one shows that writing programs and proving programs have a
correspondence. The second one shows ACL2 which is basically a way of
writing program proofs in lisp. The third one shows COQ which has a very
strong connection between mathematics and types.

In systems like Maxima and Axiom it would be possible to embed ACL2
directly into the system (lisp is lisp after all) and perform the proofs
inline.
Given the mathematical definition of a Ring and a function on the Ring
elements you could prove the function correct. For Axiom this is part of
the stated long term goals of the project.

On the other side (at the machine level) there is a program called FX, that
is, Function Extraction. FX grew out of the work by Richard Linger,
Harlan Mills, and Bernard Witt. It is being constructed at the Software
Engineering Institute. FX reads machine language, extracts the semantics,
and composes the instructions into blocks of behavior. You can't fully test
a program but FX covers *all* of the program behavior so you can identify
failing cases. See
http://daly.axiom-developer.org/TimothyDaly_files/publications/sei/Man75586.pdf
(disclaimer: I am one of the authors of the FX technology)

"Testing programs" is as ineffective as "testing theorems".
No matter how many examples you create, you don't have a proof.

Tim Daly

Bill Hart

unread,
Aug 29, 2010, 11:13:09 PM8/29/10
to sage-devel


On 30 Aug, 03:09, Tim Daly <d...@axiom-developer.org> wrote:
> Bill Hart wrote:
> > Why is this entire thread not on sage-flame? What does software
> > engineering, documentation, test code, etc. have to do with "Creating
> > a viable free open source alternative to Magma, Maple, Mathematica and
> > Matlab."?
>
> Despite what appears to be competitive badgering I really do want Sage
> to succeed.
> But I don't want Sage to spread the impression that mathematical
> software can't be trusted.

Whilst I certainly understand that you want Sage to succeed (I do too)
and I understand your concerns (I share them), I am much more
circumspect about what is actually possible.

>
> I badger because computational mathematics is my chosen field of study
> and I feel it is
> *vital* to raise the standards to professional, high quality,
> trustworthy levels.

I'll be blunt. I don't think Sage is ever going to do this. It was
once stated that having code accepted through the review process into
Sage should be considered equivalent to having a paper reviewed in a
top journal.

Whilst I once supported this as a goal for Sage, I no longer support
it. Knowing what I do now, as opposed to 9 months ago, say, about
coding, software, programming languages and even Sage, I cannot take
this seriously any more.

Sage is not designed to "raise these standards" and nor do I think it
should be a project goal to do so. It would be fatal to the project.

> If Sage is going
> to be the M$ of mathematical software, will it also convince everyone
> that math software
> just gives highly questionable answers?

I hope so. Because many people are currently under the illusion that
it does not.

The reason Sage is becoming the M$ of mathematical software (not quite
the analogy I'd make, but I think I understand your point), is that it
has stuck to some very important (social) rules, right from the start:

1) Anything is possible in 24 hours. Absolutely anything at all. Once
you start putting constraints on me, then less is possible. So if I am
not constrained, and I can do whatever I need, to satisfy you, then I
can do it in 24 hours. And why wait 24 years, when I can do it in 24
hours. The time it takes to get something done, even something useful,
depends only on how many constraints you add.

2) If some problem is exposed, there is no need to develop some
complicated, time consuming methodology to avoid the problem in
future. Simply work harder. If there are bugs, patch them. If there
are missing features, send me a patch.

3) Mathematicians respond very well to challenges. If you really want
something done, tell everyone it is really, really hard and you don't
think anyone will be able to do it, but you'd like to see someone try.
Add that you really think it will be nearly impossible. Make sure you
underline the fact that it is really, really hard, and no one has
succeeded so far. Keep repeating this. Eventually someone will take
the bait. This especially applies to problems all the experts agree is
hard. But here you need *young* players, who *know* the experts must
be wrong and that *they* will be the one to solve the problem (often
they are right).

4) If anyone does *anything* at all, make a big fuss about it and
email everyone you know who might be even tangentially interested in
what they have done. Under all circumstances encourage them to do
more.

> Every program makes mistakes but
> hand-waving
> about "it's not my problem, it's the upstream code" gives the whole
> field a bad reputation.

I don't recall who wrote that comment above, and I'm not going to
check. But they'll have a hard time living it down. I assume they were
not serious, or not thinking. Anyhow, it has been thoroughly debunked.

> And I very much care about that. This *isn't* software, this is
> *computational mathematics*.

Ah, that non-academic pursuit called *computational mathematics",
which everyone would like to become a hard core, well-funded academic
subject.

That is not going to happen because of Sage. That will happen because
of a single individual who has a brilliant new idea that puts the
whole subject on a solid theoretical footing. That new idea is going
to have to be absolutely brilliant. It will make our petty squabbling
look just that. The fact is, none of us has had that big idea. It's
still out there. Literate programming, using more "scientific"
languages like Lisp, theorem proving, program proving. None of these
is the solution. Sage surely has nothing to offer in this regard. It
is a social success, and it is giving a lot of people interesting
things to do. As a community, it is very valuable to be part of. The
software also happens to do a *lot* of stuff, even useful stuff. It is
very successful, but not in the sense you want it to be. That idea is
still waiting to be had.

Of course this "single individual with a brilliant new idea" seems to
contradict what I say below about needing a large group of volunteers
to help you meet your goals. Maybe the two are not talking about the
same thing. Maybe they are and don't actually contradict one another.
I haven't decided yet.

>
> If advocating such project goals is considered "sage-flame" material
> then we all lose.

It's only sage-flame material in that it misses the fact that the
community simply doesn't seem that interested in these goals.

>
> ...[snip]...
>
>
>
>
>
> > Why attack Sage. It is what it is. Why defend it. It certainly didn't/
> > doesn't get everything right. One thing is for sure. Whatever is wrong
> > with Sage, it is almost certainly too late to fix it. Whatever is
> > right with Sage certainly made it popular.
>
> *sigh*

Seriously? Do you think there is a missed opportunity here? Do you
really think this crowd should be gainfully employed working on a
project with different goals? Do you think they would?

> > If you want to make Sage seriously innovate to solve one or more of
> > the above, you need a *large* group of like-minded volunteers who can
> > help you. You won't do it on your own, no matter how many years you
> > work, nor how hard.
>
> > Many of the things Tim and David say resonate with me. I'd really,
> > really love a tremendously efficient, well-documented, reliable, Open
> > Source mathematical project. Having seen how insanely difficult even
> > just goal number 1 is for just the domain of arithmetic, I honestly
> > think we haven't got a chance. Not ever. The expertise don't exist in
> > sufficient quantity. And even those with the expertise, don't have the
> > time. So, looks like we are stuck with what we got.
>
> Re: "don't have the time"... Unlike any other software effort, programs
> like Sage are timeless.
> The integral of the sin(x) will always give the same answer, now or 30
> years from now. Any
> one individual does not have the time but "the project" does, assuming
> it lasts.

The problem is, that socially, the project will not continue to
succeed, unless it appears to meet the goals of those involved in the
project. Go ahead and prove me wrong, but I don't think any amount of
"advocating" is going to fundamentally change the goals of all those
people. Nor is it going to rewrite those 500,000 lines of code.

> Would you
> read a mathematics journal with the editorial policy "we'll print
> anything because we don't
> have the time?".

I don't believe the editorial policy, nor the peer review process
produces more accurate mathematical papers any more than it produces
bug free software. I am not under any illusions about the correctness
of the published literature any more than I am about the correctness
of code in Sage.

> By the way, does anyone know what the current state of program proving
> > is? How does this work? Does one write the proof by hand then
> > formalise it in code in a formal system until it "parses"? Can someone
> > give me some references on this. I am genuinely interested. I've read
> > some comments about SML being good for program proving (why?), and
> > that the Haskell type system amounts to giving a "proof" under some
> > circumstances. But it is baking my noodle how any of this has anything
> > to do with proving programs, especially in mathematics. The "monad"
> > idea in Haskell gave me some hope, because it sounds mathematical, but
> > I simply didn't understand it, at the end of the day.
>
> Seehttp://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence
> andhttp://userweb.cs.utexas.edu/users/kaufmann/tutorial/rev3.html#top
> andhttp://coq.inria.fr/a-short-introduction-to-coq

Thanks. I had heard of coq. I will devour the other references with
gusto. Some computer science type question bothered me for 9 years,
until I finally saw that it had been solved. I hope this one doesn't
keep me occupied for as long. I cannot get it out of my mind.

>
> The first one shows that writing programs and proving programs have a
> correspondence. The second one shows ACL2 which is basically a way of
> writing program proofs in lisp.

I read an example of this earlier this evening. It was on using Lisp
to prove the complexity of some sorting algorithm or something. I was
pretty unconvinced, however, that this was anything more than using
Lisp as a glorified calculator to symbolically prove the induction
required to prove the complexity.

Nonetheless, I nearly swallowed my food the wrong way when I realised
that Lisp separates symbols from the values they are bound to. And I
couldn't shake the feeling that the secret to program proving lay here
somewhere. I hope your references might lead me to be able to
understand why I feel that.

Somehow I also have the feeling that the fly-in-the-ointment with this
sort of thing turns out to be arithmetic. Sorry I can't put this gut
feeling into words that make sense. So I won't try.

> The third one shows COQ which has a very
> strong connection between mathematics and types.

Ok.

>
> In systems like Maxima and Axiom it would be possible to embed ACL2
> directly into the system (lisp is lisp after all) and perform the proofs
> inline.

Of course, Though it sounds like I can just use Lisp, and don't need
axiom or maxima.

> Given the mathematical definition of a Ring and a function on the Ring
> elements you could prove the function correct. For Axiom this is part of
> the stated long term goals of the project.

Just out of interest, what do you mean by prove correct? Do you mean,
prove the algorithm correct, or the implementation, or the complexity,
or that the function means what you agreed it means according to some
contract?

>
> On the other side (at the machine level) there is a program called FX, that
> is, Function Extraction. FX grew out of the work by Richard Linger,
> Harlan Mills, and Bernard Witt. It is being constructed at the Software
> Engineering Institute. FX reads machine language, extracts the semantics,
> and composes the instructions into blocks of behavior. You can't fully test
> a program but FX covers *all* of the program behavior so you can identify
> failing cases. Seehttp://daly.axiom-developer.org/TimothyDaly_files/publications/sei/Ma...
> (disclaimer: I am one of the authors of the FX technology)

Sounds interesting. I will take a look. Thanks.

>
> "Testing programs" is as ineffective as "testing theorems".
> No matter how many examples you create, you don't have a proof.

Yes. This is the level at which people should have confidence in
(current) mathematical software.

>
> Tim Daly

Bill.

David Kirkby

unread,
Aug 30, 2010, 2:41:15 AM8/30/10
to sage-...@googlegroups.com
On 30 August 2010 01:41, Bill Hart <goodwi...@googlemail.com> wrote:
> Why is this entire thread not on sage-flame? What does software
> engineering, documentation, test code, etc. have to do with "Creating
> a viable free open source alternative to Magma, Maple, Mathematica and
> Matlab."?

Everything!

Correct software engineering, (which by definition includes
documentation and test code) is a prerequisite for creating a viable
alternative to Magma, Maple, Mathematica and MATLAB.

I find it strange that you even ask such a question.

Dave

Dr. David Kirkby

unread,
Aug 30, 2010, 4:59:53 AM8/30/10
to sage-...@googlegroups.com
On 08/30/10 03:09 AM, Tim Daly wrote:
>
>
> Bill Hart wrote:
>> Why is this entire thread not on sage-flame? What does software
>> engineering, documentation, test code, etc. have to do with "Creating
>> a viable free open source alternative to Magma, Maple, Mathematica and
>> Matlab."?
> Despite what appears to be competitive badgering I really do want Sage
> to succeed.
> But I don't want Sage to spread the impression that mathematical
> software can't be trusted.

I don't think mathematical software can be trusted - at least not 100%. But then
no code can.

Humans make errors, and if I recall correctly there's a 10^-18 chance of an
uncorrected bit error in the CPU I use (Xeon 3.33 GHz W3580). When you think of
the clock rate, errors are probably more frequent that we would like.

In any case, I don't think the world at large will judge mathematical software
on the basis of Sage. People are not going to say "I don't trust Mathematica,
Maple or MATLAB" since errors have been found in Sage.

One thing to be aware of is the probability of errors in tables you use - but
them I'm sure you are aware of that. Wolfram Research claim to have found
numerous errors in tables in maths books. Of course, in general they don't do
the ethical thing and report such errors in a way that everyone can verify their
claims, and if correct put a line though that result in the their book.

But there are exceptions.

http://library.wolfram.com/infocenter/TechNotes/4196/

has a title of "Errors found by Mathematica in Gradshteyn and Ryzhik, "Tables of
Integrals, Series, and Products" (4th Edition)."

The contents are just a text file

http://library.wolfram.com/infocenter/TechNotes/4196/GradshteynRyzhik.txt

> If Sage is going
> to be the M$ of mathematical software, will it also convince everyone
> that math software
> just gives highly questionable answers? Every program makes mistakes but
> hand-waving
> about "it's not my problem, it's the upstream code" gives the whole
> field a bad reputation.

No, doing that gives Sage a bad reputation - you can not generalise that to all
mathematical software. Robert's statement, which I suspect he wishes he had not
made, would enhance the reputation of other software packages in comparison to
Sage.

As strange as he is, and as irritating as he is some times, Vladimir Bondarenko
has shown that there are numerous bugs in Maple and Mathematica. I don't know
all the techniques he uses (he keeps them secret as he hopes to market his
software), though I could postulate at some of them.

>> Many of the things Tim and David say resonate with me. I'd really,
>> really love a tremendously efficient, well-documented, reliable, Open
>> Source mathematical project. Having seen how insanely difficult even
>> just goal number 1 is for just the domain of arithmetic, I honestly
>> think we haven't got a chance. Not ever. The expertise don't exist in
>> sufficient quantity. And even those with the expertise, don't have the
>> time. So, looks like we are stuck with what we got.

I think sage's mission statement is overly optimistic.

I don't believe for one minute that Sage can ever become a viable alternative
for many people. It no doubt is for some. But that does make Sage useless. If I
thought Sage was useless, I would not bother subscribing to sage-devel and not
devote a considerable time to porting Sage to Solaris.

> "Testing programs" is as ineffective as "testing theorems".
> No matter how many examples you create, you don't have a proof.
>
> Tim Daly

Tim, you make some good points. You are clearly a talented person in your field.
But I suspect that it's simply not practical for the Sage community, Wolfram
Research, Mathworks to prove their code correct. All such projects we grind to a
snail's pace if they tried to do it. You need to be realistic about what is, and
what is not reasonably practical.

I know Sage does not have an aim to execute Mathematica programs directly, but
it would be incredibly useful if there was a Mathematica -> Sage translator. The
Rubi test suite is already available in Mathematica format. Creating such a tool
would be an interesting student project. Unfortunately, it would require
knowledge of Sage, Mathematica, parsers and a whole lot of other skills that
would probably be hard to find in one individual student.

Dave

Robert Bradshaw

unread,
Aug 30, 2010, 3:51:10 PM8/30/10
to sage-...@googlegroups.com
On Sun, Aug 29, 2010 at 1:56 AM, Tim Daly <da...@axiom-developer.org> wrote:
> tl;dr old curmudgeon flaming on about the dead past, not "getting it" about
> Sage.
>
> Robert Bradshaw wrote:
>>
>>
>> In terms of the general rant, there are two points I'd like to make.
>> The first is that there's a distinction between the Sage library
>> itself and the many other spkgs we ship. By far the majority of your
>> complaints have been about various arcane spgks. Sage is a
>> distribution, and we do try to keep quality up, but it's important to
>> note that much of this software is not as directly under our control,
>> and just because something isn't as good as it could be from a
>> software engineering perspective doesn't mean that it won't be
>> extremely useful to many people.  Even if it has bugs. We try to place
>> the bar high for getting an spkg in but blaming the Sage community for
>> poor coding practices in external code is a bit unfair. I hold the
>> Sage library itself to a much higher standard.
>>
>
> The point that "software is not as directly under our control" is not really
> valid.
>
> This is a *design* decision of Sage, not a necessary evil. Sage specifically
> chose
> to build on dozens of other pieces of software.

I think reusing the large body of existing code is a necessary evil to
achieve our goals. Unfortunately, I think it is also a huge source of
problems, and we can all agree the code is of varying quality (some of
it's great, some is not so much...).

> Once you make spkg functionality part of Sage functionality, you own it.

To fully "own" the code we need to either (1) fork (2) get the code
fixed upstream or (3) re-write it entirely ourselves. We have taken
all three of these routes in various cases, but all take a huge amount
of duplicated effort.

Sometimes the route we currently take is "this code exists, let's try
to make use of it" which has compromises. So there is the large number
of spkgs, some of which cause many headaches, that few people work on,
and then there is the core library which, though not without its
issues, I feel is in better shape and where most of the work goes.

> The statement that Sage tries "to place the bar high for getting an spkg in"
> isn't
> actually much of a claim. I've watched the way spkgs get voted onto the
> island
> and it usually involves a +1 by less than half a dozen people. Would you
> really
> consider this to be placing "the bar high"? I'd consider developing a test
> suite,
> or an API function-by-function code review, or a line-by-line code review to
> be placing the bar high. At the moment I see Sage writing test cases for
> python
> code but I don't see the same test cases being pushed into the spkgs. Even
> where
> external test cases are available (e.g. the computer algebra test suites for
> Schaums
> and Kamke) I don't see them being run.

You're right, the bar isn't high. My main point is that we are trying
to raise it. It used to take almost nothing for an spkg to go in.

> From a software engineering perspective there are some things that *are*
> directly under Sage control such as the pexpect interfaces. How carefully
> are these designed? Just yesterday I saw comments about the gensym
> (question-mark variables) connections to Maxima not being handled. This
> syntax is not a new Maxima feature so a pexpect interface design could have
> taken this into account but it did not. Each pexpect interface should be
> designed
> to be automatically constructed from the BNF of the underlying spkg. This
> would eliminate the mismatch by design and be good software engineering.
>
> The conclusion that "blaming the Sage community for poor coding practices
> in external code" as being "a bit unfair" is not valid. While it is grossly
> unfair to
> assume that spkgs are of poor quality, if your *design* calls for using
> materials
> of "unknown quality" it seems that a very large portion of your effort
> *must*
> involve quality reviews of spkgs. End users just see Sage.

Personally, I think there's a distinction between "the Sage community
writes code of questionable quality" and "the Sage community uses code
of questionable quality." Now I'm not saying that everyone here has
excellent software development skills (which is far from the truth)
but what I do see that I think is disingenuous is the comments I see
of "spkg x.y.z has compiler warnings, the Sage community doesn't know
how to write good code."

Would I like to see such issues fixed? Yes, for sure. But sometimes
treating an spkg as a black box that does what you ask it too gets the
job done. Hopefully over time the poorly-written or poorly maintained
packages get fixed/replaced. (I see the spkg model staying with us for
a long time, hopefully the average quality going up--there are a lot
of solid ones.)

Yes, spkgs are a problem.

> We are now over 50 years
> into the development of computational mathematics and Sage has the goal of
> competing
> with systems developed in the 1970/1980s, over 30 years ago. This would be a
> great
> thing if Sage were to deeply document the algorithms, develop the standards,
> and/or
> prove the code correct but I don't see anyone advocating any of these. I
> don't see anyone
> advocating alternative ideas that would "raise the bar" in computational
> mathematics.
>
> Even in the area of education I don't see anyone hammering on the NSF to
> fund more
> efforts in computational mathematics. I don't see pushback to NIST to
> standardize the
> algorithms. Obama wants to bring science back to life and encourage
> research. As the
> largest group of academics I would wish that you would petition the funding
> sources.
> Even if all of the funds went to Sage I'd still feel that this was
> worthwhile.
>
> In short, I don't see *change*.

If I understand you correctly, you want to set the goal for Sage much
higher than just a free, open alternative to the Ma*s.

- Robert

Robert Bradshaw

unread,
Aug 30, 2010, 4:11:52 PM8/30/10
to sage-...@googlegroups.com

Yep. One quote I like is "Add tests until fear turns into boredom"
(not sure who that's originally from). Of course relative tolerance
of fear and boredom can vary from individual to individual.

>> I'd consider developing a
>> test suite,
>> or an API function-by-function code review, or a line-by-line code
>> review to
>> be placing the bar high.
>
> Yes, though one does need to be practical about it. Those sorts of things
> are essential in code for specific applications (medical, aeronautical), but
> are probably not practical for Sage. I doubt anyone at Wolfram Research has
> ever gone through every line of ATLAS code, but they use ATLAS.

+1. We certainly have room for improvement here (and kudos to you,
David, in particular for doing a lot of work in cleaning/testing what
we do have). It's a question of allocation of limited resources.

>> Still to come will be the "code rot" issue. Open source packages tend to
>> have a
>> very small number of active contributors. Projects tend to stop when
>> those people
>> drift away.
>
> I think this can be avoided to some extent by not adding to the core Sage
> library very specialised items that are only of use to a few people. Just
> because person X developers some code during his PhD, no matter how useful
> that may be to him, I don't think it needs to be a standard part of Sage if
> its only going to be used by very few people.

I agree. The gulf between standard and optional is too large.

>> Now that the wave of new spkg adoption has slowed I expect to see a
>> growing
>> need for maintaining "upstream" code. By *design*, their problems are
>> now your
>> problems. Who will debug a problem that exists in 500,000 lines of
>> upstream code?
>> Who will understand the algorithms (e.g. sympow) written by experts,
>> some of
>> whom are unique in the world, and debug them?
>
> How do you expect Wolfram Research, Maplesoft and similar deal with such
> issues? They must hit them too. I suspect they have a few nightmares with
> this, but the best way is probably to have decent documentation. If code is
> well commented, and has references to papers where the algorithms are
> published, then it sill probably be maintainable.

They don't have a public bug repository, show their code, or let
people compile it on their own machines. As an employer, they have
more power to tell people what to work on. And they've got a lot more
experience at it. Whatever their "secret," I doubt they'll divulge if
we ask them :). In what I've seen, they are also aiming at (or at
least hitting) a much smaller target audience in terms of cutting edge
research.

- Robert

Robert Bradshaw

unread,
Aug 30, 2010, 4:51:18 PM8/30/10
to sage-...@googlegroups.com
On Sun, Aug 29, 2010 at 3:47 AM, Dr. David Kirkby
<david....@onetel.net> wrote:
> On 08/29/10 07:07 AM, Robert Bradshaw wrote:
>>
>> On Fri, Aug 27, 2010 at 12:37 PM, Dr. David Kirkby
>
> Of course it could be a hardware error, but if so it was not logged as such.
> But I've only seen that error once, and can't reproduce it, unlike some
> other issues, which seem to be related to pexpect. But I'll post more
> details later, when I have done some more testing.

Does sound like your hardware is solid (though you never know :-).

There are two different goals that people have here. One is to build a
solid, bug free piece of mathematical software, ideally conforming to
all the good software engineering principles, building everywhere,
well tested, etc. The other goal that people have here is to make Sage
into something useful for their current research and teaching needs.
While these two goals certainly complement each other, they have
vastly different priorities.

My concern is that if too much effort is placed on the former, Sage as
a platform for mathematicians to develop and share *useful* code will
suffer. No matter how well designed and bug free a piece of software
is, if it isn't useful, or won't be for another 30 years, than it
looses value and will have a hard time prospering. (Some things are
worth laying foundations for decades into the future, but I wouldn't
call software one of them.)

I don't have a solution to this dilemma. Streamlining the review
process would help, but might not be enough. Perhaps we need to have a
unstable branch (don't know how we'd create a "stable" trusted branch,
take stuff out?) that has lower standards, and a stable branch with
higher standards, and code could migrate from one to the other. This
is both a technical and social question.

It's still a huge step up from "I wrote some code, here's what I
found" or even "I'll post a tarball to my website, good luck."

>> This includes in particular many of the spkgs that
>> have been grandfathered in and wouldn't make the cut now, but it takes
>> time to remove/replace/clean them up. Of course there's room for
>> improvement, but do you think the current review process is
>> insufficient and lots of new bad code is being written and included?
>
> I don't want to get into a rant, but there's one developer, who I would
> rather refer to anonymously, but he complained when I did that, so I'll name
> him - Nathann Cohen.
>
> He seems a nice guy, but wrote in an email that was circulated to about 6
> people (myself and William included), that he did not want to worry too much
> about how code might fail, but rather fix the problems if there are bugs
> reported. I think that is very bad practice. But not one single person on
> that list pointed out this flaw in this logic. I raised the issue - perhaps
> I overreacted, but there was nobody that actually told him that his methods
> were wrong.

I don't know if I saw this email, but I'd have a problem with this
development style as well.

> I think there's simply a lack of the right mix of skills in developers.
>
>> If so, what should we do better?
>
> * I think a good start would be to try to attract some compute science
> students to do some projects looking at how quality could be improved. In
> essence, I don't believe Sage has the right mix of skill sets.

Welcome, all CS students! On a more serious note, we have had one CS
student look at the security of the notebook for a master's thesis,
but more would be nice.

> An interesting project or two could be made by getting someone with a more
> appropriate skill set to come up with some suggestions.
>
> Doing this, might broaden the user base of Sage too.
>
> * I don't know if William has funding to buy, and encourage developers to
> read some books on software engineering. I think there's a lack of awareness
> of what is considered good practice, and what is just asking for trouble.
>
> Software engineering is not my background, though I believe I know more
> about it than many of the developers. That's only because I've realised that
> to be professional at something, one must know what the professionals do. As
> a chartered engineer, I try to maintain professional standards.
>
> He can buy me a copy of
>
> http://www.amazon.com/Software-Engineering-9th-Ian-Sommerville/dp/0137035152/ref=sr_1_1?ie=UTF8&s=books&qid=1283077786&sr=8-1
>
> if he wants.

Some of what we do may be due to ignorance, but, as I've said before,
it's often a question of allocation of resources (mostly time).

> * If there was funding, then pay someone with a strong background in a
> commercial environment with knowledge of how to develop software. Someone
> good would cost a lot of money. He/she should be interviewed by someone who
> knows the subject well. Perhaps ask a prof from the CS dependent to be on an
> interview panel.

If there was funding, most people would probably rather spend it on
someone who could develop new algorithms to further the state of
mathematical research. If there were funding for many people, perhaps
a percentage could go to funding someone with as CS background just to
focus on software development issues.

> * Have regular "bug fix only" releases, where the aim is just to increase
> the quality, and not to add features.
>
> Nathann Cohen has said "-1" to that idea. William has said it would put
> people off, and at least one other developer (forget who) did not like it.
> But I feel it's probably one of the easiest ways to improve Sage.

I'm unconvinced that this would help, and agree it could put people
off. We could bump all non-bugfix tickets to the next release but I
don't think it'd increase the actual number of bugs fixed. (Bug-fix
Sage days work well though.)

> * Have a system like gcc, where the releases numbers x.y.z mean something.
> Only have z != 0 on bug-fix-only releases. Make it clear on the web site
> that the the x.y.0 are less well tested.

+1

> * Make release candidates available for anyone to report on.

+1

> * Have longer times between the "final" release candidate and a release
> date. I expressed concern that 4.5.3.alpha2 was going to be released last
> Monday and 4.5.3 released on Friday. Luckily that has not happened.

+1

> * Something I suggested before, which would not be perfect, but would be
> useful, is to have a "risk factor" on trac tickets.

That's an interesting idea.

> * I think William is spot on this blog post.
>
> http://389a.blogspot.com/2010/08/computational-projects-part-1.html
>
> There should be a difference in code quality that one developers for
> oneself, and what gets put into Sage. It sounds like William will be
> developing things for himself, which wont be committed to Sage, as he will
> not have time to document/test them sufficiently well. That's a good sign.
>
> But I think a lot of the code in Sage is very specialised things, that are
> only useful to a very small number of people. IMHO, that should be in
> external packages which people include if they want them.  These bits of
> code will be unmaintainable if the person including them leaves. I don't
> think they really should be in the core Sage code. I think the R model is
> more appropriate.

I do feel that code improves going through the review process and
getting it up to snuff to get into Sage. It'd be sad if there's
decreased motivation to do so (but I totally understand).

Something like adding doctests to otherwise stable code could make a
student project.

>  * Spend more time on implementing well the things we have, rather than go
> onto something else. There was for example a complaint on sage-support about
> a lack of documentation for plotting 3D, with not all options documented.

This is a volunteers/scratch your own itch issue. It would be nice and
useful if it were easier for end users to add to the documentation.

> Someone said the R interface is "rough around the edges". I'd like to see
> less emphasis on sticking things into Sage, and a bit more on not having
> things that are "round around the edges".

Yes. Sometimes we just take the best that's out there rather than not
have it at all.

>  * Have design documents, which document how specific areas will be done. It
> seems at the minute that someone has an idea for new bit of code, they
> create some code for the library, it gets reviewed and committed. I don't
> see any real design documents.

We used to do "sage enhancement proposals" now and then, but I haven't
seen one in a while.

>  * Run the self-tests for the packages. In many caes upstream packages have
> test code, but the people that added the .spkg's never bothered looking at
> running the tests.

+1

>  * Tim, who clearly knows more about maths than me, has also given a list.
>
>  * There are more I could think of, but its pointless listing them all. It
> really does need someone with knowledge of best industry practices to look
> at what happens.
>
>  * Stable and unstable branches, though that might be impractical due to the
> lack of people wanting to be release managers. I think the bug-fix-only
> releases would reduce the need for this a little, especially if people
> wanting a stable release were advised to download a bug-fix-only release.

The spkgs don't fit well into the revision control/branching concept,
but perhaps it could be done.

- Robert

Tim Daly

unread,
Aug 30, 2010, 5:59:58 PM8/30/10
to sage-...@googlegroups.com

> How do you expect Wolfram Research, Maplesoft and similar deal with such
> issues? They must hit them too. I suspect they have a few nightmares with
> this, but the best way is probably to have decent documentation. If code is
> well commented, and has references to papers where the algorithms are
> published, then it sill probably be maintainable.


Actually I've had the same "literate programming", deep documentation
discussion with
developers from Wolfram and Maple. According to them, neither program
has good
internal documentation. They can rely on the fact that they can pay
people to maintain
the source and the "knowledge continuity". I have begged them to improve it.

Why would *I* care? Because of the future. Think about things in the
long term.

There are very few companies that last 50 years. Indeed, Maplesoft was
bought recently.
What would happen if Wolfram and MapleSoft died? Well, in the worst
case, the failing
company just disappears. Nobody can run MMA or Maple code anymore. This
creates
a *huge* black hole in computational mathematics. (Think Derive and
Texas Instruments).

There is the possibility that the code might be released but it is a
very slim chance.
Companies that go bankrupt get their assets sold. The source code would
be the major
asset and there would be a lot of people wanting to get paid so they
won't likely make
the code open source. (Think Macsyma and Symbolics)

But lets assume the best case, that MMA and/or Maple became open source.
Here I have a great deal of experience because *I* got the open source
version of
Axiom which was a very large, very complex program. Axiom was sold
commercially
as "one of the big 3". Internally it was mostly code with about 3 lines
of useful
documentation. The "released" version of the code lacked the front end
(saturn),
the back end (nag libraries), the new compiler (aldor), and required a
running
Axiom in order to build Axiom so it had to be restructured to
self-bootstrap.

The only reason that Axiom came back to life is that I was one of the
original
developers and had a deep understanding of the system internals. I'm not
bragging,
I'm simply pointing out that it is difficult to organize and repair a
"million things of code".
Even with my deep understanding of internals (some of which I wrote) it
still took a
year to make it available. I found that I couldn't understand my own
code and I always
try to write dirt-simple code, unfortunately without comments.

Now imagine that someone handed you the (partial but mostly complete)
source code for
MMA or Maple. And imagine that it contained 3 useful comments.
Is there any chance of rebuilding these programs into runnable systems?
I think not.

Now imagine that MMA and Maple were fully literate and could be read
like a novel.

Which would you prefer?

Which scenerio benefits computational mathematics in the long term?

Now apply the same lesson to Sage. Assume that 30 years from now, none
of the
original developers are connected with the code and there is no one to
ask. It will happen.

Tim Daly

rjf

unread,
Aug 30, 2010, 8:12:45 PM8/30/10
to sage-devel

I read this thread with some interest. I'm not sure anything new is
being said; just different
people saying it. Well, not all different.

I am curious as to the claim that Sage is very popular.
Compared to?

Maxima for Windows has seen something like 285,000 downloads. But
that includes people
who downloaded various versions. The most popular one had 80,600
downloads;
the most recent release (8/13/2010) has had 7,500 in the last 17 days.

Presumably some people use Maxima who don't download it from
sourceforge, where I found
this data. I suppose people who download Sage do not count in these
totals.

I know that Sage does all kinds of other things, simulating Magma and
whatever. Is that
what you are comparing it to?

Is there some statistical summary somewhere?

As for whether this belongs on sage-flame or not.. I think this
discussion is totally mainstream,
but then, so are a number of discussions on sage-flame.

RJF


Bill Hart

unread,
Aug 30, 2010, 9:09:30 PM8/30/10
to sage-devel
I think/suspect Sage is more popular than Maxima if you go by all time
download statistics.

I don't think it is as popular as Maple or Mathematica yet.

I think the claim was that it is becoming the M$ of mathematical
software. I suspect that means "default standard" or something.
Actually, I didn't ask. Tim, what does it mean?

Bill.

Bill Hart

unread,
Aug 30, 2010, 9:37:43 PM8/30/10
to sage-devel


On Aug 30, 8:51 pm, Robert Bradshaw <rober...@math.washington.edu>
wrote:
I think the thing that is required is:

(4) carefully test the upstream functions that we use in Sage.
(5) review the upstream code, at least each function used in Sage.

I know it won't happen though, because there's too much of it to
review and too many functions to test. It's too late to fix this
problem.

Sage may never have gotten off the ground if this were a requirement.
It would grind to a halt if this became one. So it will never happen.

>
> Sometimes the route we currently take is "this code exists, let's try
> to make use of it" which has compromises. So there is the large number
> of spkgs, some of which cause many headaches, that few people work on,
> and then there is the core library which, though not without its
> issues, I feel is in better shape and where most of the work goes.

I think you mean that most of the *reviewing* work goes into the Sage
library code.

There's plenty of work going into the spkgs. That work just isn't
necessarily under the direct control of the Sage project.

How many lines of Sage python/cython code are there? About 1.5M or so?

GMP/MPIR and FLINT together gets up towards 0.5M lines of code, and
that's just two of nearly 100 spkgs....

>
>
>
>
>
> > The statement that Sage tries "to place the bar high for getting an spkg in"
> > isn't
> > actually much of a claim. I've watched the way spkgs get voted onto the
> > island
> > and it usually involves a +1 by less than half a dozen people. Would you
> > really
> > consider this to be placing "the bar high"? I'd consider developing a test
> > suite,
> > or an API function-by-function code review, or a line-by-line code review to
> > be placing the bar high. At the moment I see Sage writing test cases for
> > python
> > code but I don't see the same test cases being pushed into the spkgs. Even
> > where
> > external test cases are available (e.g. the computer algebra test suites for
> > Schaums
> > and Kamke) I don't see them being run.
>
> You're right, the bar isn't high. My main point is that we are trying
> to raise it. It used to take almost nothing for an spkg to go in.

It's not really just about raising a bar. One part of a spkg may be
very solid and another part very poor. Without some finer control over
what is used by Sage, the problem will always exist.

As it is at the moment, spkgs get in based on needed functionality and
a cursory examination of a few vitals, not because of a systematic
review of all the code in them.

Of course, excluding packages that are not actively maintained might
be helpful. The trouble is, any rule you make, there'll always be
exceptions.

>
>
>
>
>
> > From a software engineering perspective there are some things that *are*
> > directly under Sage control such as the pexpect interfaces. How carefully
> > are these designed? Just yesterday I saw comments about the gensym
> > (question-mark variables) connections to Maxima not being handled. This
> > syntax is not a new Maxima feature so a pexpect interface design could have
> > taken this into account but it did not. Each pexpect interface should be
> > designed
> > to be automatically constructed from the BNF of the underlying spkg. This
> > would eliminate the mismatch by design and be good software engineering.
>
> > The conclusion that "blaming the Sage community for poor coding practices
> > in external code" as being "a bit unfair" is not valid. While it is grossly
> > unfair to
> > assume that spkgs are of poor quality, if your *design* calls for using
> > materials
> > of "unknown quality" it seems that a very large portion of your effort
> > *must*
> > involve quality reviews of spkgs. End users just see Sage.
>
> Personally, I think there's a distinction between "the Sage community
> writes code of questionable quality" and "the Sage community uses code
> of questionable quality." Now I'm not saying that everyone here has
> excellent software development skills (which is far from the truth)
> but what I do see that I think is disingenuous is the comments I see
> of "spkg x.y.z has compiler warnings, the Sage community doesn't know
> how to write good code."

You forget that numerous members of the Sage community write spkgs,
not just python. And just because C gives compiler warnings instead of
leaving everything to runtime or not giving feedback at all, doesn't
make the python/cython code more solid. I am certain the python/cython
code in Sage is just as much to blame for the bugs in Sage.

>
> Would I like to see such issues fixed? Yes, for sure. But sometimes
> treating an spkg as a black box that does what you ask it too gets the
> job done. Hopefully over time the poorly-written or poorly maintained
> packages get fixed/replaced. (I see the spkg model staying with us for
> a long time, hopefully the average quality going up--there are a lot
> of solid ones.)
>

The other option would be to reimplement everything in cython, except
for a very small core which needs to be in assembly.
I don't think there is anything wrong with Tim's sentiments. He's
really just directing them at the wrong project. Sage doesn't have
those aims, nor do any of the MA*'s. He needs a project with those
specific aims that he can direct those sentiments towards.

Unfortunately, the number of people who will think it is sexy to work
on such a project is pretty small. So it really is a project that will
take 30 years. Who knows if it can keep pace with the development of
computers over that time.

Bill.

rjf

unread,
Aug 31, 2010, 12:23:50 AM8/31/10
to sage-devel
The idea that Sage persons should re-implement something in Python or
Cython is based on a notion that there is a problem with existing and
(apparently) working code, because it is written in an
(allegedly) unsuitable implementation language. Furthermore, this
notion also extends to a claim that this problem can be remedied by
assigning this procedure to an (apparently) inexperienced programmer
who is (apparently) not necessarily mathematically well-versed, and
has limited time (e.g. summer) to do the job.

Yet even "easy" tasks like multiplying polynomials can require some
subtlety to do fast.

And "difficult" tasks like (say) symbolic integration fail to have
even one entirely adequate implementation to use as a model.

Using other peoples' programs is, of course, not such a great idea if
those programs are buggy, but
distributing your own programs with bugs is not really much of an
improvement.

Just my few cents.


RJF




Robert Bradshaw

unread,
Aug 31, 2010, 1:10:06 AM8/31/10
to sage-...@googlegroups.com

My point exactly.

>> Sometimes the route we currently take is "this code exists, let's try
>> to make use of it" which has compromises. So there is the large number
>> of spkgs, some of which cause many headaches, that few people work on,
>> and then there is the core library which, though not without its
>> issues, I feel is in better shape and where most of the work goes.
>
> I think you mean that most of the *reviewing* work goes into the Sage
> library code.

Yes, though I would say that most of the work done by people, e.g. on
this list, goes into the Sage library code as well.

> There's plenty of work going into the spkgs. That work just isn't
> necessarily under the direct control of the Sage project.
>
> How many lines of Sage python/cython code are there? About 1.5M or so?
>
> GMP/MPIR and FLINT together gets up towards 0.5M lines of code, and
> that's just two of nearly 100 spkgs....

I don't mean to discount the work that goes into spkgs at all--after
all I work on Cython myself. And there are spkgs like Python and
SQLite that are heavily worked on completely outside our community.

>> Would I like to see such issues fixed? Yes, for sure. But sometimes
>> treating an spkg as a black box that does what you ask it too gets the
>> job done. Hopefully over time the poorly-written or poorly maintained
>> packages get fixed/replaced. (I see the spkg model staying with us for
>> a long time, hopefully the average quality going up--there are a lot
>> of solid ones.)
>>
>
> The other option would be to reimplement everything in cython, except
> for a very small core which needs to be in assembly.

I'm not holding my breath :).

Like Axiom.

I wasn't saying that those weren't worthwhile goals, just that they're
not Sage's.

- Robert

Rob Beezer

unread,
Aug 31, 2010, 2:01:31 AM8/31/10
to sage-devel
On Aug 29, 5:41 pm, Bill Hart <goodwillh...@googlemail.com> wrote:
> Why attack Sage. It is what it is. Why defend it. It certainly didn't/
> doesn't get everything right. One thing is for sure. Whatever is wrong
> with Sage, it is almost certainly too late to fix it. Whatever is
> right with Sage certainly made it popular.

"Whatever is right" is easy. When you want to explore a mathematical
topic programmatically you don't need to start from scratch. There's
high-precision arithmetic (Bill Hart did that), there's graph
isomorphism (Robert Miller did that), there's exact linear algebra
(William Stein, Robert Bradshaw, David Kohel, etc, etc did that). You
can write what interests you and pull in great code for the parts you
need but can't write or don't want to write. And when it is wrong
(not if), you can isolate the problem, and if you can't fix it,
there's a good chance somebody else will care enough to fix it.
Sometimes even promptly. And in the process Sage gets incrementally
better. That's the beauty of Sage for me and I don't believe it can
be said about any other project.

Rob

Tim Daly

unread,
Aug 31, 2010, 3:15:06 AM8/31/10
to sage-...@googlegroups.com

> I think the claim was that it is becoming the M$ of mathematical
> software. I suspect that means "default standard" or something.
> Actually, I didn't ask. Tim, what does it mean?
>
I was making the assumption that Sage managed achieve "success" by being
widely
adopted and replacing the 4Ms. The bulk of the discussion rests on that
assumption.
If that assumption is not true and Sage disappears, nobody cares.

Tim Daly

Tim Daly

unread,
Aug 31, 2010, 4:46:21 AM8/31/10
to sage-...@googlegroups.com

>
> If I understand you correctly, you want to set the goal for Sage much
> higher than just a free, open alternative to the Ma*s.
>
> - Robert
>
>
Yes, but why am I trying to do that?

Computation mathematics is a new field of study, at least in the
symbolic area.
It is the child of the union of mathematics and computers. Unlike other
forms
of software, computational mathematics software (CMS) has the property that
it will always be able to give timeless answers, making it useful forever.

Being useful forever does not imply that the software survives forever.
You can argue that this is good darwinian attrition. But CMS software is
very hard to build and requires a great deal of very scarce expertise that
can disappear at any time (e.g. changing jobs, being hired into a company
like MapleSoft/Wolfram/Magma/etc., companies being bought or closed)
When that happens, and it will, then that portion of the software becomes
unmaintainable or unavailable.

The natural response to dead software is to write new software. You can
see this in CMS-land as there are over one hundred attempts at building
CAS programs (I collected them at one point, similar to the Sage goal).
But due to the expertise issue these programs don't get very far. Sage wants
to rewrite the Trager/Bronstein integration but that seems unlikely as the
required expertise (Bronstein) is dead and the code isn't documented (yet).

Sage is trying to avoid the darwin problem by gluing together existing
systems.
This is a very clever idea, a "second generation" response.

What I am trying to do is ask the question...
"What does a computational mathematics system need to do to live forever?"
in particular, in this forum,
"What does Sage need to do to live forever?"

Sage is exciting, fast moving, has no time to do it right, would die of
documentation ossification, couldn't possibly prove anything (as if proofs
are foreign to mathematics), needs to be released twice a day, is in a
foot-race with the 4Ms for mind-share, needs quantity not quality, will
let the next-generation-slubs document their work, is 3ns faster than
M*, etc.

I am one of the first generation dinosaurs trying to impart some of the
lessons
learned to the next generation. I am taking the long view, trying to
figure out
how to impart computational mathematics to my grandchildren. Will they still
be writing new polynomial packages and arguing over ZZ? Will they have to
watch yet another hundred threads discussion the same issues? Will they
suggest
that William Stein should move his comments to the flame thread? Or will
they have
a broad legacy of excellent CMS work upon which to build?

My experience tells me that William will move on, python will move on, the
funding will dry up, the students will be hired into real jobs and Sage
will die
of code rot as GCC/python/architecture/build-complexity/etc all work away
at its foundations. The system will devolve into tracking down hard bugs in
"other peoples code", people will find that hard without documentation
and not worth doing because it isn't flashy. Suppose William had instead
proposed that "Sage was a project to find and fix all the bugs in Maxima".
How many people would join? Now imagine Barry Brightspot proposing
"a project to find and fix all the bugs in Sage"....

To those who work on Sage... Why? Do you just want to build yet-another-CAS?
Do you just want a platform for your thesis work? Do you just want to
write code that gets thrown away? Or would you rather have your name
appear in the credits list of Sage-2090 as one of the Newtons of
computational mathematics?

I am advocating that Sage set its goals much higher than replacing the Ms.
If you don't, then my experience tells me that the project will die. If
you do
then the project may die anyway but other people can build on the remains
rather than from scratch. Or you just might have a formula for the
long-term.

What do *you* think needs to be done to make Sage live forever?
I have thought about this question for a long time and I'm trying to
pass it on.
Your experience may tell you that my suggestions are wrong but you'll
only be
able to know after the fact and by then it will be too late.

Anyway, I've said about all I want to say so I'm abandoning this topic.
Good luck and thanks for all the fish.

Tim Daly

Harald Schilly

unread,
Aug 31, 2010, 6:54:19 AM8/31/10
to sage-devel
On Aug 30, 11:59 pm, Tim Daly <d...@axiom-developer.org> wrote:
> Now apply the same lesson to Sage. Assume that 30 years from now, none
> of the
> original developers are connected with the code and there is no one to
> ask. It will happen.

I didn't read this thread but just about that comment: I think the
solution to that is constant reinvention. Hopefully new people will
join the project and from time to time parts of the system will be
rewritten. Sometimes forced (python 2 -> python 3) or sometimes just
out of necessity (coercion system). So, just like a living organism,
old parts die or are replaced by new parts that do the same or do it
better ... (right now, for example, I want to code a smartphone client
that communicates with sage,but the simple server api doesn't do what
I want. I guess I'll have to rewrite it. that's an example for old
code that will be rejuvenated out of necessity.)

H

Dr. David Kirkby

unread,
Aug 31, 2010, 6:32:36 PM8/31/10
to sage-...@googlegroups.com
On 08/30/10 09:51 PM, Robert Bradshaw wrote:
> On Sun, Aug 29, 2010 at 3:47 AM, Dr. David Kirkby
>
> There are two different goals that people have here. One is to build a
> solid, bug free piece of mathematical software, ideally conforming to
> all the good software engineering principles, building everywhere,
> well tested, etc. The other goal that people have here is to make Sage
> into something useful for their current research and teaching needs.
> While these two goals certainly complement each other, they have
> vastly different priorities.

IMHO, if Sage is to be a viable alternative to the 4M's, it needs to address the
former.

>>> If so, what should we do better?
>>
>> * I think a good start would be to try to attract some compute science
>> students to do some projects looking at how quality could be improved. In
>> essence, I don't believe Sage has the right mix of skill sets.
>
> Welcome, all CS students! On a more serious note, we have had one CS
> student look at the security of the notebook for a master's thesis,
> but more would be nice.

That's something William and others need to actively seek out though - ask in CS
departments.

One CS student project is useful, but that must be a very small fraction of the
total number of student projects.

>> He can buy me a copy of
>>
>> http://www.amazon.com/Software-Engineering-9th-Ian-Sommerville/dp/0137035152/ref=sr_1_1?ie=UTF8&s=books&qid=1283077786&sr=8-1
>>
>> if he wants.
>
> Some of what we do may be due to ignorance, but, as I've said before,
> it's often a question of allocation of resources (mostly time).

But if anyone is going to spend a lot of time doing something, it would make
sense to reduce ones level of ignorance.

I believe the time/effort used to find out how to write better software, would
reap benefits in the medium and long term.

>> * If there was funding, then pay someone with a strong background in a
>> commercial environment with knowledge of how to develop software. Someone
>> good would cost a lot of money. He/she should be interviewed by someone who
>> knows the subject well. Perhaps ask a prof from the CS dependent to be on an
>> interview panel.
>
> If there was funding, most people would probably rather spend it on
> someone who could develop new algorithms to further the state of
> mathematical research. If there were funding for many people, perhaps
> a percentage could go to funding someone with as CS background just to
> focus on software development issues.

I suspect you are right - people would rather spend the money on someone
developing algorithms. But what area? Ask 20 random Sage developers and you are
likely to get 15 different answers.

I think this was clear when William tried to get a list of the most annoying
bugs. There were barely any common ones. People have different interests. As
such, an extra person working on algorithms in field X is probably only going
to benefit a small fraction of the Sage community. In contrast, someone whole
could improve the procedures could benefit everyone.

A year spent cleaning up Sage's procedures would benefit everyone - a year spent
developing algorithms would probably have far less overall positive impact.

>> * Have regular "bug fix only" releases, where the aim is just to increase
>> the quality, and not to add features.
>>
>> Nathann Cohen has said "-1" to that idea. William has said it would put
>> people off, and at least one other developer (forget who) did not like it.
>> But I feel it's probably one of the easiest ways to improve Sage.
>
> I'm unconvinced that this would help, and agree it could put people
> off. We could bump all non-bugfix tickets to the next release but I
> don't think it'd increase the actual number of bugs fixed. (Bug-fix
> Sage days work well though.)

By bug-fix release, I should elaborate. I would include

* Bug fixes
* Added documentation.
* Extra test code

I think having releases where new features were not introduced, but those three
areas were addressed, it would result in a concentrated effort to reduce the bugs.

But, that was NOT the main reason for suggesting it.

My reason was that basically by having 'bug fix' releases, you would create some
releases which are more stable than others. Those would be more suitable for
people who don't want to keep upgrading every couple of weeks. They might chose
to have stability in preference to features. I believe there are a lot of people
in that category.

>> * Have a system like gcc, where the releases numbers x.y.z mean something.
>> Only have z != 0 on bug-fix-only releases. Make it clear on the web site
>> that the the x.y.0 are less well tested.
>
> +1

But to do that effectively, one needs to have bug-fix only releases - just like
gcc does.

>> * Make release candidates available for anyone to report on.
>
> +1

I was pretty sure you were against that a few weeks ago - saying they should
subscribe to sage-devel and would be aware of them. Perhaps it was someone else.
Sorry if I'm mistaken.

I think making the release candidates public for a couple of weeks would be
good. I've lost count of the number of times a release is made, only for it to
be realised there's a fairly serious problem a couple of days later.

>> * Have longer times between the "final" release candidate and a release
>> date. I expressed concern that 4.5.3.alpha2 was going to be released last
>> Monday and 4.5.3 released on Friday. Luckily that has not happened.
>
> +1

We do seem to agree on quite a bit.

>> * Something I suggested before, which would not be perfect, but would be
>> useful, is to have a "risk factor" on trac tickets.
>
> That's an interesting idea.

I think one worth perusing.

>> * I think William is spot on this blog post.
>>
>> http://389a.blogspot.com/2010/08/computational-projects-part-1.html
>>
>> There should be a difference in code quality that one developers for
>> oneself, and what gets put into Sage. It sounds like William will be
>> developing things for himself, which wont be committed to Sage, as he will
>> not have time to document/test them sufficiently well. That's a good sign.
>>
>> But I think a lot of the code in Sage is very specialised things, that are
>> only useful to a very small number of people. IMHO, that should be in
>> external packages which people include if they want them. These bits of
>> code will be unmaintainable if the person including them leaves. I don't
>> think they really should be in the core Sage code. I think the R model is
>> more appropriate.
>
> I do feel that code improves going through the review process and
> getting it up to snuff to get into Sage. It'd be sad if there's
> decreased motivation to do so (but I totally understand).

Yes, I agree code would improve with the review process. I would not discount
having optional packages reviewed myself. I just feel that having very
specialised code in the core is not such a good idea. Arbitarily, I'd say if
less than 100 are likely to use it, then don't add it to the core. That would
make the core less susceptible to bit rot.

If you look at the Wolfram Research Library, where are a whole load of optional
packages available contributed by users. I assume they have gone through at
least some form of review before being put on the Wolfram web site.

Acutally, it's quite funny, since 2004 there has been a very nice library for a
polar plot of a list of data.

http://library.wolfram.com/infocenter/MathSource/2061/

Then in Mathematica 6, Wolfram Researhc added ListPolarPlot[]

http://reference.wolfram.com/mathematica/ref/ListPolarPlot.html

The funny thing is, the old contributed code for creating polar plots of a list
of data is *much* better than what's in the core Mathematica code.

> Something like adding doctests to otherwise stable code could make a
> student project.

Yes.

But also an equally good, perhaps even a better student project would be to look
at whether there are more appropriate ways to test some parts of Sage. Again,
that is for a CS type student.

>> Someone said the R interface is "rough around the edges". I'd like to see
>> less emphasis on sticking things into Sage, and a bit more on not having
>> things that are "round around the edges".
>
> Yes. Sometimes we just take the best that's out there rather than not
> have it at all.

But R itself is not "rough". It is a well respected package, with commercial
users and commercial support. But I'm told the Sage integration to R is rough.
I'd like to see a bit more emphasis on having things with smooth edges before
they get integrated.

>> * Have design documents, which document how specific areas will be done. It
>> seems at the minute that someone has an idea for new bit of code, they
>> create some code for the library, it gets reviewed and committed. I don't
>> see any real design documents.
>
> We used to do "sage enhancement proposals" now and then, but I haven't
> seen one in a while.

I've not seen one at all.

I have on and off worked on Sage for at least 5 years, though I lost interest at
one point when it was clear the Solaris fixes Michael had were not being
integrated.

>> * Run the self-tests for the packages. In many caes upstream packages have
>> test code, but the people that added the .spkg's never bothered looking at
>> running the tests.
>
> +1

This one really is a "no brainer" but unfortunately a very small fraction of
packages in Sage do this.

Dave

Dr. David Kirkby

unread,
Aug 31, 2010, 7:18:46 PM8/31/10
to sage-...@googlegroups.com
On 08/31/10 11:32 PM, Dr. David Kirkby wrote:
>
> If you look at the Wolfram Research Library, where are a whole load of
> optional packages available contributed by users. I assume they have
> gone through at least some form of review before being put on the
> Wolfram web site.
>
> Acutally, it's quite funny, since 2004 there has been a very nice
> library for a polar plot of a list of data.
>
> http://library.wolfram.com/infocenter/MathSource/2061/
>
> Then in Mathematica 6, Wolfram Researhc added ListPolarPlot[]
>
> http://reference.wolfram.com/mathematica/ref/ListPolarPlot.html
>
> The funny thing is, the old contributed code for creating polar plots of
> a list of data is *much* better than what's in the core Mathematica code.
> Dave
>

I conclude that Ted Ersek, from the Naval Air Warfare Center, Aircraft Division,
actually needed to create polar plots of lists of data, and knew how they should
be presented. In contrast, the person who wrote the code for the Mathematica
core was probably told to do so by their manager. They did not seem to
appreciate how people might want to plot the data.

That's one way open-source does have an advantage.

Dave

Bill Hart

unread,
Sep 1, 2010, 11:32:20 AM9/1/10
to sage-devel
Tim,

all screwing around aside for a moment. I broadly agree with your
sentiments. However, there are also some issues with what you are
suggesting. And I mean to make these observations in all seriousness.

One of the reasons we have been rewriting things like ZZ and ZZ[x] is
that there has been an enormous amount of new research in these
directions recently. It's perhaps not appreciated by people outside of
number theory, but multiplying huge integers and polynomials is an
*important problem* in some parts of computational number theory.

I have the utmost respect for Lisp. It is an innovative language which
is still way ahead of its time. The original papers on Lisp are some
of the most important in Computer Science. They are still being mined
actively today. I also have a respect for the desire to produce
something worth passing on to our grandchildren. I see Lisp as being
precisely such a thing.

But what doesn't seem to have been appreciated by the "old dinosaurs",
is that much of the work in systems like Axiom and Maxima has simply
been superseded by more recent research. This is a significant problem
CAS's have. There is no way of producing something which will stand
for all time. The very work it relies on is superseded. This is
completely independent of changes in computer architecture, compiler
design, language evolution, and other transients. This is one of those
pure things. Algorithms for doing these things are simply better than
they were 30 years ago. The way we compute with these fundamental
objects has changed, for all time. No amount of care in the past could
have prevented that old work from being obsoleted by these new
algorithmic advances. So I just don't think the goals you have are
actually in any sense reasonable.

Now if we stop talking about completely pure things for a moment and
return to reality, we also find that computer architecture is
changing. Software that once ran, needs to be ported to the new
architectures to keep it running. So it has to be constantly
rewritten. With some care, we can mitigate such things, but they
cannot be avoided altogether. In fact the only thing which doesn't
change is the mathematics itself. As you say, sin(x) is timeless. But
it is a timeless mathematical function, not a timeless piece of
software. The latter simply doesn't exist.

Again, if you look at languages, these are evolving too. Even Lisp
evolves. We've had Common Lisp, Scheme, PTL Scheme (now Racket), and
numerous revisions of the Scheme Report. Lisp is having new features
added in response to new theoretical developments in computer
languages. All languages are. This means that eventually you want to
rewrite all your code. Now rewriting 100's of thousands of lines of
code is a major undertaking. So you have to add that cost to the
burden of "keeping up".

The point I am trying to make is that creating something timeless is
impossible. It is impossible to keep up with the rate of advance in
all these areas, even if we plan very carefully. I realise that maybe
Axiom tries to define its own language so that if Lisp changes, only a
smallish core needs changing. Well, Axiom's language will also suffer
from obsolescence, eventually, no matter how clever it is. Imagine if
you just get done documenting all of axiom and making it literate and
adding more mathematics, i.e. getting to a satisfactory point with
respect to all your project goals, and then there is a revolution in
computing which fundamentally changes the way hardware works,
algorithms work or languages work. Suppose we communicate with
computers of the future in a completely different way. Axiom will
never have reached the point of being useful because it always lagged
so far behind the current theoretical cutting edge that it actually
became obsolete before it ever received widespread use.

When I work on FLINT, I want my code to have a heavy influence on the
generations which follow, much as I see NTL as having had a heavy
influence on me. This is in fact why I made the decision to rewrite
the whole of FLINT completely from scratch about a year ago or so. I
was very unhappy with what was on record. I wanted it to be "right".
But I do not expect my software to be used by future generations. It
would be nice if it influenced its successors. That is all I hope for
it.

Instead of thinking that we should take the time to get everything
right now, I have a completely different perspective on this (leave
aside most of my comments in earlier posts above, which weren't
terribly serious). What I see instead is that the theoretical
algorithmic work (the timeless part of our enterprise) is actually
advancing *faster* than we can keep up, even writing crappy code. It
might surprise you to learn that there are significant theoretical
advances in arithmetic with large integers which have never been
implemented. This is despite the fact that for many, many years there
has been feverish effort to keep up. The same is true for linear
algebra and polynomial arithmetic too. No package out there reflects
the state-of-the-art in theoretical development. Just the other day I
believe I found a way to compute polynomial signatures asymptotically
faster. This may have to wait until next year for implementation, as
we simply have so much to do that there just isn't time to implement
this now.

If I look at projects like Maxima and Axiom, I am not sure I see them
keeping up. Are they? My understanding is that Maxima may influence
future generations of people dealing with symbolics. But I had a look
at what Maxima has to offer in my field, and I really didn't find
anything. What does Axiom offer? What theoretically timeless things
does it pass on to the next generation. Surely not that sin(x) is
computed correctly. My hand held calculator does that (modulo some of
the digits perhaps being unreliable, depending on the calculator).

I did try to understand the Axiom project. I downloaded the source
code and documentation. I'm afraid that it seemed really incomplete
and I couldn't understand much of it at all. I imagined that the
"literate programming" was supposed to make it so that anyone could
understand the code. Having looked at the Axiom project, I am
uncertain that we have the same concept of what literate programming
is. And if that is the case, then perhaps it is a very personal thing.
That is a concerning conclusion. Can you formalise the notion of
literate programming? Does it have a solid basis in theory?

What I think we can pass on to our grandchildren, what is timeless and
pure, is not a computer algebra system, but theoretical advances, i.e.
ideas. Even if Lisp passes into obscurity, it has still influenced
radically almost all modern languages. The ideas in those original
papers are timeless. They have value, not the original Lisp
implementation, nor any current one.

Sage on the other hand, I think, just takes the view, "I want
something useful, now". For some people, they want something useful
for their research. For others they want if for their teaching. For
others they want it to play with. For others, they want it as a
glorified calculator, or perhaps for modelling work. It means
different things to different people. I do not see it as an enterprise
which is necessarily trying to innovate in any of the areas of
language design, software engineering, program proving, symbolic
integration or any of these other "pure" areas. That doesn't mean such
innovations don't arise from time to time in Sage. But it isn't, as I
see it, the primary goal of such a project. Perhaps it shouldn't be.

Bill.

Jason Grout

unread,
Sep 1, 2010, 9:08:21 PM9/1/10
to sage-...@googlegroups.com
On 9/1/10 10:32 AM, Bill Hart wrote:
> Tim,
>
> all screwing around aside for a moment. I broadly agree with your
> sentiments. However, there are also some issues with what you are
> suggesting. And I mean to make these observations in all seriousness.


I'm reading this thread with great interest. Though I don't have strong
feelings one way or the other at this time, your post made a lot of
sense to me. Thanks for the post.

Jason

--
Jason Grout

William Stein

unread,
Sep 2, 2010, 5:28:32 AM9/2/10
to sage-...@googlegroups.com

I've written a blog post that is relevant to this thread:

http://389a.blogspot.com/2010/09/purple-sage.html

--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

Tim Daly

unread,
Sep 2, 2010, 8:54:48 AM9/2/10
to sage-...@googlegroups.com

William Stein wrote:
> On Wed, Sep 1, 2010 at 6:08 PM, Jason Grout <jason...@creativetrax.com> wrote:
>
>> On 9/1/10 10:32 AM, Bill Hart wrote:
>>
>>> Tim,
>>>
>>> all screwing around aside for a moment. I broadly agree with your
>>> sentiments. However, there are also some issues with what you are
>>> suggesting. And I mean to make these observations in all seriousness.
>>>
>> I'm reading this thread with great interest. Though I don't have strong
>> feelings one way or the other at this time, your post made a lot of sense to
>> me. Thanks for the post.
>>
>> Jason
>>
>
> I've written a blog post that is relevant to this thread:
>
> http://389a.blogspot.com/2010/09/purple-sage.html
>
>

As I understand your post, you are creating software for your own
personal use.
Clearly the only standards that matter are your own. Do you agree that
creating
software that the world should use might require different standards? What
would you expect those standards to be? Should the code be required to do
more than just compile and run a simple test?

kcrisman

unread,
Sep 2, 2010, 10:14:47 AM9/2/10
to sage-devel


On Sep 2, 5:28 am, William Stein <wst...@gmail.com> wrote:
> On Wed, Sep 1, 2010 at 6:08 PM, Jason Grout <jason-s...@creativetrax.com> wrote:
> > On 9/1/10 10:32 AM, Bill Hart wrote:
>
> >> Tim,
>
> >> all screwing around aside for a moment. I broadly agree with your
> >> sentiments. However, there are also some issues with what you are
> >> suggesting. And I mean to make these observations in all seriousness.
>
> > I'm reading this thread with great interest.  Though I don't have strong
> > feelings one way or the other at this time, your post made a lot of sense to
> > me.  Thanks for the post.
>
> > Jason
>
> I've written a blog post that is relevant to this thread:
>
>    http://389a.blogspot.com/2010/09/purple-sage.html
>

"PSAGE could in the long run lead to a more modular approach to Sage
itself. For example, the PSAGE library will be a Python library like
any other, and it should be possible to just install it into an
existing Sage as an optional package. "

But could it become incompatible with other Sage stuff? For instance,
probably even in existing spkgs there could be competing definitions
of this or that thing... or maybe not? It would be helpful to have it
be an optional package from the beginning, so that if we ever (!) got
automated testing, one could just take a fresh build, install psage,
and then run doctests.

If what I'm saying doesn't make sense (maybe because of namespaces not
conflicting?), sorry for the noise. It would just be nice to make
sure that some of the prophecies occasionally made on this list about
Sage's death didn't happen prematurely because a lot of active
developers realized psage was more useful to them without it being
totally compatible from the start :)

- kcrisman

rjf

unread,
Sep 2, 2010, 3:03:01 PM9/2/10
to sage-devel
Is it pronounced Piss-ige or Pee-Sage? What does the P stand for?
I know that dogs, etc. mark locations this way, so maybe that has to
do with geometry?

see sage-flame for a snarkier comment.
Reply all
Reply to author
Forward
0 new messages