Using valgrind to find segfaults

104 views

Skip to first unread message

Bill Hart

unread,

Nov 3, 2010, 1:29:29 AM11/3/10

to sage-devel

Hi all,

I've been down with the flu for a few days and so amusing myself in
ways that don't make my head hurt too much. So, for fun, I just read
through the *very* long trac ticket for getting the new Pari into
Sage.

Firstly, thank you to all the people who took the time to work on
putting the new MPIR and Pari into Sage.

(By the way, I don't understand why MPIR has been updated to 2.1.2 and
not 2.1.3 which fixes a serious bug in the mpf functions. Nor do I
understand why MPIR has been updated and the thread for this hasn't
been closed. Also FLINT hasn't been updated, even though I explicitly
stated it isn't safe to build the old flint against the new MPIR.)

Anyhow, whilst reading the long Pari trac ticket, and associated
tickets, a few things stood out to me (a C programmer) that just might
not be obvious to everyone. Apologies if this is already known to
everyone here.

At some point the new Pari + new MPIR caused a segfault in one of the
doctests. Now, segfaults are in some ways the easiest types of bugs to
track down. Here's how:

You simply compile the relevant C libraries with gcc -g (this adds
symbol information and line numbers to the compiled libraries). Next,
you run the program valgrind. You don't need to do anything to run
this program. It just works.

If you normally type "blah" at the command line to run your program,
just type "valgrind blah" instead. It will take much longer to run
(usually 25-100 times longer), but it will tell you precisely which
lines of the C code caused the segfault and if it was reading or
writing to an invalid memory address at the time! Its output is a bit
like a stack trace in Python.

Note you can actually do all this with a Sage doctest, because after
all, Sage is just a program you run from the command line.

Once you find out which lines of C code the segfault occurs at, you
can put a trace in to see if the data being fed to the relevant
function is valid or not. This tells you if the library is at fault or
your higher level Python/Cython code is somehow responsible for
feeding invalid data (e.g. some C object wasn't initialised).

Once upon a time, Michael Abshoff used to valgrind the entire Sage
test suite and fix all the myriad bugs that showed up!

So valgrind is the bug hunters friend.

A second observation, made by Leif I think, is spot on. This all quite
possibly shows up a problem with insufficient doctesting in Sage.

Now the MPIR test code is pretty extensive and really ought to have
picked up this bug. We put a lot of time into the test code for that
MPIR release, so this is unfortunate.

However, the entire Pari test suite and the entire Sage test suite
(with an older version of Pari) passed without picking up this pretty
serious bug in the MPIR division code!

I think this underscores something I have been saying for a long time.
Sage doesn't test the C libraries it uses well enough. As a result of
that, it is taking inordinate amounts of developers' time to track
down bugs turned up by Sage doctests when spkg's are updated. In some
cases there is actually woefully inadequate test code in the C library
itself. But even when this is not the case, it makes sense for Sage to
do some serious testing before assuming the library is bug free. This
is particularly easy to do in Python, and much harder to do at the
level of the C library itself, by the way.

I have been saying this for a very long time, to many people. *ALL*
mathematical libraries are broken and contain bugs. If you don't test
the code you are using, it *is* broken. The right ratio of test code
to code is really pretty close to 50/50. And if you think I don't do
this myself when I write code (even Sage code), well you'd be wrong.

One solution would be for everyone to test more widely. If you write
code that depends on feature Y of module X and module X doesn't
properly test feature Y, assume it is broken and write doctests for
that code as well as the code you are writing yourself.

To give an example, Andy Novocin and I have been working on new
polynomial factoring code in FLINT for a couple of years now. Around 6
months ago we had a long test of some 100,000 or so polynomials
factoring correctly. We also had a long test of some 20 odd very
difficult polynomials factoring correctly. Thus there was no reason at
all to suppose there were *ANY* bugs in the polynomial factoring code
or any of the functions it made use of. By Sage standards I think this
is an insane level of testing.

But I insisted that every function we have written have its own test
code. This has meant 6 months more work (there was something like
40,000 lines of new code to test). But I cannot tell you how many new
serious bugs (and also performance problems too) that we turned up.
There must be dozens of serious bugs we've fixed, many of which would
have led to incorrect factorisations of whole classes of polynomials.

The lesson for me was: just because my very extensive 5 or 6 doctests
passed for the very complex new functionality I added, does not mean
there aren't incredibly serious bugs in the underlying modules I used,
nor does it mean that my new code is worth printing out and using as
toilet paper.

Detecting bugs in Sage won't make Sage a viable alternative to the
MA*'s (that a whole nuther thread). After all, testing standards in
those other packages are quite possibly much worse. But testing more
thoroughly will mean less time is spent wasted trying to track down
bugs in an ad hoc manner, and eventually, much more time available for
addressing those issues that are relevant to becoming a viable
alternative.

Bill.

leif

unread,

Nov 3, 2010, 2:19:14 AM11/3/10

to sage-devel

On 3 Nov., 06:29, Bill Hart <goodwillh...@googlemail.com> wrote:
> [...]

> Firstly, thank you to all the people who took the time to work on
> putting the new MPIR and Pari into Sage.
>
> (By the way, I don't understand why MPIR has been updated to 2.1.2 and
> not 2.1.3 which fixes a serious bug in the mpf functions. Nor do I
> understand why MPIR has been updated and the thread for this hasn't
> been closed. Also FLINT hasn't been updated, even though I explicitly
> stated it isn't safe to build the old flint against the new MPIR.)

Em, we haven't yet upgraded MPIR in Sage (see #8664), it's still
1.2.2.

(I recently sent you and thempirteam an e-mail regarding the [rather
trivial] exec stack problem of MPIR 2.1.x with Fedora 14. I couldn't
find it on mpir-devel, and MPIR's trac was down.)

Hopefully we'll get the new MPIR into an early alpha of 4.6.1, but
there's still work to do to make /upgrading Sage/ work with that,
since currently not all dependent parts of the /Sage library/ get
automatically properly rebuilt. I think we made a step forward with
the 4.6 release, since now at least dependent /spkgs/ get rebuilt.

W.r.t. 2.1.3, somebody else said we're currently not using any of the
mpf functions in Sage.

A long way to go... ;-)

I don't think people would like a complete feature (and perhaps
component upgrade) freeze for e.g. 6 months.

But there's work in progress to at least better support more automatic
testing on a wide(r) variety of platforms and systems. If we get new
weird doctest or build errors with every (pre-)release, there remains
little time to solve problems a long time in.

-Leif

Mitesh Patel

unread,

Nov 3, 2010, 7:09:09 AM11/3/10

to sage-...@googlegroups.com

On 11/03/2010 12:29 AM, Bill Hart wrote:
> I've been down with the flu for a few days and so amusing myself in
> ways that don't make my head hurt too much. So, for fun, I just read

I hope you get better soon!

Thanks for the tip. I think we could add a valgrind builder to the Sage
buildbot.

> A second observation, made by Leif I think, is spot on. This all quite
> possibly shows up a problem with insufficient doctesting in Sage.
>
> Now the MPIR test code is pretty extensive and really ought to have
> picked up this bug. We put a lot of time into the test code for that
> MPIR release, so this is unfortunate.
>
> However, the entire Pari test suite and the entire Sage test suite
> (with an older version of Pari) passed without picking up this pretty
> serious bug in the MPIR division code!
>
> I think this underscores something I have been saying for a long time.
> Sage doesn't test the C libraries it uses well enough. As a result of
> that, it is taking inordinate amounts of developers' time to track
> down bugs turned up by Sage doctests when spkg's are updated. In some
> cases there is actually woefully inadequate test code in the C library
> itself. But even when this is not the case, it makes sense for Sage to
> do some serious testing before assuming the library is bug free. This
> is particularly easy to do in Python, and much harder to do at the
> level of the C library itself, by the way.
>
> I have been saying this for a very long time, to many people. *ALL*
> mathematical libraries are broken and contain bugs. If you don't test
> the code you are using, it *is* broken. The right ratio of test code
> to code is really pretty close to 50/50. And if you think I don't do
> this myself when I write code (even Sage code), well you'd be wrong.

Does anyone have an estimate of this ratio for the Sage library?

Bill Hart

unread,

Nov 3, 2010, 8:30:06 AM11/3/10

to sage-devel

Hi Leif,

On 3 Nov, 06:19, leif <not.rea...@online.de> wrote:
> On 3 Nov., 06:29, Bill Hart <goodwillh...@googlemail.com> wrote:
>
> > [...]
> > Firstly, thank you to all the people who took the time to work on
> > putting the new MPIR and Pari into Sage.
>
> > (By the way, I don't understand why MPIR has been updated to 2.1.2 and
> > not 2.1.3 which fixes a serious bug in the mpf functions. Nor do I
> > understand why MPIR has been updated and the thread for this hasn't
> > been closed. Also FLINT hasn't been updated, even though I explicitly
> > stated it isn't safe to build the old flint against the new MPIR.)
>
> Em, we haven't yet upgraded MPIR in Sage (see #8664), it's still
> 1.2.2.

I just noticed this myself. I misread the version number of the spkg
in Sage 4.6 as 2.1.2 and not 1.2.2. This explains why the ticket
hasn't been closed. It also explains why flint will build against the
MPIR in sage.

It's blowing my mind that Sage is still using an 18 month old MPIR
which is almost uniformly half the speed!! I predict the doctest times
for some modules will drop noticeably when you put the new MPIR in.

Hasn't David Harvey been maintaining an optional GMP 5.0.1 spkg? I
can't believe anyone has still been using Sage with the old MPIR spkg
when that is available. It is also uniformly twice as fast!!

Didn't there used to be some thing about Sage wanting to be a viable
alternative to Magma....

>
> (I recently sent you and thempirteam an e-mail regarding the [rather
> trivial] exec stack problem of MPIR 2.1.x with Fedora 14. I couldn't
> find it on mpir-devel, and MPIR's trac was down.)

I forwarded it just now. That may save a day or two before thempirteam
email address gets read (I am no longer in charge of MPIR, Jason
Moxham is). At any rate, I'm pretty sure mpir 1.2.2 is no different
with regard to this issue than the latest version.

We got asked to add some code to the bottom of a large number of files
in MPIR for some "security issue" a while back by some distro, but I
didn't know what to make of the request. I guess maybe nothing
happened?

>
> Hopefully we'll get the new MPIR into an early alpha of 4.6.1, but
> there's still work to do to make /upgrading Sage/ work with that,
> since currently not all dependent parts of the /Sage library/ get
> automatically properly rebuilt. I think we made a step forward with
> the 4.6 release, since now at least dependent /spkgs/ get rebuilt.

I see.

>
> W.r.t. 2.1.3, somebody else said we're currently not using any of the
> mpf functions in Sage.
>

cddlib, ecl, mpmath, mpfr, singular, sympy all, as far as I can see,
make extensive use of mpf functions.

This seems to me to be a self compounding issue. The longer these
packages are put off, the more difficult the problems become. We're
talking nearly 18 months since some spkgs have been updated. I'm glad
we're getting around to it, but what I think I'm seeing is the
following work flow:

1) Try to build the whole of Sage on top of package X (without
necessarily having updated packages X relies on or testing package X
in isolation on supported platforms)
2) Bizarre doctest failures result
3) Find some bug, report upstream and pull package X back out
4) Upstream fixes bug (if upstream even exists)
5) Package doesn't get put in again because new work on package X that
has happened in the mean time might potentially cause more failures
6) Long delay whilst everyone involved licks their wounds
7) Rinse, lather, repeat

I don't feel like that is a sustainable model.

I don't wish to appear mean, but the (fairly humourous) image that
comes to mind is that we're taping a whole pile of old vacuum tubes
and radio parts together and telling people it's a mercedes benz.
There doesn't seem to be a coherent plan to rationalise the core of
Sage, modularise, or meet the objectives Sage has. All I see in the
future is more taping together of ever more broken bits and pieces and
trying to make it all cooperate.

>
> But there's work in progress to at least better support more automatic
> testing on a wide(r) variety of platforms and systems. If we get new
> weird doctest or build errors with every (pre-)release, there remains
> little time to solve problems a long time in.

This seems to have happened since the very first Sage's were released.
I don't remember a time when there weren't new unusual bugs that
showed up on platform X.Y.Z.

To me, this speaks of a need to break the process down a little. Most
supported platforms surely don't cause us regular problems, because
developers are testing on those platforms anyway. So initially release
a Sage which is fine on those "easy" and most widely used platforms.

Then for each other supported platform X, have a group (not the same
one doing the initial release) who then are tasked with getting it
running on platform X. They maintain their own repo, apply their own
patches and release their own "platform X certified Sage" when they
are good and ready. Eventually, they feed their patches back into the
"main" Sage so as to keep the number of patches they have to regularly
apply to a minimum. This model would prevent all this pointless too-
ing and fro-ing I see because something or other doesn't pass on
Solaris, or whatever. You can't seriously expect to do this all on the
day of release! Lots of people are just gonna get burned out if we
keep demanding that of them.

When gcc release a new compiler, it isn't instantly available on
Ubuntu and certainly not on Solaris. I don't see why Sage should be
any different.

And we need to put aside this naive notion that if Sage passes its
doctests on all supported platforms then somehow it is substantially
more reliable. There's probably 100,000 bugs in Sage. If three
doctests fail, big whoop. If they all pass on platform X, that should
be enough for platform X.

(You will be able to tell when Sage doesn't have 100,000 bugs because
you'll be able to afford to offer a $3300 bounty for fixing critical
bugs. :-)

Another big issue is modularity. Different groups should be
responsible for different parts of Sage. There should be a "core Sage"
community, a graph theory community, a symbolics community, a number
theory community, etc (with some overlap of membership obviously).
Each community should be able to test, update and release their part
of Sage independently of the other parts. I've argued for this for
years. No one has ever agreed with me.

No large project can ultimately succeed without modularity. This is
why module systems were added to every serious programming language,
because it was realised that modularity was essential to the success
of large projects (small projects didn't need it and neither do toy
languages).

Sage has been showing the strain for over 2 years now.

Modularity would also help things like porting to new platforms. E.g.
how could anyone think Sage will ever become a viable alternative to
the MA*'s if it never runs natively on Windows? But we'll never
achieve the latter if there isn't something smaller than Sage to port
before you port everything else. Modularity is the key to this.

Finally, testing each individual spkg (against its dependencies) on
all supported platforms *before* having to download and build the
whole of the latest Sage seems to me to be a logical first step. I'm
not seeing even that happen at the moment. This again is a kind of
modularity. If the new Pari doesn't even build on the Solaris, what is
the point of spending a whole day building and testing the whole of
Sage on Solaris? And if Pari doesn't even have a comprehensive test
suite and a new stable release I'm not getting why we are even using
it the way we are. We surely need to be much more sceptical about it
and test the hell out of it before trying to put it into Sage. OK it's
in now, but is it really worth doing it that way again in the future?

And thank goodness ticket #4000 finally got closed. I haven't even sat
down to try and analyse what went on there. There *has* to be a lesson
or two to learn from that process.

>
> -Leif

John Cremona

unread,

Nov 3, 2010, 8:55:15 AM11/3/10

to sage-...@googlegroups.com

> Sage on Solaris? And if Pari doesn't even have a comprehensive test
> suite and a new stable release I'm not getting why we are even using
> it the way we are. We surely need to be much more sceptical about it
> and test the hell out of it before trying to put it into Sage. OK it's
> in now, but is it really worth doing it that way again in the future?

Pari has a test suite of about 3200 lines. This may not be as
comprehensive as you would like, but it includes calls to every gp
function, and extra tests are added every time a bug is reported and
fixed. Moreover, this test suite is run on installation in Sage (if
SAGE_CHECK or something similar is set, e.g. during the testing
process before the new spkg is accepted).

Secondly, partly as a result of all the testing of Pari by Sage
developers (and bugs found and fixed, some by Sage developers and
others by on of the two Pari developers), Pari has announced a new
stable release, essentially the one which is now in Sage, once it has
been through a lot more tests (there's a build log at
http://pari.math.u-bordeaux.fr/buildlog.html which to my amateur eye
looks similar to http://build.sagemath.org/sage/one_line_per_build

Bill, wouldn't you be an ideal person to help get MPIR 2.1.3 into
Sage? I would certainly like everything to run twice as fast!

John

Bill Hart

unread,

Nov 3, 2010, 10:42:16 AM11/3/10

to sage-devel

On 3 Nov, 11:09, Mitesh Patel <qed...@gmail.com> wrote:
<SNIP>

> > I have been saying this for a very long time, to many people. *ALL*
> > mathematical libraries are broken and contain bugs. If you don't test
> > the code you are using, it *is* broken. The right ratio of test code
> > to code is really pretty close to 50/50. And if you think I don't do
> > this myself when I write code (even Sage code), well you'd be wrong.
>
> Does anyone have an estimate of this ratio for the Sage library?
>

I don't know how to measure this. I noted that tests seem to appear
inside docstrings r""" .... """ in sections labelled TESTS. There were
lots of other lines in docstrings in sections labelled EXAMPLES,
INTERNAL DOCUMENTATION or REFERENCES.

So I wrote a short C program and script to cover the .py files
recursively in sage-4.6/devel/sage-main/sage and the output is here:

http://selmer.warwick.ac.uk/output

The final column should give the % of nonblank lines in the file which
are in a docstring after a header TESTS, but not in any of the other
kinds of sections in a docstring. (This is not the most scientific
measurement in the world but a first approximation.) The count
includes the TESTS header itself.

The program skips blank lines and lines containing only whitespace and
does not count them for any purpose.

If you tell me what other kinds of sections I should look for, I can
modify the program. I can also easily get it to print the number of
lines of references, etc, if that is useful info.

<SNIP>

Bill.

William Stein

unread,

Nov 3, 2010, 10:46:25 AM11/3/10

to sage-...@googlegroups.com

On Wed, Nov 3, 2010 at 5:30 AM, Bill Hart <goodwi...@googlemail.com> wrote:
> Finally, testing each individual spkg (against its dependencies) on
> all supported platforms *before* having to download and build the
> whole of the latest Sage seems to me to be a logical first step. I'm
> not seeing even that happen at the moment. This again is a kind of
> modularity. If the new Pari doesn't even build on the Solaris, what is
> the point of spending a whole day building and testing the whole of
> Sage on Solaris? And if Pari doesn't even have a comprehensive test
> suite and a new stable release I'm not getting why we are even using
> it the way we are. We surely need to be much more sceptical about it
> and test the hell out of it before trying to put it into Sage. OK it's
> in now, but is it really worth doing it that way again in the future?

Just for the record, for people reading this who might think Bill
speaks for the whole project,
I am absolutely against ever removing PARI from Sage. (This is an
important distinction
to make, since I *do* often get offlist questions from people whenever
some random person posts
opinions like the above on Sage devel.)

I guess my summary of the above other suggestions by Bill are:

* Bill says we should go ahead and release Sage anyways even if we
know it is "broken" on some widely used platforms, where broken means
"some doctests fail", and people using those platforms can use an
older or different version of Sage. He says this is fine to do,
since that's what everybody else does. I disagree with this, and
think GCC is not everybody.

* Bill remarks his test suite for some project has about one line
devoted to testing for every line of actual code, and Sage should too.
I looked, and Sage probably has about 3 lines of code for every line
of testing, since there are 134,329 lines of input doctests, and
probably around 300,000-400,000 lines of actual code in the sage
library.

* Bill remarks that "modularity" is helpful, and we should use a
language that supports it (we do!), and that it would make a windows
port much easier. I think that as a community we should also support
modularity -- I'm working on making that even easier for developers --
http://code.google.com/p/purplesage/ is a prototype, and I'll be
writing more about how other groups can do something similar. I think
a smallish library version of Sage would in fact be easier to port ot
Windows. That said, the biggest obstruction to Sage on Windows has
always been lack of focused work on the port by people who knew what
they're doing. Sage only got ported to Solaris because David Kirkby
put in an _enormous_ amount of effort making that happen. Something
similar is needed for windows. Blair Sutton started in that
direction, but ran out of steam after about 4 months; David Kirkby was
much, much more longterm persistent.

* Bill suggests that the development of the core of Sage should be
broken up into 20 or so special groups, etc. I disagree with this;
personally I think specialized groups in different research areas
should instead develop packages *on top* of Sage, while *everybody*
contributes to making the core of Sage much more bugfree and stable
than it already is. But this shouldn't happen until we iron out the
bugs about how to best develop packages on top of Sage see remarks
above).

-- William

--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

Bill Hart

unread,

Nov 3, 2010, 10:47:02 AM11/3/10

to sage-devel

On 3 Nov, 12:55, John Cremona <john.crem...@gmail.com> wrote:
> > Sage on Solaris? And if Pari doesn't even have a comprehensive test
> > suite and a new stable release I'm not getting why we are even using
> > it the way we are. We surely need to be much more sceptical about it
> > and test the hell out of it before trying to put it into Sage. OK it's
> > in now, but is it really worth doing it that way again in the future?
>
> Pari has a test suite of about 3200 lines.

There's about 50,000 lines of test code in flint and it does a
fraction of what Pari does in terms of functionality.

> This may not be as
> comprehensive as you would like, but it includes calls to every gp
> function, and extra tests are added every time a bug is reported and
> fixed. Moreover, this test suite is run on installation in Sage (if
> SAGE_CHECK or something similar is set, e.g. during the testing
> process before the new spkg is accepted).
>
> Secondly, partly as a result of all the testing of Pari by Sage
> developers (and bugs found and fixed, some by Sage developers and
> others by on of the two Pari developers), Pari has announced a new
> stable release, essentially the one which is now in Sage, once it has

> been through a lot more tests (there's a build log athttp://pari.math.u-bordeaux.fr/buildlog.htmlwhich to my amateur eye
> looks similar tohttp://build.sagemath.org/sage/one_line_per_build
>

That is good news! I was very disconcerted when we were told we should
be putting Pari SVN in Sage. I understand the issues with developer
manpower there, but this is precisely why I think we've not been doing
the right thing by expecting it all Just Works.

> Bill, wouldn't you be an ideal person to help get MPIR 2.1.3 into
> Sage? I would certainly like everything to run twice as fast!

If I understand correctly, there aren't issues with the MPIR spkg. The
issues are with Sage rebuilding library dependencies when they are
updated. That's not something I have any expertise worth a dime to
help with.

Bill.

William Stein

unread,

Nov 3, 2010, 11:41:36 AM11/3/10

to sage-...@googlegroups.com

On Wed, Nov 3, 2010 at 7:47 AM, Bill Hart <goodwi...@googlemail.com> wrote:
>
> On 3 Nov, 12:55, John Cremona <john.crem...@gmail.com> wrote:
>> > Sage on Solaris? And if Pari doesn't even have a comprehensive test
>> > suite and a new stable release I'm not getting why we are even using
>> > it the way we are. We surely need to be much more sceptical about it
>> > and test the hell out of it before trying to put it into Sage. OK it's
>> > in now, but is it really worth doing it that way again in the future?
>>
>> Pari has a test suite of about 3200 lines.
>
> There's about 50,000 lines of test code in flint and it does a
> fraction of what Pari does in terms of functionality.

Michael Abshoff posted the SLOC count for Pari a while ago (here:
http://www.mail-archive.com/sage-...@googlegroups.com/msg06440.html).
The line for pari was:

120578 pari-2.3.2.p3

He remarks "There are some real surprises on that list. MPFR has many
more lines of code than I thought, Pari many fewer lines of code. It's
amazing what it achieves with such a small code base."

So, despite FLINT doing only a fraction of what Pari does in terms of
functionality, Pari isn't that much bigger than FLINT. In fact,
Michael even goes on to remark that "FLINT is a bloated pig...". :-)

>> This may not be as
>> comprehensive as you would like, but it includes calls to every gp
>> function, and extra tests are added every time a bug is reported and
>> fixed. Moreover, this test suite is run on installation in Sage (if
>> SAGE_CHECK or something similar is set, e.g. during the testing
>> process before the new spkg is accepted).
>>
>> Secondly, partly as a result of all the testing of Pari by Sage
>> developers (and bugs found and fixed, some by Sage developers and
>> others by on of the two Pari developers), Pari has announced a new
>> stable release, essentially the one which is now in Sage, once it has
>> been through a lot more tests (there's a build log athttp://pari.math.u-bordeaux.fr/buildlog.htmlwhich to my amateur eye
>> looks similar tohttp://build.sagemath.org/sage/one_line_per_build
>>
>
> That is good news! I was very disconcerted when we were told we should
> be putting Pari SVN in Sage. I understand the issues with developer
> manpower there, but this is precisely why I think we've not been doing
> the right thing by expecting it all Just Works.

We switched to Pari SVN for several reasons including being told by
Karim that the time was right and that a stable release based on it
was "coming soon".

>> Bill, wouldn't you be an ideal person to help get MPIR 2.1.3 into
>> Sage? I would certainly like everything to run twice as fast!
>
> If I understand correctly, there aren't issues with the MPIR spkg. The
> issues are with Sage rebuilding library dependencies when they are
> updated. That's not something I have any expertise worth a dime to
> help with.
>
> Bill.
>

> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org

Bill Hart

unread,

Nov 3, 2010, 11:54:56 AM11/3/10

to sage-devel

On 3 Nov, 14:46, William Stein <wst...@gmail.com> wrote:

> On Wed, Nov 3, 2010 at 5:30 AM, Bill Hart <goodwillh...@googlemail.com> wrote:
> > Finally, testing each individual spkg (against its dependencies) on
> > all supported platforms *before* having to download and build the
> > whole of the latest Sage seems to me to be a logical first step. I'm
> > not seeing even that happen at the moment. This again is a kind of
> > modularity. If the new Pari doesn't even build on the Solaris, what is
> > the point of spending a whole day building and testing the whole of
> > Sage on Solaris? And if Pari doesn't even have a comprehensive test
> > suite and a new stable release I'm not getting why we are even using
> > it the way we are. We surely need to be much more sceptical about it
> > and test the hell out of it before trying to put it into Sage. OK it's
> > in now, but is it really worth doing it that way again in the future?
>
> Just for the record, for people reading this who might think Bill
> speaks for the whole project,

Wait, what!??? ...why on earth would anyone think that!!? Have I ever
held some kind of executive capacity in the project?

No!! Anything I suggest is nothing more than that, a suggestion, and
an opinion.

> I am absolutely against ever removing PARI from Sage.

I hope no one was confused about this. I wrote:

"And if Pari doesn't even have a comprehensive test
suite and a new stable release I'm not getting why we are even using

it __the way__ we are. We surely need to be much more sceptical about
it
and __test the hell out of it__ before trying to put it into Sage. OK
it's
in now, but is it really worth doing it __that way__ again in the
future? "

I don't think there is any suggestion there that we should remove Pari
from Sage. That would be impractical. I've added emphasis to the
important points I was making.

> (This is an
> important distinction
> to make, since I *do* often get offlist questions from people whenever
> some random person posts
> opinions like the above on Sage devel.)
>
> I guess my summary of the above other suggestions by Bill are:
>
> * Bill says we should go ahead and release Sage anyways even if we
> know it is "broken" on some widely used platforms,

I think I said the main Sage release should pass doctests on widely
used platforms. I'm talking about having separate groups manage
separate Sage releases on other less widely used platforms.

If we add Windows to the list of supported platforms eventually, will
that get added to the list of platforms that will need to be tested on
and fixed every other week when there is a new Sage release? Did
anyone do this with the Cygwin port to check for regressions/new
doctest failures before 4.6 came out?

At what point will the list of supported platforms just become too
large to be managing simultaneous releases on all platforms
simultaneously every other week?

I'm trying to think through some of the standard ways that large
projects manage the complexity of software releases to see if there is
a way to streamline things. By that I mean non-blocking threads.

Another oft suggested strategy is to have a sage-stable and sage-
devel. In some ways, purple-sage looks like it might become some
people's sage-devel, I don't know.

> where broken means
> "some doctests fail", and people using those platforms can use an
> older or different version of Sage. He says this is fine to do,
> since that's what everybody else does. I disagree with this, and
> think GCC is not everybody.

For a vast number of projects, ports to platforms that are not in the
general category of "widely used linux distros" are managed
separately. It's not a GCC thing, but certainly a prevalent thing.

>
> * Bill remarks his test suite for some project has about one line
> devoted to testing for every line of actual code, and Sage should too.
> I looked, and Sage probably has about 3 lines of code for every line
> of testing, since there are 134,329 lines of input doctests, and
> probably around 300,000-400,000 lines of actual code in the sage
> library.

The biggest problem I see here is each doctest only seems to do a
single iteration of the function in question. I am sure there are
exceptions to this. But I don't see how you hope to catch corner cases
by testing fewer times than I have fingers and toes.

Still, one line in four, if accurate, doesn't sound like a bad ratio.
Writing test code in Python is easier than in some other languages, so
I don't know what is a good ratio for python code.

>
> * Bill remarks that "modularity" is helpful, and we should use a
> language that supports it (we do!),

Quite obviously it does. So my point was obviously not about the
language!

> and that it would make a windows
> port much easier. I think that as a community we should also support

> modularity -- I'm working on making that even easier for developers --http://code.google.com/p/purplesage/is a prototype, and I'll be

> writing more about how other groups can do something similar. I think
> a smallish library version of Sage would in fact be easier to port ot
> Windows.

I agree with this. I think the purplesage concept actually addresses
some of the major issues.

> That said, the biggest obstruction to Sage on Windows has
> always been lack of focused work on the port by people who knew what
> they're doing. Sage only got ported to Solaris because David Kirkby
> put in an _enormous_ amount of effort making that happen. Something
> similar is needed for windows. Blair Sutton started in that
> direction, but ran out of steam after about 4 months; David Kirkby was
> much, much more longterm persistent.

There is almost infinitely more work in a native Windows port than a
Solaris port. Solaris is similar enough to other unixes to make a port
doable by one person. I think it is completely unfair to say Blair
failed because David was more persistent. Blair failed because it is a
much, much harder job, and he got tired.

One person will never finish the job on Windows. It needs a team of
skilled individuals. But a smaller bite-sized useful chunk of Sage at
a time is far more likely to be ported than the whole thing at once.

>
> * Bill suggests that the development of the core of Sage should be
> broken up into 20 or so special groups, etc. I disagree with this;
> personally I think specialized groups in different research areas
> should instead develop packages *on top* of Sage, while *everybody*
> contributes to making the core of Sage much more bugfree and stable
> than it already is. But this shouldn't happen until we iron out the
> bugs about how to best develop packages on top of Sage see remarks
> above).

I don't recall suggesting the "core of Sage" should be broken into 20
pieces. That would be totally pointless.

Or are you suggesting that everything that is currently in that 300mb
tarball should be considered "core Sage". I think that is a completely
and utterly unsustainable model.

The core should be *much* smaller. Then communities should be building
on top of that smaller core with packages relevant to their area.
That's what I mean by modularisation.

I cannot think of a single other project out there that has a single
monolithic 300mb source tarball released every other week. Anyhow, I
don't think we disagree on the general principle that modularity is a
good thing.

Bill.

Bill Hart

unread,

Nov 3, 2010, 12:07:33 PM11/3/10

to sage-devel

On 3 Nov, 15:41, William Stein <wst...@gmail.com> wrote:

> On Wed, Nov 3, 2010 at 7:47 AM, Bill Hart <goodwillh...@googlemail.com> wrote:
>
> > On 3 Nov, 12:55, John Cremona <john.crem...@gmail.com> wrote:
> >> > Sage on Solaris? And if Pari doesn't even have a comprehensive test
> >> > suite and a new stable release I'm not getting why we are even using
> >> > it the way we are. We surely need to be much more sceptical about it
> >> > and test the hell out of it before trying to put it into Sage. OK it's
> >> > in now, but is it really worth doing it that way again in the future?
>
> >> Pari has a test suite of about 3200 lines.
>
> > There's about 50,000 lines of test code in flint and it does a
> > fraction of what Pari does in terms of functionality.
>
> Michael Abshoff posted the SLOC count for Pari a while ago (here:http://www.mail-archive.com/sage-...@googlegroups.com/msg06440.html).
> The line for pari was:
>
> 120578 pari-2.3.2.p3

So 117378 lines of code vs 3200 lines of test code for Pari.

Around 20000 lines of code vs 20000 lines of test code for flint2.

>
> He remarks "There are some real surprises on that list. MPFR has many
> more lines of code than I thought, Pari many fewer lines of code. It's
> amazing what it achieves with such a small code base."
>
> So, despite FLINT doing only a fraction of what Pari does in terms of
> functionality, Pari isn't that much bigger than FLINT. In fact,
> Michael even goes on to remark that "FLINT is a bloated pig...". :-)
>

Yeah but it runs really fast, and he was right. Why do you think we
rewrote it from scratch.

>
>
>
>
>
>
>
>
> >> This may not be as
> >> comprehensive as you would like, but it includes calls to every gp
> >> function, and extra tests are added every time a bug is reported and
> >> fixed. Moreover, this test suite is run on installation in Sage (if
> >> SAGE_CHECK or something similar is set, e.g. during the testing
> >> process before the new spkg is accepted).
>
> >> Secondly, partly as a result of all the testing of Pari by Sage
> >> developers (and bugs found and fixed, some by Sage developers and
> >> others by on of the two Pari developers), Pari has announced a new
> >> stable release, essentially the one which is now in Sage, once it has

> >> been through a lot more tests (there's a build log athttp://pari.math.u-bordeaux.fr/buildlog.htmlwhichto my amateur eye

> >> looks similar tohttp://build.sagemath.org/sage/one_line_per_build
>
> > That is good news! I was very disconcerted when we were told we should
> > be putting Pari SVN in Sage. I understand the issues with developer
> > manpower there, but this is precisely why I think we've not been doing
> > the right thing by expecting it all Just Works.
>
> We switched to Pari SVN for several reasons including being told by
> Karim that the time was right and that a stable release based on it
> was "coming soon".

Great! Git version of flint 1.6 is available here:

http://selmer.warwick.ac.uk/FLINT.git

branch test_code.

The time is right and a stable release based on it is "coming soon".

In fact I will do you one better. No known issues!

Please report any bugs and we'll fix in git.

>
>
>
>
>
>
>
>
>
> >> Bill, wouldn't you be an ideal person to help get MPIR 2.1.3 into
> >> Sage? I would certainly like everything to run twice as fast!
>
> > If I understand correctly, there aren't issues with the MPIR spkg. The
> > issues are with Sage rebuilding library dependencies when they are
> > updated. That's not something I have any expertise worth a dime to
> > help with.
>
> > Bill.
>
> > --
> > To post to this group, send an email to sage-...@googlegroups.com
> > To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com

> > For more options, visit this group athttp://groups.google.com/group/sage-devel

kcrisman

unread,

Nov 3, 2010, 1:34:00 PM11/3/10

to sage-devel

> > Just for the record, for people reading this who might think Bill
> > speaks for the whole project,
>
> Wait, what!??? ...why on earth would anyone think that!!? Have I ever
> held some kind of executive capacity in the project?
>
> No!! Anything I suggest is nothing more than that, a suggestion, and
> an opinion.
>

Of course, and regular readers know this :) But I think it's true
that this happens (non-regular readers happening upon one message and
thinking this is 'official'). Probably even some now-regular readers
did this early on in their Sage life cycle. It takes a while to get
used to an open development culture! Which this thread shows is very
healthy in Sage :)

FWIW,
- kcrisman

William Stein

unread,

Nov 3, 2010, 3:25:16 PM11/3/10

to sage-...@googlegroups.com

On Wed, Nov 3, 2010 at 8:54 AM, Bill Hart <goodwi...@googlemail.com> wrote:
>
>
> On 3 Nov, 14:46, William Stein <wst...@gmail.com> wrote:
>> On Wed, Nov 3, 2010 at 5:30 AM, Bill Hart <goodwillh...@googlemail.com> wrote:
>> > Finally, testing each individual spkg (against its dependencies) on
>> > all supported platforms *before* having to download and build the
>> > whole of the latest Sage seems to me to be a logical first step. I'm
>> > not seeing even that happen at the moment. This again is a kind of
>> > modularity. If the new Pari doesn't even build on the Solaris, what is
>> > the point of spending a whole day building and testing the whole of
>> > Sage on Solaris? And if Pari doesn't even have a comprehensive test
>> > suite and a new stable release I'm not getting why we are even using
>> > it the way we are. We surely need to be much more sceptical about it
>> > and test the hell out of it before trying to put it into Sage. OK it's
>> > in now, but is it really worth doing it that way again in the future?
>>
>> Just for the record, for people reading this who might think Bill
>> speaks for the whole project,
>
> Wait, what!??? ...why on earth would anyone think that!!? Have I ever
> held some kind of executive capacity in the project?

I have no idea, but often people read some random posting of an idea
on a list, then contact me off list worried that there has been some
scary official statement that X will happen with Sage. It happens all
too often. It's not your fault...

> If we add Windows to the list of supported platforms eventually, will
> that get added to the list of platforms that will need to be tested on
> and fixed every other week when there is a new Sage release?

Yes.

> Did
> anyone do this with the Cygwin port to check for regressions/new
> doctest failures before 4.6 came out?

There still isn't a version on Cygwin that even *builds*
automatically, let alone works without further work. Tests still
fail, etc.
So this is different.

> At what point will the list of supported platforms just become too
> large to be managing simultaneous releases on all platforms
> simultaneously every other week?

There was a recent long discussion about supported platforms on
sage-devel. The suggestion was that we *define*
the set of supported platforms to be the ones where we do automated
testing (with buildbot, etc., and all tests pass),
and be sure to include platforms in that list that people funding sage
development care a lot about. I now think this is a
really good idea. Then the answer to your question is: "The list of
supported platforms is *exactly* the size that we
can clearly manage, since we are managing it."

> Another oft suggested strategy is to have a sage-stable and sage-
> devel. In some ways, purple-sage looks like it might become some
> people's sage-devel, I don't know.

That's a completely wrong way to think about what purple-sage is,
though I can see how the misconception could arise,
especially because initially I didn't know, and I thought it might be
that. But Purple sage is:

(1) a stripped down Sage, with a very small number of
modifications to the sage library so that this stripped down version
works, and

(2) a completely *new* library, 100% outside of the "sage"
namespace, of code.

As an example to make things more concrete, yesterday Fredrik
Johansson pointed me to a track ticket, and said "maybe you could
include this in psage", since he was viewing psage as "sage + any
patches william can think to apply to sage". I think that would be a
hellish nightmare to maintain... and that is not what psage is. The
code in psage is either completely new code, or code that might never
ever go into Sage, or if it does, it won't happen for a while. It's
by people doing research math, who don't want to worry about having
stable API's, 100% coverage, etc. Right now it's got code for
computing Maass forms, Siegel modular forms, Hilbert modular forms,
and L-functions of elliptic curves over function fields.

>> * Bill remarks his test suite for some project has about one line
>> devoted to testing for every line of actual code, and Sage should too.
>> I looked, and Sage probably has about 3 lines of code for every line
>> of testing, since there are 134,329 lines of input doctests, and
>> probably around 300,000-400,000 lines of actual code in the sage
>> library.
>
> The biggest problem I see here is each doctest only seems to do a
> single iteration of the function in question. I am sure there are
> exceptions to this. But I don't see how you hope to catch corner cases
> by testing fewer times than I have fingers and toes.

That's a matter of convention. To encourage people to write more
useful tests, here's one I wrote. Look at
sage/matrix/matrix_integer_dense_hnf.py at the function "def
sanity_checks(times=50, n=8, m=5, proof=True, stabilize=2,
check_using_magma = True)". It has doctests that do HNF using
several different systems on random input and compare the results.
It could be made to run longer than it does now though. Amusingly,
when I wrote that function, I immediately found a serious bug in
PARI... (which the pari devs immediately fixed, of course).

> Still, one line in four, if accurate, doesn't sound like a bad ratio.
> Writing test code in Python is easier than in some other languages, so
> I don't know what is a good ratio for python code.

Well the docstrings are mainly useful for users and people reading the
code. I really hope we also add tons of unit tests, especially when
the doctest coverage gets to 100%...

>> That said, the biggest obstruction to Sage on Windows has
>> always been lack of focused work on the port by people who knew what
>> they're doing. Sage only got ported to Solaris because David Kirkby
>> put in an _enormous_ amount of effort making that happen. Something
>> similar is needed for windows. Blair Sutton started in that
>> direction, but ran out of steam after about 4 months; David Kirkby was
>> much, much more longterm persistent.
>
> There is almost infinitely more work in a native Windows port than a
> Solaris port. Solaris is similar enough to other unixes to make a port
> doable by one person. I think it is completely unfair to say Blair
> failed because David was more persistent. Blair failed because it is a
> much, much harder job, and he got tired.

He got tired after about 1/10th the amount of time elapsed.
I agree that the "native windows port" will be harder though, because it will
involve fundamentally rewriting how some of Sage works. E.g.,
switching entirely to
libGAP (a C interface to GAP) would be a good step.

> One person will never finish the job on Windows. It needs a team of
> skilled individuals. But a smaller bite-sized useful chunk of Sage at
> a time is far more likely to be ported than the whole thing at once.

I'm sure one person could do it. I think I could do it. Or Mike
Hansen. Or Carl Witty. But I'm probably not going to, since I have
more important things to do.
I might come back to it in a few years though, if nobody else has done
it first.

>> * Bill suggests that the development of the core of Sage should be
>> broken up into 20 or so special groups, etc. I disagree with this;
>> personally I think specialized groups in different research areas
>> should instead develop packages *on top* of Sage, while *everybody*
>> contributes to making the core of Sage much more bugfree and stable
>> than it already is. But this shouldn't happen until we iron out the
>> bugs about how to best develop packages on top of Sage see remarks
>> above).
>
> I don't recall suggesting the "core of Sage" should be broken into 20
> pieces. That would be totally pointless.

Cool. I'm glad I misunderstood.

> Or are you suggesting that everything that is currently in that 300mb
> tarball should be considered "core Sage". I think that is a completely
> and utterly unsustainable model.

Yes, that's what I mean by "core sage".

> The core should be *much* smaller. Then communities should be building
> on top of that smaller core with packages relevant to their area.
> That's what I mean by modularisation.
>
> I cannot think of a single other project out there that has a single
> monolithic 300mb source tarball released every other week.

Sage is unfortunately not released every other week.

Sage isn't about copying other projects. Just because you don't know
of other projects that are like Sage, doesn't mean Sage must fail or
is "unsustainable".

> Anyhow, I
> don't think we disagree on the general principle that modularity is a
> good thing.

Yes. I also agree that growing the "core of Sage" whatever that
means too much is not the right next direction to go in. The next
direction should be to have a library version of Sage, root out bugs,
make the core Sage library even higher quality and harder to get code
into, make it really, really easy for people to build their own
packages on Sage (and host them on Pypi says), etc. The python
community has done tons for us already, and we should reap the
benefits of all that.

>> Michael even goes on to remark that "FLINT is a bloated pig...". :-)
>Yeah but it runs really fast, and he was right. Why do you think we
> rewrote it from scratch.

I had to include a funny mabshoff quote. I for one definitely
appreciate FLINT!

>> We switched to Pari SVN for several reasons including being told by
>> Karim that the time was right and that a stable release based on it
>> was "coming soon".
>
> Great! Git version of flint 1.6 is available here:
> http://selmer.warwick.ac.uk/FLINT.git
> branch test_code.
> The time is right and a stable release based on it is "coming soon".

OK, let me add that in addition to be told the time is right, Robert
Bradshaw, John Cremona, and I were all in the same place at Sage Days
looking for a challenging 3-day project. And John really wanted the
new PARI since it fixed some bugs. Also, he offered us a free
dinner...

-- William

John Cremona

unread,

Nov 3, 2010, 4:20:16 PM11/3/10

to sage-...@googlegroups.com

> OK, let me add that in addition to be told the time is right, Robert
> Bradshaw, John Cremona, and I were all in the same place at Sage Days
> looking for a challenging 3-day project. And John really wanted the
> new PARI since it fixed some bugs. Also, he offered us a free
> dinner...
>

Did I really? I had completely forgotten -- and I'm sure there were
no conditions attached!

John

John H Palmieri

unread,

Nov 3, 2010, 5:20:28 PM11/3/10

to sage-devel

I think you should include EXAMPLES also, since those actually contain
most of the doctests in the Sage library. "TESTS" blocks are not used
very systematically, I think.

--
John

Bill Hart

unread,

Nov 3, 2010, 10:04:31 PM11/3/10

to sage-devel

I've updated it, and yep, that makes a very big difference. Pretty
healthy looking.

I should really check a few and make sure there aren't other sections
that are getting counted as tests (I have to explicitly exclude any
sections for them to not be counted). But for the couple of files I
checked the numbers looked right.

Bill.

Dr David Kirkby

unread,

Nov 3, 2010, 10:09:04 PM11/3/10

to sage-devel

On Nov 3, 5:29 am, Bill Hart <goodwillh...@googlemail.com> wrote:
> Hi all,

Hi Bill

> Now the MPIR test code is pretty extensive and really ought to have
> picked up this bug. We put a lot of time into the test code for that
> MPIR release, so this is unfortunate.

Bugs are inevitable. Anybody that believes one can write a non-trivial
piece of bug-free code is mistaken. NASA can't do it. The aviation
industry can't do it.

So any thoughts of writing bug-free code are just pointless. It will
never happen. The best we can do is to reduce the probability of
bugs.

> However, the entire Pari test suite and the entire Sage test suite
> (with an older version of Pari) passed without picking up this pretty
> serious bug in the MPIR division code!
>
> I think this underscores something I have been saying for a long time.
> Sage doesn't test the C libraries it uses well enough.

I agree. There tends to be a trust that the upstream developers are
good mathematicians and their code is often widely used, so it must be
ok. But I'm afraid that in many cases in Sage, the upstream developers
skill set is often not in software, but in mathematics.

> As a result of
> that, it is taking inordinate amounts of developers' time to track
> down bugs turned up by Sage doctests when spkg's are updated. In some
> cases there is actually woefully inadequate test code in the C library
> itself.

Agreed. I think however there is now an increased awareness of this,
and the situation is improving. I know recently someone proposed
adding some code to Sage, and I asked if an audit had been done of the
library. It was clear that in that case the developers were clearly
quite skilled at writing software. It was well commented, clean etc. I
did not check the maths of it, but it passed the first stage in that
the developers were clearly vigilant in what they were doing. It had
test code etc.

> But even when this is not the case, it makes sense for Sage to
> do some serious testing before assuming the library is bug free. This
> is particularly easy to do in Python, and much harder to do at the
> level of the C library itself, by the way.

> I have been saying this for a very long time, to many people. *ALL*
> mathematical libraries are broken and contain bugs.

Of course. Any non-trivial piece of code will contain bugs.

> If you don't test
> the code you are using, it *is* broken.

Even if you do test, if the code is non-trivial, it will contain bugs.
There is no such thing as a non-trivial bug-free program.

> The right ratio of test code
> to code is really pretty close to 50/50. And if you think I don't do
> this myself when I write code (even Sage code), well you'd be wrong.

Where do you get this 50:50 figure from Bill? Did you can rand(), or
is it based on any hard facts?

What appears to be to be a rather extreme example, is the SQlite
database which is in Sage, where the test code is 647 times bigger
than the database

http://www.sqlite.org/testing.html

If you pick up any decent book on software engineering you will find
that the most expensive part of developing commercial software is the
maintenance costs. So it does not surprise me one bit that a lot of
time is spent in Sage in resolving bug problems - that seems pretty
normal in the computer industry.

> One solution would be for everyone to test more widely. If you write
> code that depends on feature Y of module X and module X doesn't
> properly test feature Y, assume it is broken and write doctests for
> that code as well as the code you are writing yourself.

Unfortunately that will often get impractical, and I think would
easily consume more than the 50:50 mix you mention above.

> To give an example, Andy Novocin and I have been working on new
> polynomial factoring code in FLINT for a couple of years now. Around 6
> months ago we had a long test of some 100,000 or so polynomials
> factoring correctly. We also had a long test of some 20 odd very
> difficult polynomials factoring correctly. Thus there was no reason at
> all to suppose there were *ANY* bugs in the polynomial factoring code
> or any of the functions it made use of.

That's just rubbish. If this code is non-trivial, then you must expect
bugs.

> By Sage standards I think this
> is an insane level of testing.

It's certainly higher than normal in Sage I would agree. There are a
wide mix of Sage developers, so who pay very scant attention to code
quality, others who do take it seriously.

> But I insisted that every function we have written have its own test
> code. This has meant 6 months more work (there was something like
> 40,000 lines of new code to test). But I cannot tell you how many new
> serious bugs (and also performance problems too) that we turned up.
> There must be dozens of serious bugs we've fixed, many of which would
> have led to incorrect factorisations of whole classes of polynomials.

I'm really puzzled you are surprised at this.

> The lesson for me was: just because my very extensive 5 or 6 doctests
> passed for the very complex new functionality I added, does not mean
> there aren't incredibly serious bugs in the underlying modules I used,
> nor does it mean that my new code is worth printing out and using as
> toilet paper.

You do not need 6 months worth to prove this - you can pick up any
decent book on software engineering and these issues are discussed.

> Detecting bugs in Sage won't make Sage a viable alternative to the
> MA*'s (that a whole nuther thread).

Agreed. But failing to detect too many bugs will make Sage non-viable
for a large number of users.

> After all, testing standards in
> those other packages are quite possibly much worse.

I do believe Wolfram Research take testing quite seriously. Whilst I'm
aware of some of which is written at

http://reference.wolfram.com/mathematica/tutorial/TestingAndVerification.html

is total rubbish, I doubt they actually lie about some of the testing
methods they use. They certainly claim to use a lot of techniques
which Sage does not.

> But testing more
> thoroughly will mean less time is spent wasted trying to track down
> bugs in an ad hoc manner, and eventually, much more time available for
> addressing those issues that are relevant to becoming a viable
> alternative.

Agreed. But one has to balance the conflicting demands of testing code
better, and adding more features. Finding the right balance point is
not easy. Different developers each have their own ideas.

> Bill.

Bill Hart

unread,

Nov 4, 2010, 12:00:52 AM11/4/10

to sage-devel

Hi David.

Well, I only have to find one person who agrees with me :-)

http://blog.flipbit.co.uk/2009/06/what-code-coverage-percentage-should.html

But it is roughly based on the idea that each call to a function from
elsewhere in a project should equal one more test for that function,
and this somehow seems to naturally work out to about that amount.
Usually slightly less I think. But it depends on the language and the
type of testing done obviously.

>
> What appears to be to be a rather extreme example, is the SQlite
> database which is in Sage, where the test code is 647 times bigger
> than the database
>
> http://www.sqlite.org/testing.html

Now that's my kind of test suite!!

>
> If you pick up any decent book on software engineering you will find
> that the most expensive part of developing commercial software is the
> maintenance costs. So it does not surprise me one bit that a lot of
> time is spent in Sage in resolving bug problems - that seems pretty
> normal in the computer industry.
>
> > One solution would be for everyone to test more widely. If you write
> > code that depends on feature Y of module X and module X doesn't
> > properly test feature Y, assume it is broken and write doctests for
> > that code as well as the code you are writing yourself.
>
> Unfortunately that will often get impractical, and I think would
> easily consume more than the 50:50 mix you mention above.
>
> > To give an example, Andy Novocin and I have been working on new
> > polynomial factoring code in FLINT for a couple of years now. Around 6
> > months ago we had a long test of some 100,000 or so polynomials
> > factoring correctly. We also had a long test of some 20 odd very
> > difficult polynomials factoring correctly. Thus there was no reason at
> > all to suppose there were *ANY* bugs in the polynomial factoring code
> > or any of the functions it made use of.
>
> That's just rubbish. If this code is non-trivial, then you must expect
> bugs.

Of course.

>
> > By Sage standards I think this
> > is an insane level of testing.
>
> It's certainly higher than normal in Sage I would agree. There are a
> wide mix of Sage developers, so who pay very scant attention to code
> quality, others who do take it seriously.
>
> > But I insisted that every function we have written have its own test
> > code. This has meant 6 months more work (there was something like
> > 40,000 lines of new code to test). But I cannot tell you how many new
> > serious bugs (and also performance problems too) that we turned up.
> > There must be dozens of serious bugs we've fixed, many of which would
> > have led to incorrect factorisations of whole classes of polynomials.
>
> I'm really puzzled you are surprised at this.

I am puzzled you think I was expressing surprise. I was well aware
what would turn up with better test code.

>
> > The lesson for me was: just because my very extensive 5 or 6 doctests
> > passed for the very complex new functionality I added, does not mean
> > there aren't incredibly serious bugs in the underlying modules I used,
> > nor does it mean that my new code is worth printing out and using as
> > toilet paper.
>
> You do not need 6 months worth to prove this - you can pick up any
> decent book on software engineering and these issues are discussed.
>
> > Detecting bugs in Sage won't make Sage a viable alternative to the
> > MA*'s (that a whole nuther thread).
>
> Agreed. But failing to detect too many bugs will make Sage non-viable
> for a large number of users.
>
> > After all, testing standards in
> > those other packages are quite possibly much worse.
>
> I do believe Wolfram Research take testing quite seriously. Whilst I'm
> aware of some of which is written at
>

> http://reference.wolfram.com/mathematica/tutorial/TestingAndVerificat...

>
> is total rubbish, I doubt they actually lie about some of the testing
> methods they use. They certainly claim to use a lot of techniques
> which Sage does not.
>
> > But testing more
> > thoroughly will mean less time is spent wasted trying to track down
> > bugs in an ad hoc manner, and eventually, much more time available for
> > addressing those issues that are relevant to becoming a viable
> > alternative.
>
> Agreed. But one has to balance the conflicting demands of testing code
> better, and adding more features. Finding the right balance point is
> not easy. Different developers each have their own ideas.
>