Re: [sage-devel] Re: bug wranglers

6 views

Skip to first unread message

William Stein

unread,

Oct 21, 2010, 1:24:13 AM10/21/10

to sage-...@googlegroups.com, ps...@googlegroups.com

On Wed, Oct 20, 2010 at 9:33 PM, Robert Bradshaw
<robe...@math.washington.edu> wrote:
> On Wed, Oct 20, 2010 at 5:10 AM, Johan S. R. Nielsen
> <J.S.R....@mat.dtu.dk> wrote:
>> I think that Burcin's suggestion is excellent. Development of Sage
>> should definitely move towards more structure, documentation, testing
>> and other software engineering practices, but as for any Open Source-
>> project, these things should come naturally as the project grows and
>> matures; as has already happened with Sage a lot, it seems (for a
>> relative new-comer like me). To require too much too soon would kill
>> the joy of working on the project and thus kill the project itself.
>
> +1 I have definitely seen that the level of bureaucracy has going up,
> especially in the last year or two, has turned off a lot of potential
> and even former developers.

Another +1. The level of bureaucracy has gone up so much in the last
year that it has very seriously turned off me.

> The focus on software engineering and
> testing can certainly be good for quality (though that's not an
> immediate or certain implication), but the problem is that too much
> emphasis on it has a significant chilling effect on contributions. The
> lag time between hacking out some code and getting it in is way too
> high these days discourages contribution and sucks up a lot of
> development time and energy with endless rebases and waiting.

It kills a large portion of potential contributions of code for
advanced research, perhaps 80% or more. I've talked to sooooo many
people about this in the last few months...

> And
> though we all want to produce bug-free code, holding that up as the
> primary objetive (as opposed to producing useful code) I think
> dissuades people from submitting or refereeing code.

Also, anybody who has significant experience with software engineering
knows that producing bug-free code is a recipe for producing almost no
code.

I was talking to somebody today who works on Microsoft Windows
(actually implementing the OS there), and who has also written a lot
of code for Sage (at a "research level" -- advanced number theory
stuff). He said internally at Microsoft code gets into the system,
and out getting used by a lot of people (internally) much more quickly
than with Sage. Instead of the very "all or nothing" model that we
tend to have, they have many levels of review that code goes through.
Sage would benefit from something similar. That's basically what
http://purple.sagemath.org/ is about: a way to get code out there and
distributed, used, etc., but without all the bureaucracy. As an
example, I'll sit down this coming Tuesday with Sal Baig, and get his
and Chris Hall's library for computing with elliptic curves over
function fields into PSAGE, and have it be in the next release.
That's code that isn't stable yet and is mainly for research. For a
year they have been trying to get it into Sage, but it just isn't
happening, since they care and know much more about *using* the code
for research purposes, than about how to make a proper
makefile/autoconf setup, so it builds fine on Solaris and OS X 10.4.

I think PSAGE will show that the increasingly bureaucratic and
plodding way in which code gets into Sage isn't necessarily bad, in
the same sense that Debian can be very plodding and bureaucratic, but
it still provides a good foundation for many other much more svelte
and useful Linux distributions.

> I'm not sure I have a solution, but one thing that keeps coming to
> mind is that Sage is trying to span several audiences with different
> goals and criteria, and perhaps the various audiences would be best
> met by a stable vs. unstable model like Debian. Purple sage is an
> extreem move in this direction (more like Debian experimental)--I can
> certainly see the attraction and value, but I just hope it doesn't
> become an incompatible fork.

A difference from debian experimental is that PSAGE starts by removing
over 20 standard packages from Sage. In fact, so far, that is
essentially *all* that PSAGE is. Also, my intention is that most
Python code in PSAGE go into a different Python module than the "sage"
one.

>> Burcin's suggestion seem to fit this curve pretty well at this time.
>> New developers and bugfixers -- with little overview of the monster
>> that is Sage -- would feel more confident in reporting and fixing bugs
>> if there was a feeling that there was someone (or a group of someones)
>> with overview and structure. If some enthusiastic veterans could be
>> found and agree on the exact model of this, I think it would improve
>> bug-tracking and -fixing in a number of ways:
>> - overview of bugs, their severity and class (by cleaning up,
>> removing duplicates, collating related tracs, and reclassifying)
>> - better classification of bugs by everyone else (by monkey-see-
>> monkey-do)
>> - better overview over bugs to fix before releases (by better
>> overview over all bugs)
>> - shorter pickup-time between a trac has been filed (possibly by
>> someone not interested in fixing it) and someone is looking at it
>> - assurance that a veteran has looked at the trac, accepted it, and
>> maybe even given an approving nod after positive review
>> - and all of this gives more confidence to developer-rookies
>> I think the system should entirely superseed the automatic-owner
>> system that is currently in Sage. Software-speaking, this would
>> provide an abstract interface between the tracs and those responsible
>> for it, which makes it more flexible to have either one, several or
>> many owners of a trac or class of tracs.
>>
>> Personally, I like the one-week-at-a-time suggestion (though several
>> people should be on duty each week perhaps) sounds best. However, it's
>> easy for me to say, as I don't think I have the required experience to
>> undertake this duty ("You are not bug-ninja materiAL, SOLDIER! Drop
>> down and give me a recursive function generating the Fibonacci
>> sequence!"). When and if the time comes, I would be happiest with a
>> one-week-at-a-time schedule, though.
>>
>>> Burcin wrote:
>>> Perhaps we should come up with a locking mechanism, to prevent two
>>> different people from trying to sort the same issue at the same time,
>>> but it feels like too much organization at the beginning.
>> Maybe there would not need to be a locking-mechanism to begin with,
>> but surely a mechanism so that a bug-wrangler could see that no other
>> bug-wrangler has already looked at this new trac.
>
> I agree that a big part of the problem is that it's hard to get a big
> picture of all the bugs being worked on. The idea of a weekly
> "bug-wrangler" is an interesting one. I have a simpler proposal (which
> may be insufficient, but would complement a bug-wrangler's role and is
> much easier to implement).
>
> First, have the initial status of tickets be some pre-new stage.
> (Something like "unclassified".) This woud make it so you don't have
> to be an expert to classify a bug. Volunteers could go and look at all
> unclassified tickets and file them appropriately (severity, defect vs.
> enhancement, component, duplicate, etc.) Of course, there could be a
> rotating bug-wrangler, but if it was easy enough for a "veteran" to
> hop on and categorize a bunch of them in one sitting this might not
> even be necessary.

This is perhaps already provided by:

http://spreadsheets.google.com/viewform?key=pCwvGVwSMxTzT6E2xNdo5fA

Harald Schilly could probably set things up so more people can wrangle the bugs
that come in through that page.

> Second, rather than have the default milestone be the next release,
> have some default future milestone.

There's no default milestone. See for yourself:

http://trac.sagemath.org/sage_trac/newticket

It being the next release is merely a social convention... That said,
of course you mean "instead of whatever is done now..."

>Only tickets that are actively
> being worked on get put on the current release. This would make it
> much easier to see what's being worked on (or at least cared about)
> and what to expect in the next release. Sufficiently important items
> (blockers) would also get added here.

Couldn't something like that be accomplished with a trac report
showing the 50 "most active" tickets? Sort of like how
http://ask.sagemath.org/questions/ by default shows the 50 most
active questions, and http://mathoverflow.net/ shows the top 50 or so
most active questions?

> Thirdly, I think the list of components could be cleaned up.

For the last year people kept asking to add new trac components (I
know, because I had to add them). There used to only be a handful.

Just out of curiosity, is trac still the best system to use for
managing what we're managing? You work at Google and they have their
own free http://code.google.com/ thing, which provides similar
capabilities to trac, but with integrated code review, the possibility
to delete comments (w00t!), etc., native support for Mercurial (which
trac barely supports, preferring SVN), easy forking, etc. I have so
far little experience with code.google.com, but I'm sort of curious
how good it is. I set it up for PSAGE here
http://code.google.com/p/purplesage/ since I didn't want to have to
manage a whole trac, hg repo, etc., yet again.

> All of these would help us get a better picture of the overall state
> of Sage. Finally, we need more automation. Refereeing code shouldn't
> have to involve downloading and applying patches and running all
> tests--that should all be done automatically (with failing tickets
> bounced right away, or at least in a 24-48 hour window). We should
> have a notebook server with sessions of Sage with various tickets
> already applied for quick refereeing (or a CLI interface on, say,
> boxen for those who prefer that--"telnet/ssh
> 10921.sage.math.washington.edu" would be cool, opening you immediately
> into a jailed sage session). I'll plug
> http://trac.sagemath.org/sage_trac/ticket/9967 which is a requirement
> for this and I just recently started to write a per-ticket build bot.
> This would help close the gap and lag between writing code and getting
> it into Sage, and I think that having such a small gap was one of the
> key ingredients in letting Sage explode like it did.

+1 You hit the nail on the head. The small gap absolutely *was* the
reason. I missed so many key things in (social and other) life
during the first 1.5 years of Sage in order to make the next releases,
because I knew how absolutely critical it was putting together new
releases in order to draw in developers and keep the momentum going.
Making a release 1 day earlier, literally made a huge difference
during that time. Many developers who got involved then, because of
the fast cycle, have since stopped working on Sage. Now the release
cycle is about 2 months between releases. You can see here (look at
the dates) what it used to be: http://sagemath.org/src-old/

Again, I'm not necessarily claiming Sage itself needs to move that
quickly again. This is very difficult technically due to the larger
number of platforms supported, the larger test suite, codebase, etc.
But something does need to change, or some of the truly brilliant
computational mathematics researchers (like Mark Watkins, say) will be
unlikely to be drawn in again. For me, PSAGE will accomplish this
goal, while allowing me to continue to benefit from the incredibly
valuable high quality work that so many people are doing on making
Sage a solid, well tested, cross-platform system.

-- William

--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

Reply all

Reply to author

Forward

0 new messages