What to do with research-y code.

112 views
Skip to first unread message

Robert Bradshaw

unread,
Nov 26, 2012, 12:19:24 PM11/26/12
to sage-devel
This is somewhat a continuation of the "permutations...again" thread,
but I think the topic is much broader than that. Over time
contributing Sage has become increasingly bureaucratic with the goal
(I hope) of getting higher-quality more-stable code.

Raising the bar on Sage code quality creates this limbo area of code
that's good enough to be shared/built upon, but not good enough to be
included in Sage. The combinat folks seem to have realized this from
the beginning (hence the combinat queue) and this was also the
motivation for psage http://purple.sagemath.org/goals.html (see
especially "Change the development model") I don't see this changing
anytime soon.

On the other hand, it's very important that code like this not get
lost, and there is value added by taking code to the next level (e.g.
http://sagemath.blogspot.com/2011/12/when-using-sage-to-support-research.html
) If something like the salvus model
http://sagemath.blogspot.com/2011/11/is-time-ripe-for-httpsagenbcom.html
takes off, and I'm optimistic it will, that might open up
opportunities for others to do this work, or (more interestingly)
researchers to secure funding that allows them to put on their
developer hat as part of their day job rather than squeezing it in on
nights and weekends between teaching and research. (Of course it would
also be nice of the academic landscape shifts to give credit to these
worthwhile endeavors but that's not going to happen overnight.)

So, what can we do? We're looking to shake up the development model
next year, I think it'd be useful to see how to best leverage this
"alpha-quality" research code from Sage *and* provide an easy (or at
least no-harder-than-necessary) way to migrate such code into Sage
itself once it's ready (with incentives). What tools, models, and
conventions would be conducive to this? The intent seems to be to
support sage-combinat-like queues or patch overlays, but is this the
best model? (It works, but it seems to be a lot of overhead and has
its drawbacks.) Long-lived git branches? Independent forks in separate
namespaces (like psage)? A very stable sage-core with various
(optional?) overlays? (Would this allow for an even more "sound
software engineering practices" base without the tension to apply
unrealistically high bars everywhere? I've see good patches or even
bugfixes not go in for ages due to trivial documentation issues...
(being able to *easily* make trivial chances or build upon the work of
others will help here)) Decorators for "alpha-quality" methods that
don't exist in the "stable" model? Any other ideas?

- Robert

kcrisman

unread,
Nov 26, 2012, 12:46:48 PM11/26/12
to sage-...@googlegroups.com

Great thoughts, Robert; I don't know if it will go anywhere, but it's worth revisiting every so often.
 

Raising the bar on Sage code quality creates this limbo area of code
that's good enough to be shared/built upon, but not good enough to be
included in Sage. The combinat folks seem to have realized this from
the beginning (hence the combinat queue) and this was also the
motivation for psage http://purple.sagemath.org/goals.html (see
especially "Change the development model") I don't see this changing
anytime soon.


Right.
 
nights and weekends between teaching and research. (Of course it would
also be nice of the academic landscape shifts to give credit to these
worthwhile endeavors but that's not going to happen overnight.)


Yes; the places where this is most likely to change (like where I teach) are also least likely to have enough time to do it right - at which point I apologize for not dealing with many, many bugs I probably could but simply don't have the time to currently fix :(
 
bugfixes not go in for ages due to trivial documentation issues...
(being able to *easily* make trivial chances or build upon the work of

You are so right here, and I'm sure I've been guilty of some of them.  The GUI model on Github is good here (the rest of Github I could do without).  The effort to make those trivial changes is often not worth it.
 
others will help here)) Decorators for "alpha-quality" methods that
don't exist in the "stable" model? Any other ideas?


Honestly, I think that what makes sage-combinat work is not the queue, but the community.  Perhaps because of the very high bar of correctness mathematics has (or attempts to have), potential Sage developers are reluctant to review code that they don't 100% understand.  Then that goes double for the research-y ones.   Probably the best thing any research-y code developer could do is to train two colleagues/grad students/whatever somewhere on Earth to really understand their code, *even if in a separate email and not on Trac*.   Otherwise what I've observed is patches languishing because someone (again, often me) doesn't have time or energy to do that last 2% of verification that takes 98% of the review time, and once a few weeks have passed since the initial post of code, the original author has often moved on to something else for the time being.  Conversely, when people are together, lots of stuff happens, and even often after the initial Sage Days or what have you, because you feel more of a personal responsibility for seeing the project through.

That's a long way of saying maybe we need to encourage people posting this type of code (or any, really) to make the effort to identify people, whether current Sage developers or not, to work alongside - people who will have enough motivation to use the new code that they will care, but who aren't so invested in it that things like documentation or edge cases will seem pointless to check.  How does this piece work in psage?  In sage-combinat, again, the queue seems to solve this, and there is a lot of impressive teamwork.  I know that for my own research code I know of no one else in this category, so I haven't bothered, but presumably people with grad students or postdoc colleagues or faculty mentors or coauthors will have people to identify for this... ?

- kcrisman
 

P Purkayastha

unread,
Nov 26, 2012, 12:53:31 PM11/26/12
to sage-...@googlegroups.com
I think the Sage community could quickly expand and there could be tens,
if not hundreds, of git development branches once the switch to git
occurs. It would be quite hard to keep track of all the different
branches and the individual modifications that people have in their
forks. I looked up scipy right now, and that itself has over 500
watchers and 200 forks. The situation is the same for matplotlib, and
almost the same for mathjax. It would be nice to see how those
communities cope with such a huge number of forks and development branches.

What I describe below is one way I think we could have access to the
many individual patches and "alpha-quality" code people might have.

To encourage people to contribute back high-quality version of their
research projects into Sage, one thing that could be done is to enable a
wikipage where the developers can mention or list their current
project/unpolished code. The hope is that such a model will help the
person get feedback for his/her code and the person can get encouraged
to eventually submit it to trac and include it with Sage. It often
happens with me that I get a bit more motivation to finish/polish my
work once someone asks me for it - the feedback helps me know that the
code might be useful for someone else too! I wonder if other people here
have faced similar situations.

The wiki page could be similar to the one used in
http://godashboard.appspot.com/ (that's not a wiki page though). There,
a lot of projects are listed; though I am not sure how it is maintained.
The positive aspect of having a wiki page is that any developer who has
been given access to it can edit it and list his/her project or code.
This could also help to consolidate all the different and separate
development that goes on in psage and combinat, into one place where you
can access all those links and sub-projects (I wasn't even aware of
psage for instance).

A big question is, of course, how can one get easy access to all these
development code. I don't have a solution for this.

About getting trivial patches merged in - I don't know how it can be
effected. One option is to use the online editing facility of github
itself, which has already been used a bit in the notebook development. I
don't know where the Sage library will be hosted though.


Dima Pasechnik

unread,
Nov 27, 2012, 12:53:46 AM11/27/12
to sage-...@googlegroups.com
On 2012-11-26, Robert Bradshaw <robe...@gmail.com> wrote:
> This is somewhat a continuation of the "permutations...again" thread,
> but I think the topic is much broader than that. Over time
> contributing Sage has become increasingly bureaucratic with the goal
> (I hope) of getting higher-quality more-stable code.
>
> Raising the bar on Sage code quality creates this limbo area of code
> that's good enough to be shared/built upon, but not good enough to be
> included in Sage. The combinat folks seem to have realized this from
> the beginning (hence the combinat queue)
the biggest issue with it that they failed to sanitize very old Sage
foundations they use: e.g. look at the badly abused Permutation_class, which
is used for all sorts of sequences, many of which have absolutely
nothing to do with permutations per se.
By the way, currently it makes Sage basically unsuitable to use to teach
permutations to undergraduates, as they are, quite understandably,
immediately get lost, dazed and confused.

I actually think that Permutation_class issue deserves a coordinated
effort to cleanup. Two patches Nathann just fixed
#13742, and #13750, only touch the tip of the iseberg, it seems to me
as their reviewer.
Introduce a dedicated class Sequence or somesuch thing to cater for
"permutations" which are actually not, something like this.
(Could be that this already is happening, I don't know).

Dima

Burcin Erocal

unread,
Nov 27, 2012, 2:48:13 AM11/27/12
to sage-...@googlegroups.com
On Tue, 27 Nov 2012 01:53:31 +0800
P Purkayastha <ppu...@gmail.com> wrote:

<snip Robert's post>
> I think the Sage community could quickly expand and there could be
> tens, if not hundreds, of git development branches once the switch to
> git occurs. It would be quite hard to keep track of all the different
> branches and the individual modifications that people have in their
> forks. I looked up scipy right now, and that itself has over 500
> watchers and 200 forks. The situation is the same for matplotlib, and
> almost the same for mathjax. It would be nice to see how those
> communities cope with such a huge number of forks and development
> branches.

Note that most of the research code we are talking about is either a
single .sage file or a bunch of .py files in a directory, totaling
at most 1500 - 2000 lines of code. Here is a good example:

http://math.bu.edu/people/rpollack/OMS/OMS.zip

from http://math.bu.edu/people/rpollack/


I cannot imagine research mathematicians wrangling forks of the Sage
library (IIRC, ~ 500k lines of code) just to get a small piece working.
In most cases, these forks will contain old versions, untouched since
the paper was published, so even merging with the latest Sage release
will be nontrivial. Especially if the research code required changes in
some core Sage library class (add a function to number fields say), only
a few people really familiar with Sage (and the DVCS system in use) can
handle the merge.


This is all to say that git forks are not going to solve the problem
Robert brought up.

> What I describe below is one way I think we could have access to the
> many individual patches and "alpha-quality" code people might have.

Here is another way, which is not at all new. :) Do what William does
with psage:

If your code exceeds the "single .sage/.py file" threshold, it is
fairly easy to create a Python package out of it. With the myriad of
Python packaging solutions (easy_install, pip, etc.), installing such a
package given a URL is also trivial. So just publish the URL on your
home page, announce it to your colleagues and you're good to go.

There are several problems with this approach, which I mention below.

> To encourage people to contribute back high-quality version of their
> research projects into Sage, one thing that could be done is to
> enable a wikipage where the developers can mention or list their
> current project/unpolished code. The hope is that such a model will
> help the person get feedback for his/her code and the person can get
> encouraged to eventually submit it to trac and include it with Sage.
> It often happens with me that I get a bit more motivation to
> finish/polish my work once someone asks me for it - the feedback
> helps me know that the code might be useful for someone else too! I
> wonder if other people here have faced similar situations.

The wiki page is a good idea, but it would be filled with stale
information quickly if it is not supported by some infrastructure to
keep it up to date.

You're right that getting feedback is a great encouragement to polish
the code (which is a big advantage of the combinat model), but I don't
see why it also encourages people to submit it to Sage. The review
process can be quite painful after all (see #9016). In many cases, from
a professional/career perspective, this is even bad for you:

* once your code is readily available in Sage, people assume it's
standard functionality and stop citing/giving credit to the
implementation/paper

* the time spent to polish the code is wasted according to academic
assessment criteria, which usually only counts publications and
citations


We tried to address these problems to provide more of an incentive to
make people submit their work to Sage with the citation module:

http://trac.sagemath.org/sage_trac/ticket/3317

Here is some relevant discussion from sage-devel more than a year ago:

https://groups.google.com/d/topic/sage-devel/RtYTIxgn2io/discussion


Going back to the problems with individual packages:

- code that is not tested regularly against updates in Sage bitrots

This problem can be solved by a continuous integration system (like
patchbot) that runs the tests against changes in Sage. Depending on
hardware availability, this can happen with every beta and rc
releases, or even daily.

The developer has to commit to fixing the problems revealed by the
test suite. This would be in their interest, since it guarantees
that if somebody else ever wants to use the code in question, it
will (up to amount of test coverage) run on the latest Sage release.

- some research code goes beyond just a bunch of python files.

For example Simon's p-group cohomology package:

http://sage.math.washington.edu/home/SimonKing/Cohomology/

It includes MeatAxe, a C-library, and depends on an optional Sage
package. This provides a challange to Salvus (or wakari:
http://continuum.io/blog/introducing-wakari) like models. Unless
they create a virtual machine for each user to play with, I don't
see how they can support installing arbitrary code.

Package dependencies for optional/experimental packages is beyond
the capabilities of the sage packaging system. Adding
"sage -i <package_name>" commands to spkg-install is a very ugly
hack.


If you've read so far, maybe some self-plugging will be tolerated.
lmonade (http://www.lmona.de/) is designed to solve these problems,
with

- a flexible package manager that support overlays and

- a continuous integration system (patchbot in software engineering
speak) where people can sign up to get their code tested when the
packages they depend on are updated.

It still lacks many features due to lack of developer time. I plan to
change that soon (also encouraged by discussions like this one), but
any help is much appreciated nevertheless.


Cheers,
Burcin

David Kirkby

unread,
Nov 27, 2012, 3:45:49 AM11/27/12
to sage-...@googlegroups.com
On 26 November 2012 17:19, Robert Bradshaw <robe...@gmail.com> wrote:

> Raising the bar on Sage code quality creates this limbo area of code
> that's good enough to be shared/built upon, but not good enough to be
> included in Sage. The combinat folks seem to have realized this from
> the beginning (hence the combinat queue) and this was also the
> motivation for psage http://purple.sagemath.org/goals.html (see
> especially "Change the development model") I don't see this changing
> anytime soon.
>
> On the other hand, it's very important that code like this not get
> lost, and there is value added by taking code to the next level (e.g.
> http://sagemath.blogspot.com/2011/12/when-using-sage-to-support-research.html

I feel the way to solve this is to have community contributed
packages, which don't form part of the core of Sage, but can be
installed by anyone if they wish to. Projects like R, Perl, autoconf
all have this.

I've never looked at the R system, but I know R has a policy of
keeping the core quite small. Perl has a huge range of
user-contributed packages, which are easy to search and easy to
install.

http://search.cpan.org/

Just to cut and past the sections, (which has lost the alphabetical
order, which I can't be bothered to sort out).

* Archiving Compression Conversion
* File Name Systems Locking
* Option Parameter Config Processing
* Bundles (and SDKs)
* Graphics
* Perl6
* Commercial Software Interfaces
* Internationalization Locale Pragmas
* Control Flow Utilities
* Language Extensions
* Security
* Data and Data Types
* Language Interfaces
* Server Daemon Utilities
* Database Interfaces
* Mail and Usenet News
* String Language Text Processing
* Development Support
* Miscellaneous User Interfaces
* Documentation
* Networking Devices IPC
* World Wide Web
* File Handle Input/Output
* Operating System Interfaces

One would need a way to search for modules, like Perl have. I just
done a search for Mathematica, and sure enough, someone has written a
Perl interface to Mathematica.

http://search.cpan.org/~jberger/Math-Mathematica-0.002/lib/Math/Mathematica.pm

Doing a search on "Prime" gives 262 packages. At least the first dozen
or so are all related to prime numbers - I did not bother looking at
them all.

You can also search by author. So if you have a research area of
maths, you could look for packages contributed by authors who are
active in your field of reserach.

To me that is a FAR better way forwared that just adding more and more
to the core of Sage.

Obviously there would need to be a way for users to get packages added
Given the amount of spam, it would be sensible to not leave that open,
but allow any Sage developer to add a package. If someone who is not
reallly a developer, but feels they have something useful, they could
just drop in an email to sage-support and ask that their package is
uploaded.

It *might* also be worth allowing a method for others to add comments
on the packages, but again spam would probably make that impracticle.

Dave

Robert Bradshaw

unread,
Nov 27, 2012, 4:15:04 AM11/27/12
to sage-devel
+1, for sure. And we have many other great sage communities.

Of course communities are hard to "engineer" compared to tools or
policies. But I think having good technology, conventions, and
practices can be conducive (or not) to community collaboration as well
as allowing individuals to spend their limited time more productively.

In particular, moving code from "I hacked something up to write a
paper" to "I wrote something I feel comfortable sharing with my
colleagues" is a huge step forward, and research-y communities are a
good stepping stone between the former and inclusion into Sage (as
well as being a source of users (aka testers), developers, and peer
reviewers of code that requires a high degree of expertise).

> Perhaps because of the very high bar of correctness
> mathematics has (or attempts to have), potential Sage developers are
> reluctant to review code that they don't 100% understand. Then that goes
> double for the research-y ones. Probably the best thing any research-y
> code developer could do is to train two colleagues/grad students/whatever
> somewhere on Earth to really understand their code, *even if in a separate
> email and not on Trac*. Otherwise what I've observed is patches
> languishing because someone (again, often me) doesn't have time or energy to
> do that last 2% of verification that takes 98% of the review time, and once
> a few weeks have passed since the initial post of code, the original author
> has often moved on to something else for the time being. Conversely, when
> people are together, lots of stuff happens, and even often after the initial
> Sage Days or what have you, because you feel more of a personal
> responsibility for seeing the project through.
>
> That's a long way of saying maybe we need to encourage people posting this
> type of code (or any, really) to make the effort to identify people, whether
> current Sage developers or not, to work alongside - people who will have
> enough motivation to use the new code that they will care, but who aren't so
> invested in it that things like documentation or edge cases will seem
> pointless to check. How does this piece work in psage?

It'd be interesting to get William's take on it, but
https://github.com/williamstein/psage/graphs/contributors is
informative. (Of course the salv.us effort started up this summer.)

> In sage-combinat,
> again, the queue seems to solve this, and there is a lot of impressive
> teamwork.

Yes, and I think it'd be really interesting to get one of them to
understand outline their goals (and how well the queue meets them)
rather than just their methods. Then we can either come up with
something even better, or make sure we support queues really well :).

> I know that for my own research code I know of no one else in
> this category, so I haven't bothered, but presumably people with grad
> students or postdoc colleagues or faculty mentors or coauthors will have
> people to identify for this... ?

Yep.

- Robert

Jan Groenewald

unread,
Nov 27, 2012, 4:20:33 AM11/27/12
to sage-...@googlegroups.com
Hi

It would also be nice, if like R, one could run a system-wide install for users
(e.g. in a university lab environment) and users could have optional packages
either installed system-wide, or locally (e.g. .sage), and that those packages
gracefully stated their minimum sage version to work on, or gracefully (in some
sense) declined to work when there was a mismatch.

Regards,
Jan




--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To post to this group, send email to sage-...@googlegroups.com.
To unsubscribe from this group, send email to sage-devel+...@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-devel?hl=en.





--
  .~.
  /V\     Jan Groenewald
 /( )\    www.aims.ac.za
 ^^-^^


Robert Bradshaw

unread,
Nov 27, 2012, 4:25:51 AM11/27/12
to sage-devel
I think publishing git branches should be easy enough (read, a
one-liner from sage). Getting noticed/getting feedback is the hard
part.

Another crazy idea: encourage uploading of unfinished code with
optional nag reminders. I'd rather people upload code without tests
and people start reviewing it than hold onto it until they "get around
to" adding all the examples. Just having public "todo" lists can be
motivation to finish something up, or bug someone about finishing
something up.

> The wiki page could be similar to the one used in
> http://godashboard.appspot.com/ (that's not a wiki page though). There, a
> lot of projects are listed; though I am not sure how it is maintained. The
> positive aspect of having a wiki page is that any developer who has been
> given access to it can edit it and list his/her project or code. This could
> also help to consolidate all the different and separate development that
> goes on in psage and combinat, into one place where you can access all those
> links and sub-projects (I wasn't even aware of psage for instance).
>
> A big question is, of course, how can one get easy access to all these
> development code. I don't have a solution for this.
>
> About getting trivial patches merged in - I don't know how it can be
> effected. One option is to use the online editing facility of github itself,
> which has already been used a bit in the notebook development. I don't know
> where the Sage library will be hosted though.

Yep. Or at the very least:

1. Notice issue.
2. One-liner to become pristine version of code.
3. Fix.
4. Send for review [also a one-liner]
5. One-liner to get back to where you were, with or without this edit.
Testing may optionally happen on your machine in the background, and
the patchbot will pick it up.

Step (2) would also include pulling someone's open ticket, so rather
than responding with "you forgot a period" you just add it in. (Online
is even better.)

- Robert

Robert Bradshaw

unread,
Nov 27, 2012, 4:56:38 AM11/27/12
to sage-devel
On Mon, Nov 26, 2012 at 11:48 PM, Burcin Erocal <bur...@erocal.org> wrote:
> On Tue, 27 Nov 2012 01:53:31 +0800
> P Purkayastha <ppu...@gmail.com> wrote:
>
> <snip Robert's post>
>> I think the Sage community could quickly expand and there could be
>> tens, if not hundreds, of git development branches once the switch to
>> git occurs. It would be quite hard to keep track of all the different
>> branches and the individual modifications that people have in their
>> forks. I looked up scipy right now, and that itself has over 500
>> watchers and 200 forks. The situation is the same for matplotlib, and
>> almost the same for mathjax. It would be nice to see how those
>> communities cope with such a huge number of forks and development
>> branches.
>
> Note that most of the research code we are talking about is either a
> single .sage file or a bunch of .py files in a directory, totaling
> at most 1500 - 2000 lines of code.

Yes, that's one end of the continuum, and there's a lot of that.

> Here is a good example:
>
> http://math.bu.edu/people/rpollack/OMS/OMS.zip
>
> from http://math.bu.edu/people/rpollack/
>
>
> I cannot imagine research mathematicians wrangling forks of the Sage
> library (IIRC, ~ 500k lines of code) just to get a small piece working.
> In most cases, these forks will contain old versions, untouched since
> the paper was published, so even merging with the latest Sage release
> will be nontrivial.

But what if one could magically sync to the version they were using +
their code. Rebasing it to the present might still be work, but the
more code gets used (even in the context of an old sage) the greater
the chances it will get improved and eventually merged in by someone.

> Especially if the research code required changes in
> some core Sage library class (add a function to number fields say), only
> a few people really familiar with Sage (and the DVCS system in use) can
> handle the merge.
>
>
> This is all to say that git forks are not going to solve the problem
> Robert brought up.

Yep.

>> What I describe below is one way I think we could have access to the
>> many individual patches and "alpha-quality" code people might have.
>
> Here is another way, which is not at all new. :) Do what William does
> with psage:
>
> If your code exceeds the "single .sage/.py file" threshold, it is
> fairly easy to create a Python package out of it. With the myriad of
> Python packaging solutions (easy_install, pip, etc.), installing such a
> package given a URL is also trivial. So just publish the URL on your
> home page, announce it to your colleagues and you're good to go.
>
> There are several problems with this approach, which I mention below.
>
>> To encourage people to contribute back high-quality version of their
>> research projects into Sage, one thing that could be done is to
>> enable a wikipage where the developers can mention or list their
>> current project/unpolished code. The hope is that such a model will
>> help the person get feedback for his/her code and the person can get
>> encouraged to eventually submit it to trac and include it with Sage.
>> It often happens with me that I get a bit more motivation to
>> finish/polish my work once someone asks me for it - the feedback
>> helps me know that the code might be useful for someone else too! I
>> wonder if other people here have faced similar situations.
>
> The wiki page is a good idea, but it would be filled with stale
> information quickly if it is not supported by some infrastructure to
> keep it up to date.

For sure, this would have to be automated (or at least freshness
scores assigned).

> You're right that getting feedback is a great encouragement to polish
> the code (which is a big advantage of the combinat model), but I don't
> see why it also encourages people to submit it to Sage. The review
> process can be quite painful after all (see #9016). In many cases, from
> a professional/career perspective, this is even bad for you:
>
> * once your code is readily available in Sage, people assume it's
> standard functionality and stop citing/giving credit to the
> implementation/paper

Very interesting point.

> * the time spent to polish the code is wasted according to academic
> assessment criteria, which usually only counts publications and
> citations

Totally. Even if you get some credit for producing the code, polishing
it is way down on the list of being formally accredited and
appreciated (despite the appreciation of colleagues).

> We tried to address these problems to provide more of an incentive to
> make people submit their work to Sage with the citation module:
>
> http://trac.sagemath.org/sage_trac/ticket/3317
>
> Here is some relevant discussion from sage-devel more than a year ago:
>
> https://groups.google.com/d/topic/sage-devel/RtYTIxgn2io/discussion

It'd be nice to revisit this.

> Going back to the problems with individual packages:
>
> - code that is not tested regularly against updates in Sage bitrots
>
> This problem can be solved by a continuous integration system (like
> patchbot) that runs the tests against changes in Sage. Depending on
> hardware availability, this can happen with every beta and rc
> releases, or even daily.

As long as the doctests take to run when you're waiting for them, one
surprising revelation of the patchbot is how frequently you can
actually test the entire codebase. Currently, a single computer can
keep up with every patch that's uploaded to trac, and donating cycles
is very easy should we have the need.

> The developer has to commit to fixing the problems revealed by the
> test suite. This would be in their interest, since it guarantees
> that if somebody else ever wants to use the code in question, it
> will (up to amount of test coverage) run on the latest Sage release.

Typically, if someone finds something broken in my code, I am very
motivated to fix it. There's a question abou the other way around:
what if their tests break because I changed the way polynomial rings
print? How easily can I go change their code? Do I have to wait for
them to accept my change? Are things just broken in the meantime
(which is actually really bad, because once things go red, you don't
notice or can't even detect further breaking changes).

There's also the issue cross-package changes. Maybe the code can be
refactored to minimize this (e.g. see the aspect oriented software
development thread) but I still think the coupling is quite strong
between different mathematical components, and fragmenting our library
into several packages/repositories makes development much more painful
(remember when docs and clib were their own spkg). It's possible that
research-y code is more leafy, but adding a method to number fields is
a prime example of something that would be a pain to put in a package
(at least if this became a regular pattern, it's not very scalable).

When, if ever, would this code get reviewed? Would there be a list of
standard (versioned) packages that are "part of" Sage? Would the
quality of a package be based on extrinsic factors (e.g. author
reputations). We do already sweep this under the rug for the upstream
packages we include.

It's also nice to be able to say "Sage x.y.z" rather than "sage +
these packages, but not those packages" for reproducibility,
especially if behavioral changes as well as additional functionality
is involved. (Putting the list of packages under revision control,
with hermetic builds/environments, is one way to solve this issue.)
Also, shipping with "batteries included" is a really nice feature.

Stil, the package model has advantages.

> - some research code goes beyond just a bunch of python files.
>
> For example Simon's p-group cohomology package:
>
> http://sage.math.washington.edu/home/SimonKing/Cohomology/
>
> It includes MeatAxe, a C-library, and depends on an optional Sage
> package. This provides a challange to Salvus (or wakari:
> http://continuum.io/blog/introducing-wakari) like models. Unless
> they create a virtual machine for each user to play with, I don't
> see how they can support installing arbitrary code.
>
> Package dependencies for optional/experimental packages is beyond
> the capabilities of the sage packaging system. Adding
> "sage -i <package_name>" commands to spkg-install is a very ugly
> hack.

I think lmona, or something like that, will have to be part of the
model. I'm certainly assuming something more powerful than spkgs...

> If you've read so far, maybe some self-plugging will be tolerated.
> lmonade (http://www.lmona.de/) is designed to solve these problems,
> with
>
> - a flexible package manager that support overlays and
>
> - a continuous integration system (patchbot in software engineering
> speak) where people can sign up to get their code tested when the
> packages they depend on are updated.
>
> It still lacks many features due to lack of developer time. I plan to
> change that soon (also encouraged by discussions like this one), but
> any help is much appreciated nevertheless.
>
>
> Cheers,
> Burcin
>

Simon King

unread,
Nov 27, 2012, 4:58:56 AM11/27/12
to sage-...@googlegroups.com
Hi David,

On 2012-11-27, David Kirkby <david....@onetel.net> wrote:
> I feel the way to solve this is to have community contributed
> packages, which don't form part of the core of Sage, but can be
> installed by anyone if they wish to. Projects like R, Perl, autoconf
> all have this.

What is wrong with Sage's experimental or optional packages? Or
(analogously) with the packages of GAP, that also come with different
degrees of "official" support/approval.

OK, Sage does not have *that* many experimental/optional packages.

Later in your post, you stated (if I understand you correctly) that it
may be a step forward to *not* just add new stuff to the sage library,
hence, *not* to add it to the default distribution, and instead create
a new spkg for new stuff.

I am not sure whether I agree that it would be a step forward.

On the plus side:

* It would provide more flexibility for the user, when he/she can add
stuff needed for a particular application.
* The packages are (I think) supposed to be largely independent. Hence,
it would force the developper to modularise the code, which may
improve code quality (but IANASE [I am not a software engineer]).

On the minus side:

* If people get the new stuff without the need to install an additional
package, they are more likely to discover and use it. Hence, new stuff
would more likely to be tested (good for quality).
* Imagine that stuff like integer programming would not be in the sage
library, but only available as an optional package. Then, there will
quite likely be users saying: "Sage does not even have integer
programming", being unaware that all what they need to do is install
yet another optional package.
* If code is distributed in many small packages contributed by
individuals, it would mean that the version control would be
fragmented. That's totally opposed to the "single repository"
approach that seems to be part of Sage's switch to git.

Best regards,
Simon


David Kirkby

unread,
Nov 27, 2012, 8:11:54 AM11/27/12
to sage-...@googlegroups.com
On 27 November 2012 09:58, Simon King <simon...@uni-jena.de> wrote:
> Hi David,

Hi Simon.

> On 2012-11-27, David Kirkby <david....@onetel.net> wrote:
>> I feel the way to solve this is to have community contributed
>> packages, which don't form part of the core of Sage, but can be
>> installed by anyone if they wish to. Projects like R, Perl, autoconf
>> all have this.
>
> What is wrong with Sage's experimental or optional packages? Or
> (analogously) with the packages of GAP, that also come with different
> degrees of "official" support/approval.
>
> OK, Sage does not have *that* many experimental/optional packages.

There's no real search facility like I indicated with Perl. It's just
a list in alphabetical order.

At the moment, if you are a developer and want to add a bit of code to
the sage library to do X or Y, it seems like pretty much anything can
be added, as long as it is reviewed properly.

> Later in your post, you stated (if I understand you correctly) that it
> may be a step forward to *not* just add new stuff to the sage library,
> hence, *not* to add it to the default distribution, and instead create
> a new spkg for new stuff.

I believe there is an argument that is something is not going to be
used by many. it should not be in the default distribution. There are
26015 packages available for Python

http://pypi.python.org/pypi

which one can install. Do you think Python would be better if all
those packages were built in? Personally I don't think so.

> * Imagine that stuff like integer programming would not be in the sage
> library, but only available as an optional package. Then, there will
> quite likely be users saying: "Sage does not even have integer
> programming", being unaware that all what they need to do is install
> yet another optional package.

Obviously you need to have a core set of code which is needed to make
a package useful. As a non-mathmatician, I have no idea how useful
integer programming is, or how many people use it. But if it is useful
to wide range of people, then include it in the Sage library.

R, Perl, Python, MATLAB all have an extensive range of external
packages which the developers don't include in the core of the
program.

Sage is pretty unique in basically bundling everything together. I do
understand some of the reasons for that design choice, but it
basically means none of the Linux distributions will include a Sage
package.



Dave

William Stein

unread,
Nov 27, 2012, 8:32:54 AM11/27/12
to sage-...@googlegroups.com
On Mon, Nov 26, 2012 at 11:48 PM, Burcin Erocal <bur...@erocal.org> wrote:
> On Tue, 27 Nov 2012 01:53:31 +0800
> P Purkayastha <ppu...@gmail.com> wrote:
>
> <snip Robert's post>
>> I think the Sage community could quickly expand and there could be
>> tens, if not hundreds, of git development branches once the switch to
>> git occurs. It would be quite hard to keep track of all the different
>> branches and the individual modifications that people have in their
>> forks. I looked up scipy right now, and that itself has over 500
>> watchers and 200 forks. The situation is the same for matplotlib, and
>> almost the same for mathjax. It would be nice to see how those
>> communities cope with such a huge number of forks and development
>> branches.
>
> Note that most of the research code we are talking about is either a
> single .sage file or a bunch of .py files in a directory, totaling
> at most 1500 - 2000 lines of code. Here is a good example:
>
> http://math.bu.edu/people/rpollack/OMS/OMS.zip
>
> from http://math.bu.edu/people/rpollack/

Or

https://github.com/haikona/OMS

for what that very code has evolved into... in the process of trying
to to get it into Sage (via workshops, student projects, etc.)

>
>
> I cannot imagine research mathematicians wrangling forks of the Sage
> library (IIRC, ~ 500k lines of code) just to get a small piece working.
> In most cases, these forks will contain old versions, untouched since
> the paper was published, so even merging with the latest Sage release
> will be nontrivial. Especially if the research code required changes in
> some core Sage library class (add a function to number fields say), only
> a few people really familiar with Sage (and the DVCS system in use) can
> handle the merge.
>
> This is all to say that git forks are not going to solve the problem
> Robert brought up.

I don't follow you. For example, you say "only a few people can
handle the merge". If I fork the sage repo (on github say) today and
do nothing to it, then try to merge with sage in 6 months, the merge
would be automatic/trivial (as long as the history of sage moves
forward). If I make a few changes, merging will only involve those
changes.

Of course, as you say below, I agree that making separate Python
packages for individual projects is a reasonable way to go. This is
just the analogue of R packages, or npm modules (in node), etc.,... or
even PyPi (http://pypi.python.org/pypi). I wonder if people could
in fact post Python modules on PyPi that depend on the Sage library?

>> What I describe below is one way I think we could have access to the
>> many individual patches and "alpha-quality" code people might have.
>
> Here is another way, which is not at all new. :) Do what William does
> with psage:
>
> If your code exceeds the "single .sage/.py file" threshold, it is
> fairly easy to create a Python package out of it. With the myriad of
> Python packaging solutions (easy_install, pip, etc.), installing such a
> package given a URL is also trivial. So just publish the URL on your
> home page, announce it to your colleagues and you're good to go.

+1
I will (at some point) for Salvus.

> Package dependencies for optional/experimental packages is beyond
> the capabilities of the sage packaging system. Adding
> "sage -i <package_name>" commands to spkg-install is a very ugly
> hack.

What's the situation with Pypi or any other Python packaging
solutions. Surely they provide dependencies?

>
>
> If you've read so far, maybe some self-plugging will be tolerated.
> lmonade (http://www.lmona.de/) is designed to solve these problems,
> with
>
> - a flexible package manager that support overlays and
>
> - a continuous integration system (patchbot in software engineering
> speak) where people can sign up to get their code tested when the
> packages they depend on are updated.
>
> It still lacks many features due to lack of developer time. I plan to
> change that soon (also encouraged by discussions like this one), but
> any help is much appreciated nevertheless.
>
>
> Cheers,
> Burcin
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To post to this group, send email to sage-...@googlegroups.com.
> To unsubscribe from this group, send email to sage-devel+...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sage-devel?hl=en.
>
>



--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

kcrisman

unread,
Nov 27, 2012, 10:12:09 AM11/27/12
to sage-...@googlegroups.com


On Tuesday, November 27, 2012 4:59:14 AM UTC-5, Simon King wrote:
Hi David,

On 2012-11-27, David Kirkby <david....@onetel.net> wrote:
> I feel the way to solve this is to have community contributed
> packages, which don't form part of the core of Sage, but can be
> installed by anyone if they wish to. Projects like R, Perl, autoconf
> all have this.

What is wrong with Sage's experimental or optional packages? Or
(analogously) with the packages of GAP, that also come with different
degrees of "official" support/approval.


The difference is that most R packages are continually tested and are not accepted on CRAN without actually fairly stringent requirements that they work with R (and they always have a minimum version that they work with, with an intelligible error message if not).  There are other R repositories that are less stringent, and the R community knows what stuff is what.  The Sage community is probably a couple of orders of magnitude smaller, and has a lot fewer auto build resources, so we are nowhere near there.  As I pointed out somewhere, we don't even have an automated test of each release with all optional packages on the various platforms (I think that Jeroen and William occasionally do this by hand, though).

But in general having a well-designed way to include research code as optional packages that could be injected into a contrib/ or something folder, as a lot of people have suggested, could work.  Of course, Maxima basically includes all its contrib code, and sometimes it is not as easy to figure out as the ordinary stuff either, but I doubt there is a perfect point here.  How hard would it be to "import contrib.foo as modular.foo" without causing problems?  Because of course one wouldn't want one's research stuff to have to live in a different namespace if it was for Sage proper stuff.

Keshav Kini

unread,
Nov 27, 2012, 3:00:42 PM11/27/12
to sage-...@googlegroups.com
William Stein <wst...@gmail.com> writes:
> On Mon, Nov 26, 2012 at 11:48 PM, Burcin Erocal <bur...@erocal.org> wrote:
>> Package dependencies for optional/experimental packages is beyond
>> the capabilities of the sage packaging system. Adding
>> "sage -i <package_name>" commands to spkg-install is a very ugly
>> hack.
>
> What's the situation with Pypi or any other Python packaging
> solutions. Surely they provide dependencies?

Of course, but not all our SPKGs are on PyPI or even have anything to do
with Python at all. In any case, Burcin's solution of using Gentoo
Prefix is appealing to me because it uses an installation system with
managed backwards compatibility (EAPI versioning) which has been used by
very large numbers of people for more than a decade now, whereas the
various Python package management solutions seem to keep getting
deprecated and replaced every couple of years... I think you've run into
just such problems, as evidenced by a post on your Google+ account back
in July :)

-Keshav

Keshav Kini

unread,
Nov 27, 2012, 3:06:33 PM11/27/12
to sage-...@googlegroups.com
Robert Bradshaw <robe...@math.washington.edu> writes:
> Another crazy idea: encourage uploading of unfinished code with
> optional nag reminders. I'd rather people upload code without tests
> and people start reviewing it than hold onto it until they "get around
> to" adding all the examples. Just having public "todo" lists can be
> motivation to finish something up, or bug someone about finishing
> something up.

+1, I don't think this is a crazy idea at all. In fact I think it's a
very good idea.

-Keshav

Keshav Kini

unread,
Nov 27, 2012, 3:04:54 PM11/27/12
to sage-...@googlegroups.com
Florent wrote this back in June:
https://github.com/kini/sage-workflow/blob/master/combinat.rst

It contains a bit about the Combinat team's goals.

-Keshav

David Kirkby

unread,
Nov 28, 2012, 10:32:31 PM11/28/12
to sage-...@googlegroups.com
On 27 November 2012 15:12, kcrisman <kcri...@gmail.com> wrote:
> How hard would it be to "import contrib.foo as modular.foo"
> without causing problems? Because of course one wouldn't want one's
> research stuff to have to live in a different namespace if it was for Sage
> proper stuff.

This reminds me of a situation that existed between National
Instruments and Sun.

I have a National Instruments GPIB board, which worked fine in Solaris
9. When I upgraded to Solaris 10, it would not work. Eventually the
problem was found by a Sun employee. National Instruments had for many
years had a Solaris driver for the GPIB board, which they called "ib".

Years later, Sun introduced Infiniband support in Solaris 10, and so
added a driver for Infiniband . No prizes for guessing what Sun called
the driver - yes "ib".

Needless to say, the National Instruments card would not work in
Solaris 10. A solution which worked for me was to remove the
Infiniband support from Solaris as I had no use for it. Then the
National Instruments driver "ib" would work properly.

Ideally, one would want a solution which stopped this sort of thing
happening in Sage, but with user contributed code, it might be quite
difficult to prevent.

Dave
Reply all
Reply to author
Forward
0 new messages