When I started Sage I viewed it as a distribution of a bunch of math
software, and Python as just the interpreter language I happen to use
at the time. I didn't even know if using Python as the language would
last. However, it's also possible to think of Sage as a Python
library.
Anyway, it has occurred to me (a few times, and again recently) that
it would be possible to make much of the Sage distribution, without
Python of course, into a Python library. What I mean is the
following. You would have a big Python library called "sagemath",
say, and inside of that would be a huge HG repository. In that
repository, one would check in the source code for many of the
standard Sage spkg's... e.g., GAP, Pari, etc. When you type
python setup.py install
then GAP, Pari, etc., would all get built, controlled by some Python
scripts, then installed as package_data in the sagemath directory of
<your python>/site-packages/.
From a technical perspective, I don't see any reason why this couldn't
be made to work. HG can handle this much data, and "python setup.py
install" can do anything. It does lead to a very different way of
looking at Sage though, and it could help untangle things in
interesting ways.
(1) Have a Python library called "sagecore", which is just the most
important standard spkg's (e.g., Singular, PARI, etc.), perhaps
eventually built *only* as shared object libraries (no standalone
interpreters).
(2) Have a Python library which is the current Sage library (we
already have this), and which can be installed assuming sagecore is
installed.
(3) Have other Python libraries (like psage:
http://code.google.com/p/purplesage/source/browse/), which depend on
(2). Maybe a lot of the "sage-combinat" code could also be moved to
such a library, so they can escape the "combinat patch queue" madness.
Maybe many other research groups in algebraic topology, differential
geometry, special functions, etc., will start developing such
libraries... on their own, and share them with the community (but
without having to deal directly with the sage project until they want
to).
To emphasize (3), when people want to write a lot of mathematics code
in some area, e.g., differential geometry, they would just make a new
library that depends on Sage (the library in (2)). We do the work
needed to make it easy for people to write code outside of the Sage
library, which depends on Sage. Especially writing Cython code like
this can be difficult and confusing, and we don't explain it all in
any Sage documentation. It actually took me quite a while to figure
out how to do it today (with psage).
The core Sage library (2) above would continue to have a higher and
higher level of code review, tough referee process etc. However, the
development models for (3) would be up to the authors of those
libraries.
The above is already how the ecosystem with Python
(http://pypi.python.org/pypi), Perl (http://www.cpan.org/), R, etc.,
work. Fortunately, Python has reasonably good support already for
this.
I think without a shift in this direction, Sage is going to be very
frustrating for people writing research oriented code.
Fortunately, it's possible to do everything I'm describing above
without disturbing the mainline Sage project itself, at least for
now.
-- William
--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org
First impression:
This may be easier on developers, but this could be a sturdy nail in
the coffin of "Sage as a viable replacement to M*". Fragmenting Sage
even further is going to make it harder to install, and harder to
ensure any standards of
* quality
* performance
* documentation
* portability (give up now on Windows, if you haven't already)
On second thought:
Matlab seems to make good money selling external packages, so
apparently, people are willing to go through some extra effort to get
the parts they need. If it was dead easy to install extra components
from within the notebook, we could seriously slim down the required
feature set. That way, we could *significantly improve the chances*
of a successful Windows port. Most of us here would continue
installing everything and the kitchen sink, but most users won't.
So I guess I'm for it. I'm a little worried that this is going to
fragment the developer base. But on the other hand, sage-combinat is
a pretty awesome community, and in my opinion, a wild success.
> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>
Without commenting on your entire proposal, let me say that the idea of
having all of Sage inside *one* Mercurial repo sounds very nice. When
someone says "it worked in version 5.0!" I'd love to just do "hg up
5.0" and rebuild what has changed.
That would make bisecting much more powerful, since right now "hg
bisect" isn't very helpful with anything that involves different
versions of a spkg. I know that the "rebuild what has changed" part
presents a lot of difficulty, but the one-repo-to-rule-them-all idea is
very nice.
Dan
--
--- Dan Drake
----- http://mathsci.kaist.ac.kr/~drake
-------
What about those people who install Sage because it's an easy way to get
a running version of Gap, Pari, etc? I think this is one nice selling
point of Sage right now: just last week, I was talking to a friend who
did a lot of his thesis work with Gap and (IIRC) Macaulay2. It's easy to
get someone using Sage if you tell them, "you can use those programs via
gap_console() and so on, but with a little bit of extra work, you can
work from Sage and have everything work seamlessly together.
Having the standalone interpreter available, as well as a shared
library or pexpect interfact, makes it easy for people to take baby
steps while switching to Sage, which I think is very attractive for busy
mathematicians who worry about the sunk cost of their knowledge of Gap,
Pari, and so on.
That's a good point. I'm also dubious that, for most of these
programs, explicitly not building the command line interpreter would
be a significant savings. (The pexpect interfaces on the other
hand...)
- Robert
Putting all dependencies into our own repository rubs me the wrong way
from an aesthetic point of view, but the current divide between spkg
and non-spkg work is annoying, especially for ones that have close
ties and dependancies with the sage.* library. A technical hurdle
would be that the revision history would be huge, if we're saving
every version of every package.
A middle ground might be to have everything but the sources themselves
checked in (i.e. all spkgs under the same repo). Perhaps even optional
ones. The development model would be to work directly on the sources,
we could have hooks to just save "patch files" in the right places
(and a special kind of commit to sync to new sources). This, of
course, is moving the opposite direction of making things more
modular.
Another idea would be to have the exact versioned dependencies (spkgs)
under the main revision control. Syncing to a revision and re-building
may require re-building (downloading?) a dependency. (On this note we
could explore if we need to ship all these dependancies ourselves, but
being able to specify, and tweak, exact versions I think has been
essential to getting such a large number of packages all working
together on such a large number of platforms.) An overlay like psage
could then add/change the dependencies as well.
> (1) Have a Python library called "sagecore", which is just the most
> important standard spkg's (e.g., Singular, PARI, etc.), perhaps
> eventually built *only* as shared object libraries (no standalone
> interpreters).
>
> (2) Have a Python library which is the current Sage library (we
> already have this), and which can be installed assuming sagecore is
> installed.
I'm not sure that splitting things up into sagecore/sagelibrary would
be a significant advantage over a-bunch-of-spkgs/sagelibrary, because
it doesn't erase the line that causes the most trouble.
> (3) Have other Python libraries (like psage:
> http://code.google.com/p/purplesage/source/browse/), which depend on
> (2). Maybe a lot of the "sage-combinat" code could also be moved to
> such a library, so they can escape the "combinat patch queue" madness.
> Maybe many other research groups in algebraic topology, differential
> geometry, special functions, etc., will start developing such
> libraries... on their own, and share them with the community (but
> without having to deal directly with the sage project until they want
> to).
The advantage of the sage-combinat queue is that it provides a more
natural migration of stuff into the core. One concern I have with (3)
is that if several libraries go monkey-patching the core (e.g.
replacing the number field implementation) they could become mutually
incompatible very quickly. It also discourages building common
infrastructure, e.g. exact linear algebra (unless that itself became a
library).
> To emphasize (3), when people want to write a lot of mathematics code
> in some area, e.g., differential geometry, they would just make a new
> library that depends on Sage (the library in (2)). We do the work
> needed to make it easy for people to write code outside of the Sage
> library, which depends on Sage. Especially writing Cython code like
> this can be difficult and confusing, and we don't explain it all in
> any Sage documentation. It actually took me quite a while to figure
> out how to do it today (with psage).
>
> The core Sage library (2) above would continue to have a higher and
> higher level of code review, tough referee process etc. However, the
> development models for (3) would be up to the authors of those
> libraries.
>
> The above is already how the ecosystem with Python
> (http://pypi.python.org/pypi), Perl (http://www.cpan.org/), R, etc.,
> work. Fortunately, Python has reasonably good support already for
> this.
>
> I think without a shift in this direction, Sage is going to be very
> frustrating for people writing research oriented code.
>
> Fortunately, it's possible to do everything I'm describing above
> without disturbing the mainline Sage project itself, at least for
> now.
To summarize, are you thinking of (3) as optional spkgs supported by
research communities? (The difference being the non-technical
distinction between a sage dependency vs. an additional library.) I'm
not seeing how (1) would help in this goal, though it's an interesting
topic to breach. As Tom mentions, If we made more packages optional,
but really easy to install, that could make porting efforts easier
too.
In many ways, Sage as a platform on which to build research code,
rather than Sage incorporating all research code, fits with the
current bureaucratic model. The shipping Sage could satisfy many
non-researchers, and if it's easy to add the research code in, those
"in the know" could be cutting edge.
Once concern I have is that one thing that has made Sage as good as it
is is everyone going the extra mile to make things a bit nicer than if
they had to just code it up for themselves, or helping out on parts
that aren't directly related to their research. This takes time, but
helps keep things going smoothly. If everyone fragments into their own
communities with lower expectations, this could be bad for the
infrastructure they're trying to build on. It would also be a shame if
stuff doesn't ever get polished enough to make it "upstream" into
standard Sage. (Lots of people using the code could make the
refereeing process smoother, and it's easier to improve code as part
of a project than improve a patch on trac, but the motivation goes way
down...)
- Robert
One comment I have. I feel the approach taken by projects like perl,
Mathematica, MATLAB, R etc is good. With these projects, there is a core system
containing functionality of useful to a large group.
When people want specific code to do work for their research interests, they
develop that themselves, and its made available for others to use if they want to.
For perl there is
http://www.cpan.org/
For Mathematica there is
http://library.wolfram.com/
For MATLAB there is File exchange
http://www.mathworks.com/matlabcentral/fileexchange/
For R there is
http://cran.r-project.org/web/packages/
Well set up, there should be no conflicts in using more than one package. In the
case of Mathematica, all commands should be in lower case, as that can never
cause a problem with Mathematica, where all commands start in upper case.
It seems to me that there are too many people adding what is quite obscure maths
into Sage, which probably has no users other than themselves. I'm not a
mathematician, but I've heard this stored from Sage developers who are
mathematicians.
Of course Sage does have optional and experimental packages, but these are
actually quite small in number. For the Sage library, it seems that virtually
anything can get merged if it has some use to someone.
With all due respect to the authors, should a program to solve Rubiks cubes be
part of the core Sage, or as an optional component? I would postulate Wolfram
Reserach would never integrate such functionality into Mathematica, but would
probably post a package into their user-contributed library.
Dave
I'm not talking about that... yet. But even if I were, for an end
sage user, there's nothing stopping us from importing everything into
sage exactly as now.
People seem to have a big concern that sage-as-it-is-now would cease
to exist were we to do what I'm describing. This is a misplaced fear.
In fact, it would just change how Sage itself is put together, and
make the work that people put into Sage available to a vastly wider
community of users.
-- William
> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>
--
I actually started doing something, which is at
http://sage.math.washington.edu/home/wstein/sagecore/
All I did was make a list of all the C/C++ libraries that come with
Sage and which the core Sage library depends on due to Cython code
that uses these libraries. These are also libraries that aren't "bog
standard". Here is the list:
cliquer
eclib
ecm
flint
givaro
gsl
iml
lcalc
libfplll
libm4ri
linbox
mpfi
mpfr
mpir
ntl
pari
polybori
pynac
ratpoints
singular
symmetrica
zn_poly
For each in the list, I copied (via a script I wrote), the spkg over
to a directory "sagecore/packages", extracted the package, got rid of
the explicit version number from it (it's in the SPKG.txt file
anyways), got rid of all the .hg repos in there, and the result is at
http://sage.math.washington.edu/home/wstein/sagecore/sagecore.tar.bz2
You can browse (for a limited time) the hg repo at
http://sage.math.washington.edu:8010/file/d569a5041000/packages
The next step is to write a script that goes through and builds all
the above packages into a directory.
Then make it so that gets installed into a standard Python package
data directory, and then build the Sage library on top of this.
William
> At the moment just keeping apace with the latter
> components is taking a large chunk of developers time.
This is very, very true, and your point about cvxopt later on nicely
emphasizes this.
>
> To begin with, I imagine one can look into ways PyPI manages
> dependencies, etc.
> I understand there is a mechanism that allows for pinning of
> particular versions, etc...
Yep, and also easily packaging up certain versions of all dependencies
with a package (sagenb uses this, actually).
PyPI rocks.
William
> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>
--
Are you also considering to move away from the spkg based buildsystem,
or do you think it will stay?
Ondrej
I am merely suggesting a technical experiment. I'm unsure what the
implications are, and I don't think there is any easy way to know
until one just does it.
William
>
>
> Ondrej
>
> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>
--
On Tue, Oct 26, 2010 at 06:44:21PM -0700, William Stein wrote:
> When I started Sage I viewed it as a distribution of a bunch of math
> software, and Python as just the interpreter language I happen to use
> at the time. I didn't even know if using Python as the language would
> last. However, it's also possible to think of Sage as a Python
> library.
>
> Anyway, it has occurred to me (a few times, and again recently) that
> it would be possible to make much of the Sage distribution, without
> Python of course, into a Python library. What I mean is the
> following. You would have a big Python library called "sagemath",
> say, and inside of that would be a huge HG repository. In that
> repository, one would check in the source code for many of the
> standard Sage spkg's... e.g., GAP, Pari, etc. When you type
>
> python setup.py install
>
> then GAP, Pari, etc., would all get built, controlled by some Python
> scripts, then installed as package_data in the sagemath directory of
> <your python>/site-packages/.
A big +1 for being able to use Sage in both ways (the current one, and
as a Python library). First from an aesthetic point of view (Sage
would be a library just like others), but more importantly from a
practical point of view. I have tried several times to use Sage
together with other tools (typically user interfaces like Spyder). Up
to now this has each time been a pain, for the only working solution I
found was to install all those tools and their dependencies within
Sage (including things like Qt, ...).
Cheers,
Nicolas
--
Nicolas M. Thi�ry "Isil" <nth...@users.sf.net>
http://Nicolas.Thiery.name/
(I am CC'ing Sage-Combinat which may be interested by this thread on
sage-devel)
On Tue, Oct 26, 2010 at 06:44:21PM -0700, William Stein wrote:
> (3) Have other Python libraries (like psage:
> http://code.google.com/p/purplesage/source/browse/), which depend on
> (2). Maybe a lot of the "sage-combinat" code could also be moved to
> such a library, so they can escape the "combinat patch queue" madness.
> Maybe many other research groups in algebraic topology, differential
> geometry, special functions, etc., will start developing such
> libraries... on their own, and share them with the community (but
> without having to deal directly with the sage project until they want
> to).
I agree that we are totally abusing what patch queues are designed
for; in fact, I am quite amazed that our patch queue remains quite
robust even with 25+ contributers and 200+ patches: we have no more
than one conflict every other week.
Combinatorics naturally interacts (and strive from this interaction)
with many other parts of mathematics [1], and that translates into
strong code interactions (e.g. code from algebra using combinatorics,
which in turns uses algebra). My usual rant is that a typical research
project for us involves touching:
- 95% of general purpose stuff (groups, polynomials, categories,
basic combinatorics, linear algebra, ...)
- 5% of project specific code.
The main role of Sage-Combinat is to promote this kind of interaction,
as well as early and tight integration (see Robert's comments in this
thread which I totally second). We also want to keep a broad view and
refactor at large scale (hence our need for things like categories).
Right now, splitting up things into libraries with a clean acyclic
graph of dependencies would not only be hard, but would defeat those
purposes.
I agree that this goes against the principle of modularity, and that
some quite specific code out of the 5% may end up into Sage's library
where it does not really belong. But if this is the price to pay to
make sure that the generic stuff out of the 95% goes in, so be it.
Part of it is that we are still at a very early stage. It is my hope
that clearly defined subfields within Sage-Combinat will progressively
emerge once the dust of the common infrastructure will settle down
(that's actually already the case, and probably explains why the
Sage-Combinat queue does not explode, whereas it was on the fringe of
it last year, with half as many patches). In that future, the
libraries approach could prove useful for the 5% of project/theme
specific code.
Cheers,
Nicolas
[1] other fields of mathematics certainly do to, but I don't feel
qualified to speak about them
This post is about:
(1) Concern about distutils/setuptools/etc., is misplaced.
(2) Python3 and librarifying Sage.
First, all this discussion about distutils/setuptools/david
cournapeau, etc., is actually mostly IRRELEVANT to making the core
Sage library into a standalone library. The way it would work is
this:
1. You type "python setup.py develop" (or possibly "python setup.py install").
2. A function in setup.py builds all the non-standard C/C++ libraries
that the core Sage library depends on, which is the following 24
libraries:
boost-cropped givaro libm4ri mpir ratpoints
cliquer gsl libpng ntl
eclib iml linbox pari singular
ecm lcalc mpfi polybori symmetrica
flint libfplll mpfr pynac zn_poly
This function in setup.py is a Python function, and it can do
*anything* it wants. distutils/setuptools/etc. are irrelevant!! In
fact, this can just be a very simple version of the current Sage build
system, and we can just include the 24 Sage packages corresponding the
above-listed 24 libraries basically as is. Just for fun, I tried
this and wrote a sample setup.py sort of illustrating what I mean (and
ran it, and it works, but you can't, since of course it needs the
source files. I'll post more later.). When I did this, by the way,
and deleted the .a files, leaving just the shared libraries, it only
took about 25MB compressed -- pretty interesting.
3. After the C/C++ libraries have all been built, then the regular
Sage library gets built, using some slight variation of the current
build scripts.
---
Anyway, since this thread sort of ended with some major misconceptions
that the setuptools weirdness was a serious issue, I wanted to correct
this misconception.
Another point I think is interesting is that the Sage library itself
seriously depends on the above 24 C/C++ libraries, which have little
or nothing to do with Python2 versus Python3, plus a very small number
of Python libraries: numpy, matplotlib, networkx. Sage uses scipy,
cvxopt, etc., a tiny, tiny bit, but nothing serious. Even matplotlib
is *only* used to draw pictures. Thus if we wanted a Python3 version
of the Sage library itself, if we had a library like I describe above,
this would only require a Python3 version of numpy and networkx, plus
the work of porting the Sage library itself. This doesn't sound so
far off, since there already is a Python3 version of numpy.
-- William
I don't quite agree with this interpretation. Even if
setuptools/distutils were "perfect" they would not be the right tool
for building those 24 libraries. The right tool is a Python script
that calls the native build system on each of those 24 libraries
(e.g., which is autoconf, perl, etc.).
>
>
>> [...]
>> 2. A function in setup.py builds all the non-standard C/C++ libraries
>> that the core Sage library depends on, which is the following 24
>> libraries:
>>
>> boost-cropped givaro libm4ri mpir ratpoints
>> cliquer gsl libpng ntl
>> eclib iml linbox pari singular
>> ecm lcalc mpfi polybori symmetrica
>> flint libfplll mpfr pynac zn_poly
>> [...]
>
> I'd prefer having plain text files rather than a pickled build_db.
Thank you for looking at the code.
It is just a pickled pure python dictionary, which is flexible and
easy to work with.
> Adding (formal) dependency specifications to the spkgs (a file, say,
> spkg-deps in each spkg) would be a step forward, too, such that we can
> also *generate* the "real Makefile" spkg/standard/deps in the
> traditional build process.
There's no deps file in what I'm doing. Also, among these 24
libraries the dependencies are nearly trivial. Basically, many depend
on MPIR, some on NTL, and beyond that there is nothing (?).
> (In addition, a lot of what's currently performed in every spkg-
> install could be factored out.)
True, but orthogonal.
>
>
> -Leif
>
>> [...] Thus if we wanted a Python3 version
>> of the Sage library itself, if we had a library like I describe above,
>> this would only require a Python3 version of numpy and networkx, plus
>> the work of porting the Sage library itself. This doesn't sound so
>> far off, since there already is a Python3 version of numpy.
>
I again think that your above remarks are totally completely
orthogonal to what I'm proposing.
It may be that maintaining those 24 packages is a lot of work, but
that's work that is not relevant
to making sage into a standalone library. It's work that is *already*
been done (and being done)
for the standalone version of Sage. If (and only if) Sage were to
switch to a different build system,
then the standalone sage library could just adapt to that.
>> Another point I think is interesting is that the Sage library itself
>> seriously depends on the above 24 C/C++ libraries, which have little
>> or nothing to do with Python2 versus Python3, plus a very small number
>> of Python libraries: numpy, matplotlib, networkx. Sage uses scipy,
>> cvxopt, etc., a tiny, tiny bit, but nothing serious. Even matplotlib
>> is *only* used to draw pictures. Thus if we wanted a Python3 version
>> of the Sage library itself, if we had a library like I describe above,
>> this would only require a Python3 version of numpy and networkx, plus
>> the work of porting the Sage library itself. This doesn't sound so
>> far off, since there already is a Python3 version of numpy.
>>
>
> I'm all for slicing up the current rather monolithic Sage distribution
> into smaller, more manageable parts. Having an independent "Sage
> core", the transition to Python 3 will certainly be less painful (the
> question is not whether, but only when we'll do this). It is even
> thinkable to have some parts using Python 3 (the Sage core, say) and
> some parts still using Python 2 (SageNB? or which parts of the Sage
> distribution were relying on that old version of Twisted?) at one and
> the same time, even in officially shipped distributions ...
I really, really hope that having to ship both never happens, but yes,
it is technically possible.
My plan for migrating the Sage notebook to not use twisted anymore is to switch
to Flask (http://flask.pocoo.org/). Flask is a small
"microframework", but it only
works with Python 2.x, and they have no plans at present to support Python 3.x.
Evidently, they believe that there is no good mod_wsgi support in
Python 3.x yet.
>
> Cheers,
> Georg
Citation: http://flask.pocoo.org/docs/installation/
"At the time of writing, the WSGI specification is not yet finalized for
Python 3, so Flask cannot support the 3.x series of Python."
So it sounds like it's just a matter of time...
Jason