I recently adapted the merge scripts to deal with spkg's in a new way.
Most importantly, changes inside a spkg are automatically *committed*
before merging the spkg into Sage (the spkg is extracted, hg commit is
done using a commit message coming from SPKG.txt, an hg tag is added and
the spkg is repacked). I hope this will make authoring and reviewing
spkg's slightly easier.
This implies that a merged spkg is no longer byte-for-byte identical to
the spkg made by a ticket author.
The new script also adds several sanity checks for a spkg:
1) Inside the spkg, there must be a top-level directory whose name is
the same as the spkg, but with the extension ".spkg" removed.
2) SPKG.txt must contain a line of the form
=== cliquer-1.2.p9 (Jeroen Demeyer, 4 May 2011) ===
(more precisely, it must match /^==* ${spkg_name_and_version} /)
3) There must also be such a line for the previous spkg version (e.g.
any future numpy spkg must mention "numpy-1.5.1" in its SPKG.txt, which
is the version currently in Sage). This is to ensure that a spkg is
based on the most recent version.
4) SPKG.txt and spkg-install must be under hg control.
Further ideas, suggestions, complaints are welcome.
Jeroen.
I agree with the sentiment expressed elsewhere in previous threads that
the changelog should be in the hg log, and not necessarily in the
SPKG.txt file. In other words, I feel like the changes you made should
be reversed---the hg log messages should be insisted on, and the
changelog inside the SPKG.txt should be generated from the hg log. But
it doesn't matter enough to me to change what you've done.
Thanks,
Jason
WIth no resolution, so thanks for suggesting a solution.
> I do agree that getting SPKG.txt to be automatically generated from
> changelogs would be a nice way to get better changelogs, so maybe I am
> agreeing with Jason?
+1, I find it more natural to work with hg.
To support both workflows, another option is to support going both
ways--if there are uncommitted changes, make an hg entry based on the
spkg.txt, otherwise, update spkg.txt based on the changelog entry +
spkg filename.
- Robert
I like SPKG.txt. Personally I would have called the file "ChangeLog" in common
with just about every other software project, but SPKG.txt does. I think that
summerises the changes much better than what "hg log" does, where often there
are numerous changes made when a ticket gets reviewed.
> I do agree that getting SPKG.txt to be automatically generated from
> changelogs would be a nice way to get better changelogs, so maybe I am
> agreeing with Jason?
>
> - kcrisman
Automatic generation would be more accurate and more detailed. I don't feel
however it would be better. I suspect it will have a lot of details that one
does not see (or want), when trying to get a quick overview of what has happened
to a package.
--
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?
I agree with David on this, but maybe that is partially because I'm not
very fluent with hg. My personal spkg workflow is NOT to commit changes
until at the very last moment, such that "hg diff" always gives the diff
against the last version. So I modeled the merger script to make this
workflow easier (by not having to do the last step of committing the
changes).
Jeroen.
I guess this suits my workflow too--I would just make extra commits in
between versions. So for me:
1. make all the changes, committing as I go like I would normally do.
2. Make an entry in SPKG.txt which summarizes these changes, as sort of
a changelog for the version bump.
3. Upload the spkg so that Jeroen's script makes one more commit which,
in effect, tags the version number and commits a summary changelog in
SPKG.txt.
That sounds perfect! The details will still be in the hg log from my
commits as I go, and a high-level summary is in the SPKG.txt and
committed as one last commit to the repository.
Jason
I suppose for my spkg workflow (mostly Cython) the new spkg doesn't
usually involve anything more than swapping out the sources and
perhaps adding/removing a patch. Adding an SPKG.txt entry is entirely
redundant with the hg commit (if one is even needed).
- Robert
> Further ideas, suggestions, complaints are welcome.
> Jeroen.
>
I've often wondered if it would be possible to safely remove the write
permissions from the "src" directory and everything below it, so files can't be
accidentally changed.
I believe that would reduce the chances of the "src" being corrupted.
However, there may be the odd package which would fail to build if this was
done, in which case the default permissions should be used.
I would add on top of that for consideration that SPKG.txt often contains
more info than what you would find in a normal changelog. It often has
special instructions about the package, it is much more info than just a
changelog.
Francois
This email may be confidential and subject to legal privilege, it may not reflect the views of the University of Canterbury, and it is not guaranteed to be virus free. If you are not an intended recipient, please notify the sender immediately and erase all copies of the message and any attachments. Please refer to http://www.canterbury.ac.nz/emaildisclaimer for more information.
Oh, I agree, that's what SPGK.txt was created for. That's why it
wasn't called ChangeLog to begin with, but now people have been using
it as a the changelog.
- Robert
>> I suppose for my spkg workflow (mostly Cython) the new spkg doesn't
>> usually involve anything more than swapping out the sources and
>> perhaps adding/removing a patch. Adding an SPKG.txt entry is entirely
>> redundant with the hg commit (if one is even needed).
>>
> I would add on top of that for consideration that SPKG.txt often contains
> more info than what you would find in a normal changelog. It often has
> special instructions about the package, it is much more info than just a
> changelog.
>
> Francois
Good point.
I know people critisize it, but for me, who has worked with .spkg
files a lot, I find it useful.
Dave
Yes, and +1 for keeping the other valuable information in SPKG.txt
updated and useful.
Jason
I think most packages build inside of that src directory, right?
Changing permissions would mess that up.
Jason
I can imagine something going wrong when src/ is updated to a new
upstream version, but permissions are not going to help that situation.
At that point, src/ is writable.
Jeroen.
There have been a number of packages to which the contents under "src" have been
purposely made by people not knowing what they are doing. I've lost count of them.
Only a week or two ago (during the 4.7 release), there was a file which got
patched in "src" when "patch" was run from spkg-install. It was related to
building Python on some Linux version - I forget the ticket.
I suspect with the increased use of "patch" and less use of "cp" when applying
patches, it will become easier to make a mistaken and patch the upstream source
by mistake.
So, it it was possible to protect against that, I think it would be a good idea.
Jeroen.
Using patch is more resistant to this, because it will refuse to apply
the same patch twice.
- Robert
Even better would be to checksum the source in a src.md5 file, and
have sage -spgk warn/error if the checksums don't match. Thus one
couldn't accidentally modify the src directory.
- Robert
I think it would use the user's .hgrc file.
> I sometimes worry we're not really using Mercurial for anything.
I agree.
> http://hg.sagemath.org/ paints a very unrealistic picture of the development
> of Sage, for example, in comparison to how a real open source project's code
> repository should look - take for example matplotlib (
> http://github.org/matplotlib ), ipython ( http://github.com/ipython ),
> Octave ( http://hg.savannah.gnu.org/hgweb/octave ), etc.
Especially with the longer release cycles. At the very least it'd be
nice to have a public "devel" repo updated at every alpha release. And
I'd love to have a live head to test and rebase against. Something
like the sage merger script, but automated as every ticket on trac
with positive review + release manager approval + passing on these X
systems (on top of last previous head). It would just crawl forward
over time and always be (relatively) stable.
> To be frank, when
> even you, who are in ultimate charge of our source control, say you are not
> very fluent with the source control mechanism we use, it must mean we are
> all kind of stumbling around... Are we just using hg as a convenient way to
> generate patches and nothing more? I'm in no way an expert on Mercurial or
> on software development practices but these things do worry me.
Over the years, we've moved to using trac as our revision control
mechanism, and mercurial just to generate patches and keep track of
(limited) history. On top of that we don't have good automated/cli
interfaces for dealing with trac, so in some ways it's a step back
from just emailing patches around (though in others a step forward).
The pull-request model of google code/github I think is a much nicer
one, but momentum is hard to change. There's also the advantage of
iterating on the actual commits to produce a cleaner history (e.g.
folding patches together, making corrections after discussion), though
that could be incorporated into the fork/pull model as well (or
keeping patch queues under revision control, a la sage-combinat).
The other problem is that so much isn't under revision control (eg.
what versions of spkgs to use), or in multiple repositories that need
to be kept in sync. Were I to design the system from scratch, I'd put
all our code (devel/scripts/...) in a single repo, along with the
top-level files, and a list of dependencies (spkgs). Building sage
would fetch (locally or remotely) the dependencies listed and build
them in such a way that changing the list of dependencies and
re-building would easily and cheaply reversible. I would probably
still build my own Python, but may require it (flexible version) as a
bootstrapping prerequisite. Whether the non-upstream parts of an spkg
belong in the spkgs or the main repo, I'm not sure, but I'd rather
*everything* be expressed as commit to a single repository (possibly
moving a pointer to some new, vanilla upstream source, rather than
putting all upstream sources in our repo).
If others have similar views, maybe we could move in that direction.
- Robert
Are you thinking about something like this, a file, say singular-version which
points to the current Singular version? Applying a patch which changes this
textfile implies updating the Singular SPKG? Then, patches can depend on other
patches in a clean way?
> Building sage
> would fetch (locally or remotely) the dependencies listed and build
> them in such a way that changing the list of dependencies and
> re-building would easily and cheaply reversible. I would probably
> still build my own Python, but may require it (flexible version) as a
> bootstrapping prerequisite. Whether the non-upstream parts of an spkg
> belong in the spkgs or the main repo, I'm not sure, but I'd rather
> *everything* be expressed as commit to a single repository (possibly
> moving a pointer to some new, vanilla upstream source, rather than
> putting all upstream sources in our repo).
>
> If others have similar views, maybe we could move in that direction.
+1
Cheers,
Martin
--
name: Martin Albrecht
_pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
_otr: 47F43D1A 5D68C36F 468BAEBA 640E8856 D7951CCF
_www: http://martinralbrecht.wordpress.com/
_jab: martinr...@jabber.ccc.de
Yes. Or perhaps a single file that lists all dependencies with
pointers, rather than a whole directory of one-line files.
>> Building sage
>> would fetch (locally or remotely) the dependencies listed and build
>> them in such a way that changing the list of dependencies and
>> re-building would easily and cheaply reversible. I would probably
>> still build my own Python, but may require it (flexible version) as a
>> bootstrapping prerequisite. Whether the non-upstream parts of an spkg
>> belong in the spkgs or the main repo, I'm not sure, but I'd rather
>> *everything* be expressed as commit to a single repository (possibly
>> moving a pointer to some new, vanilla upstream source, rather than
>> putting all upstream sources in our repo).
>>
>> If others have similar views, maybe we could move in that direction.
>
> +1
>
> Cheers,
> Martin
>
> --
> name: Martin Albrecht
> _pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
> _otr: 47F43D1A 5D68C36F 468BAEBA 640E8856 D7951CCF
> _www: http://martinralbrecht.wordpress.com/
> _jab: martinr...@jabber.ccc.de
>
> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>
I actually developed such a system from scratch (it's called Qsnake:
http://qsnake.com) and pretty much followed your paragraph above, so I
pushed all the repos at github:
for example the Cython repo is here:
https://github.com/qsnake/cython
it is actually a fork of the official Cython repo. And then Qsnake is
clever enough, that when it is fetching the sources, and if there is
setup.py and no spkg-install, it creates spkg-install automatically
with "setup.py install" (and other default stuff), and then saves the
cython.spkg package into the spkg/standard/ directory.
Having all spkg-install scripts in the main repository (so in my case
in this repository: https://github.com/qsnake/qsnake) is a valid idea,
that I have been bouncing around too. I decided not to, to keep things
localized, as sometimes nontrivial modifications are needed to the
upstream packages, and the best way to do such modifications is to
simply create couple git patches. Also for testing, if there is
spkg-install file, committed, I can checkout the git repo for a
particular package and
qsnake install .
and it will install the package. And I can debug it easily.
Ondrej
Oh, and then people can simply send pull requests to the respective
packages directly. So I think it solves lots of the problems raised in
this thread. For sage, it would have to move to github though, so it
might not be an option.
Ondrej
Jeroen.
I would not rely on that, as a different UID or GID on different
systems will give different results.
MD5 would not be good, as different systems have either no program to
compute an md5 checksum, or have them with different names. Although
many systems have a "sum" command, the algorithm will be system
dependant, so that too can't be used.
In contrast, POSIX defines "cksum" so using that as a checksum tool is
preferable.
Something like the following might be workable
drkirkby@laptop:~/sage-4.7.rc0/spkg/standard/singular-3-1-1-4.p8$ find
src -exec cksum {} \; | awk '{print $1}' | cksum | awk '{print $1}'
3766045910
That computes the checksum of each file (using cksum) and then
computes a checksum of the checksums.
drkirkby@laptop:~/sage-4.7.rc0/spkg/standard/singular-3-1-1-4.p8$ find
src -exec cksum {} \; | awk '{print $1}' | cksum | awk '{print $1}'
3766045910
> Oh, and then people can simply send pull requests to the respective
> packages directly. So I think it solves lots of the problems raised in
> this thread. For sage, it would have to move to github though, so it
> might not be an option.
Bitbucket has pull requests and forks, and mercurial has submodules,
IIRC. I don't know exactly how they work, as I'm a big fan of the
git/github branching model over mercurial, but it might be possible to
do these sorts of things with bitbucket and stay with mercurial.
Jason
OK:
find src -print -exec cksum {} \; | awk '{print $1}' | sort | cksum | awk
'{print $1}'
gives what I assume will be the same output on all systems.
> I'd prefer it if we also store the checksum for each file together with the
> filename, so one can find out which file was changed. Though one would have
> to write a bit more than just a one-liner to implement it. While we are at
> it, cryptographically sign everything ;-)
I think once a file is known to have been changed, tracking down which one
manually should not be hard, so I don't really see the need to store the
checksums of every file.
But it isn't exposed as an hg repository anywhere, one only gets
snapshots now and then, and not all tests pass at every point. The
fact that so much is not under (a single) revision control makes this
harder.
- Robert
As would touching a file (e.g. to revert a patch or accidental
change), or any change keeping the length of the file the same (yes,
small patches do that sometimes).
> MD5 would not be good, as different systems have either no program to
> compute an md5 checksum, or have them with different names. Although
> many systems have a "sum" command, the algorithm will be system
> dependant, so that too can't be used.
>
> In contrast, POSIX defines "cksum" so using that as a checksum tool is
> preferable.
cksum is cryptographically weak, but strong enough for this purpose.
> Something like the following might be workable
>
> drkirkby@laptop:~/sage-4.7.rc0/spkg/standard/singular-3-1-1-4.p8$ find
> src -exec cksum {} \; | awk '{print $1}' | cksum | awk '{print $1}'
> 3766045910
>
>
> That computes the checksum of each file (using cksum) and then
> computes a checksum of the checksums.
>
Cool.
>> for example the Cython repo is here:
>>
>> https://github.com/qsnake/cython
>>
>> it is actually a fork of the official Cython repo. And then Qsnake is
>> clever enough, that when it is fetching the sources, and if there is
>> setup.py and no spkg-install, it creates spkg-install automatically
>> with "setup.py install" (and other default stuff), and then saves the
>> cython.spkg package into the spkg/standard/ directory.
>>
>> Having all spkg-install scripts in the main repository (so in my case
>> in this repository: https://github.com/qsnake/qsnake) is a valid idea,
>> that I have been bouncing around too. I decided not to, to keep things
>> localized, as sometimes nontrivial modifications are needed to the
>> upstream packages, and the best way to do such modifications is to
>> simply create couple git patches. Also for testing, if there is
>> spkg-install file, committed, I can checkout the git repo for a
>> particular package and
>>
>> qsnake install .
>>
>> and it will install the package. And I can debug it easily.
Is it a reversible install? What about concurrent versions? What about
dependancies (e.g. if the version of Cython was updated, would all the
stuff depending on Cython get re-compiled?
> Oh, and then people can simply send pull requests to the respective
> packages directly. So I think it solves lots of the problems raised in
> this thread. For sage, it would have to move to github though, so it
> might not be an option.
Probably not. Is it really tied to github, or could any dvcs be plugged in?
- Robert
> The
> fact that so much is not under (a single) revision control makes this
> harder.
You mean that it's harder to rollback because spkg's are not under
revision control? With the current spkg system, that is hard to solve.
Jeroen.
You raise an interesting point. My solution with cksum would not
detect if the file has been touched whilst the contents remain the
same, since its only checking the contents, not the date.
But if someone has touched a file, but did not change the contents, it
is not really an issue.
>> MD5 would not be good, as different systems have either no program to
>> compute an md5 checksum, or have them with different names. Although
>> many systems have a "sum" command, the algorithm will be system
>> dependant, so that too can't be used.
>>
>> In contrast, POSIX defines "cksum" so using that as a checksum tool is
>> preferable.
>
> cksum is cryptographically weak, but strong enough for this purpose.
Yes, there's a small probability of failure - I think 1 in 2^32,
though perhaps thatś not true. It was not designed for cryptographic
purposes, but for just the sort of application being discussed here. I
suspect md5 is more computationally intensive, but the real problem
with md5 is there is no single command which will work on every
system.
Dave
Jeroen.
Well, with a shell script, if it works now, you know it will keep working. The
same can't be said for Python, as numerous backwards-incompatible changes occur
with different versions of Python. We can't upgrade to 2.7 yet, and 3.x is well
over the horizon.
That said, I can't find a one-liner as a shell script which will check the times
and dates too.
Dave
No, currently it's just like in Sage. For reversible install, one
would have to redo every single package to be able to take things from
SPKG_LOCAL (=SAGE_LOCAL in Sage), but install into something like
SPKG_INSTALL, so that one can package it and keep track of files.
Hand in hand with this go "binary packages". If one can do
"uninstall", then immediately one can start creating automatic binary
packages. That would be super cool.
I am currently undecided whether to go with this or not. No doubt it
would be useful. If Sage goes this way, than surely we'll follow too.
Otherwise probably not, as compatibility is important. People who know
Sage should find it quite easy to play with Qsnake and vice versa.
Learning a completely new system and how it works is an obstacle.
Adding uninstall and binary packages will make everything more complex
(imagine installing the wrong binary into incompatible base
install....). Right now it's all just source packages, and possibly
one binary for the whole thing (for each platform), which is
manageable.
> What about concurrent versions?
Only if you rename the package. In this I have the same (or similar)
vision as Sage, that is to create a well tested scientific environment
with just one tested version of each package, that works with
everything. If there are two incompatible versions, then one should
change the name of the package, e.g. python -> python2 + python3. Then
it can live side by side.
> What about
> dependencies (e.g. if the version of Cython was updated, would all the
> stuff depending on Cython get re-compiled?
No. Only the other way round -- if you want to install "phaml", that
uses Cython to wrap Fortran, it will also pull in "cython"
automatically (as well as all the other packages). However, since the
dependency tree is known, we can add this feature as well, to
recompile all "rdepends" (reverse dependencies, to use Debian
terminology).
>
>> Oh, and then people can simply send pull requests to the respective
>> packages directly. So I think it solves lots of the problems raised in
>> this thread. For sage, it would have to move to github though, so it
>> might not be an option.
>
> Probably not. Is it really tied to github, or could any dvcs be plugged in?
I just made two releases last couple days, one can download it from here:
https://github.com/qsnake/qsnake/archives/master
and that is just one big source tarball, and installs completely
locally, no git is needed (it will actually install its own
automatically --- but it is not needed for the build system). Git is
only used to create the big source tarball using "qsnake -d". It can
be easily changed to "hg" in the build system (and then one needs to
move all the packages, which is simple but tedious).
Ondrej
That's actually the first issue I have created about Qsnake some time ago :)
https://github.com/qsnake/qsnake/issues/1
so I just updated it with your idea to automatically recompile reverse
dependencies. That's a great idea. Once implemented, just this should
keep you up to date with the latest git versions of all packages:
qsnake update
qsnake upgrade
Ondrej
That was my point, +1 to chksumming the files, not the listing.
Preserving dates is nice, but that will make it much harder to work on
files (e.g. if I patch a file to test something out, then I'd have to
touch it back to the old time, if I can even remember that, or
re-unpack the sources. If Makefiles are an issue, one can touch the
entire directory to have the same timestamp.
>>> MD5 would not be good, as different systems have either no program to
>>> compute an md5 checksum, or have them with different names. Although
>>> many systems have a "sum" command, the algorithm will be system
>>> dependant, so that too can't be used.
>>>
>>> In contrast, POSIX defines "cksum" so using that as a checksum tool is
>>> preferable.
>>
>> cksum is cryptographically weak, but strong enough for this purpose.
>
> Yes, there's a small probability of failure - I think 1 in 2^32,
> though perhaps thatś not true. It was not designed for cryptographic
> purposes, but for just the sort of application being discussed here.
I brought up the issue because someone mentioned signatures.
> I suspect md5 is more computationally intensive, but the real problem
> with md5 is there is no single command which will work on every
> system.
Although finding that single command can be bothersome--with shell
scripting there are a whole lot of commands that worn on one (or even
most) posix systems and not on all of them, or even in one shell and
not another, as evidenced by the amount of work spent porting the <1MB
shell scripts we have to Solaris. WIth Python, if it works on my
system it has a 99.9% chance of working on yours, and
version-to-version changes, though non-trivial, are small (e.g. Cython
runs with 2.3 through 2.7 on a single codebase). 3.x excluded, but
that was 5+ years of backwards incompatible changes all pushed at
once. And of course in Sage we end up using/interfacing with a lot of
C libraries, which are not so portable.
That being said, our hands are tied as Python is not a prerequisite
for building that first spkg.
- Robert
I could see this done with paths, with every spkg installing into its
one versioned directory, and the python/shell/include/library paths
set up to have the list of all currently enabled spkgs.
> Hand in hand with this go "binary packages". If one can do
> "uninstall", then immediately one can start creating automatic binary
> packages. That would be super cool.
>
> I am currently undecided whether to go with this or not. No doubt it
> would be useful. If Sage goes this way, than surely we'll follow too.
> Otherwise probably not, as compatibility is important. People who know
> Sage should find it quite easy to play with Qsnake and vice versa.
> Learning a completely new system and how it works is an obstacle.
>
> Adding uninstall and binary packages will make everything more complex
> (imagine installing the wrong binary into incompatible base
> install....). Right now it's all just source packages, and possibly
> one binary for the whole thing (for each platform), which is
> manageable.
>
>> What about concurrent versions?
>
> Only if you rename the package. In this I have the same (or similar)
> vision as Sage, that is to create a well tested scientific environment
> with just one tested version of each package, that works with
> everything.
Yes, that's what I'm thinking, but it'd be nice to be able to, e.g.,
try out a new spkg without hosing the entire install (in a possibly
irreversible manner).
> If there are two incompatible versions, then one should
> change the name of the package, e.g. python -> python2 + python3. Then
> it can live side by side.
Or python2.6p10.
>> What about
>> dependencies (e.g. if the version of Cython was updated, would all the
>> stuff depending on Cython get re-compiled?
>
> No. Only the other way round -- if you want to install "phaml", that
> uses Cython to wrap Fortran, it will also pull in "cython"
> automatically (as well as all the other packages). However, since the
> dependency tree is known, we can add this feature as well, to
> recompile all "rdepends" (reverse dependencies, to use Debian
> terminology).
That would be useful to get the known-stable environment, regardless
of the order you install packages. That's one of the things that's so
appealing about http://nixos.org/nix/
>>> Oh, and then people can simply send pull requests to the respective
>>> packages directly. So I think it solves lots of the problems raised in
>>> this thread. For sage, it would have to move to github though, so it
>>> might not be an option.
>>
>> Probably not. Is it really tied to github, or could any dvcs be plugged in?
>
> I just made two releases last couple days, one can download it from here:
>
> https://github.com/qsnake/qsnake/archives/master
>
> and that is just one big source tarball, and installs completely
> locally, no git is needed (it will actually install its own
> automatically --- but it is not needed for the build system). Git is
> only used to create the big source tarball using "qsnake -d". It can
> be easily changed to "hg" in the build system (and then one needs to
> move all the packages, which is simple but tedious).
I was thinking of something even simpler, where the dependent packages
would just be tarballs (at least as an option), so no revision control
is needed by the build system at all.
- Robert
>
> - Robert
>
I was thinking about this a lot yesterday, and there are a lot more
issues to resolve, than it seems at first sight. In particular, some
packages like Python is doing some recompiling of modules (I think,
but maybe I am wrong) and some other stuff to the place where it is
installed. Some other packages (setuptools?) are modifying some stuff
as well (at least I read it somewhere).
Pretty much, as long as the "installation" is just copying of files,
then it should work. But if you also need to modify some stuff after
installing it (post install script in Debian/Ubuntu), then things
become more complex.
With our current approach, one is free to do any kind of necessary
tweaks in $SPKG_LOCAL (=SAGE_LOCAL) to make things work. Usually by
the build system of the package itself.
You can download tarballs from github, for example for the Qsnake's
cython package:
https://github.com/qsnake/cython/archives/master
without having git installed. So in principle the build system can use
it too. I just stick with git for now.
Ondrej
I suppose I was imagining checking on both ends, but that's not really
necessary.
- Robert
True, but I can't think of anything in Sage where one needs to modify
the environment any more than put a file somewhere that it's
accessible (though such a thing could be possible).
I agree a general solution is much more subtle.
Ah, yes.
> I just stick with git for now.
Nothing against github (in fact I really like it), it's just that I'm
wary of making my infrastructure heavily dependent on someone else's
for some things.
- Robert
One would have to try and see. Nice thing about the current SPKG
format is that it is extremely simple, and although it doesn't allow
uninstall, in my opinion it is completely general in terms of making
sure the result works.
My point is that in terms of both simplicity (=understandability,
maintenance, time for people to learn it, use it, reuse it, ....) and
functionality (=getting any package to work) together, it might not be
possible to beat the current system.
However, one should always try, that's for sure.
>>> [...]
>>> I was thinking of something even simpler, where the dependent packages
>>> would just be tarballs (at least as an option), so no revision control
>>> is needed by the build system at all.
>>
>> You can download tarballs from github, for example for the Qsnake's
>> cython package:
>>
>> https://github.com/qsnake/cython/archives/master
>>
>> without having git installed. So in principle the build system can use
>> it too.
>
> Ah, yes.
>
>> I just stick with git for now.
>
> Nothing against github (in fact I really like it), it's just that I'm
> wary of making my infrastructure heavily dependent on someone else's
> for some things.
No, you have a good point. Sage relies on the packages hosted here:
http://sagemath.org/packages/standard/
or (equivalently) on the packages distributed in the source tarball,
hosted here at any of these mirrors:
http://sagemath.org/download-source.html
So it's quite safe. I don't have many computers at my disposal, so I
chose github, which allows me to upload files larger than 100MB
(unlike google code).
I was thinking about this too yesterday, and I think there is a *high*
value in having a full source distribution, preferably with all the
git histories for all the subprojects (Qsnake currently strips the
.git repository from each package after downloading it from github for
space reasons, but that's trivial to change), so that if the internet
goes down, or github crashes, as long as enough people have downloads
of the sources, there is pretty much no harm done and one can happily
continue developing scientific applications, without any loss. Sage
currently also doesn't have let's say the sympy git history, or Cython
git history.
Ondrej
It just occurred to me, that it should be possible to keep the current
SPKG format, and implement uninstall. One just needs to keep track of
all files in SPKG_LOCAL, then see what new files were added + which
files have changed.
If a file has changed, then a warning should be produced, and we would
look at each case manually. Maybe it's possible to make the whole Sage
(or Qsnake in my case) to build without changing any files, just keep
adding them.
If the file was just added, we'll keep track of it. And when the
package is uninstalled, it will simply be removed. Currently we remove
the old files in the spkg-install for some packages, and that's a
hack.
I'll try to implement this, I think that this feature would be really
cool. With this, one can also (trivially) create a binary package, and
store it in let's say spkg/binary in the local install.
Wow, this is exciting!
Ondrej
> It just occurred to me, that it should be possible to keep the current
> SPKG format, and implement uninstall. One just needs to keep track of
> all files in SPKG_LOCAL, then see what new files were added + which
> files have changed.
Why not something like
find local -print > foobar.preinstall
install foobar.spkg
find local -print > foobar.postinstall
then if an uninstall is required, one deletes the files in foobar.postinstall
which are not in before foobar.preinstall
That's a good point, didn't occur to me, that it won't work for
parallel compilation.
Does Sage work with parallel installation of packages? Looking at the README:
http://boxen.math.washington.edu/sage/src/README.txt
it doesn't seem to be the default way? I also started to compile Sage
4.6.2 on my computer, and it seems to be compiling in sequential mode.
> so in the spkg that lists them. During single package build one could
> automatically check that it is up to date, but the actual list of files
> needs to be distributed with the spkg.
Personally (and that is just my opinion), I don't like to maintain a
list of files in the spkg itself, I don't think that's a good
solution.
I think that a better solution is to disable uninstall if the user
uses parallel compilation of packages. Note that parallel compilation
(make -j9) inside one package is ok.
> With that information it would be relatively easy to automatically translate
> spkgs into distribution source packages (e.g. srpm). So in the long run we
> could make use of native package management schemes...
Ondrej
On Tue, May 10, 2011 at 7:19 AM, Volker Braun <vbraun.name@gmail.com> wrote:
> IMHO the list of installed files is an integral piece of package management
> and should explicitly be part of the spkg. Automatically generating it is
> not an option during parallel compilation. There should be a "spkg-files" orThat's a good point, didn't occur to me, that it won't work for
parallel compilation.Does Sage work with parallel installation of packages?
This wouldn't be as painful if wildcards are allowed...
> With that information it would be relatively easy to automatically translate
> spkgs into distribution source packages (e.g. srpm). So in the long run we
> could make use of native package management schemes...
I still like the idea of everything installing into their own
directory, and the final "view" is the union of the directories (e.g.
via PATHS or symlinks) rather than tying to track/synchronize every
package stomping over the same (set of) directories.
- Robert
Sage uses SAGE_LOCAL, but SPKG_LOCAL is more project neutral, so I use that.
> packagename.versionnr before each install of an SPKG and see what
> breakes (probably a lot). But if we get things working again then
> uninstall is just as easy as deleting a directory. This would make the
> step to something like nix very small.
>
> Note that a lot of SPKG's also install stuff into something like
> python*/site-packages.
>
> And I second the idea that the list of files should not be
> autogenerated during install but be a part of the SPKG. Although the
> initial file lists in the SPKG can be autogenerated offcourse :).
What would be the advantage of having it in the SPKG itself?
Like if you want to install two packages that overwrite the same file?
Ondrej
>> Does Sage work with parallel installation of packages?
>>
>
> Absolutely. Do:
>
> $ export SAGE_PARALLEL_SPKG_BUILD=yes
> $ export MAKE='make -j8'
> $ make
>
> See the installation guide for information about the relevant environment
> variables. I think that we should set SAGE_PARALLEL_SPKG_BUILD to "yes"
> automatically -- it works very well, according to everyone I've talked to
> about it.
I agree with that too. I still think it would be wise to be able to disable it,
as one might want to build individual packages in paralell, but not have the
resources to build loads of different packages in parallel. It would certainly
be an issue on my laptop!
But I think it would be better to have this as the default, as for the vast
majority of cases it is beneficial.
Definitely, two packages should not override the same files.
So what you have in mind is:
* automatically generate the list of files from *sequential* builds,
store it in spkg (or possibly somewhere else)
* use this in all default builds (either parallel or sequential), Sage
would store the list of files somewhere, and use it for uninstall
Is that right?
Ondrej
On Thu, May 12, 2011 at 5:29 AM, Maarten Derickx
<m.derick...@gmail.com> wrote:
[...]
>>
>> What would be the advantage of having it in the SPKG itself?
>>
> That it wil be compatible with parallel building as mentioned earlier.
>
>> Like if you want to install two packages that overwrite the same file?
> I don't think we should never do (or even want) such a thing. I think
> every spkg should only touch it's own files (or else we will get into
> an unpredictable mess if we also want uninstall and parallel
> building), if you really want an spkg to touch a file created by for
> example foo.spkg, one should instead make a patch for the foo.spkg.
> Maybe we should add some code that checks if there are spkg's breaking
> this rule.Definitely, two packages should not override the same files.
That's certainly a valid question! We usually first install blas which produce
a libf77blas then lapack which will produce liblapack. Finally it is ATLAS's
turn. ATLAS if I am not mistaken overwrite libf77blas from blas purely and
simply. Then it takes liblapack and modifies it.
Note that the current plans that we have for blas/lapack/atlas with Volker
involves getting rid of the blas spkg and of the individual lapack spkg and
build {f77,c}blas and lapack in one go in the ATLAS spkg.
Francois
This email may be confidential and subject to legal privilege, it may not reflect the views of the University of Canterbury, and it is not guaranteed to be virus free. If you are not an intended recipient, please notify the sender immediately and erase all copies of the message and any attachments. Please refer to http://www.canterbury.ac.nz/emaildisclaimer for more information.
And short of making it the default, at least it could be documented in
the README file or the Makefile (in addition to the install guide):
wstein@ubuntu:~/sage-4.7$ grep "SAGE_PARALLEL" *
wstein@ubuntu:~/sage-4.7$
>
> --
> John
>
> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to
> sage-devel+...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>
--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org
I don't think we should permit combinations - it makes testing for the variable
more difficult. When do we stop if we allow any combination of variable names?
IMHO, we should document this in a few places, so people are more likely to find
it. Perhaps in the top level README.txt.