So as you probably noticed, even though the 0.7.2 branch has existed
for over a month, we still haven't even pushed out a release
candidate. If anyone is interested, the current (and hopefully final)
blocker is
https://github.com/sympy/sympy/pull/1561 (and I guess also
https://github.com/sympy/sympy/pull/1559).
Now here's a timeline: 0.7.1 was released July 29, 2011, more than a
year and two months ago. 0.7.0 was released just over a month before
that, on June 28. 0.6.7 was released March 18, 2010, again over a
year before 0.7.0. In almost two year's time, we've had three
releases, and are struggling to get out a fourth. And it's not like
there were no changes; quite the opposite in fact. If you look at
SymPy 0.6.6 compared to the current master, it's unbelievable the
amount of changes that have gone forward in that time. We've had
since then the new polys, at least four completely new submodules
(combinatorics, sets, differential geometry, and stats), massive
improvements to integration and special functions, a ton of new stuff
in the physics module, literally thousands of bug fixes, and the list
goes on. Each of these changes on it's own was enough to warrant a
release.
So in case I didn't make my point, le me state it explicitly: we need
to release more often. We need to release *way* more often.
Well, we've had this discussion before. Saying that we need to
release more often and actually releasing are different things. We
clearly want to release, but something is keeping it from happening.
The fact that 0.7.2 has existed for over a month should make that
clear. So here are the things that I see as the major blockers to
releases in the past two years:
1. Test failures in master make it impossible to release.
Fortunately, I believe that we may have finally overcome this barrier,
which is exciting. Thanks to Travis, SymPy Bot, and the feature of
GitHub that warns against merging a failing branch, we haven't had
test failures slip into master for a while (Travis doesn't seem to
have any kind of waterfall view, and they mix in pull request tests
with the main tests, so it's hard to quantify this, but I haven't
noticed any for about a month or so). So I think actually as far as
this point, we just need to continue to make sure that we don't press
the merge button unless tests are passing, and if anything else, add
more to the list of things that are tested (see below).
2. There are a dozen things that need to be done when we release.
They are all outlined at
https://github.com/sympy/sympy/wiki/new-release. If you go through
that page, you will see that there really are a ton of things to do.
And some of them are even more than what the look at. For example, at
the end, there are 12 different independent sites that need to be
updated.
3. There is only one person who has volunteered to do the release work
(namely, me). Don't get me wrong. Lot's of people help out by
submitting fixes for test failures and various other things, and I am
much appreciative. But I'm talking specifically about the list of
things on that wiki page, which are specific to doing a release.
So here is what I think we **have** to do, or else we will never
release more often than the less than once a year that we've been at:
- We have to automate the release process. I'm referring specifically
to the stuff on
https://github.com/sympy/sympy/wiki/new-release. We
should have some script, say bin/release.py, and to do a release, all
we should have to do is run that script. And literally everything
should be in that script. If there is some step in the release
process that is not automatable, we should either attempt to make it
automatable, or ask if it is really necessary to be done before each
release. One thing that I already know will be an issue is the example
notebooks (see
https://github.com/ipython/ipython/issues/1195). Also,
we will either need to automate the process of updating all the
various sites and packaging systems, or make concessions that someone
else will have to be responsible for doing so.
The script should first run the tests (or just check travis and the
review server). Additionally, it should run all the other little
things from that wiki page that are "tests", like seeing if plotting
works, testing the speed of import, and making sure AUTHORS and
.mailmap are up to date. If any of these fails, it should just stop,
and tell you what has to be done.
The parts that are not specific to a release we should test always
("always" meaning, against every pull request). We are already
starting to do this with the docs with sympy-bot. We should test
other things as well. This is essentially to fix 1. from above: the
fewer things that actually have to be done at release time, the less
time it will take.
If everything works, then it should change the version numbers, run
all the setup.py commands to create the tarballs (including the docs),
and upload them to the necessary sites. It should use oauth tokens
for GitHub, PyPI, (and Google Code?) so that the user only has to ever
enter his password once; and after that it will get a token and save
it for future use (much like sympy-bot does).
- One thing that I know cannot be automated, but still needs to be
done is the release notes. If someone has any good ideas on how to
make this smoother, I'd like to hear them. One way would be to make
it a policy that a pull request containing major changes should not be
merged until those changes are documented in the release notes. I'm
doubtful of this, though, because policies are useless unless well
enforced.
Perhaps it might be possible to have a tool that searches through the
git history and tries to heuristically tell what parts are major
changes. Maybe such a tool already exists?
Another idea would be to make it a policy for whoever merges to write
some notes in the little box that comes up when you click the merge
button, which is then put in the merge commit message. This could be
based on the OP of the PR. IPython does this a lot. We could then
gather these messages into the release notes.
Or, perhaps this problem will be solved if we can solve the other
problems. After all, if a release consists of only a few hundred
commits (or say, a few dozen pull requests), it won't be that big of a
deal to comb through them and write down what has changed.
- Regarding point number 3, I'm really not sure how to approach this
one. There are really very few things that require even push access,
much less any other kind of administrative access to do. And those
things that do, I will **gladly** give to anyone with push access if
they are willing to do it (e.g., admin access to PyPI). My goal with
the release script is that anyone with push access should be able to
run it. I again am open to suggestions on this one. I think the only
real way to fix this is to create a culture in the SymPy community
that anyone can do a release. I am **not** the release manager for
SymPy. I am simply the person who has volunteered to do the release
work.
- I think that one other thing that has held back many releases is the
feeling of "wait, we should put this in the release". The use of a
release branch has helped keep master moving along independently, but
there still seems to be the feeling with many branches of, "this is a
nice feature, it ought to go in the release." My hope is that by
making the release process smoother, we can release more often, and
this feeling will go away, because it won't be a big deal if something
waits until the next release. As far as deprecations go, the real
issue with them is time, not release numbers. So if we deprecate a
feature today vs. one month from today, it's not a big deal (as
opposed to today vs. a year from today), regardless of how many
versions are in between.
I read about what GitHub does for their Windows product regarding
releasing often on their blog:
https://github.com/blog/1271-how-we-ship-github-for-windows (they
actually have this philosophy for all their products). One thing that
they said is, "And by shipping updates so often, there is less anxiety
about getting a particular feature ready for a particular release. If
your pull request isn’t ready to be merged in time for today’s
release, relax. There will be another one soon, so make that code
shine!" I think that is exactly the point here. Another thing that
they noted is that automation is the key to doing this, which is what
I am aiming for with the above point.
- Once we start releasing very often (and believe me, this is way down
the road, but I'm trying to be forward looking here), we can do away
with release candidates. A release candidate lives in the wild for a
week before the full release. But if we are capable of releasing
literally every week, then having release candidates is pointless. If
a bug slips into a release, we just fix it and it will be in the next
release.
So, to summarize, I think the ideal situation would be:
- Everything "test"-wise that needs to be done for a release would
always be working in master, because we would test every pull request
against it. So there would not even be a need to check it before
releasing, other than to just check that it's working in master (and
if we are testing every pull request, that will be clear).
- Making a release would just be an matter of running release.py and
doing what little needs to be done manually (like emailing the list
and compiling the release notes.
- Everyone, if they have push access, is also trusted enough to do a release.
- We should release *at least* once a month. I think that if the
process is automated enough, that this will be very possible (as
opposed to the current situation, where the release branch lasts
longer than a month). In times of high activity, we can release more
often than that (e.g., after a big pull request is merged, we can
release).
I'm open to suggestions, thoughts, concerns, and (most importantly of
all) pull requests starting to implement release.py. If anyone from
another open source community has experience with this sort of thing
(either positive or negative), I'd love to hear it.
Finally, I'd like to keep this thread on the topic of releases in
general, but if I've inspired someone to help out with this release,
just let me know, and I'll tell you what still needs to be done. But
let's keep that discussion over at
https://github.com/sympy/sympy/pull/1507.
Aaron Meurer