Releasing more often: what needs to be done if you want it to happen

106 views
Skip to first unread message

Aaron Meurer

unread,
Oct 7, 2012, 11:53:59 PM10/7/12
to sy...@googlegroups.com
So as you probably noticed, even though the 0.7.2 branch has existed
for over a month, we still haven't even pushed out a release
candidate. If anyone is interested, the current (and hopefully final)
blocker is https://github.com/sympy/sympy/pull/1561 (and I guess also
https://github.com/sympy/sympy/pull/1559).

Now here's a timeline: 0.7.1 was released July 29, 2011, more than a
year and two months ago. 0.7.0 was released just over a month before
that, on June 28. 0.6.7 was released March 18, 2010, again over a
year before 0.7.0. In almost two year's time, we've had three
releases, and are struggling to get out a fourth. And it's not like
there were no changes; quite the opposite in fact. If you look at
SymPy 0.6.6 compared to the current master, it's unbelievable the
amount of changes that have gone forward in that time. We've had
since then the new polys, at least four completely new submodules
(combinatorics, sets, differential geometry, and stats), massive
improvements to integration and special functions, a ton of new stuff
in the physics module, literally thousands of bug fixes, and the list
goes on. Each of these changes on it's own was enough to warrant a
release.

So in case I didn't make my point, le me state it explicitly: we need
to release more often. We need to release *way* more often.

Well, we've had this discussion before. Saying that we need to
release more often and actually releasing are different things. We
clearly want to release, but something is keeping it from happening.
The fact that 0.7.2 has existed for over a month should make that
clear. So here are the things that I see as the major blockers to
releases in the past two years:

1. Test failures in master make it impossible to release.
Fortunately, I believe that we may have finally overcome this barrier,
which is exciting. Thanks to Travis, SymPy Bot, and the feature of
GitHub that warns against merging a failing branch, we haven't had
test failures slip into master for a while (Travis doesn't seem to
have any kind of waterfall view, and they mix in pull request tests
with the main tests, so it's hard to quantify this, but I haven't
noticed any for about a month or so). So I think actually as far as
this point, we just need to continue to make sure that we don't press
the merge button unless tests are passing, and if anything else, add
more to the list of things that are tested (see below).

2. There are a dozen things that need to be done when we release.
They are all outlined at
https://github.com/sympy/sympy/wiki/new-release. If you go through
that page, you will see that there really are a ton of things to do.
And some of them are even more than what the look at. For example, at
the end, there are 12 different independent sites that need to be
updated.

3. There is only one person who has volunteered to do the release work
(namely, me). Don't get me wrong. Lot's of people help out by
submitting fixes for test failures and various other things, and I am
much appreciative. But I'm talking specifically about the list of
things on that wiki page, which are specific to doing a release.

So here is what I think we **have** to do, or else we will never
release more often than the less than once a year that we've been at:

- We have to automate the release process. I'm referring specifically
to the stuff on https://github.com/sympy/sympy/wiki/new-release. We
should have some script, say bin/release.py, and to do a release, all
we should have to do is run that script. And literally everything
should be in that script. If there is some step in the release
process that is not automatable, we should either attempt to make it
automatable, or ask if it is really necessary to be done before each
release. One thing that I already know will be an issue is the example
notebooks (see https://github.com/ipython/ipython/issues/1195). Also,
we will either need to automate the process of updating all the
various sites and packaging systems, or make concessions that someone
else will have to be responsible for doing so.

The script should first run the tests (or just check travis and the
review server). Additionally, it should run all the other little
things from that wiki page that are "tests", like seeing if plotting
works, testing the speed of import, and making sure AUTHORS and
.mailmap are up to date. If any of these fails, it should just stop,
and tell you what has to be done.

The parts that are not specific to a release we should test always
("always" meaning, against every pull request). We are already
starting to do this with the docs with sympy-bot. We should test
other things as well. This is essentially to fix 1. from above: the
fewer things that actually have to be done at release time, the less
time it will take.

If everything works, then it should change the version numbers, run
all the setup.py commands to create the tarballs (including the docs),
and upload them to the necessary sites. It should use oauth tokens
for GitHub, PyPI, (and Google Code?) so that the user only has to ever
enter his password once; and after that it will get a token and save
it for future use (much like sympy-bot does).

- One thing that I know cannot be automated, but still needs to be
done is the release notes. If someone has any good ideas on how to
make this smoother, I'd like to hear them. One way would be to make
it a policy that a pull request containing major changes should not be
merged until those changes are documented in the release notes. I'm
doubtful of this, though, because policies are useless unless well
enforced.

Perhaps it might be possible to have a tool that searches through the
git history and tries to heuristically tell what parts are major
changes. Maybe such a tool already exists?

Another idea would be to make it a policy for whoever merges to write
some notes in the little box that comes up when you click the merge
button, which is then put in the merge commit message. This could be
based on the OP of the PR. IPython does this a lot. We could then
gather these messages into the release notes.

Or, perhaps this problem will be solved if we can solve the other
problems. After all, if a release consists of only a few hundred
commits (or say, a few dozen pull requests), it won't be that big of a
deal to comb through them and write down what has changed.

- Regarding point number 3, I'm really not sure how to approach this
one. There are really very few things that require even push access,
much less any other kind of administrative access to do. And those
things that do, I will **gladly** give to anyone with push access if
they are willing to do it (e.g., admin access to PyPI). My goal with
the release script is that anyone with push access should be able to
run it. I again am open to suggestions on this one. I think the only
real way to fix this is to create a culture in the SymPy community
that anyone can do a release. I am **not** the release manager for
SymPy. I am simply the person who has volunteered to do the release
work.

- I think that one other thing that has held back many releases is the
feeling of "wait, we should put this in the release". The use of a
release branch has helped keep master moving along independently, but
there still seems to be the feeling with many branches of, "this is a
nice feature, it ought to go in the release." My hope is that by
making the release process smoother, we can release more often, and
this feeling will go away, because it won't be a big deal if something
waits until the next release. As far as deprecations go, the real
issue with them is time, not release numbers. So if we deprecate a
feature today vs. one month from today, it's not a big deal (as
opposed to today vs. a year from today), regardless of how many
versions are in between.

I read about what GitHub does for their Windows product regarding
releasing often on their blog:
https://github.com/blog/1271-how-we-ship-github-for-windows (they
actually have this philosophy for all their products). One thing that
they said is, "And by shipping updates so often, there is less anxiety
about getting a particular feature ready for a particular release. If
your pull request isn’t ready to be merged in time for today’s
release, relax. There will be another one soon, so make that code
shine!" I think that is exactly the point here. Another thing that
they noted is that automation is the key to doing this, which is what
I am aiming for with the above point.

- Once we start releasing very often (and believe me, this is way down
the road, but I'm trying to be forward looking here), we can do away
with release candidates. A release candidate lives in the wild for a
week before the full release. But if we are capable of releasing
literally every week, then having release candidates is pointless. If
a bug slips into a release, we just fix it and it will be in the next
release.

So, to summarize, I think the ideal situation would be:

- Everything "test"-wise that needs to be done for a release would
always be working in master, because we would test every pull request
against it. So there would not even be a need to check it before
releasing, other than to just check that it's working in master (and
if we are testing every pull request, that will be clear).

- Making a release would just be an matter of running release.py and
doing what little needs to be done manually (like emailing the list
and compiling the release notes.

- Everyone, if they have push access, is also trusted enough to do a release.

- We should release *at least* once a month. I think that if the
process is automated enough, that this will be very possible (as
opposed to the current situation, where the release branch lasts
longer than a month). In times of high activity, we can release more
often than that (e.g., after a big pull request is merged, we can
release).

I'm open to suggestions, thoughts, concerns, and (most importantly of
all) pull requests starting to implement release.py. If anyone from
another open source community has experience with this sort of thing
(either positive or negative), I'd love to hear it.

Finally, I'd like to keep this thread on the topic of releases in
general, but if I've inspired someone to help out with this release,
just let me know, and I'll tell you what still needs to be done. But
let's keep that discussion over at
https://github.com/sympy/sympy/pull/1507.

Aaron Meurer

Ondřej Čertík

unread,
Oct 8, 2012, 2:25:15 PM10/8/12
to sy...@googlegroups.com
On Sun, Oct 7, 2012 at 8:53 PM, Aaron Meurer <asme...@gmail.com> wrote:
[...]
> - We should release *at least* once a month. I think that if the
> process is automated enough, that this will be very possible (as
> opposed to the current situation, where the release branch lasts
> longer than a month). In times of high activity, we can release more
> often than that (e.g., after a big pull request is merged, we can
> release).

We should definitely automate it. I've had great experience with Vagrant,
here are my scripts to automate the NumPy release:

https://github.com/certik/numpy-vendor

That among linux tgz even builds a binary for Windows. The advantage
of Vagrant is that anyone can easily run it, both Mac or Linux and
the environment is 100% the same. (Travis CI also uses Vagrant btw.)

Aaron, are you able to run Vagrant on your Mac? Let me know if you
are in favor of that, and I can write the initial release script using Vagrant,
and then we keep improving it (all of us).


-------------

Yes, releasing each week, or each month would be great.

I think we are too worried of each release to be "perfect". I wouldn't worry
about #1561. I think we can improve the notebook in the next release.
I think it's more important to get the main code base released and make
sure that all tests work on all platforms. I think that's the only issue and
I think we are pretty good at that.

Ondrej

Aaron Meurer

unread,
Oct 8, 2012, 2:58:53 PM10/8/12
to sy...@googlegroups.com
On Mon, Oct 8, 2012 at 12:25 PM, Ondřej Čertík <ondrej...@gmail.com> wrote:
> On Sun, Oct 7, 2012 at 8:53 PM, Aaron Meurer <asme...@gmail.com> wrote:
> [...]
>> - We should release *at least* once a month. I think that if the
>> process is automated enough, that this will be very possible (as
>> opposed to the current situation, where the release branch lasts
>> longer than a month). In times of high activity, we can release more
>> often than that (e.g., after a big pull request is merged, we can
>> release).
>
> We should definitely automate it. I've had great experience with Vagrant,
> here are my scripts to automate the NumPy release:
>
> https://github.com/certik/numpy-vendor
>
> That among linux tgz even builds a binary for Windows. The advantage
> of Vagrant is that anyone can easily run it, both Mac or Linux and
> the environment is 100% the same. (Travis CI also uses Vagrant btw.)
>
> Aaron, are you able to run Vagrant on your Mac? Let me know if you
> are in favor of that, and I can write the initial release script using Vagrant,
> and then we keep improving it (all of us).

It seems to work (at least I am able to install it). Is there a simple
way that I can test that it really works?

>
>
> -------------
>
> Yes, releasing each week, or each month would be great.
>
> I think we are too worried of each release to be "perfect". I wouldn't worry
> about #1561. I think we can improve the notebook in the next release.

I just noticed that the notebooks are not even included in the tarball
by default anyway. So I think I will do this.

And anyway, I really think we should have *all* examples be notebooks,
and we should be doctesting them, etc.

So I'll just merge Sean's IPython extension branch (assuming it
works), and make a release candidate. I hopefully will do all that
this evening.

> I think it's more important to get the main code base released and make
> sure that all tests work on all platforms. I think that's the only issue and
> I think we are pretty good at that.

It's not the only issue, because as I mentioned, for example, there
are a dozen sites to update after the release, and that takes a bunch
of time too. And there's always the release notes (which actually, I
still need someone to go through and verify that all important changes
from 0.7.2 are included at
https://github.com/sympy/sympy/wiki/Release-Notes-for-0.7.2).

So the only way is to automate everything: tests, deployment, post
deployment, everything.

Aaron Meurer

>
> Ondrej
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
>

Ondřej Čertík

unread,
Oct 8, 2012, 4:42:09 PM10/8/12
to sy...@googlegroups.com
On Mon, Oct 8, 2012 at 11:58 AM, Aaron Meurer <asme...@gmail.com> wrote:
> On Mon, Oct 8, 2012 at 12:25 PM, Ondřej Čertík <ondrej...@gmail.com> wrote:
>> On Sun, Oct 7, 2012 at 8:53 PM, Aaron Meurer <asme...@gmail.com> wrote:
>> [...]
>>> - We should release *at least* once a month. I think that if the
>>> process is automated enough, that this will be very possible (as
>>> opposed to the current situation, where the release branch lasts
>>> longer than a month). In times of high activity, we can release more
>>> often than that (e.g., after a big pull request is merged, we can
>>> release).
>>
>> We should definitely automate it. I've had great experience with Vagrant,
>> here are my scripts to automate the NumPy release:
>>
>> https://github.com/certik/numpy-vendor
>>
>> That among linux tgz even builds a binary for Windows. The advantage
>> of Vagrant is that anyone can easily run it, both Mac or Linux and
>> the environment is 100% the same. (Travis CI also uses Vagrant btw.)
>>
>> Aaron, are you able to run Vagrant on your Mac? Let me know if you
>> are in favor of that, and I can write the initial release script using Vagrant,
>> and then we keep improving it (all of us).
>
> It seems to work (at least I am able to install it). Is there a simple
> way that I can test that it really works?

Yes --- follow the instructions:

https://github.com/certik/numpy-vendor#how-to-use

it it starts doing something, then it works. It takes a few hours to
actually build everything in Wine inside it, so you don't have to wait,
just kill it with ctrl-C.

You will need to have Fabric installed (https://github.com/fabric/fabric).

That is the tool, that allows automatic manipulation of remote servers,
in this case Vagrant VM. Later, we can extend our scripts to do some stuff
on our Linode server or some other servers.

>
>>
>>
>> -------------
>>
>> Yes, releasing each week, or each month would be great.
>>
>> I think we are too worried of each release to be "perfect". I wouldn't worry
>> about #1561. I think we can improve the notebook in the next release.
>
> I just noticed that the notebooks are not even included in the tarball
> by default anyway. So I think I will do this.
>
> And anyway, I really think we should have *all* examples be notebooks,
> and we should be doctesting them, etc.
>
> So I'll just merge Sean's IPython extension branch (assuming it
> works), and make a release candidate. I hopefully will do all that
> this evening.
>
>> I think it's more important to get the main code base released and make
>> sure that all tests work on all platforms. I think that's the only issue and
>> I think we are pretty good at that.
>
> It's not the only issue, because as I mentioned, for example, there
> are a dozen sites to update after the release, and that takes a bunch
> of time too. And there's always the release notes (which actually, I

I can help with the sites. But I think we don't need to update them.
I would just push in the git tag, put tarballs at google code and update pypi.
This can be done by hand.

Ondrej

Matthew Brett

unread,
Oct 8, 2012, 4:47:27 PM10/8/12
to sy...@googlegroups.com
Hi,
I've got windows machines set up as buildbots and am hand-triggering
bdist builds on a windows xp and windows 7 machine which upload to a
web-accessible directory. That could be automated by watching for a
release-like git tag I suppose.

Of course y'all would be welcome to use these,

Cheers,

Matthew

Aaron Meurer

unread,
Oct 8, 2012, 7:59:47 PM10/8/12
to sy...@googlegroups.com
Awesome, thanks! It is possible to create a 32-bit Windows installer
in Linux/Mac OS X, but it seems that it's only possible to create a
64-bit installer in Windows (see
http://code.google.com/p/sympy/issues/detail?id=1235).

Aaron Meurer

>
> Cheers,
>
> Matthew

Aaron Meurer

unread,
Oct 9, 2012, 12:36:17 AM10/9/12
to sy...@googlegroups.com
I went ahead and did the whole process, and it worked just fine.

Aaron Meurer

>
>>
>>>
>>>
>>> -------------
>>>
>>> Yes, releasing each week, or each month would be great.
>>>
>>> I think we are too worried of each release to be "perfect". I wouldn't worry
>>> about #1561. I think we can improve the notebook in the next release.
>>
>> I just noticed that the notebooks are not even included in the tarball
>> by default anyway. So I think I will do this.
>>
>> And anyway, I really think we should have *all* examples be notebooks,
>> and we should be doctesting them, etc.
>>
>> So I'll just merge Sean's IPython extension branch (assuming it
>> works), and make a release candidate. I hopefully will do all that
>> this evening.
>>
>>> I think it's more important to get the main code base released and make
>>> sure that all tests work on all platforms. I think that's the only issue and
>>> I think we are pretty good at that.
>>
>> It's not the only issue, because as I mentioned, for example, there
>> are a dozen sites to update after the release, and that takes a bunch
>> of time too. And there's always the release notes (which actually, I
>
> I can help with the sites. But I think we don't need to update them.
> I would just push in the git tag, put tarballs at google code and update pypi.
> This can be done by hand.

Well, half of the sites are our own (homepage, SymPy Live/Gamma, blog,
etc.). So we definitely should update those :)

The rest, like Wikipedia or Freshmeat, I guess are not as important.
It helps for marketing purposes, but that's it. As for trying to get
it into the linux packaging repos, I'm not even going to bother. That
really should be someone else's job (if you want, we can add tasks for
it for GCI).

Ondřej Čertík

unread,
Oct 9, 2012, 9:55:50 AM10/9/12
to sy...@googlegroups.com
Cool, thanks for trying. Especially the wine thing is highly
nontrivial, that took
me days to figure out how to set it up. But once it's done, it works
like a charm
in Vagrant.

So we can use this approach for our automation.


> Well, half of the sites are our own (homepage, SymPy Live/Gamma, blog,
> etc.). So we definitely should update those :)
>
> The rest, like Wikipedia or Freshmeat, I guess are not as important.
> It helps for marketing purposes, but that's it. As for trying to get
> it into the linux packaging repos, I'm not even going to bother. That
> really should be someone else's job (if you want, we can add tasks for
> it for GCI).

Yes. Other people can help with updating pages, even our own.
E.g. I can update our pages.

Ondrej

Joachim Durchholz

unread,
Oct 11, 2012, 6:14:45 PM10/11/12
to sy...@googlegroups.com
+1, just a few quick notes:

Am 08.10.2012 05:53, schrieb Aaron Meurer:
> If anyone is interested, the current (and hopefully final)
> blocker is https://github.com/sympy/sympy/pull/1561 (and I guess also
> https://github.com/sympy/sympy/pull/1559).

That might be part of the problem: I have something different on my
agenda right now, and I guess it's similar for many others.

It's probably a good idea to give release blockers high visibility.
Say, a web page that lists blockers (could be a redirect to the proper
search results).

That way, people looking for "what's next" might get the blockers on
their shortlist.

> So in case I didn't make my point, le me state it explicitly: we need
> to release more often. We need to release *way* more often.
>[...]
> - I think that one other thing that has held back many releases is the
> feeling of "wait, we should put this in the release". The use of a
> release branch has helped keep master moving along independently, but
> there still seems to be the feeling with many branches of, "this is a
> nice feature, it ought to go in the release." My hope is that by
> making the release process smoother, we can release more often, and
> this feeling will go away, because it won't be a big deal if something
> waits until the next release. As far as deprecations go, the real
> issue with them is time, not release numbers. So if we deprecate a
> feature today vs. one month from today, it's not a big deal (as
> opposed to today vs. a year from today), regardless of how many
> versions are in between.

+10.

> 2. There are a dozen things that need to be done when we release.
> They are all outlined at
> https://github.com/sympy/sympy/wiki/new-release. If you go through
> that page, you will see that there really are a ton of things to do.
> And some of them are even more than what the look at. For example, at
> the end, there are 12 different independent sites that need to be
> updated.
> [...]
> - We have to automate the release process. I'm referring specifically
> to the stuff on https://github.com/sympy/sympy/wiki/new-release. We
> should have some script, say bin/release.py, and to do a release, all
> we should have to do is run that script. And literally everything
> should be in that script. If there is some step in the release
> process that is not automatable, we should either attempt to make it
> automatable, or ask if it is really necessary to be done before each
> release.

Third option: Make it part of the requirements before a pull request can
go in.

Example in point:

> One thing that I already know will be an issue is the example
> notebooks (see https://github.com/ipython/ipython/issues/1195). Also,
> we will either need to automate the process of updating all the
> various sites and packaging systems, or make concessions that someone
> else will have to be responsible for doing so.

Here's how to make it part of the pull request:
1) Unify everything that needs to be updated into a single directory
tree, so people can make all the relevant changes part of the same pull
request and nothing gets forgotten. Here, the notebooks would become
part of that SymPy Release Tree.
2) If a notebook exercises a specific piece of Python code, they Python
code should mention that in a comment. Even better, write unit tests to
verify that the notebooks actually work.
3) Make the release script regenerate the notebooks from the SymPy
Release Tree.

I do not think that everything in the release process can be automated
or removed, but I think the inevitable manual intervention can be moved
into the pull request reviewing process.

> - One thing that I know cannot be automated, but still needs to be
> done is the release notes. If someone has any good ideas on how to
> make this smoother, I'd like to hear them. One way would be to make
> it a policy that a pull request containing major changes should not be
> merged until those changes are documented in the release notes. I'm
> doubtful of this, though, because policies are useless unless well
> enforced.

Enforce it then as a part of the pull request policy.

The more important aspect: State that part of the policy.
Describe the line between changes that need and need not be described.

> Perhaps it might be possible to have a tool that searches through the
> git history and tries to heuristically tell what parts are major
> changes. Maybe such a tool already exists?

Such a tool would have to emit the same output, whether the same set of
changes is pulled as a single request or split into a dozen small pull
requests.
I doubt that such a tool is even feasible.

> Or, perhaps this problem will be solved if we can solve the other
> problems. After all, if a release consists of only a few hundred
> commits (or say, a few dozen pull requests), it won't be that big of a
> deal to comb through them and write down what has changed.

Quite possible.
It's certainly not a decision that needs to be made early, since no
other decisions depend on it.

> - Everyone, if they have push access, is also trusted enough to do a release.

I'd wait with that until the release process is indeed fully automated.

I'm not sure if we really want the possibility of a daily release. Well,
maybe we want the possibility, but we'll probably have nontechnical
issues to consider (such as not annoying users with daily updates vs.
fixing really important bugs). And we want to have these decisions to be
vaguely consistent, so it's going to be a single person who decides when
to release, even if the actual release is no big thing.

I'm not 100% sure whether I'm on spot with this though.
Reply all
Reply to author
Forward
0 new messages