Gerrit is used for Android and Eclipse, for example.
Thanks,
Jason
[1] http://code.google.com/p/gerrit/
Talk on gerrit: http://vimeo.com/23609339
Short Tutorial on gerrit/jenkins: http://vimeo.com/20084957
An old post about gerrit:
http://unethicalblogger.com/2009/12/07/code-review-with-gerrit-a-mostly-visual-guide.html
After looking into it a bit, I've found some things that I really don't
like, for example the following admonition from `a wiki page about
gerrit workflows`_::
> If the master repository has changed since you started, you should
> rebase your changes to the current state. And if you have made many
> small commits, you should squash them so that they do not show up in
> the public repository. Remember: each commit will become a change in
> Gerrit, and need to be approved separately. If you are making one
> "change" to the project, squash your many "checkpoint" commits into
> one commit for public consumption.
This strikes me as very bad. Rebasing, OK, though personally I think
merging is usually a better strategy, especially since it preserves
history more nicely. Squashing commits is definitely a bad idea, though,
as it makes future rebasing harder when you have large single commits
instead of many small commits, which is in fact one of the problems with
our current patch-based system.
.. _a wiki page about gerrit workflows:
http://wiki.openstack.org/GerritWorkflow#Long-lived_Topic_Branches
-Keshav
----
Join us in #sagemath on irc.freenode.net !
Reviewed changes should be very small so that they are easy to verify and review. Also, most changes are bugfixes or minor feature adds, so it should not create large, hard to merge commits.
If you have such a large change, you should probably be splitting into multiple reviews anyway.
I know the authors of gerrit, and would be willing to ask them a few questions if people had specific concerns.
--Christopher
--
To post to this group, send an email to sage-...@googlegroups.com
To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org
However, if I have a "topic" I'd like to commit, and all the commits
really should be reviewed as a package together, or if I'm collaborating
with others on a branch, then it is incredibly convenient to look at
patchsets at a topic-branch level (ala github) rather than the
individual commit level, like gerrit seems to be insisting. It seems
that the only way to get a topic-level view of things in gerrit is to
insist that you squash all the commits into one commit, which is usually
bad, in my opinion.
It is nice that gerrit preserves the history of each individual commit
that is reviewed. Can this history be pushed or mirrored somewhere else
(e.g., can I clone the gerrit repository and get these fake
ref/for/master branches?)
Jason
Cool! Thanks!
I'll have to think more about whether I agree with the "each commit
should stand on its own" philosophy mentioned below. It certainly makes
bisection easier. But sometimes huge patches make me think that each
*branch* should stand on its own, and be merged with no fast-forward.
Then it's much easier to tease apart logically separate, but
interrelated, sets of changes. Maybe all that is going on here is
miscommunication about what "stand on its own" means. Does it mean that
the software will run after each commit? Or does it mean that each
commit represents a single logical change, though it may temporarily put
the software into an unreliable or error-ridden state?
Jason
Jason
To unsubscribe from this group, send an email to sage-devel+unsubscribe@googlegroups.com
Thanks. That sheds light on the subject.
Jason
It gets a bit fuzzier when there are multiple authors on a single
feature, spread out through time. I think this is more common for a
project like Sage (lots of very-part-time contributors). It can also
be handy to spread out history into smaller or logical chunks (e.g.
big automatic renames/file moves, and then manual fixups, where the
former might not give a consistant state but is easy to review because
it's just the result of a sed script). I do, however, like the model
of iteratively working on a feature, including incorporating peer
feedback and fixes, adn then squashing the history down into a single
coherent commit at the end. Perhaps we could require that every state
of the main line be consistant, and if it's worthwhile to preserve
inconsistent history we merge it as a branch rather than rebasing.
- Robert
That seems very reasonable to me.
Thanks,
Jason
Many thanks for this super useful discussion! From my own experience
with github's branch-based review model, I lean strongly towards the
view you summarize above. I prefer to let authors make very liberal
use of the commit abilities in git and review with the branch as the
'atomic unit' I worry about, giving only secondary consideration to
individual commits.
Obviously if the commit history is a complete mess we may require that
it gets rebased and cleaned up, but in general we tend to be
relatively lax with how the individual commits stack up, as long as
the entire branch diff is sensible and easy to analyze on its own.
From our own experience in ipython with github and how unbelievably
fluid this has made the process both for new contributors and for
reviewers, I think I really prefer this over the view that gerrit
seems to propose. But it's been very enlightening to read their
rationale, and I certainly see it as a coherent perspective one can
choose to adopt.
Cheers,
f
f
Yes. A pull request is just a remote branch. Anyone with commit access
to the repository can merge the remote branch and push to the github
repository.
>
> Also, in my vastly finite experience with github, I've had problems with
> the fact that pull requests seem to be immutable: once a request has
> been created, it doesn't seem to be easy to add more commits to it, or
> change the commits. (This may just be my na�vety.)
Again, a pull request is just a remote branch. If I have submitted a
pull request, that just means that I've submitted a certain branch for
merging. If I push more commits to that branch, they automatically show
up in the pull request. In fact, I can rebase the branch and push -f
the branch and the new commits will overwrite whatever was there.
One thing I like about github is that it is so close to the underlying
git philosophy, so that once you understand git, it's easy to understand
github.
> But other than, I think github does have a great model. Is github on the
> table for a future platform to move to?
Yes. We've already moved the Sage notebook and the Sage Cell server to
github.
>
> I think gerrit is definitely a heavy-handed approach to reviews that
> seems to work better with larger products (like Android), since you do
> sort of have to adapt to its style of reviews. And I don't know if it
> is the right tool for Sage (not sure if anything is), but it is
> certainly worth looking at.
>
Thanks again for your help with understanding gerrit better!
Jason
Yes. Typically one or more core committers participate in the
discussion, and it's easy to informally choose who does the final
merge. But if you want to be more formal about it, there's an
assignee field and a project can choose to use it explicitly and
require that person to be the one who does the actual final merge.
> Also, in my vastly finite experience with github, I've had problems with the
> fact that pull requests seem to be immutable: once a request has been
> created, it doesn't seem to be easy to add more commits to it, or change the
> commits. (This may just be my naïvety.)
No, you can keep pushing to the branch you created the PR from, and
new commits show up as they are made. You can even rebase and force
push, and the PR will get rebased too. We do the first all the time,
and the second also, though less frequently.
Cheers,
f
Interesting. Where is this assignee field? Is it the assignee field in
the Issues? Does setting that require that the committer be the assignee?
Thanks,
Jason
I think we'd still keep our code review process, and the release
manager would to the actual merge, but it would make submitting,
pulling, revewing, collaborating on, and merging patches much easier.
>> Also, in my vastly finite experience with github, I've had problems with the
>> fact that pull requests seem to be immutable: once a request has been
>> created, it doesn't seem to be easy to add more commits to it, or change the
>> commits. (This may just be my naïvety.)
>
> No, you can keep pushing to the branch you created the PR from, and
> new commits show up as they are made. You can even rebase and force
> push, and the PR will get rebased too. We do the first all the time,
> and the second also, though less frequently.
What happens to (inline) comments in this case?
I think they get preserved as static snippets in the discussion page,
but I'm not 100% sure. I know we have PRs that have gone through
this, but I wouldn't know how to find one right now to verify; if
you're really curious we could do a quick experiment with a fake PR to
test what happens, it would not take long.
Cheers,
f
It's in the gray bar below the title for the PR. Anyone in the team
who owns the target repo can be assignee, it's a drop-list that shows
all the committers to the target repo.
Cheers,
f
+1. IMO, merging into master is a more "correct" way of squashing
multiple commits into one commit (the merge commit, from the point of
view of the main trunk), since you still have the original history.
Merging into the main line (the master branch) would be equivalent to
what right now we call getting your patch merged.
Though for a single commit, a rebase is cleaner IMHO than a merge, and
squashing commits improves the signal-to-noise ratio of the history
(and it's nice to be able to squash away stuff like typos).
The problem with rebasing is that unless you do it manually, the actual
commits (i.e. snapshots of the codebase) do not necessarily represent code that
anyone has actually looked at. Just because a merge or rebase is
possible to resolve automatically doesn't mean that it the result is
what you intended. With merging, you isolate this badness in the merge
commit, and can commit on top of your merge to fix any incompatibilities
between your original code and the current state of the other branch:
X --------------- Y --- Z
\ \
A --- B --- C --- D --- M --- E
Changes made in A through D do not directly conflict with changes made
in Y and Z, but the changes in Y and Z make the code added in A through
D perform differently from expected. The change in E fixes this. Every
commit here except M represents a state of the codebase that was
considered coherent and correct by whoever authored the commit.
Now in the case of rebasing, you get this:
X --- Y --- Z
\ \
\ A'--- B'--- C'--- D' --- E
\
A - - B - - C - - D <- - (these will disappear)
As before, changes made in A through D do not directly conflict with
changes made in Y and Z, so the user rebases A through D to A' through
D' automatically. But now, none of A' through D' represent states of the
codebase that were considered coherent and correct by the people who
made A through D respectively, even though A' through D' bear the names
of those people in their headers.
Furthermore, once any problems in the interaction between Y or Z and A
through D become apparent, they are recorded in E, but these changes are
already several nodes away from Z, and so appear out of their natural
context.
Worse, A through D will either hang around in other people's
repositories, confusing matters, or, if you were wise and didn't share
commits that you later rebased, they have disappeared and, once garbage
collected, are not recoverable.
I think the above is still true when A through D is actually A through A
(i.e. a single commit).
> and
> squashing commits improves the signal-to-noise ratio of the history
> (and it's nice to be able to squash away stuff like typos).
You can `git commit --amend` away stuff like typos, but squashing is bad
for various reasons. The obvious ones are preserving history, observing
developer intent in context, logical breakdown of large code changes,
etc., but another important one is that merges can be done automatically
more easily if commits are small.
If we follow IPython's lead and never commit directly to the master
branch, only merging into it, then `git log --merges` basically gives
you your squashed commits without actually destroying the originals.
That was my original position, but I recently changed my mind after
discussions with Min Ragan-Kelley and Aaron Smeurer (sympy lead). The
main reason (in addition to Keshav's valid arguments) was to have the
merge commit serve as an indicator of the fact that there was review,
and who did the review, straight into the log. Seen in that way, the
merge commit is signal, not noise.
More details behind my reasoning here:
http://mail.scipy.org/pipermail/ipython-dev/2012-January/008630.html
We've made the change and we're all quite happy with the new approach.
Cheers,
f