Luckily, there are alternatives, specifically local git clients and
gitweb. However, these don't work when git's change model is broken by
the use of git commit --amend.
For commits with a small number of files, such changes are reviewable
by the use of the "patch history" table in the diff views. But when
there are a large number of files, it becomes difficult to find the
files which have changed, and if there are a lot of changed files, to
produce a compact combined diff.
So if there are no objections, I'm going to change [[Git/Workflow]] to
restrict the recommended applications of "git commit --amend", and to
recommend plain "git commit" as an alternative. A plain commit seems
to work just fine. It gives you a separate commit to analyse with
Gerrit, gitweb and client-side tools, and it provides a link to the
original change in the "dependencies" section of the change page.
-- Tim Starling
_______________________________________________
Wikitech-l mailing list
Wikit...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
It sounds similar to what i said in the thread "consecutive commits in
Gerrit", so i probably support it, but i don't completely understand
how will it work with the `git review' command, which doesn't like
multiple commits. If the documentation will explain how to use `git
review' with follow up commits, it will be fine.
--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
“We're living in pieces,
I want to live in peace.” – T. Moore
- Using it to fix a typo or minor error in a commit = awesome.
- Using it to pile up tons of changes across tons of files = not awesome.
The former makes review EASIER, the latter makes review HARDER.
- Trevor
$ git fetch https://gerrit.wikimedia.org/r/p/analytics/udp-filters
refs/changes/22/3222/3 && git branch foo FETCH_HEAD
$ git fetch https://gerrit.wikimedia.org/r/p/analytics/udp-filters
refs/changes/22/3222/4 && git branch bar FETCH_HEAD
$ git diff foo..bar
The two 'git fetch' commands (or at least the part before the &&) can
be taken from the change page in Gerrit.
> For commits with a small number of files, such changes are reviewable
> by the use of the "patch history" table in the diff views. But when
> there are a large number of files, it becomes difficult to find the
> files which have changed, and if there are a lot of changed files, to
> produce a compact combined diff.
>
> So if there are no objections, I'm going to change [[Git/Workflow]] to
> restrict the recommended applications of "git commit --amend", and to
> recommend plain "git commit" as an alternative. A plain commit seems
> to work just fine. It gives you a separate commit to analyse with
> Gerrit, gitweb and client-side tools, and it provides a link to the
> original change in the "dependencies" section of the change page.
>
I have mixed feelings towards this. One time at the office, over
lunch, I was bitching about how Gerrit used amends instead of real
branches, and after discussing this for 15 minutes we felt like we'd
just reverse-engineered the Gerrit developers' decision process
(RobLa: "I think we just figured out why the Gerrit people did it this
way.")
There are several advantages to using Gerrit the way it's meant to be
used (with amends rather than followup commits):
* Review comments of one logical change are centralized, rather than
being scattered across multiple changes (this is what I said I'd do
differently if I was writing a code review system now)
* Conflicts must be resolved by rebasing the conflicted topic branch
onto the HEAD of master, so there aren't any merge commits containing
conflict resolutions unless you deliberately create them (and someone
else approves them). I imagine I'd be quite annoyed/confused if I was
using git bisect to track down a bug and it blamed a complex merge
commit that had conflicts
* Every commit that ends up being merged into master is "clean", in
that it's been approved and passes the tests. This is the entire point
of continuous integration / pre-commit review / gated trunk / etc.,
and it's a major advantage because:
** you can rewind master to any point in history and it'll be in a sane state
** you can start a branch (including a deployment branch) off any
point in master and be confident that that's a reasonably sane branch
point
** if you find subtle breakage later, you can use git bisect on master
to find out where it broke, and there will not be any botched
intermediate states confusing bisect (e.g. if there's a commit
somewhere that breaks half the code and you merge it in together with
a followup commit that fixes it, bisect might find that commit and
wrongly blame it for the breakage you're looking for; this probability
increases as merged-in feature branches get longer)
Of course you can't blindly place absolute trust in any random commit
on master, but at least you know that it 1) has been reviewed as a
unit and approved, and once we have better Jenkins integration you'll
know that 2) it passed the lint checks and 3) it passed the tests as
they existed at the time. Approving commits that are broken because
you know they were fixed later in some follow-up commit somewhere
violates this entire model, and it exposes you to the danger of
merging the broken commit but not the follow-up, if the follow-up is
held up for some reason. Fortunately, once we have proper Jenkins
integration, this will not be possible because Jenkins will mark the
broken commit as having failed the tests, and you will not be able to
approve and merge it without manually overriding Jenkins's V:-1
review.
An unrelated disadvantage of "stacked changes" in Gerrit (i.e. changes
that depend on other unmerged changes) is that if B.1 (change B
patchset 1) depends on A.1, and someone amends A.1 later to produce
A.2, B.1 will still depend on A.1 and will need to be rebased to
depend on A.2 instead. I've done this before with a stack of four
commits (amended B and had to rebase C and D), and I can tell you it's
not fun. I think I've figured out a trick for it now (use an
interactive rebase from the top of the stack), but it's not intuitive
and I've never tried it.
That said, if you have multiple commits that are related/dependent but
both represent valid, sane, non-broken states (e.g. you introduce a
function, then use it somewhere), then by all means chain them
together. I'll +1 Trevor there, amending is probably not a good idea
if you're adding lots of changes. And if all else fails, you can just
abandon the intermediate changes, squash them together into one
omnibus change, and submit that for review instead.
As for Amir's question about git-review: git-review warns you against
stacked changes, because it's considered an anti-pattern in Gerrit.
However, the warning ends with a question like "Are you sure this is
what you meant to do?", and if you type "yes" it'll continue. You can
also suppress this warning with the -y flag.
Roan
I've done a few test commits, it will work. The procedure is to copy
the download command out of the Gerrit change page and paste it into a
shell, same as amending. Git gives you some output which includes:
: If you want to create a new branch to retain commits you
: create, you may do so (now or later) by using -b with the
: checkout command again. Example:
: git checkout -b new_branch_name
You follow its instructions, creating a branch:
: git checkout -b my_test_commit
Then edit the files, then instead of amend, just a regular
: git commit -a
: git review
It complains that you're committing two things:
: You have more than one commit that you are about to submit.
: The outstanding commits are:
:
: 6e4f490 (HEAD, test2) test 2
: 634b5d7 (junk-2) Test commit 1
:
: Is this really what you meant to do?
: Type 'yes' to confirm:
You answer "yes", then it's done. Here's what the result looks like:
https://gerrit.wikimedia.org/r/#change,3794
git review --no-rebase may be necessary to reduce noise on the parent
change pages, I haven't tested that yet.
It can be done with slightly fewer steps if you make the steps more
complicated.
-- Tim Starling
It doesn't work, I'm afraid. Because of the implicit rebase on push,
usually subsequent changesets have a different parent. So when you
diff between the two branches, you get all of the intervening commits
which were merged to the master.
Examples from today:
https://gerrit.wikimedia.org/r/#change,3367
Patchsets 1 and 2 have different parents.
https://gerrit.wikimedia.org/r/#change,3363
Patchsets 1, 2 and 3 have different parents.
It's possible to get a diff between them, and I did, but it's tedious.
I figure we should pick a workflow that doesn't waste the reviewer's
time quite so much.
> I have mixed feelings towards this. One time at the office, over
> lunch, I was bitching about how Gerrit used amends instead of real
> branches, and after discussing this for 15 minutes we felt like we'd
> just reverse-engineered the Gerrit developers' decision process
> (RobLa: "I think we just figured out why the Gerrit people did it this
> way.")
I could not find a recommendation to use --amend in the Gerrit manual.
The manual says that it is possible to submit subsequent commits to
the same change by just adding a Change-Id footer to the commit
message. That seems to be the reason for the copy button next to the
Change-Id on the change page.
When I tried to do a test push containing several commits which
depended on each other, but with the same Change-Id, Gerrit rejected
it. But I'm not sure if that's a configuration issue or if it just
expected them to be done in separate pushes.
If it was possible to follow the Gerrit manual in this way, and submit
non-amending followup commits with the same Change-Id, then we'd have
the comment grouping advantages without the code review disadvantages.
> * Every commit that ends up being merged into master is "clean", in
> that it's been approved and passes the tests. This is the entire point
> of continuous integration / pre-commit review / gated trunk / etc.,
> and it's a major advantage because:
> ** you can rewind master to any point in history and it'll be in a sane state
> ** you can start a branch (including a deployment branch) off any
> point in master and be confident that that's a reasonably sane branch
> point
> ** if you find subtle breakage later, you can use git bisect on master
> to find out where it broke, and there will not be any botched
> intermediate states confusing bisect (e.g. if there's a commit
> somewhere that breaks half the code and you merge it in together with
> a followup commit that fixes it, bisect might find that commit and
> wrongly blame it for the breakage you're looking for; this probability
> increases as merged-in feature branches get longer)
I think significantly increasing review time and breaking authorship
and history information would be a high price to pay for these advantages.
Sure, you can bisect, but when you find the commit and discover that
there were 5 people amending it, how do you work out who was responsible?
I don't think bisect is such a useful feature that it's worth throwing
away every other feature in git just to get it.
> you will not be able to
> approve and merge it without manually overriding Jenkins's V:-1
> review.
It seems like that will be a lot easier and a lot more rare than
finding a diff between amended commits with different parents.
> An unrelated disadvantage of "stacked changes" in Gerrit (i.e. changes
> that depend on other unmerged changes) is that if B.1 (change B
> patchset 1) depends on A.1, and someone amends A.1 later to produce
> A.2, B.1 will still depend on A.1 and will need to be rebased to
> depend on A.2 instead. I've done this before with a stack of four
> commits (amended B and had to rebase C and D), and I can tell you it's
> not fun. I think I've figured out a trick for it now (use an
> interactive rebase from the top of the stack), but it's not intuitive
> and I've never tried it.
Tell people to not amend their commits then.
-- Tim Starling
Yes, that works. And actually, I think it's a little bit easier. It's a
little more "manual" that trying git one-liners, but it seems to at
least *always* work (and I've used it a couple of times to fix
changes with totally botched history).
>> An unrelated disadvantage of "stacked changes" in Gerrit (i.e. changes
>> that depend on other unmerged changes) is that if B.1 (change B
>> patchset 1) depends on A.1, and someone amends A.1 later to produce
>> A.2, B.1 will still depend on A.1 and will need to be rebased to
>> depend on A.2 instead. I've done this before with a stack of four
>> commits (amended B and had to rebase C and D), and I can tell you it's
>> not fun. I think I've figured out a trick for it now (use an
>> interactive rebase from the top of the stack), but it's not intuitive
>> and I've never tried it.
>
> Tell people to not amend their commits then.
>
Also making sure people don't have unrelated commits showing up as
dependencies makes the process much easier. If two things aren't
related--don't relate them!
-Chad
I was hoping that someone was going to say "you're wrong, making those
diffs is easy, here's how." But I take it by the silence that I'm not
wrong, and it really is hard.
Also, I'm concluding based on Roan's objections that I'm going to have
a hard time convincing people to stop amending their commits. So I
wrote this script that provides changeset diffs for reviewers:
http://tstarling.com/gerrit-differ.php
It fetches both commits from a local mirror into a temporary branch,
rebases the branches to a common ancestor, and then diffs them.
I just tried to push second commit to https://gerrit.wikimedia.org/r/#change,3841
patchet three.
If you don't start from "scratch" i.e. base your commit on the parent:
8824515e571eadd4a63b09e1331f35309315603f
(now I have
$ git log HEAD ^HEAD^^^
commit e67af5bbd843db3062cc0082254b69aae3d1241b
Author: saper <sa...@saper.info>
Date: Wed Mar 28 22:06:17 2012 +0200
An example how a foreign key should be added to the table
Change-Id: I0da5b25f4b4499facac6c410fa7ab74250935288
commit 96692fb23c00cb726144290b108623896cf24834
Author: Marc A. Pelletier <ma...@uberbox.org>
Date: Tue Mar 27 22:44:32 2012 -0400
(bug 5445) remove autoblocks when user is unblocked
(...comment truncated...)
Change-Id: I4aa820ae9bbd962a12d0b48b6c638a1b6ff4efc9
This is the current HEAD:
commit 8824515e571eadd4a63b09e1331f35309315603f
Author: Santhosh Thottingal <santhosh....@gmail.com>
Date: Wed Mar 28 11:25:45 2012 +0530
Trying to commit e67af5bbd843db3062cc0082254b69aae3d1241b
makes gerrit say:
! [remote rejected] HEAD -> refs/changes/3841 (squash commits first)
It does not matter if I use the same change ID or not. It knows
exactly where it should go but it still refuses it.
I have managed to workaround this by creating a branch, doing
lots of commits there, merging it, and push the merge to gerrit.
But then it uploads lots of unrelated changets:
https://gerrit.wikimedia.org/r/#change,3706
https://gerrit.wikimedia.org/r/#change,3707
https://gerrit.wikimedia.org/r/#change,3708 (but this was outside of the branch)
https://gerrit.wikimedia.org/r/#change,3709
The commit tree looked like:
private branch: 3706 --- 3707 ---
/ \
62562768cf8f2696 + -------- 3708 ----+ 3709 (merge)
As you can see, although there were so many changesets,
they all have dependencies set properly.
Is this a better way? I don't know...
I wonder why in this case gerrit does not complain
with its usual (squash commits first)
Having private branches with other people
would certainly help to work together on issues.
I tried to submit an improvement to
https://gerrit.wikimedia.org/r/#change,3841 and
it seems I can't do this the other way than
rebasing my changes to the parent of the changeset
(*not* master). Not sure how to make a branch
out of it (maybe I should merge it with the parent
commit?)
//Saper
There is an interesting failure mode of the whole plan :)
As you noticed on IRC comparing patchset 4 and 5 makes no sense.
Patchset 5 actually does not change anything except the commit
message relative to the patchset 4, but that's not the issue.
Side note:
I have a small tool (https://github.com/saper/gerrit-fetch-all)
to fetch all changesets from gerrit and branch the as
"change/3841/4". Then you can diff your changesets locally,
in the git repository.
Here's why: patchsets 1 till 4 are based off revision
$ git log --pretty=oneline change/3841/4 ^change/3841/4^^
fcc05dee9b93e080b13fc4b0d5b83a1c75d34362 (bug 5445) remove autoblocks when user is unblocked
8824515e571eadd4a63b09e1331f35309315603f Fix Bug 30681 - Wrong escaping for inexistent messages.
While patchset 5 has been rebased to a newer changeset
from master:
$ git log --pretty=oneline change/3841/5 ^change/3841/5^^
4f2ff743ff1bc93d922ab9b5b3135786df5c7b69 (bug 5445) remove autoblocks when user is unblocked
571e63cd2c2bac9a033e1816f5ad8b6a14b4f42b Merge "Use local context to get messages"
95c35e52113b9a98accc1e9b0e9fffc15b1661a8 Use local context to get messages
$ git branch -vv |grep change/3841/
change/3841/1 89daac5 Remove autoblocks when original block goes (Bug #5445)
change/3841/2 b9090b3 (bug 5445) remove autoblocks when user is unblocked
change/3841/3 96692fb (bug 5445) remove autoblocks when user is unblocked
change/3841/4 fcc05de (bug 5445) remove autoblocks when user is unblocked
* change/3841/5 4f2ff74 (bug 5445) remove autoblocks when user is unblocked
So here's how patchsets 4 and 5 differ according to git:
$ git log --pretty=oneline change/3841/4...change/3841/5
* 4f2ff743ff1bc93d922ab9b5b3135786df5c7b69 (bug 5445) remove autoblocks when user is unblocked
* 571e63cd2c2bac9a033e1816f5ad8b6a14b4f42b Merge "Use local context to get messages"
|\
| * 95c35e52113b9a98accc1e9b0e9fffc15b1661a8 Use local context to get messages
* | 681a170f290ca0a7b0d771155ddc59f091a5576d Merge "Add phpunit testcases for Bug 30681"
|\ \
| * | b91ffd7b09b445224cdef27a3a40bc9ded1fb8c7 Add phpunit testcases for Bug 30681
| /
* | dde3821ac130486a24a7f7a97eaf0eb6d67e55d2 (bug 35541) ns gender aliases for Croatian (hr)
|/
* cc2f70df0d106f84877591113d3973214bcfd36a gitignore mwsql script history file
* fcc05dee9b93e080b13fc4b0d5b83a1c75d34362 (bug 5445) remove autoblocks when user is unblocked
The common revision for both changesets is the next one:
$ git merge-base change/3841/4 change/3841/5
8824515e571eadd4a63b09e1331f35309315603f
(this is the parent of change/3841/1..4)
So it is clear that diff between them will include all the changes merged to
master in between.
My git-fu is limited, so I don't know how to compare such revisions.
I think we generally run into git architectural assumption
- that git is meant to store trees of files and "diffs"
or "changes" are not an object in the git world at all.
So I think that rebasing the patchset to the current master is a bad idea
for review. However, this increases likelihood of merge conflict
once the change is approved. Maybe we should have a workflow like this:
patchset 1 - proposed change
patchset 1 - review, negative
patchset 2 - updated change
patchset 2 - review, negative
patchset 3 - updated change
patchset 3 - review, possitve - change approved
patchset 4 - patchset 3 rebased to the master branch
patchset 4 - merged, closed (if ok)
Would that work in practice?
//Saper
You could have created a gist to host that shell script :-D
Anyway, for historical tracking, the way to fetch all changes is:
git fetch gerrit refs/changes/*:refs/changes/*
Then to diff patchset 1 and 2 of change 2476:
git diff changes/76/2476/{1,2}
Marcin, maybe your feature could make it in git-review ?
--
Antoine "hashar" Musso
The problem here is the implicit rebase. As long as the review
backlog isn't long and/or people aren't submitting conflicting
changes, rebasing amended changes against master creates
more harm than good.
For amending commits, you should use `git review -R` so you
don't rebase the change (again) against master (see for example
[0], difference between patch 2 and 3). I've updated the docs[1],
but they are, briefly:
git review -d 123
(make changes)
git commit -a --amend
git review -R
If you're not using git-review and have been using the alias, your
amended patchsets have not been creating this problem.
-Chad
[0] https://gerrit.wikimedia.org/r/#change,4020
[1] https://www.mediawiki.org/wiki/Git/Workflow#Amend_your_change