Undetected rename, how do undo related pathset

858 views
Skip to first unread message

Ash

unread,
Jul 25, 2011, 9:20:54 AM7/25/11
to Repo and Gerrit Discussion
A coder did a 'rename' of a java source code file, then edited the
file, all within one commit, and then this was pushed to an existing
change in gerrit.

The rename has not been detected in the gerrit diffs, probably because
a rename should be individually pushed to gerrit so be detected
whereas he had some of edits within the renamed file.

After the fact, how can I possibly correct this? I wouldn't mind
"undoing" the patchset (& re-pushing the rename etc.), if this was
possible, but it appears not to be.

Help!

--
Ash

Magnus Bäck

unread,
Jul 25, 2011, 10:05:35 AM7/25/11
to Repo and Gerrit Discussion
On Monday, July 25, 2011 at 15:20 CEST,
Ash <ashi...@gmail.com> wrote:

> A coder did a 'rename' of a java source code file, then edited the
> file, all within one commit, and then this was pushed to an existing
> change in gerrit.
>
> The rename has not been detected in the gerrit diffs, probably because
> a rename should be individually pushed to gerrit so be detected
> whereas he had some of edits within the renamed file.

It's perfectly fine to rename a file and change it in the same commit.
If it doesn't work it's clearly a Gerrit bug.

When you say "has not been detected", what do you mean? Is the file
displayed with the new name? Is it displayed with the old name?
Something else?

It *should* be displayed with the new name with the text "renamed from
<old name>" in italics just below the filename.

> After the fact, how can I possibly correct this? I wouldn't mind
> "undoing" the patchset (& re-pushing the rename etc.), if this was
> possible, but it appears not to be.

You can always upload a new patch set where you've undone the changes,
e.g. by renaming the file back to the original name and amend the commit.
But this should not be necessary.

--
Magnus Bäck Opinions are my own and do not necessarily
SW Configuration Manager represent the ones of my employer, etc.
Sony Ericsson

Ash

unread,
Jul 25, 2011, 10:39:57 AM7/25/11
to Repo and Gerrit Discussion
inline:

On Jul 25, 3:05 pm, Magnus Bäck <magnus.b...@sonyericsson.com> wrote:
> On Monday, July 25, 2011 at 15:20 CEST,
>      Ash <ashiru...@gmail.com> wrote:
>
> > A coder did a 'rename' of a java source code file, then edited the
> > file, all within one commit, and then this was pushed to an existing
> > change in gerrit.
>
> > The rename has not been detected in the gerrit diffs, probably because
> > a rename should be individually pushed to gerrit so be detected
> > whereas he had some of edits within the renamed file.
>
> It's perfectly fine to rename a file and change it in the same commit.
> If it doesn't work it's clearly a Gerrit bug.
>
> When you say "has not been detected", what do you mean? Is the file
> displayed with the new name? Is it displayed with the old name?
> Something else?

It's displayed as a delete of the old file and an add of the new name.
I'm using gerrit 2.2.1

There were a few edits to file that was renamed, perhaps enough to go
over the rename-detection threshold.

>
> It *should* be displayed with the new name with the text "renamed from
> <old name>" in italics just below the filename.
>
> > After the fact, how can I possibly correct this? I wouldn't mind
> > "undoing" the patchset (& re-pushing the rename etc.), if this was
> > possible, but it appears not to be.
>
> You can always upload a new patch set where you've undone the changes,
> e.g. by renaming the file back to the original name and amend the commit.
> But this should not be necessary.

So if the bad patchset was patchset 5 "PS5" (which was responding to
comments in PS4):
1. go back two patchsets (PS4), push to gerrit (presumably I have to
do a git commit -amend so its gets a new SHA1)
2. do the rename, commit, immediately push (this becomes a new
patchset PS6 that will show the rename)
3. redo the edits from PS5, commit, push (this becomes PS7)
4. diff the new PS7 against PS4 in order to see the original comments
against which he made changes (in the original PS5)

Ouch! If only there was some way to discard a pushed patchset!

Git itself knows the history of the file as it varied between
patchsets and show the diffs between patchsets, but in the final push,
and after that if clones by another user the renames will not be
detected as patchsets are gerrit concept and they all get squashed
into one commit.

Since git can't detect renames if the file as been extensively edited,
and many reviews will ask for file renames and edits, don't these
renames always remain undetectable in the final repo (although they
are visible within the gerrit change)?

Martin Fick

unread,
Jul 25, 2011, 11:36:53 AM7/25/11
to repo-d...@googlegroups.com, Ash
On Monday 25 July 2011 08:39:57 am Ash wrote:
>
> There were a few edits to file that was renamed, perhaps
> enough to go over the rename-detection threshold.

Gerrit and git do not share the exact same threshold.
There is not much you can do about this.


> > You can always upload a new patch set where you've
> > undone the changes, e.g. by renaming the file back to
> > the original name and amend the commit. But this should
> > not be necessary.
>
> So if the bad patchset was patchset 5 "PS5" (which was
> responding to comments in PS4):
> 1. go back two patchsets (PS4), push to gerrit
> (presumably I have to do a git commit -amend so its gets
> a new SHA1)
> 2. do the rename, commit, immediately push (this becomes
> a new patchset PS6 that will show the rename)
> 3. redo the edits from PS5, commit, push (this becomes
> PS7) 4. diff the new PS7 against PS4 in order to see the
> original comments against which he made changes (in the
> original PS5)
>
> Ouch! If only there was some way to discard a pushed
> patchset!

Separate patchsets isn't going to help, you would need
separate commits (changes) to force detection.

-Martin

Ash

unread,
Jul 25, 2011, 12:02:23 PM7/25/11
to Repo and Gerrit Discussion


On Jul 25, 4:36 pm, Martin Fick <mf...@codeaurora.org> wrote:
> On Monday 25 July 2011 08:39:57 am Ash wrote:
>
>
>
> > There were a few edits to file that was renamed, perhaps
> >  enough to go over the rename-detection threshold.
>
> Gerrit and git do not share the exact same threshold.
> There is not much you can do about this.
>

I have access to the server and I'm happy to tweak server settings or
code if need be.

> > > You can always upload a new patch set where you've
> > > undone the changes, e.g. by renaming the file back to
> > > the original name and amend the commit. But this should
> > > not be necessary.
>
> > So if the bad patchset was patchset 5 "PS5" (which was
> >  responding to comments in PS4):
> > 1. go back two patchsets (PS4), push to gerrit
> >  (presumably I have to do a git commit -amend so its gets
> >  a new SHA1)
> > 2. do the rename, commit, immediately push (this becomes
> >  a new patchset PS6 that will show the rename)
> > 3. redo the edits from PS5, commit, push (this becomes
> >  PS7) 4. diff the new PS7 against PS4 in order to see the
> >  original comments against which he made changes (in the
> >  original PS5)
>
> > Ouch! If only there was some way to discard a pushed
> >  patchset!
>
> Separate patchsets isn't going to help, you would need
> separate commits (changes) to force detection.

Surely if the rename was in its own patchset with little to no edits,
it would always be detected?

We could then do the heavy editing on the newly-named files a follow-
up patchset which presumably can then be diffed against the original
file on an earlier patchset. Or am I misunderstanding the way
patchsets interact with git?

>
> -Martin

Martin Fick

unread,
Jul 25, 2011, 12:18:25 PM7/25/11
to repo-d...@googlegroups.com
On Monday 25 July 2011 10:02:23 am Ash wrote:
> On Jul 25, 4:36 pm, Martin Fick <mf...@codeaurora.org>
wrote:
> > On Monday 25 July 2011 08:39:57 am Ash wrote:
> > > There were a few edits to file that was renamed,
> > > perhaps enough to go over the rename-detection
> > > threshold.
> >
> > Gerrit and git do not share the exact same threshold.
> > There is not much you can do about this.
>
> I have access to the server and I'm happy to tweak server
> settings or code if need be.

No Gerrit settings that I know of. You would have to look
at the jgit code to figure it out. I am not sure why it
matters, there will always be times when they are not
caught, even in cgit. It is just a heuristic.


> > > > You can always upload a new patch set where you've
> > > > undone the changes, e.g. by renaming the file back
> > > > to the original name and amend the commit. But this
> > > > should not be necessary.
> > >
> > > So if the bad patchset was patchset 5 "PS5" (which
> > > was responding to comments in PS4):
> > > 1. go back two patchsets (PS4), push to gerrit
> > > (presumably I have to do a git commit -amend so its
> > > gets a new SHA1)
> > > 2. do the rename, commit, immediately push (this
> > > becomes a new patchset PS6 that will show the rename)
> > > 3. redo the edits from PS5, commit, push (this
> > > becomes PS7) 4. diff the new PS7 against PS4 in order
> > > to see the original comments against which he made
> > > changes (in the original PS5)
> > >
> > > Ouch! If only there was some way to discard a pushed
> > > patchset!
> >
> > Separate patchsets isn't going to help, you would need
> > separate commits (changes) to force detection.
>
> Surely if the rename was in its own patchset with little
> to no edits, it would always be detected?

Yes, but patchsets replace each other, it will serve no
purpose if it is detected in a previous patchset. That's
why I said that you want it in a separate change. Are you
confusing patchsets and changes?

-Martin

--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum

Ash

unread,
Jul 25, 2011, 12:29:38 PM7/25/11
to Repo and Gerrit Discussion


On Jul 25, 5:18 pm, Martin Fick <mf...@codeaurora.org> wrote:
> On Monday 25 July 2011 10:02:23 am Ash wrote:
>
> > On Jul 25, 4:36 pm, Martin Fick <mf...@codeaurora.org>
> wrote:
> > > On Monday 25 July 2011 08:39:57 am Ash wrote:
> > > > There were a few edits to file that was renamed,
> > > > perhaps enough to go over the rename-detection
> > > > threshold.
>
> > > Gerrit and git do not share the exact same threshold.
> > > There is not much you can do about this.
>
> > I have access to the server and I'm happy to tweak server
> >  settings or code if need be.
>
> No Gerrit settings that I know of.  You would have to look
> at the jgit code to figure it out.  I am not sure why it
> matters, there will always be times when they are not
> caught, even in cgit.  It is just a heuristic.
>

If gerrit doesn't detect a rename, it doesn't display a diff against
the original file making it very difficult to see the original
comments and what changes were actually made in response to a code
review.

It's just an heuristic in git but when it breaks down it means you
lose file history which is one of the main points of using a SCM.
Patchsets don't "replace each other" while a review is ongoing - the
inter-patchset diffs are important to seeing what has changed in an
ongoing review. Without rename -detection, you have no diffs. Indeed,
I have no diffs and it's quite frustrating.

Ash

unread,
Jul 26, 2011, 5:42:36 AM7/26/11
to Repo and Gerrit Discussion, Ash
So it turns out that I'm a silly boy and the rename was detected, but
only when diffing the rename patchset against the immediately-previous
patchset, whereas gerrit by default compares it against patchset 1
which doesn't show the review.

The question still remains, since this renames are often just one
patchset amongst multiple edits, in the final merge, they will be
swallowed within one commit and therefore the rename detection (jgits
or "C git") will almost certainly get lost. I'm interested to know
what the community usually does to mitigate this?

(Is there a way to adjust the rename treshhold in jgit/gerrit?)

--
Ash

Swindells, Thomas

unread,
Jul 26, 2011, 6:01:22 AM7/26/11
to Ash, Repo and Gerrit Discussion
> The question still remains, since this renames are often just one patchset
> amongst multiple edits, in the final merge, they will be swallowed within one
> commit and therefore the rename detection (jgits or "C git") will almost
> certainly get lost. I'm interested to know what the community usually does to
> mitigate this?
>
[Swindells, Thomas] I just live with it personally, though more generally I've been having the opposite problem where with small files the size of our companies copyright notice at the top means that they are considered renames when they aren't!

> (Is there a way to adjust the rename treshhold in jgit/gerrit?)


**************************************************************************************
This message is confidential and intended only for the addressee. If you have received this message in error, please immediately notify the postm...@nds.com and delete it from your system as well as any copies. The content of e-mails as well as traffic data may be monitored by NDS for employment and security purposes. To protect the environment please do not print this e-mail unless necessary.

NDS Limited. Registered Office: One London Road, Staines, Middlesex, TW18 4EX, United Kingdom. A company registered in England and Wales. Registered no. 3080780. VAT no. GB 603 8808 40-00
**************************************************************************************

Magnus Bäck

unread,
Jul 26, 2011, 10:29:34 AM7/26/11
to Repo and Gerrit Discussion
On Tuesday, July 26, 2011 at 11:42 CEST,
Ash <ashi...@gmail.com> wrote:

> So it turns out that I'm a silly boy and the rename was detected, but
> only when diffing the rename patchset against the immediately-previous
> patchset, whereas gerrit by default compares it against patchset 1
> which doesn't show the review.
>
> The question still remains, since this renames are often just one
> patchset amongst multiple edits, in the final merge, they will be
> swallowed within one commit and therefore the rename detection (jgits
> or "C git") will almost certainly get lost.

Sorry, I don't quite follow. The way you write makes me believe you're
using patchsets of a change as a way of splitting commits into multiple
parts that you combine (basically squash-merge) when you submit the
change. Surely this isn't the case?

IIRC the rename detection threshold in C Git triggers when about 50%
of the lines in a file differ (for JGit the threshold appears to be
60%). For reasonably sized commits in any standard-looking programming
language this shouldn't trigger very often except in actual cases of
files being renamed.

> I'm interested to know what the community usually does to mitigate
> this?

I doubt this is even perceived as a problem. To my knowledge this hasn't
come up on this list before, and I don't think I've ever seen a question
from any of our users (~1000) during the two years we've been using
Gerrit. Sure, all Git/Gerrit questions don't reach my inbox (thank God),
but you get the idea.

> (Is there a way to adjust the rename treshhold in jgit/gerrit?)

JGit seems to make it configurable (RenameDetector.java), but Gerrit
doesn't expose this configuration point.

Ash Irus

unread,
Jul 26, 2011, 12:05:39 PM7/26/11
to Repo and Gerrit Discussion
On Tue, Jul 26, 2011 at 3:29 PM, Magnus Bäck <magnu...@sonyericsson.com> wrote:
On Tuesday, July 26, 2011 at 11:42 CEST,
    Ash <ashi...@gmail.com> wrote:

> So it turns out that I'm a silly boy and the rename was detected, but
> only when diffing the rename patchset against the immediately-previous
> patchset, whereas gerrit by default compares it against patchset 1
> which doesn't show the review.
>
> The question still remains, since this renames are often just one
> patchset amongst multiple edits, in the final merge, they will be
> swallowed within one commit and therefore the rename detection (jgits
> or "C git") will almost certainly get lost.

Sorry, I don't quite follow. The way you write makes me believe you're
using patchsets of a change as a way of splitting commits into multiple
parts that you combine (basically squash-merge) when you submit the
change. Surely this isn't the case?

I'm using Gerrit in the following way:
1. Coder submits code using push,
2. Reviewer comments on code
3. Coder makes changes according the review, and commits the changes using the changeId of the first submission in the commit message, furthermore he must amend the original commit (via 'git rebase' or 'git commit --amend') and then push
Repeat 2 & 3 until no more changes needed
n. Reviewer (or Approver) approves and only the final commit is merged with the the gerrit repo branch.

I've been led to believe (although there is no official documentation explicitly explaining a basic gerrit workflow) that this is standard usage.

Now, the reviewer may ask the coder to rename a class to a better name, and hence in Java this would necessitate a filename change. The ongoing review may require code changes in the renamed file in later comments. The problem is that all this must occur under one commit (i.e. you have to commit --amend or rebase to generate patchsets to an ongoing review).

If there are many edits following the rename, these may fall beyond the threshold.

I agree that ideally code for review should involve few changes and hence less chances of renames followed by extensive editing, however not all of us smaller shops have the luxury of having many people able to code review, so they tend to pile up into larger submissions making this scenario quite likely.
 

IIRC the rename detection threshold in C Git triggers when about 50%
of the lines in a file differ (for JGit the threshold appears to be
60%). For reasonably sized commits in any standard-looking programming
language this shouldn't trigger very often except in actual cases of
files being renamed.

I wasn't aware it was this high. I was under the impression it was quite a bit less by default (20% ?). If it is this high you are probably right that is very unlikely for a rename to be entirely missed.
 

> I'm interested to know what the community usually does to mitigate
> this?

I doubt this is even perceived as a problem. To my knowledge this hasn't
come up on this list before, and I don't think I've ever seen a question
from any of our users (~1000) during the two years we've been using
Gerrit. Sure, all Git/Gerrit questions don't reach my inbox (thank God),
but you get the idea.

> (Is there a way to adjust the rename treshhold in jgit/gerrit?)

JGit seems to make it configurable (RenameDetector.java), but Gerrit
doesn't expose this configuration point.

It would be nice if it did, if the threshold is much lower than 50%.
 

--
Magnus Bäck                   Opinions are my own and do not necessarily
SW Configuration Manager      represent the ones of my employer, etc.
Sony Ericsson

Magnus Bäck

unread,
Jul 26, 2011, 5:46:01 PM7/26/11
to Repo and Gerrit Discussion
On Tuesday, July 26, 2011 at 18:05 CEST,
Ash Irus <ashi...@gmail.com> wrote:

> I'm using Gerrit in the following way:
> 1. Coder submits code using push,
> 2. Reviewer comments on code
> 3. Coder makes changes according the review, and commits the changes
> using the changeId of the first submission in the commit message,
> furthermore he must amend the original commit (via 'git rebase' or
> 'git commit --amend') and then push
> Repeat 2 & 3 until no more changes needed
> n. Reviewer (or Approver) approves and only the final commit is merged
> with the the gerrit repo branch.
>
> I've been led to believe (although there is no official documentation
> explicitly explaining a basic gerrit workflow) that this is standard
> usage.

Yes, it is.

> Now, the reviewer may ask the coder to rename a class to a better
> name, and hence in Java this would necessitate a filename change. The
> ongoing review may require code changes in the renamed file in later
> comments. The problem is that all this must occur under one commit
> (i.e. you have to commit --amend or rebase to generate patchsets to an
> ongoing review).
>
> If there are many edits following the rename, these may fall beyond
> the threshold.

Yes, although it should be quite rare.

> I agree that ideally code for review should involve few changes and
> hence less chances of renames followed by extensive editing, however
> not all of us smaller shops have the luxury of having many people able
> to code review, so they tend to pile up into larger submissions making
> this scenario quite likely.

I don't think the size of the shop is relevant, it's the ratio of coders
to approvers that matters. The number of approvers is limited by
competence, how the organization chooses to prioritize code reviews,
and possibly internal politics.

Reviewing large chunks of code at a time will either take *much* longer
time for the reviewers (I'd expect the review time as a function of the
commit delta to grow exponentially), or the reviews will have much lower
quality.

[...]

Reply all
Reply to author
Forward
0 new messages