Follow-up from today's conversations

83 views
Skip to first unread message

Ehsan Akhgari

unread,
Mar 17, 2014, 6:31:54 PM3/17/14
to mozilla-c...@googlegroups.com
1. Steven convinced me that the model that he is proposing for multiple pushes to the same bug will work.  I think we should at least try to prototype it.  He will provide more details on what he is envisioning tomorrow.

2. I convinced Steven that we don't want the multiple units of works per bug model.  Instead, people can just file a new bug for a new unit of work, the same way they do today.

We didn't really get anywhere on the question of how many review requests to create per bug.  I still think my idea of creating one review request per bug no matter how many changesets/commits are being pushed is better.  ;-)  Thoughts?

Cheers,

Taras Glek

unread,
Mar 17, 2014, 6:41:29 PM3/17/14
to Ehsan Akhgari, mozilla-c...@googlegroups.com


Monday, March 17, 2014 15:31
1. Steven convinced me that the model that he is proposing for multiple pushes to the same bug will work.  I think we should at least try to prototype it.  He will provide more details on what he is envisioning tomorrow.

2. I convinced Steven that we don't want the multiple units of works per bug model.  Instead, people can just file a new bug for a new unit of work, the same way they do today.

We didn't really get anywhere on the question of how many review requests to create per bug.  I still think my idea of creating one review request per bug no matter how many changesets/commits are being pushed is better.  ;-)  Thoughts?
I think that' breaks the mq workflow. People usually have a patch per logical thing to review.

Taras

Cheers,
--
You received this message because you are subscribed to the Google Groups "mozilla-code-review" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mozilla-code-re...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ehsan Akhgari

unread,
Mar 18, 2014, 9:06:14 AM3/18/14
to Taras Glek, mozilla-c...@googlegroups.com
On Mon, Mar 17, 2014 at 6:41 PM, Taras Glek <tg...@mozilla.com> wrote:


Monday, March 17, 2014 15:31
1. Steven convinced me that the model that he is proposing for multiple pushes to the same bug will work.  I think we should at least try to prototype it.  He will provide more details on what he is envisioning tomorrow.

2. I convinced Steven that we don't want the multiple units of works per bug model.  Instead, people can just file a new bug for a new unit of work, the same way they do today.

We didn't really get anywhere on the question of how many review requests to create per bug.  I still think my idea of creating one review request per bug no matter how many changesets/commits are being pushed is better.  ;-)  Thoughts?
I think that' breaks the mq workflow. People usually have a patch per logical thing to review.

There is unfortunately a lot of the IRC conversation context missing here which I don't have a link to, but I don't think this is an issue here.  The model that Steven had in mind was for the case where for example you land something in a bug and then later on discover that fix was not sufficient and then you reopen the bug to do another round of fixes (another logical unit of work.)  When this currently happens we ask people to file a new bug for the new work, which is why I said we do not need to address this use case.

Steven MacLeod

unread,
Mar 18, 2014, 5:24:53 PM3/18/14
to Ehsan Akhgari, Taras Glek, mozilla-c...@googlegroups.com
To describe what Ehsan and I discussed, I'll walk through an example and with
diagrams and explain the workflow. We'll start with the Reviews repository, and
a local developer repository in the following state:

Inline image 1

We would then push to "Bug 1" on the Reviews repo, which would result in
commits "C5" and "C6" being pushed. The push would return some URL indicating
the bug and the push (We'll assume bug 1 going forward). Visiting this page would
show an interface similar to:

Inline image 2

We would select the commits we would like to post for review using the
checkboxes, and then submit them. Assuming we select both commits and submit,
we would end up with the following state for review requests:

Inline image 3

We would have the "permalink" review request "/r/1" which contains the "squashed"
diff. This review request would link out to a review request for each commit
("/r/2" and "/r/3").

Now, let's assume the developer modifies commit C6, resulting in C6', and
introduces a new commit C7. The state of the repositories is the following:

Inline image 4

We would then push to "Bug 1" on the Reviews repo again, which would result
in commits "C6'" and "C7" being pushed. The push would return some URL
indicating the bug and push. Visiting the page would now show a new interface
where you not only select the commits for review, but match them to the
previous push for that bug.

Inline image 5

Only the new commits C6' and C7 would be shown to the right, since they were
what appeared in the push. We would need to click "Click To Show More Commits"
so that C5 would appear allowing us to match it as well. Clicking it would result in
the following:

Inline image 6

Now, we would select C7 so that it would be a new Review Request, and then drag
C6' onto C6 (/r/3) and drag C5 onto C5 (/r/2). The reason we require matching C5
to C5, is that if any of the old review requests don't have a match, we will close them
and assume they have been folded into another commit, or removed. After doing this
we would see something like the following:

Inline image 7

This interface would be flexible enough to support rebases including reordering of
commits. By matching these commits we are able to update the associated review
requests, and Review Board will automatically give us nice interdiffs for each commit.

Once we submit this matching, we would end up with the following review request
states:

Inline image 9


That is the general workflow I imagined for this. It should support the type of
development I use with separate branches for each feature, as well as Ehsan's
workflow with many features on a single branch where commits must be
selected out of the push.

In the case where there are commits intermingled between the ones you would
like reviewed, you just don't select them. Review Board is smart enough to filter
out most changes which aren't part of the commits you're reviewing, especially
when they are in files you did not touch in your change.

I'm not going to provide more diagrams showing the different workflows unless
someone is unclear and would like something specific explained.

- Steve

Ehsan Akhgari

unread,
Mar 18, 2014, 5:51:46 PM3/18/14
to Steven MacLeod, Taras Glek, mozilla-c...@googlegroups.com
Thanks for posting this, Steven!

George, can you please work on a quick prototype of this?  I have thought about this and have managed to convince myself that it should be fairly simple to implement so I would like to know ASAP if you hit any roadblocks on this.  This is pretty fundamental to the workflow of submitting something for review so we should focus on seeing whether this idea works fine in practice.

Thanks,

--

George Miroshnykov

unread,
Mar 19, 2014, 6:56:25 AM3/19/14
to Steven MacLeod, Ehsan Akhgari, Taras Glek, mozilla-c...@googlegroups.com
Hey, that looks nice!

I still have a ton of questions, but I’ll try to implement what I can.

Meanwhile, a question: do we want to set the limit to one (parent) review request per bug?
Having multiple (competing) fixes from different contributors to the same bug looks feasible, but we definitely don’t want to bulk them into a single review request.

The current algorithm would work like this:

1. User runs "BUG=31337 hg push -e 'ssh -o SendEnv=BUG’” (see https://github.com/ehsan/reviewtool/issues/2 for details)
2. User gets the URL like http://review.example.org/?changeset=<something>&bug=31337
3. User opens this URL
4. We look up a parent review request for bug 31337
5. If there’s no existing review request, we create a new one and its children
6. If there is existing review request, we update it and its children

That workflow will fail if someone wants to create a second review request for the same bug.
We could probably identify review requests not just by bug number, but by a tuple of (bug number, user), but that means two users who collaborate on a common fix won’t be able to incrementally push their changesets for review - that would create separate review requests per user.

What do you think about it?

Thanks,
George

Ehsan Akhgari

unread,
Mar 19, 2014, 9:43:54 AM3/19/14
to George Miroshnykov, Steven MacLeod, Taras Glek, mozilla-c...@googlegroups.com
On Wed, Mar 19, 2014 at 6:56 AM, George Miroshnykov <gmiros...@rebbix.com> wrote:
Hey, that looks nice!

I still have a ton of questions, but I’ll try to implement what I can.

Please ask them!
 
Meanwhile, a question: do we want to set the limit to one (parent) review request per bug?

Yep.
 
Having multiple (competing) fixes from different contributors to the same bug looks feasible, but we definitely don’t want to bulk them into a single review request.

We discourage multiple people working on competing fixes on a bug.  Sometimes different people work on the bug at different times, but they should all be able to add their review requests under the same parent entry.
 
The current algorithm would work like this:

1. User runs "BUG=31337 hg push -e 'ssh -o SendEnv=BUG’” (see https://github.com/ehsan/reviewtool/issues/2 for details)
2. User gets the URL like http://review.example.org/?changeset=<something>&bug=31337
3. User opens this URL
4. We look up a parent review request for bug 31337
5. If there’s no existing review request, we create a new one and its children
6. If there is existing review request, we update it and its children

Yeah that is exactly what I had in mind.
 
That workflow will fail if someone wants to create a second review request for the same bug.
We could probably identify review requests not just by bug number, but by a tuple of (bug number, user), but that means two users who collaborate on a common fix won’t be able to incrementally push their changesets for review - that would create separate review requests per user.

Differentiating based on the user is not needed, see above please.

Cheers,
Ehsan

George Miroshnykov

unread,
Mar 19, 2014, 10:24:09 AM3/19/14
to Ehsan Akhgari, Steven MacLeod, Taras Glek, mozilla-c...@googlegroups.com
Great, then let’s carry on with one review request per bug.

The main question right now is here:

The second question is how do we generate a squashed diff of arbitrary changesets, preferably without touching the working copy of the repository (to avoid locks etc.)?
E.g. we have changesets C1 => C2 => C3.
If user picks (C1, C2, C3) or (C1, C2) or (C2, C3) - we can be sure there are no conflicts and can simply use “hg diff” to get a squashed diff.
But if user picks (C1, C3) there’s a chance that C3 won’t apply on C1 without C2.
The only way to detect this that I can think of is to try to apply those changesets one by one on the working copy.
Is there a better way?

Next, I’m not sure why do we need to match C5 to C5?
We can simply show that C5 already has a review request and optionally allow to discard it from review.
The idea is that if I’ve pushed 10 changesets for review, I don’t want to match them all when I push 11-th.
Am I missing something?

Thanks,
George

Ehsan Akhgari

unread,
Mar 19, 2014, 12:10:15 PM3/19/14
to George Miroshnykov, Steven MacLeod, Taras Glek, mozilla-c...@googlegroups.com
On Wed, Mar 19, 2014 at 10:24 AM, George Miroshnykov <gmiros...@rebbix.com> wrote:
Great, then let’s carry on with one review request per bug.

The main question right now is here:

I'll take a look at that later today, it requires a bit of thought.
 
The second question is how do we generate a squashed diff of arbitrary changesets, preferably without touching the working copy of the repository (to avoid locks etc.)?
E.g. we have changesets C1 => C2 => C3.
If user picks (C1, C2, C3) or (C1, C2) or (C2, C3) - we can be sure there are no conflicts and can simply use “hg diff” to get a squashed diff.
But if user picks (C1, C3) there’s a chance that C3 won’t apply on C1 without C2.
The only way to detect this that I can think of is to try to apply those changesets one by one on the working copy.
Is there a better way?

We don't need to bother with that -- we can just use hg diff with the first and last changesets.  If there are irrelevant commits in the middle, they _will_ appear in the squashed diff.

Note that for the majority of the cases, the squashed diff will not be reviewed or even looked at.  In case someone looks at it and is bothered by the potentially irrelevant stuff, they can just ask the change author to resubmit without that stuff.
 
Next, I’m not sure why do we need to match C5 to C5?

I assume you mean C5 to C5'.  The reason why someone would want to do that is for RB to be able to show you the interdiff of what changed between C5 and C5'.  It's optional of course, and is left at the discretion of the patch author, so they will do that where it makes sense.
 
We can simply show that C5 already has a review request and optionally allow to discard it from review.
The idea is that if I’ve pushed 10 changesets for review, I don’t want to match them all when I push 11-th.
Am I missing something?

If you just add the 11th changeset on top, you don't need to match anything, you'll just request an additional review on the 11th commit.  If however you have rebased and/or changed the other 10 commits, you would want to tell this tool which old commit corresponds to which new one.  If you haven't done any changes to those and you don't want review on them, you can just do nothing, and they no review request will be created for them.

Does this help explain things more?

Cheers,
Ehsan

George Miroshnykov

unread,
Mar 19, 2014, 1:12:13 PM3/19/14
to Ehsan Akhgari, Steven MacLeod, Taras Glek, mozilla-c...@googlegroups.com

On March 19, 2014 at 18:10:57 , Ehsan Akhgari (ehsan....@gmail.com) wrote:

On Wed, Mar 19, 2014 at 10:24 AM, George Miroshnykov <gmiros...@rebbix.com> wrote:
Great, then let’s carry on with one review request per bug.

The main question right now is here:

I'll take a look at that later today, it requires a bit of thought.

Thanks!


 
The second question is how do we generate a squashed diff of arbitrary changesets, preferably without touching the working copy of the repository (to avoid locks etc.)?
E.g. we have changesets C1 => C2 => C3.
If user picks (C1, C2, C3) or (C1, C2) or (C2, C3) - we can be sure there are no conflicts and can simply use “hg diff” to get a squashed diff.
But if user picks (C1, C3) there’s a chance that C3 won’t apply on C1 without C2.
The only way to detect this that I can think of is to try to apply those changesets one by one on the working copy.
Is there a better way?

We don't need to bother with that -- we can just use hg diff with the first and last changesets.  If there are irrelevant commits in the middle, they _will_ appear in the squashed diff.

Got it.



Note that for the majority of the cases, the squashed diff will not be reviewed or even looked at.  In case someone looks at it and is bothered by the potentially irrelevant stuff, they can just ask the change author to resubmit without that stuff.
 
Next, I’m not sure why do we need to match C5 to C5?

I assume you mean C5 to C5'.  The reason why someone would want to do that is for RB to be able to show you the interdiff of what changed between C5 and C5'.  It's optional of course, and is left at the discretion of the patch author, so they will do that where it makes sense.

Nope, I did mean matching C5 to C5, because C5 is not changed in the case that Steven described:

> The reason we require matching C5 to C5, is that if any of the old review requests don't have a match, we will close them and assume they have been folded into another commit, or removed.

I just thought that it would be better to revere this behaviour: when we can automatically match some existing changesets, we should do it.
If user doesn’t want those changesets reviewed, they can be explicitly removed.
What do you think?

 
We can simply show that C5 already has a review request and optionally allow to discard it from review.
The idea is that if I’ve pushed 10 changesets for review, I don’t want to match them all when I push 11-th.
Am I missing something?

If you just add the 11th changeset on top, you don't need to match anything, you'll just request an additional review on the 11th commit.  If however you have rebased and/or changed the other 10 commits, you would want to tell this tool which old commit corresponds to which new one.  If you haven't done any changes to those and you don't want review on them, you can just do nothing, and they no review request will be created for them.

Oh, I probably misunderstood Steven.

I’m also concerned about edge cases of history rewrite.
The approach we’ve discussed so far covers following cases:

  1. Creating a new review request
  2. Adding changesets to existing review request
  3. Removing changesets from existing review request
  4. Rewriting a changeset, then replacing old one (C6) with a new one (C6’).

But what should we do in case of squashed commits?
As far as I can see, there’s no way for us to preserve review comments from the initial commits if they were later squashed.
Unless we track comments of the parent (squashed) review request too, like GitHub does.



Does this help explain things more?

It sure does, thanks!

Ehsan Akhgari

unread,
Mar 19, 2014, 2:40:32 PM3/19/14
to George Miroshnykov, Steven MacLeod, Taras Glek, mozilla-c...@googlegroups.com
On Wed, Mar 19, 2014 at 1:12 PM, George Miroshnykov <gmiros...@rebbix.com> wrote:

On March 19, 2014 at 18:10:57 , Ehsan Akhgari (ehsan....@gmail.com) wrote:

On Wed, Mar 19, 2014 at 10:24 AM, George Miroshnykov <gmiros...@rebbix.com> wrote:
Great, then let’s carry on with one review request per bug.

The main question right now is here:

I'll take a look at that later today, it requires a bit of thought.

Thanks!


 
The second question is how do we generate a squashed diff of arbitrary changesets, preferably without touching the working copy of the repository (to avoid locks etc.)?
E.g. we have changesets C1 => C2 => C3.
If user picks (C1, C2, C3) or (C1, C2) or (C2, C3) - we can be sure there are no conflicts and can simply use “hg diff” to get a squashed diff.
But if user picks (C1, C3) there’s a chance that C3 won’t apply on C1 without C2.
The only way to detect this that I can think of is to try to apply those changesets one by one on the working copy.
Is there a better way?

We don't need to bother with that -- we can just use hg diff with the first and last changesets.  If there are irrelevant commits in the middle, they _will_ appear in the squashed diff.

Got it.



Note that for the majority of the cases, the squashed diff will not be reviewed or even looked at.  In case someone looks at it and is bothered by the potentially irrelevant stuff, they can just ask the change author to resubmit without that stuff.
 
Next, I’m not sure why do we need to match C5 to C5?

I assume you mean C5 to C5'.  The reason why someone would want to do that is for RB to be able to show you the interdiff of what changed between C5 and C5'.  It's optional of course, and is left at the discretion of the patch author, so they will do that where it makes sense.

Nope, I did mean matching C5 to C5, because C5 is not changed in the case that Steven described:

> The reason we require matching C5 to C5, is that if any of the old review requests don't have a match, we will close them and assume they have been folded into another commit, or removed.

I just thought that it would be better to revere this behaviour: when we can automatically match some existing changesets, we should do it.
If user doesn’t want those changesets reviewed, they can be explicitly removed.
What do you think?


The only case where we would be able to do this matching automatically is if the changeset sha1 has not changed, right?  Which means that there is nothing to create a new review request based on.  I think Steven was talking about the case where C5 is "essentially" the same patch, but rebased or something.  With that, what I said above should still apply.
 
 
We can simply show that C5 already has a review request and optionally allow to discard it from review.
The idea is that if I’ve pushed 10 changesets for review, I don’t want to match them all when I push 11-th.
Am I missing something?

If you just add the 11th changeset on top, you don't need to match anything, you'll just request an additional review on the 11th commit.  If however you have rebased and/or changed the other 10 commits, you would want to tell this tool which old commit corresponds to which new one.  If you haven't done any changes to those and you don't want review on them, you can just do nothing, and they no review request will be created for them.

Oh, I probably misunderstood Steven.

I’m also concerned about edge cases of history rewrite.
The approach we’ve discussed so far covers following cases:

  1. Creating a new review request
  2. Adding changesets to existing review request
  3. Removing changesets from existing review request
  4. Rewriting a changeset, then replacing old one (C6) with a new one (C6’).

But what should we do in case of squashed commits?
As far as I can see, there’s no way for us to preserve review comments from the initial commits if they were later squashed.
Unless we track comments of the parent (squashed) review request too, like GitHub does.


For squashed commits, the author can either choose to reflect that as a new review request, *or* connect that to either one of the two commits that have been squashed.  Depending on what they do, we will fall back to one of the "easy" cases you listed above.

I didn't get the part about being able to preserve review comments, can you please elaborate on that?

Thanks!
Ehsan

George Miroshnykov

unread,
Mar 19, 2014, 3:39:36 PM3/19/14
to Ehsan Akhgari, Steven MacLeod, Taras Glek, mozilla-c...@googlegroups.com

Let’s say you push two changesets for review.
A reviewer comes and leaves two inline comments, one for each changeset.
Then you decide to squash those two changesets into one.
At this point you have two options:

  1. Create a new review request for your newly squashed changeset. You lose both comments.
  2. Discard one of the previous changesets and associate the newly squashed squashed with the old one that’s left. You lose one comment on the discarded changeset.
GitHub avoids this issue by tracking comments on the squashed diff, not on individual commits.
They also hide comments that are no longer relevant automatically after each push.
This works for squashes and other forms of history rewrites.

It can be better demonstrated by live example:

Of course, inline comments on individual commits are also available (at the bottom):

Should we try to do something similar with RB?

Thanks,
George

Steven MacLeod

unread,
Mar 20, 2014, 12:09:49 PM3/20/14
to George Miroshnykov, Ehsan Akhgari, Taras Glek, mozilla-c...@googlegroups.com
Sorry for the confusion from my example, I'll try and clear a few things up:

I just thought that it would be better to revere this behaviour: when we can automatically match some existing changesets, we should do it.
If user doesn’t want those changesets reviewed, they can be explicitly removed.
What do you think?


The only case where we would be able to do this matching automatically is if the changeset sha1 has not changed, right?  Which means that there is nothing to create a new review request based on.  I think Steven was talking about the case where C5 is "essentially" the same patch, but rebased or something.  With that, what I said above should still apply.
So, George was correct, I was talking about the case where C5 == C5, meaning it is the *exact* commit. Confusingly in my example I showed manual matching of this commit because I wasn't trying to add extra intelligence to this system for a first run.

In this case, we definitely could automatically match C5 and C5 by walking backwards until we hit a changeset that was not part of the previous review. So, we'd check the parent of the pushed C6', see it was C5, and match. Then check the parent of C5, see it was C4, and stop walking the commits. This auto matching will break down in the case where there were interleaved commits not put up for review, in which case we'd just require manual matching like I showed.

Now, the reason why I think the matching needs to be shown explicitly, even for the case where C5 is up for review and unchanged, is it gives the ability to remove a commit from review. We will *discard* any review request for a commit that didn't get matched in the latest push (Discard is a specific term to Review Board here, which closes the review request, marking it as *discarded*. This process preserves the diffs, and reviews, and the review request could be reopened in the future).

Now, I'm not sure if this UX of *not* matching a commit to a previous one causing the review request to be discarded is the best, but I thought it was a simple model that fit in with the rest, giving us a way to discard.
 
We can simply show that C5 already has a review request and optionally allow to discard it from review.
The idea is that if I’ve pushed 10 changesets for review, I don’t want to match them all when I push 11-th.
Am I missing something?

If you just add the 11th changeset on top, you don't need to match anything, you'll just request an additional review on the 11th commit.  If however you have rebased and/or changed the other 10 commits, you would want to tell this tool which old commit corresponds to which new one.  If you haven't done any changes to those and you don't want review on them, you can just do nothing, and they no review request will be created for them.

Oh, I probably misunderstood Steven.

This again was me just making my example have a less intelligent tool than is possible. This auto matching when there is no rebase is certainly possible.
 

I didn't get the part about being able to preserve review comments, can you please elaborate on that?

Let’s say you push two changesets for review.
A reviewer comes and leaves two inline comments, one for each changeset.
Then you decide to squash those two changesets into one.
At this point you have two options:

  1. Create a new review request for your newly squashed changeset. You lose both comments.
  2. Discard one of the previous changesets and associate the newly squashed squashed with the old one that’s left. You lose one comment on the discarded changeset.
GitHub avoids this issue by tracking comments on the squashed diff, not on individual commits.
They also hide comments that are no longer relevant automatically after each push.
This works for squashes and other forms of history rewrites.

It can be better demonstrated by live example:

Of course, inline comments on individual commits are also available (at the bottom):

Should we try to do something similar with RB?


When we rebase and remove or squash commits, the pusher will need to select which Review Request to become the new commit. Generally this should be the child commit in the chain, because if this is done, the way Review Board generates interdiffs, it will only show the changes that are new for the squashed commit, not the entire parent commit as well (You'll of course see the entire commits diff in the normal diff view).

As for preserving review comments, we will not delete the other review requests, we will only *discard* them. As mentioned previously, this will preserve all history, it just marks the review request. I'm unsure if we should list them somehow in the squashed review request, but links to them will be preserved in the history of the squashed review request anyway. Basically, nothing is lost, it just won't be presented as part of the current state.

George Miroshnykov

unread,
Mar 20, 2014, 1:53:43 PM3/20/14
to Steven MacLeod, Taras Glek, mozilla-c...@googlegroups.com, Ehsan Akhgari

On March 20, 2014 at 6:09:51 PM, Steven MacLeod (ste...@smacleod.ca) wrote:

Sorry for the confusion from my example, I'll try and clear a few things up:

I just thought that it would be better to revere this behaviour: when we can automatically match some existing changesets, we should do it.
If user doesn’t want those changesets reviewed, they can be explicitly removed.
What do you think?


The only case where we would be able to do this matching automatically is if the changeset sha1 has not changed, right?  Which means that there is nothing to create a new review request based on.  I think Steven was talking about the case where C5 is "essentially" the same patch, but rebased or something.  With that, what I said above should still apply.
So, George was correct, I was talking about the case where C5 == C5, meaning it is the *exact* commit. Confusingly in my example I showed manual matching of this commit because I wasn't trying to add extra intelligence to this system for a first run.

In this case, we definitely could automatically match C5 and C5 by walking backwards until we hit a changeset that was not part of the previous review. So, we'd check the parent of the pushed C6', see it was C5, and match. Then check the parent of C5, see it was C4, and stop walking the commits. This auto matching will break down in the case where there were interleaved commits not put up for review, in which case we'd just require manual matching like I showed.

Now, the reason why I think the matching needs to be shown explicitly, even for the case where C5 is up for review and unchanged, is it gives the ability to remove a commit from review. We will *discard* any review request for a commit that didn't get matched in the latest push (Discard is a specific term to Review Board here, which closes the review request, marking it as *discarded*. This process preserves the diffs, and reviews, and the review request could be reopened in the future).

Now, I'm not sure if this UX of *not* matching a commit to a previous one causing the review request to be discarded is the best, but I thought it was a simple model that fit in with the rest, giving us a way to discard.

Okay, now I think I got it.

We’re looking at this interface from somewhat different perspectives.
My current implementation (still work in progress) looks like this:
https://cloudup.com/cb5eMMDNkfj

As you’ve suggested, we now have two tables: existing changesets and new changesets.
Existing changesets are looked up as following:

  1. find a parent review request by bug number (taken from current URL params)
  2. find all children review requests for that parent review request
  3. find all changesets associated with those children review requests
New changesets are looked up as following:
  1. find all changesets descendant from the provided changeset (taken from current URL params), inclusive
  2. ignore any changeset that is a part of existing changesets (above)
  3. TODO: figure out what to do if the changeset is already a part of another review request
This way we don’t have to match C5 to C5, because we already know that C5 was added for review before.
If we want to discard it, we can explicitly click that “[X]” button on the right.

I’m not as good with frontend stuff as I’d wish, so I’m still struggling to implement a proper drag and drop support for commits matching.

So my main intention for this UI is to provide a hassle-free common flow.
E.g. you could simply create a review request for all pushed changes like this:
  • mach review-request 31337
  • <open the URL in browser>
  • all changesets are pre-selected, so you briefly review the list and click “Create review request”
  • done
Want to update existing review requests with newly pushed changesets? Same flow.
Want to discard some changesest? Click [X].
Want to match rewritten changesets? Well, you use drag and drop, unless we come up with something better.

Am I veering off too much from your idea?


  
We can simply show that C5 already has a review request and optionally allow to discard it from review.
The idea is that if I’ve pushed 10 changesets for review, I don’t want to match them all when I push 11-th.
Am I missing something?

If you just add the 11th changeset on top, you don't need to match anything, you'll just request an additional review on the 11th commit.  If however you have rebased and/or changed the other 10 commits, you would want to tell this tool which old commit corresponds to which new one.  If you haven't done any changes to those and you don't want review on them, you can just do nothing, and they no review request will be created for them.

Oh, I probably misunderstood Steven.

This again was me just making my example have a less intelligent tool than is possible. This auto matching when there is no rebase is certainly possible.
 

I didn't get the part about being able to preserve review comments, can you please elaborate on that?

Let’s say you push two changesets for review.
A reviewer comes and leaves two inline comments, one for each changeset.
Then you decide to squash those two changesets into one.
At this point you have two options:

  1. Create a new review request for your newly squashed changeset. You lose both comments.
  2. Discard one of the previous changesets and associate the newly squashed squashed with the old one that’s left. You lose one comment on the discarded changeset.
GitHub avoids this issue by tracking comments on the squashed diff, not on individual commits.
They also hide comments that are no longer relevant automatically after each push.
This works for squashes and other forms of history rewrites.

It can be better demonstrated by live example:

Of course, inline comments on individual commits are also available (at the bottom):

Should we try to do something similar with RB?


When we rebase and remove or squash commits, the pusher will need to select which Review Request to become the new commit. Generally this should be the child commit in the chain, because if this is done, the way Review Board generates interdiffs, it will only show the changes that are new for the squashed commit, not the entire parent commit as well (You'll of course see the entire commits diff in the normal diff view).

As for preserving review comments, we will not delete the other review requests, we will only *discard* them. As mentioned previously, this will preserve all history, it just marks the review request. I'm unsure if we should list them somehow in the squashed review request, but links to them will be preserved in the history of the squashed review request anyway. Basically, nothing is lost, it just won't be presented as part of the current state.

You’re right, technically we’re not losing history.
But unless we’re showing this history somewhere, it will be effectively lost for the reviewer and reviewee.

Consider this case:
User pushes three changesets for review.
The first feedback he gets is “please squash all changesets into one”, followed by some other suggestions.
User squashes the changesets as requested and pushes new single squashed changeset for review.
He is presented with the UI listing three existing outdated changesets and the new squashed changeset. He’s not sure what to do now.
Moreover, even if he manages to match one changeset and discard the rest, he now has to go over every discarded changeset’s review request and look for the feedback he may have missed / not yet applied.
He’ll have no indication if this feedback is still relevant or he already fixed the issue in the newer commit(s).
I’m still not sure how we can deal with this...

Thanks,
George

Gregory Szorc

unread,
Mar 20, 2014, 3:56:40 PM3/20/14
to George Miroshnykov, Steven MacLeod, Taras Glek, mozilla-c...@googlegroups.com, Ehsan Akhgari
On 3/20/14, 10:53 AM, George Miroshnykov wrote:
>
> On March 20, 2014 at 6:09:51 PM, Steven MacLeod (ste...@smacleod.ca
> 1. find a parent review request by bug number (taken from current URL
> params)
> 2. find all children review requests for that parent review request
> 3. find all changesets associated with those children review requests
>
> New changesets are looked up as following:
>
> 1. find all changesets descendant from the provided changeset (taken
> from current URL params), inclusive
> 2. ignore any changeset that is a part of existing changesets (above)
> 3. TODO: figure out what to do if the changeset is already a part of
> another review request
>
> This way we don’t have to match C5 to C5, because we already know that
> C5 was added for review before.
> If we want to discard it, we can explicitly click that “[X]” button on
> the right.
>
> I’m not as good with frontend stuff as I’d wish, so I’m still struggling
> to implement a proper drag and drop support for commits matching.
>
> So my main intention for this UI is to provide a hassle-free common flow.
> E.g. you could simply create a review request for all pushed changes
> like this:
>
> * mach review-request 31337
> * <open the URL in browser>
> * all changesets are pre-selected, so you briefly review the list and
> click “Create review request”
> * done
>> 1. Create a new review request for your newly squashed changeset.
>> You lose both comments.
>> 2. Discard one of the previous changesets and associate the newly
There were some comments about looking up changesets that are
descendants/children of existing changesets.

This kind of operation can be computationally expensive, especially when
we have thousands of commit "branches." You'll almost certainly want to
have a database of some kind indexing things. But I encourage you to
measure this yourself before you reach that conclusion!

I started a project over Christmas to index repository data and make it
available as a web service. The source lives at
https://bitbucket.org/indygreg/moz-tree-utopia. It's deployed at
https://hg.stage.mozaws.net/ and http://moztree.stage.mozaws.net/docs/.

It's backed by a Postgres DB. But, finding all descendants of a single
commit is very inefficient with a relational database (tons of
sub/follow-up queries). I've been wanting to port the backend to a graph
database, like Neo4j.

Anyway, if you want to lean on the web API for performing queries
against repository data, go for it. If you have any feature requests for
new APIs, send a pull request my way or let me know. I could also have
it start indexing the review repository. Let me know.

George Miroshnykov

unread,
Mar 20, 2014, 4:25:49 PM3/20/14
to Gregory Szorc, Taras Glek, mozilla-c...@googlegroups.com, Ehsan Akhgari, Steven MacLeod

On March 20, 2014 at 21:56:42 , Gregory Szorc (g...@mozilla.com) wrote:

There were some comments about looking up changesets that are 
descendants/children of existing changesets. 

This kind of operation can be computationally expensive, especially when 
we have thousands of commit "branches." You'll almost certainly want to 
have a database of some kind indexing things. But I encourage you to 
measure this yourself before you reach that conclusion! 

Yeah, I’m looking forward to deal with all those challenging performance bottlenecks :)
A database (or just some cache) would definitely be helpful.
The good part is that because changesets are immutable, I believe we can cache ancestry info indefinitely.

BTW, right now my naive approach to resolving ancestry involves sorting changesets by their local rev numbers.
Is it good enough for our use case or should I search for a different solution?


I started a project over Christmas to index repository data and make it 
available as a web service. The source lives at 
https://bitbucket.org/indygreg/moz-tree-utopia. It's deployed at 
https://hg.stage.mozaws.net/ and http://moztree.stage.mozaws.net/docs/. 

It's backed by a Postgres DB. But, finding all descendants of a single 
commit is very inefficient with a relational database (tons of 
sub/follow-up queries). I've been wanting to port the backend to a graph 
database, like Neo4j. 

Anyway, if you want to lean on the web API for performing queries 
against repository data, go for it. If you have any feature requests for 
new APIs, send a pull request my way or let me know. I could also have 
it start indexing the review repository. Let me know. 

Sweet! I’ll take a look.
I didn’t know you can use ZFS in Ubuntu, that’s cool…

Thanks,
George

Steven MacLeod

unread,
Mar 20, 2014, 4:27:46 PM3/20/14
to George Miroshnykov, Taras Glek, mozilla-c...@googlegroups.com, Ehsan Akhgari
For the most part that sounds good to me, I do have a few issues though:

I don't know if we should have all changesets pre-selected. Changes that have landed but not been pushed to the review server might appear along with a developers push. Also, Taking Ehsan's workflow for example, changes for other work might be pushed along as well and it might be easier to select the single commit that is for this bug, rather than unselect all the unrelated changes.

Maybe we could be clever and autoselect commits with the matching Bug in the commit message? I'll let others weigh in on if we should use a heuristic like that.

Additionally, I think I'd prefer the previous commits, and the new commits, appear side by side, rather than on top of each other. I think trying to match things up and compare the pushes is harder when they are listed like the way you show in the screenshot.

As for drag and drop, I don't really care if it actually is "Drag and Drop", I just want a nice way to quickly match the commits, and some sort of visual indicator for how they are matched. I'm no UX expert, drag and drop was just the first thing that came to my mind :D.

You’re right, technically we’re not losing history.
But unless we’re showing this history somewhere, it will be effectively lost for the reviewer and reviewee.

These "discarded" review requests can still be visited, and they'll be linked to in the change history of the squashed review request as they're removed from the depends on field or the description etc. They aren't hard to get to if someone wants to view them. 

Consider this case:
User pushes three changesets for review.
The first feedback he gets is “please squash all changesets into one”, followed by some other suggestions.
User squashes the changesets as requested and pushes new single squashed changeset for review.
He is presented with the UI listing three existing outdated changesets and the new squashed changeset. He’s not sure what to do now.
Moreover, even if he manages to match one changeset and discard the rest, he now has to go over every discarded changeset’s review request and look for the feedback he may have missed / not yet applied.
He’ll have no indication if this feedback is still relevant or he already fixed the issue in the newer commit(s).
I’m still not sure how we can deal with this...


As for squashing and matching, to cover the 3 commits squashed into one case, in the tool the user would have to match 3 old commits all to the same new commit. We could then just use the most child commit of the three as the review request we update, and discard the other two.

Say we have commits "C<-B<-A", where A is the parent of B, which is the parent of C. and these all become a single D commit, with additional changes. The user would match C, B, and A all to D. We would then use the review request for C as the one to update.

When you visit this consolidated review request and viewed the diff, you would see the entire change of D. But, the way Review Board's interdiffs work, if you look at the interdiff you'd only see changes which weren't in B and C. That is to say, those two commits wouldn't appear as part of the interdiff, just the new changes that were on top of the 3 commits before. This is a result of actually diffing file contents behind the scenes to generate interdiffs.

So, I think that solves the use case of the reviewer knowing, and seeing what has changed. I also think getting much better than this is a "hard problem".

Gregory Szorc

unread,
Mar 20, 2014, 4:35:01 PM3/20/14
to George Miroshnykov, Taras Glek, mozilla-c...@googlegroups.com, Ehsan Akhgari, Steven MacLeod
On 3/20/14, 1:25 PM, George Miroshnykov wrote:
>
> On March 20, 2014 at 21:56:42 , Gregory Szorc (g...@mozilla.com
> <mailto:g...@mozilla.com>) wrote:
>
>> There were some comments about looking up changesets that are
>> descendants/children of existing changesets.
>>
>> This kind of operation can be computationally expensive, especially when
>> we have thousands of commit "branches." You'll almost certainly want to
>> have a database of some kind indexing things. But I encourage you to
>> measure this yourself before you reach that conclusion!
>
> Yeah, I’m looking forward to deal with all those challenging performance
> bottlenecks :)
> A database (or just some cache) would definitely be helpful.
> The good part is that because changesets are immutable, I believe we can
> cache ancestry info indefinitely.

That is correct.

> BTW, right now my naive approach to resolving ancestry involves sorting
> changesets by their local rev numbers.
> Is it good enough for our use case or should I search for a different
> solution?

It is guaranteed that a child will not have a numeric revision less than
it's parent. But, revision N is not necessarily N+1's parent. It's also
not guaranteed that revision N+1 was committed *after* revision N.

Revision numbers are a local repo only identifier. You cannot rely on
those numbers persisting across clones.

I highly recommend that any automated system not store numeric
revisions. Store the full 40 character hex SHA-1 or the 20 byte raw
SHA-1 (not the 12 character abbreviation that commonly appears in the UI).

Keep in mind that Mercurial and Git share almost an identical concept of
how commits are structured. The only major incompatibility is that Git
supports commits with N>2 parents (octopus merges) where Mercurial is
limited to 2 parents.

>> Anyway, if you want to lean on the web API for performing queries
>> against repository data, go for it. If you have any feature requests for
>> new APIs, send a pull request my way or let me know. I could also have
>> it start indexing the review repository. Let me know.
>
> Sweet! I’ll take a look.
> I didn’t know you can use ZFS in Ubuntu, that’s cool…

Performance isn't superb though. Run FreeBSD or Illumos for that. I
really wanted to play around with snapshots and de-dupe but I don't
trust btrfs yet :)

George Miroshnykov

unread,
Mar 20, 2014, 4:48:39 PM3/20/14
to Steven MacLeod, Taras Glek, mozilla-c...@googlegroups.com, Ehsan Akhgari

When I’ve discussed this with Ehsan, I believe we agreed that whenever you push something into review repo, chances that you’ll see some irrelevant commits are extremely small, unless you previously pulled them knowingly.
Most (all?) stuff that will land in mozilla-central should go through the review repo.
Review repo should also periodically pull from mozilla-central, maybe even real-time.
So I think the vast majority of review pushes will only show commits actually intended for review, that’s why I've assumed that pre-selecting them is safe.


Maybe we could be clever and autoselect commits with the matching Bug in the commit message? I'll let others weigh in on if we should use a heuristic like that.

That’s definitely doable.


Additionally, I think I'd prefer the previous commits, and the new commits, appear side by side, rather than on top of each other. I think trying to match things up and compare the pushes is harder when they are listed like the way you show in the screenshot.

Yeah, this is a temporary layout, it wouldn’t fit in two columns, but we’ll figure it out :)


As for drag and drop, I don't really care if it actually is "Drag and Drop", I just want a nice way to quickly match the commits, and some sort of visual indicator for how they are matched. I'm no UX expert, drag and drop was just the first thing that came to my mind :D.

I don’t like the idea of manually matching commits at all, but I don’t have any alternatives to suggest, so for now I’m pushing forward...


You’re right, technically we’re not losing history.
But unless we’re showing this history somewhere, it will be effectively lost for the reviewer and reviewee.

These "discarded" review requests can still be visited, and they'll be linked to in the change history of the squashed review request as they're removed from the depends on field or the description etc. They aren't hard to get to if someone wants to view them. 

Consider this case:
User pushes three changesets for review.
The first feedback he gets is “please squash all changesets into one”, followed by some other suggestions.
User squashes the changesets as requested and pushes new single squashed changeset for review.
He is presented with the UI listing three existing outdated changesets and the new squashed changeset. He’s not sure what to do now.
Moreover, even if he manages to match one changeset and discard the rest, he now has to go over every discarded changeset’s review request and look for the feedback he may have missed / not yet applied.
He’ll have no indication if this feedback is still relevant or he already fixed the issue in the newer commit(s).
I’m still not sure how we can deal with this...


As for squashing and matching, to cover the 3 commits squashed into one case, in the tool the user would have to match 3 old commits all to the same new commit. We could then just use the most child commit of the three as the review request we update, and discard the other two.

Say we have commits "C<-B<-A", where A is the parent of B, which is the parent of C. and these all become a single D commit, with additional changes. The user would match C, B, and A all to D. We would then use the review request for C as the one to update.

When you visit this consolidated review request and viewed the diff, you would see the entire change of D. But, the way Review Board's interdiffs work, if you look at the interdiff you'd only see changes which weren't in B and C. That is to say, those two commits wouldn't appear as part of the interdiff, just the new changes that were on top of the 3 commits before. This is a result of actually diffing file contents behind the scenes to generate interdiffs.

So, I think that solves the use case of the reviewer knowing, and seeing what has changed. I also think getting much better than this is a "hard problem".


I’m not that concerned about interdiffs, but rather about losing the review comments.
Unless we start messing with RB internals and transplanting comments from old review requests to the new ones, I’m not sure what to do about it...

George Miroshnykov

unread,
Mar 20, 2014, 4:54:25 PM3/20/14
to Gregory Szorc, Taras Glek, Steven MacLeod, Ehsan Akhgari, mozilla-c...@googlegroups.com

Yup.


I highly recommend that any automated system not store numeric 
revisions. Store the full 40 character hex SHA-1 or the 20 byte raw 
SHA-1 (not the 12 character abbreviation that commonly appears in the UI). 

Sure, I’m not storing them.
All review requests use full 160-bit node IDs to identify changesets.
For now, all I need is to sort the array of node IDs in some meaningful order :)
Also, I have to find edges (first and last changeset) to generate a squashed diff.

Gregory Szorc

unread,
Mar 20, 2014, 5:04:51 PM3/20/14
to George Miroshnykov, Taras Glek, Steven MacLeod, Ehsan Akhgari, mozilla-c...@googlegroups.com
On 3/20/14, 1:54 PM, George Miroshnykov wrote:
>
> On March 20, 2014 at 22:35:06 , Gregory Szorc (g...@mozilla.com
This is trivial to do with Mercurial:

$ hg log -r '<nodeA>::<nodeB>' --template '{node}\n'

"::" is the DAG range operator. ".." is the same thing.

You can also do things like:

$ hg log -r 'children(cf485c48b52f283ceab119341b965dcf75234721)'

$ hg log -r 'descendants(cf485c48b52f283ceab119341b965dcf75234721)'
which is the same as
$ hg log -r 'cf485c48b52f283ceab119341b965dcf75234721::'

I'm pretty sure there are ways to do some of this with git rev-list. But
there's no denying Mercurial's querying capabilities are much more
robust than Git's.

George Miroshnykov

unread,
Mar 20, 2014, 5:12:21 PM3/20/14
to Gregory Szorc, mozilla-c...@googlegroups.com

Yeah, I already appreciate those querying capabilities :)

The problem is that I have more than two IDs and I don’t know their order in advance.
So first I have to sort them (or at least find the first and the last one) and then I can do the diff.
And for that kind of sorting I’m using local rev numbers, so I hope it’s fine.

Thanks,
George

George Miroshnykov

unread,
Mar 24, 2014, 10:29:56 AM3/24/14
to mozilla-c...@googlegroups.com
Hey everyone,

I’ve updated changeset-chooser to automatically attach squashed diffs to parent review requests.

I had an issue with RB integration - I tried to use base_commit_id to specify the parent commit on which the squashed diff should be applied.
But unfortunately that didn’t work as RB was trying to apply the squashed diff to revision UNKNOWN (i.e. falling back to ‘tip’).
Obviously that didn’t apply cleanly, so viewing squashed diffs failed like this:

To make it work, I had to manually patch RB like this:
As usual, there’s a big chance that I did something wrong and RB itself is fine, so please let me know if I can fix this without that patch.

Here’s an example of the squashed diff:

Known issue: deleting child review requests using “[X]” actually deletes them instead of discarding.
This renders parent review request unusable (error 500), because it still has the reference of the deleted child.
Not sure if this is a RB bug or not.

Next step: actually allow matching existing commits via drag and drop (or some other UI trick)
And rearrange the tables to be side-by-side, not one under another.

Please try it out and let me know what you think.

Cheers,
George

George Miroshnykov

unread,
Mar 25, 2014, 11:45:44 AM3/25/14
to mozilla-c...@googlegroups.com
A small update: you can now safely discard changesets from review (and re-add them later).
Corresponding children review requests are marked as discarded and removed from dependencies of the parent review requests.
All history is preserved, as we’ve discussed before.
Here is the fragment of the parent review request, last changeset was discarded and re-added:

The last major missing feature: matching up changesets after history rewrite.

Please try it out and let me know what you think.

Cheers,
George

ja...@hoppipolla.co.uk

unread,
Mar 27, 2014, 5:41:23 PM3/27/14
to mozilla-c...@googlegroups.com
On Tuesday, 25 March 2014 15:45:44 UTC, George Miroshnykov wrote:

> The last major missing feature: matching up changesets after history rewrite.

Another way to deal with history rewrites is to preserve the original changesets with their associated comments in the review but create an "equivalent merge commit" that represents the effect of the rebase. For a (somewhat complex) example of this see [1]. This approach has the advantage that you keep the comments on the initial commits, no manual work is required to match changesets and you get the chance to explicitly review the changes that went into the rebase.

[1] https://critic.hoppipolla.co.uk/r/684

George Miroshnykov

unread,
Mar 27, 2014, 5:49:31 PM3/27/14
to mozilla-c...@googlegroups.com, ja...@hoppipolla.co.uk
Hello James,

I’m not sure I got your idea right.
Are you suggesting that when the history has been rewritten we should generate a (merge?) commit that will basically undo all previous changes and apply all rewritten changes?

James Graham

unread,
Mar 27, 2014, 6:07:10 PM3/27/14
to mozilla-c...@googlegroups.com
On 27/03/14 21:49, George Miroshnykov wrote:
> Hello James,
>
> I’m not sure I got your idea right. Are you suggesting that when the
> history has been rewritten we should generate a (merge?) commit that
> will basically undo all previous changes and apply all rewritten
> changes?


So let's say that you have a branch

P-B1-B2-B3

where P is the parent of the branch and B1-B3 are the commits on the
branch. Now you do a move type rebase so that you have:

P1-B1'-B2'-B3'

In the code review system this could be represented as

B1-B2-B3-E

Where "E" is an effective merge commit. One way to think about it is
that it's the commit that you would have got if instead of rebasing onto
master you had instead done a merge with master like:

P - (some commits) - P2 --\
\ E
\ - B1 - B2 - B3 -------/

This means that the code review system doesn't represent the "real"
branch history. But as I mentioned before there are some advantages to
this approach; you can review E just like any other commit (and see both
sides of the merge), and you don't have to manually match commits or
move comments around.

George Miroshnykov

unread,
Mar 28, 2014, 6:46:03 AM3/28/14
to mozilla-c...@googlegroups.com, James Graham, eh...@mozilla.com
I think I know what you mean, but unfortunately we can’t use this approach.

In order not to enforce a specific Mercurial workflow upon contributors, we had to make a number of compromises.
Internally, we don’t have a concept of feature branches.
Every push to review repository will be a force-push because:
  1. every commit is done on branch “default”
  2. any push can create a new head (e.g. rewrite history)
The second point is basically a consequence of the first point - every contributor will have a slightly different version of the “default” branch.
With this setup, during a force-push the only usable info we can get from server-side Mercurial hook is the commit ID of the first incoming changeset.
To get the list of pushed commits, we go down the ancestry chain from the first commit to the tip.
At this point, we don’t know anything about previously pushed commits.

Compare this to e.g. GitHub API:
They require you to explicitly tell the head (feature branch name or just a tip commit ID) and the base (the branch you’re changes are based on, e.g. “master”).

As we don’t have a concept of master branch and feature branches, we can’t distinguish between an actual history rewrite and a regular force push.
The only thing we can do is to show both incoming commits and the commits that were previously associated with a given bug number.

Personally I don’t like this approach, so if anyone has a better idea on how to implement this - that would be much appreciated.

Thanks,
George

James Graham

unread,
Mar 28, 2014, 7:39:44 AM3/28/14
to mozilla-c...@googlegroups.com
On 28/03/14 10:46, George Miroshnykov wrote:
> I think I know what you mean, but unfortunately we can’t use this approach.
>
> In order not to enforce a specific Mercurial workflow upon contributors, we had to make a number of compromises.

FWIW I think "don't require a particular workflow" should be a non-goal.
We shouldn't prefer a tool that provides mediocre support for an
ill-defined variety of workflows compared to a tool that provides
excellent support for a small number of well-defined workflows.

> Internally, we don’t have a concept of feature branches.
> Every push to review repository will be a force-push because:
> every commit is done on branch “default”

I don't really understand how the tool works then? Having each review be
a single branch in the VCS seems like the most obvious design.

> any push can create a new head (e.g. rewrite history)
> The second point is basically a consequence of the first point - every contributor will have a slightly different version of the “default” branch.
> With this setup, during a force-push the only usable info we can get from server-side Mercurial hook is the commit ID of the first incoming changeset.
> To get the list of pushed commits, we go down the ancestry chain from the first commit to the tip.
> At this point, we don’t know anything about previously pushed commits.
>
> Compare this to e.g. GitHub API:
> http://developer.github.com/v3/pulls/#create-a-pull-request
> They require you to explicitly tell the head (feature branch name or just a tip commit ID) and the base (the branch you’re changes are based on, e.g. “master”).
>
> As we don’t have a concept of master branch and feature branches, we can’t distinguish between an actual history rewrite and a regular force push.
> The only thing we can do is to show both incoming commits and the commits that were previously associated with a given bug number.

FWIW in critic all force-pushes to the review repo are explicitly
disabled unless you tell it that you are about to perform a rebase. When
you do perform a rebase you have to manually tell it what commit you are
rebasing onto.

George Miroshnykov

unread,
Mar 28, 2014, 8:21:21 AM3/28/14
to mozilla-c...@googlegroups.com, James Graham

On March 28, 2014 at 1:39:49 PM, James Graham (ja...@hoppipolla.co.uk) wrote:

On 28/03/14 10:46, George Miroshnykov wrote: 
> I think I know what you mean, but unfortunately we can’t use this approach. 
> 
> In order not to enforce a specific Mercurial workflow upon contributors, we had to make a number of compromises. 

FWIW I think "don't require a particular workflow" should be a non-goal. 
We shouldn't prefer a tool that provides mediocre support for an 
ill-defined variety of workflows compared to a tool that provides 
excellent support for a small number of well-defined workflows. 

I completely agree with you here, but unfortunately I’m not in position to make decisions.
Let’s hear out what others have to say.


> Internally, we don’t have a concept of feature branches. 
> Every push to review repository will be a force-push because: 
> every commit is done on branch “default” 

I don't really understand how the tool works then? Having each review be 
a single branch in the VCS seems like the most obvious design. 

It sure is the most obvious design approach.
And the most familiar to GitHub / Bitbucket pull request users too.
Unfortunately, we don’t have lightweight branches in Mercurial and I’m told that bookmarks are not used widely enough among mozillians to make them mandatory.


> any push can create a new head (e.g. rewrite history) 
> The second point is basically a consequence of the first point - every contributor will have a slightly different version of the “default” branch. 
> With this setup, during a force-push the only usable info we can get from server-side Mercurial hook is the commit ID of the first incoming changeset. 
> To get the list of pushed commits, we go down the ancestry chain from the first commit to the tip. 
> At this point, we don’t know anything about previously pushed commits. 
> 
> Compare this to e.g. GitHub API: 
> http://developer.github.com/v3/pulls/#create-a-pull-request 
> They require you to explicitly tell the head (feature branch name or just a tip commit ID) and the base (the branch you’re changes are based on, e.g. “master”). 
> 
> As we don’t have a concept of master branch and feature branches, we can’t distinguish between an actual history rewrite and a regular force push. 
> The only thing we can do is to show both incoming commits and the commits that were previously associated with a given bug number. 

FWIW in critic all force-pushes to the review repo are explicitly 
disabled unless you tell it that you are about to perform a rebase. When 
you do perform a rebase you have to manually tell it what commit you are 
rebasing onto. 

I found the docs:
https://critic-review.org/tutorial?item=rebase
That looks interesting and well-thought.
I have to think this over.
As we’re not using branches, but rather arrays of arbitrary commits, I’m not yet sure we can implement something like that as is.

Thanks,
George

Gregory Szorc

unread,
Mar 28, 2014, 5:57:39 PM3/28/14
to George Miroshnykov, mozilla-c...@googlegroups.com, James Graham
On 3/28/14, 5:21 AM, George Miroshnykov wrote:
>>
>> > Internally, we don’t have a concept of feature branches.
>> > Every push to review repository will be a force-push because:
>> > every commit is done on branch “default”
>>
>> I don't really understand how the tool works then? Having each review be
>> a single branch in the VCS seems like the most obvious design.
>
> It sure is the most obvious design approach.
> And the most familiar to GitHub / Bitbucket pull request users too.
> Unfortunately, we don’t have lightweight branches in Mercurial and I’m
> told that bookmarks are not used widely enough among mozillians to make
> them mandatory.

The statement "we don't have lightweight branches in Mercurial" is
completely false. Mercurial's "branches" are actually lighter than Git's!

Here is how you create a super light branch in Mercurial:

$ hg init
$ touch foo
$ hg commit -A -m 'Initial commit'
# Committed rev 0
$ echo "1" >> foo
$ hg commit -A -m 'Second commit'
# Committed rev 1
$ hg up 0
$ echo "2" >> foo
$ hg commit -A -m 'Third commit'
created new head

# You just created a new branch!

$ hg log --graph
@ changeset: 2:75926c90de5c
| tag: tip
| parent: 0:7e3d21fdb142
| user: Gregory Szorc <g...@mozilla.com>
| date: Fri Mar 28 14:51:58 2014 -0700
| summary: Third commit
|
| o changeset: 1:5a028f533db5
|/ user: Gregory Szorc <g...@mozilla.com>
| date: Fri Mar 28 14:51:14 2014 -0700
| summary: Second commit
|
o changeset: 0:7e3d21fdb142
user: Gregory Szorc <g...@mozilla.com>
date: Fri Mar 28 14:51:04 2014 -0700
summary: Initial commit

These branches are lighter than Git's because you don't need to
associate a named reference to them. Yes, you can do this in Git, but
the commit will get garbage collected because nobody is holding a ref to
it. And, if you don't know the commit, it will be very difficult to find
again.

Now, these unnamed branches aren't very useful because you need to
remember the rev/node to access them. That's why Mercurial has bookmarks.

$ hg up 1
$ hg bookmark gps/my-feature
$ echo "3" >> foo
$ hg commit -A -m "Fourth Commit"

$ hg log --graph

@ changeset: 3:56a5b19ec5c0
| bookmark: gps/my-feature
| tag: tip
| parent: 1:5a028f533db5
| user: Gregory Szorc <g...@mozilla.com>
| date: Fri Mar 28 14:54:50 2014 -0700
| summary: Fourth commit
|
| o changeset: 2:75926c90de5c
| | parent: 0:7e3d21fdb142
| | user: Gregory Szorc <g...@mozilla.com>
| | date: Fri Mar 28 14:51:58 2014 -0700
| | summary: Third commit
| |
o | changeset: 1:5a028f533db5
|/ user: Gregory Szorc <g...@mozilla.com>
| date: Fri Mar 28 14:51:14 2014 -0700
| summary: Second commit
|
o changeset: 0:7e3d21fdb142
user: Gregory Szorc <g...@mozilla.com>
date: Fri Mar 28 14:51:04 2014 -0700
summary: Initial commit

I do *all* of my feature development using bookmarks this way.

The second part of the statement "bookmarks are not used widely enough
among mozillians to make them mandatory" is likely correct. AFAICT most
Mozillians are still using mq. It is my personal mission to help move
more people to bookmarks and/or branches. But it will likely take a long
time to change the culture, sadly.

Gregory Szorc

unread,
Mar 28, 2014, 6:23:29 PM3/28/14
to James Graham, mozilla-c...@googlegroups.com
On 3/28/14, 4:39 AM, James Graham wrote:
> On 28/03/14 10:46, George Miroshnykov wrote:
>> I think I know what you mean, but unfortunately we can’t use this
>> approach.
>>
>> In order not to enforce a specific Mercurial workflow upon
>> contributors, we had to make a number of compromises.
>
> FWIW I think "don't require a particular workflow" should be a non-goal.
> We shouldn't prefer a tool that provides mediocre support for an
> ill-defined variety of workflows compared to a tool that provides
> excellent support for a small number of well-defined workflows.
>
>> Internally, we don’t have a concept of feature branches.
>> Every push to review repository will be a force-push because:
>> every commit is done on branch “default”
>
> I don't really understand how the tool works then? Having each review be
> a single branch in the VCS seems like the most obvious design.

We need to get our terminology in order.

"branch" is overloaded. It means different things depending on the context.

At the core of Mercurial and Git is a very similar mechanism for
representing commits: a directional graph. The simplest graph in linear:

A --> B --> C

Here, each commit has 1 parent and 1 child. Simple.

You have cases where a commit can have multiple children:

A --> B --> C
\
D

C and D have no children, so they are referred to as "heads."

That's how DAG representation in Git *and* Mercurial work.

Now, A B C D in reality are SHA-1 hashes. e.g.
2d2f3e3ebcf2418fede810b5def66300b65b68d8. But no human wants to remember
those, so version control systems assign human-friendly names to them.

This is where Git and Mercurial start to diverge.

Git maintains a very simple mapping of human friendly names to SHA-1. It
calls these refs. A special class of refs is called "branches." A Git
branch is just a named pointer to SHA-1 of a commit. Git also imposes
some commands and logic around manipulating branches/refs. You all know
how that works (I think).

Mercurial maintains a very simple mapping of human friendly names to
SHA-1. Mercurial calls these bookmarks. They behave very similarly to
Git branches: they are just movable labels associated with a single commit.

Mercurial maintains *another* commit tracking mechanism where a
human-friendly label is attached to each commit. Unlike Git
branches/refs and Mercurial bookmarks where the label is free to move
around, this label is affixed as part of the core data in the actual
commit and thus participates in the SHA-1. Mercurial calls this heavier
tracking mechanism "branches."

Mercurial maintains yet another commit management system called mq.
Here, instead of adding more and more commits into the DAG (like Git and
Mercurial with bookmarks or branches), patches/commits are standalone
files that are temporarily added and removed from the DAG. These commits
move around depending on where you push or pop them. Each mq patch has a
name attached to it. That name identifies the commit in the DAG. You
can't move the name around. It only applies to one commit.

As you can see, Mercurial has effectively 3 ways you can use names to
refer to commits/heads. We should be able to use the name of the
bookmark, branch, or mq patch to derive the review identifier. This is
very similar to using Git branch names to drive the review identifier.
The only difference is we may need a little client side magic to make
that process seamless, because doing things like pushing the
locally-applied mq patch to a bookmark/branch of the same name on a
remote is not trivial without an alias, short shell script, or extension.

George Miroshnykov

unread,
Mar 31, 2014, 1:47:42 PM3/31/14
to mozilla-c...@googlegroups.com
So the last feature is ready: rewriting history
Drag and drop would take too much time to develop properly, so I’ve settled for a simpler UI for now.
When you select exactly one changeset, you can see an extra button [R] appear next to each of the old changesets:

Clicking that button will replace the corresponding old changeset with the selected one, after confirmation:

This will allow you to deal with history rewrites of a single changeset.
As discussed before, if you have multiple changesets squashed into one, you’ll have to pick one changeset to match and discard the rest.

Looking forward for your feedback!

Thanks,
George

Ehsan Akhgari

unread,
Mar 31, 2014, 2:34:55 PM3/31/14
to George Miroshnykov, mozilla-c...@googlegroups.com, James Graham
On Fri, Mar 28, 2014 at 8:21 AM, George Miroshnykov <gmiros...@rebbix.com> wrote:

On March 28, 2014 at 1:39:49 PM, James Graham (ja...@hoppipolla.co.uk) wrote:

On 28/03/14 10:46, George Miroshnykov wrote: 
> I think I know what you mean, but unfortunately we can’t use this approach. 
> 
> In order not to enforce a specific Mercurial workflow upon contributors, we had to make a number of compromises. 

FWIW I think "don't require a particular workflow" should be a non-goal. 
One of the goals behind this project is building a review workflow which works for a number of different types of workflows that people use here at Mozilla, and as a result, we need to focus on the lowest common denominator, hence the "don't require a particular workflow" goal.  If you disagree with that goal, it's fine, but you should keep in mind that is what we're building here.
 
We shouldn't prefer a tool that provides mediocre support for an 
ill-defined variety of workflows compared to a tool that provides 
excellent support for a small number of well-defined workflows. 

I completely agree with you here, but unfortunately I’m not in position to make decisions.
Let’s hear out what others have to say.

James, I assume you're comparing what we are building here to Opera Critic.  Critic does enforce a number of things on your workflow (for example, the way that it requires you to do rebases when you want to submit the rebased result for review, or obviously using git, etc.)  As a result, a number of choices that make sense for Critic don't necessarily make sense for us.  If you have better suggestions for a push-based workflow which supports all of mq, hg with bookmarks, and git branches, I'm all ears.  These kinds of comparisons to Critic without remembering the goals of this project are not really helpful.

> Internally, we don’t have a concept of feature branches. 
> Every push to review repository will be a force-push because: 
> every commit is done on branch “default” 

I don't really understand how the tool works then? Having each review be 
a single branch in the VCS seems like the most obvious design. 

It sure is the most obvious design approach.
And the most familiar to GitHub / Bitbucket pull request users too.
Unfortunately, we don’t have lightweight branches in Mercurial and I’m told that bookmarks are not used widely enough among mozillians to make them mandatory.

Yes, bookmarks are used by *very* few people at Mozilla.  I cannot overstate this.  No matter what our personal preferences are, the facts on the ground is that the majority of our developers using hg are using mq for their local development.  Greg may care about this so much that he may spend the time and energy to convince people that bookmarks are better and get them to switch over, but that is _not_ my goal here.  We would have a much easier time if we lived in that world today (because basically hg bookmarks are conceptually a clone of git branches done in hg.)

As the goal of this project is building a review workflow that all Mozilla developers can switch to easily with minimal friction, reiterating how github and bitbucket do things without keeping this in mind always gets us back to the same issue: these systems are trying to solve different problems.

> any push can create a new head (e.g. rewrite history) 
> The second point is basically a consequence of the first point - every contributor will have a slightly different version of the “default” branch. 
> With this setup, during a force-push the only usable info we can get from server-side Mercurial hook is the commit ID of the first incoming changeset. 
> To get the list of pushed commits, we go down the ancestry chain from the first commit to the tip. 
> At this point, we don’t know anything about previously pushed commits. 
> 
> Compare this to e.g. GitHub API: 
> http://developer.github.com/v3/pulls/#create-a-pull-request 
> They require you to explicitly tell the head (feature branch name or just a tip commit ID) and the base (the branch you’re changes are based on, e.g. “master”). 
> 
> As we don’t have a concept of master branch and feature branches, we can’t distinguish between an actual history rewrite and a regular force push. 
> The only thing we can do is to show both incoming commits and the commits that were previously associated with a given bug number. 

FWIW in critic all force-pushes to the review repo are explicitly 
disabled unless you tell it that you are about to perform a rebase. When 
you do perform a rebase you have to manually tell it what commit you are 
rebasing onto. 

I found the docs:
https://critic-review.org/tutorial?item=rebase
That looks interesting and well-thought.
I have to think this over.
As we’re not using branches, but rather arrays of arbitrary commits, I’m not yet sure we can implement something like that as is.

The mq use case only does rebases in a loose conceptual sense.  That is one big difference with our workflow and Critic's.  In addition, requiring people to let the code review tool know when they're planning to do a local rebase adds one additional step to the workflow, and I don't think there is a good reason why we would want to do that.

Cheers,
Ehsan

Gregory Szorc

unread,
Mar 31, 2014, 5:12:06 PM3/31/14
to Ehsan Akhgari, George Miroshnykov, mozilla-c...@googlegroups.com, James Graham
On 3/31/14, 11:34 AM, Ehsan Akhgari wrote:
> On Fri, Mar 28, 2014 at 8:21 AM, George Miroshnykov
> <gmiros...@rebbix.com <mailto:gmiros...@rebbix.com>> wrote:
>
>
> On March 28, 2014 at 1:39:49 PM, James Graham
> (ja...@hoppipolla.co.uk <mailto:ja...@hoppipolla.co.uk>) wrote:
>
>> On 28/03/14 10:46, George Miroshnykov wrote:
>> > I think I know what you mean, but unfortunately we can’t use this approach.
>> >
>> > In order not to enforce a specific Mercurial workflow upon contributors, we had to make a number of compromises.
>>
>> FWIW I think "don't require a particular workflow" should be a
>> non-goal.
>
> One of the goals behind this project is building a review workflow which
> works for a number of different types of workflows that people use here
> at Mozilla, and as a result, we need to focus on the lowest common
> denominator, hence the "don't require a particular workflow" goal. If
> you disagree with that goal, it's fine, but you should keep in mind that
> is what we're building here.

I somewhat disagree with the statement "we need to focus on the lowest
common denominator." The trap here is that by focusing on the "lowest
common denominator," we'd be building a tool that works for everyone but
it sucks or isn't efficient.

I believe our current approach to Mercurial is doing just this. We've
effectively said "the way we can support bookmarks, branches, mq, and
anonymous heads at the same time is if we pass an environment variable
to the server." That solution - the lowest common denominator - sucks.

While we could (should?) support environment variables as a solution, I
think we should go out of our way to support more intuitive and
efficient mechanisms. Here's how.

1. For people doing bookmark-based development, we say "push your
bookmark to the server and a review will be initiated." The review ID is
the name of the bookmark. From client side, this is `hg push -B
<bookmark>`. Simple.

2. For people doing mq-based development, we say "install this simple
extension that allows you to run `hg push review`." Under the hood, that
extension exports a bookmark to the remote having the same name as the
current tip-most applied mq patch. The argument "we don't want to
require an extension" is inconsistent with the fact that mq is itself an
extension (albeit a built-in one) that people have already activated.
One-time extension installation is not a big deal. People will do it in
order to have better code review.

3. For people doing anonymous heads (I doubt there are many), we say
"specify an environment variable" or "install this extension."

And for the record, one of my unofficial goals is to get more Mozilla
developers off mq and using bookmarks. I'm bringing in the developer
behind changeset evolution for a brown bag in April. Hopefully we'll be
able to drum up more interest with that.

Ehsan Akhgari

unread,
Mar 31, 2014, 5:43:30 PM3/31/14
to Gregory Szorc, George Miroshnykov, mozilla-c...@googlegroups.com, James Graham
On Mon, Mar 31, 2014 at 5:12 PM, Gregory Szorc <g...@mozilla.com> wrote:
On 3/31/14, 11:34 AM, Ehsan Akhgari wrote:
On Fri, Mar 28, 2014 at 8:21 AM, George Miroshnykov
<gmiros...@rebbix.com <mailto:gmiroshnykov@rebbix.com>> wrote:


    On March 28, 2014 at 1:39:49 PM, James Graham
    (ja...@hoppipolla.co.uk <mailto:ja...@hoppipolla.co.uk>) wrote:

    On 28/03/14 10:46, George Miroshnykov wrote:
    > I think I know what you mean, but unfortunately we can’t use this approach.
    >
    > In order not to enforce a specific Mercurial workflow upon contributors, we had to make a number of compromises.

    FWIW I think "don't require a particular workflow" should be a
    non-goal.

One of the goals behind this project is building a review workflow which
works for a number of different types of workflows that people use here
at Mozilla, and as a result, we need to focus on the lowest common
denominator, hence the "don't require a particular workflow" goal.  If
you disagree with that goal, it's fine, but you should keep in mind that
is what we're building here.

I somewhat disagree with the statement "we need to focus on the lowest common denominator." The trap here is that by focusing on the "lowest common denominator," we'd be building a tool that works for everyone but it sucks or isn't efficient.

I agree with the general sentiment if we're talking in the abstract, sometimes focusing on the lowest common denominator is not a good idea.  But we're not talking about anything abstract here, we're talking about a very concrete system, so if you have specific issues about how this system handles a given use case, let's discuss those.

In order for us to have a useful discussion, you need to specify what things about this tool are inefficient for people and how it fails to deliver a good experience.
 
I believe our current approach to Mercurial is doing just this. We've effectively said "the way we can support bookmarks, branches, mq, and anonymous heads at the same time is if we pass an environment variable to the server." That solution - the lowest common denominator - sucks.

See, this statement is not really helpful: "That solution - the lowest common denominator - sucks."  What is it exactly about that solution which "sucks"?  And what is the problem exactly?  "sucks" is hardly actionable.  I suspect that a number of grievances that people are raising here are things that are actually non-issues.  The three points below are great examples of that in fact!
 
While we could (should?) support environment variables as a solution, I think we should go out of our way to support more intuitive and efficient mechanisms. Here's how.

1. For people doing bookmark-based development, we say "push your bookmark to the server and a review will be initiated." The review ID is the name of the bookmark. From client side, this is `hg push -B <bookmark>`. Simple.

2. For people doing mq-based development, we say "install this simple extension that allows you to run `hg push review`." Under the hood, that extension exports a bookmark to the remote having the same name as the current tip-most applied mq patch. The argument "we don't want to require an extension" is inconsistent with the fact that mq is itself an extension (albeit a built-in one) that people have already activated. One-time extension installation is not a big deal. People will do it in order to have better code review.

3. For people doing anonymous heads (I doubt there are many), we say "specify an environment variable" or "install this extension."

You're assuming that the environment variable we have here is something that the users will see.  That is not the case.  Here is how the user visible UI will work:

./mach review [bug_number]

If bug_number is omitted, we can make this client command do the following:

1. Look to see if you're using hg bookmarks, in which case, see if the name of your bookmark is a 6-7 digit integer, in that case assume it is the bug number.
2. Look to see if you're using mq, in which case, look at the commit message of your patches in mq, look for patterns similar to "Bug 123456" and whatnot, and try to extract a bug number from that.  (and perhaps other heuristics)
3. If you're on an anonymous head or if the above heauristics fail, show an error and ask the user to specify the bug number.
4. Once you have the bug number somehow, submit it through the environment variable.
5. (In the future) add similar smarts for git users.

The environment variable we're using is just an implementation detail.  The server side system is intentionally not trying to outsmart the client and therefore does not try to use any heuristics.  This means if some day someone comes around with yet another idea on how to determine the bug number, we will not need to change anything on the server side, and can just optimize the heuristics on the client side.

That is a good thing in my opinion.

(On your comment about extensions, did you see my latest reply here https://github.com/ehsan/reviewtool/issues/2?  I am not religiously opposed to extensions where they are useful, I'm opposed to forcing people to use them when they are unnecessary.  Delivering one integer from the client side to the server side is one such case, a mercurial extension is not necessary for that.  *Could* we implement this as a mercurial extension?  I have no doubt.  Would that buy us anything over the environment variable?  I don't think so.)
 
And for the record, one of my unofficial goals is to get more Mozilla developers off mq and using bookmarks. I'm bringing in the developer behind changeset evolution for a brown bag in April. Hopefully we'll be able to drum up more interest with that.

Please do that.  :-)  mq is one of the worst things out there about Mercurial, and I cannot wait for the day that it dies and nobody uses it (either at Mozilla or elsewhere!)  The reason why I want to support that workflow is not because I'm a fan of the fact that people are using it, but they're using it today.

Seriously, I cannot wait for the day when nobody at Mozilla uses mq for anything, it's just that I don't like to force this decision down on people today because they want to use a new shiny review tool.  The right thing to do is to publicize bookmarks as a better alternative and get people to switch to them gradually in order to get the benefits of not using mq.

And what we've built here is of course compatible with bookmarks as well, so things will keep working for people when they switch to use bookmarks.  But still for example the idea of reading the bug# from the bookmark means more than just forcing people to use bookmarks, it's forcing a very specific bookmark based workflow that is one bookmark per bug (aka git's famous one branch per bug/feature).  But in my experience, because of things such as slow build times, that workflow while ideal on paper doesn't work out well for everybody in the long run.  For example, I switched away from that workflow myself a while ago after using it for perhaps around a couple of years or more, just because switching from one branch to the other every time meant waiting several minutes for a build.  My current workflow is to develop almost everything on a single dev branch, and only switch to a new branch when doing some really disruptive work.  Had a tool at Mozilla forced me to use the one branch per bug workflow, I would be a less productive developer as a result (because I would need to spend more time waiting for builds all the time).

Hope this sheds some light on my viewpoint here.

Cheers,

Gregory Szorc

unread,
Mar 31, 2014, 6:06:24 PM3/31/14
to Ehsan Akhgari, George Miroshnykov, mozilla-c...@googlegroups.com, James Graham
On 3/31/14, 2:43 PM, Ehsan Akhgari wrote:
> On Mon, Mar 31, 2014 at 5:12 PM, Gregory Szorc <g...@mozilla.com
> <mailto:g...@mozilla.com>> wrote:
>
> On 3/31/14, 11:34 AM, Ehsan Akhgari wrote:
>
> On Fri, Mar 28, 2014 at 8:21 AM, George Miroshnykov
> <gmiros...@rebbix.com <mailto:gmiros...@rebbix.com>
> <mailto:gmiroshnykov@rebbix.__com
> <mailto:gmiros...@rebbix.com>>> wrote:
>
>
> On March 28, 2014 at 1:39:49 PM, James Graham
> (ja...@hoppipolla.co.uk <mailto:ja...@hoppipolla.co.uk>
> <mailto:ja...@hoppipolla.co.uk
I enjoyed reading that reply. Lots of good content. Most of it great to
hear!

If we're keying reviews off bug #'s, that seems to limit workflows quite
substantially. I often have heads involving patches from multiple bugs
(similar to you it sounds like). I get reviews "bottom up." What I
really want are feature-based branches/reviews where bug #'s merely fall
out of commit messages. I don't want rebases changing patch order to
lead to confusion in the review tool because a new bug # is on top of
the review "stack."

Furthermore, (and we can both relate to this as two of Mozilla's most
prolific patch writers - you were #2 in 2013 (only behind bholley) and I
was #8), I think Bugzilla overhead is a real thing. I foresee the day
where we loosen our requirements on bugs needing to exist for pretty
much every change. IMO, if an appropriate person signs off on a patch,
no bug is needed. I feel there are many "trivial" patches I'm leaving on
the table because I don't want to deal with overhead: I just want to
write a "10 second" patch and ask for review "10 second review." We
can't easily do this today because we are using Bugzilla for reviews.
(The overhead for bug management dominates patch writing and review
time.) But once reviews are moved elsewhere, Bugzilla's necessity wanes.
What I'm trying to say is that limiting our review tracking to bug
numbers (and thus requiring Bugzilla) may be a bit shortsighted. I hope
we can use arbitrary identifiers for tracking reviews (e.g.
branch/bookmark/mq names) so as to not require Bugzilla down the road.
(I understand this opinion might be controversial. But it's one I'm
willing to fight for in the name of increased productivity.)

James Graham

unread,
Mar 31, 2014, 6:08:36 PM3/31/14
to Ehsan Akhgari, George Miroshnykov, mozilla-c...@googlegroups.com
On 31/03/14 19:34, Ehsan Akhgari wrote:
> On Fri, Mar 28, 2014 at 8:21 AM, George Miroshnykov <gmiros...@rebbix.com
>> wrote:
>
>>
>> On March 28, 2014 at 1:39:49 PM, James Graham (ja...@hoppipolla.co.uk)
>> wrote:
>>
>> On 28/03/14 10:46, George Miroshnykov wrote:
>>> I think I know what you mean, but unfortunately we can’t use this
>> approach.
>>>
>>> In order not to enforce a specific Mercurial workflow upon contributors,
>> we had to make a number of compromises.
>>
>> FWIW I think "don't require a particular workflow" should be a non-goal.
>>
>> One of the goals behind this project is building a review workflow which
> works for a number of different types of workflows that people use here at
> Mozilla, and as a result, we need to focus on the lowest common
> denominator, hence the "don't require a particular workflow" goal. If you
> disagree with that goal, it's fine, but you should keep in mind that is
> what we're building here.

Well I don't really understand how we've arrived at the conclusion that
supporting arbitrary workflows is an immovable requirement, even if it
comes at the expense of the overall quality of the tool. To me it feels
like we are selling out the long-term productivity of the organization
in order to avoid some short term pain in getting people to adapt. I
would be much more sympathetic to this if there wasn't ample evidence
that people are willing and able to change VCS habits in order to get a
better experience and/or to meet the requirements of the project that
they are working with. For example GitHub has a workflow that is very
different to the patch based system that was common in pre-VCS days, but
the enormous success of GitHub doesn't seem to have been affected by the
fact that it insists that you use branches + PRs for development and
doesn't also provide the option to upload diff files, for example. Even
within Mozilla, people seem to cope with having some things in Mercurial
and other things on GitHub. Although it creates some process problems,
the developers themselves seem to be able to make the transitions back
and forth rather smoothly. And new people coming to the project are
substantially more likely to have experience of branch-based development
than mq-based development.

I am particularly troubled by mq because it feels like Mozilla might be
one of the few organisations still using the tool. At least whenever you
search for mq help on Google, you have about a 50% chance of getting a
MDN page as the top result. It also seems like the wider mercurial
community are starting to think that it was a bad idea. Therefore it
seems all too plausible that it will be increasingly marginalised in the
future, and we will be left depending on a tool that no one else is
interested in.

> As the goal of this project is building a review workflow that all Mozilla
> developers can switch to easily with minimal friction, reiterating how
> github and bitbucket do things without keeping this in mind always gets us
> back to the same issue: these systems are trying to solve different
> problems.

Without wanting to suggest that we copy the GitHub or Bitbucket systems
directly (we shouldn't; we can do substantially better), they aren't
solving a different problem. They even had the same initial problem we
had (how do we move users used to a centralised, patch-based workflow to
a DVCS workflow). I really think we should try to design the best
possible thing for a branchy development model and then if we find we
*can't* get people to switch how they work, add mq support, rather than
building the entire product around the assumption that people will
refuse to try something new even if it offers considerable benefits.

> The mq use case only does rebases in a loose conceptual sense. That is one
> big difference with our workflow and Critic's. In addition, requiring
> people to let the code review tool know when they're planning to do a local
> rebase adds one additional step to the workflow, and I don't think there is
> a good reason why we would want to do that.

Yes, having to tell critic you are doing a rebase is annoying. But it's
substantially less annoying than it would be to have to map each commit
in the new branch to a commit in the old branch by hand. In general
rebases should be relatively rare (because they alter published history
and create work for all the people collaborating on a branch), so having
one manual step isn't *too* bad.

FWIW I am trying to stand up a critic system (not connected to github)
with the gecko-dev repo, just so that people can try it out and see what
the workflow's like.

Gregory Szorc

unread,
Mar 31, 2014, 6:28:22 PM3/31/14
to James Graham, Ehsan Akhgari, George Miroshnykov, mozilla-c...@googlegroups.com
It's worth reminding people that manual notification/tracking of rebases
should be a thing of the past with Mercurial's changeset evolution. The
Mercurial repo will know that rebased changeset Y is a logical successor
of changeset X. The review tool can then query this metadata and update
the review accordingly. All automatically.

This is a game changer for version control. It should mitigate the
reasons behind "force push considered evil" and allow people to
collaborate on "published" history ("published" in quotes because that
term is overloaded in Mercurial).

Of course, Git users will still likely need manual intervention to teach
the review tool about rebases. That's the bed they choose to sleep in, I
suppose.

Ehsan Akhgari

unread,
Mar 31, 2014, 6:30:40 PM3/31/14
to Gregory Szorc, George Miroshnykov, mozilla-c...@googlegroups.com, James Graham
On Mon, Mar 31, 2014 at 6:06 PM, Gregory Szorc <g...@mozilla.com> wrote:
On 3/31/14, 2:43 PM, Ehsan Akhgari wrote:
On Mon, Mar 31, 2014 at 5:12 PM, Gregory Szorc <g...@mozilla.com
<mailto:g...@mozilla.com>> wrote:

    On 3/31/14, 11:34 AM, Ehsan Akhgari wrote:

        On Fri, Mar 28, 2014 at 8:21 AM, George Miroshnykov
        <mailto:gmiroshnykov@rebbix.__com

Glad to hear that!
 
If we're keying reviews off bug #'s, that seems to limit workflows quite substantially. I often have heads involving patches from multiple bugs (similar to you it sounds like). I get reviews "bottom up." What I really want are feature-based branches/reviews where bug #'s merely fall out of commit messages. I don't want rebases changing patch order to lead to confusion in the review tool because a new bug # is on top of the review "stack."

That is very similar to my workflow indeed (except for the part about the order in which you ask for review, I pretty much use a random order.)

I think what you want though is simply achievable on top of what we have.  Is that not the case?
 
Furthermore, (and we can both relate to this as two of Mozilla's most prolific patch writers - you were #2 in 2013 (only behind bholley) and I was #8), I think Bugzilla overhead is a real thing. I foresee the day where we loosen our requirements on bugs needing to exist for pretty much every change. IMO, if an appropriate person signs off on a patch, no bug is needed. I feel there are many "trivial" patches I'm leaving on the table because I don't want to deal with overhead: I just want to write a "10 second" patch and ask for review "10 second review." We can't easily do this today because we are using Bugzilla for reviews. (The overhead for bug management dominates patch writing and review time.) But once reviews are moved elsewhere, Bugzilla's necessity wanes. What I'm trying to say is that limiting our review tracking to bug numbers (and thus requiring Bugzilla) may be a bit shortsighted. I hope we can use arbitrary identifiers for tracking reviews (e.g. branch/bookmark/mq names) so as to not require Bugzilla down the road.
(I understand this opinion might be controversial. But it's one I'm willing to fight for in the name of increased productivity.)

That is not controversial at all!  I definitely feel the pain of a high overhead into getting simple stuff checked in.  But eliminating Bugzilla is not necessary (and neither the only solution) to fix that problem.  FWIW in the olden days (close to when Mozilla was open sourced first) a lot of people used to commit a whole bunch of changes with unrelated bugs and play it loose.  I'm sure they all managed to check in their stuff quicker, but I've come across the case where I found a bug in some old code, and have looked at the bug where that change was first introduced, just to see that the bug either contains no information relevant to the change, or that someone crammed in a large refactoring on top of a simple crash fix (presumably because they were too lazy to file a bug for it or something.)

In my future glorious world though, that will all go away, and you'd be able to do:

$ # make trivial change
$ ./mach review filebug # Get mach to file a bug for you before submitting the review request
$ # profit

With Autoland, that will literally be the only work you need to do.  Tools will finish all of the non-interesting parts of the process (such as filing the bug, etc.)

And again, that is all possible on top of what we're building here.

James Graham

unread,
Mar 31, 2014, 6:32:56 PM3/31/14
to mozilla-c...@googlegroups.com
On 31/03/14 23:06, Gregory Szorc wrote:

> Furthermore, (and we can both relate to this as two of Mozilla's most
> prolific patch writers - you were #2 in 2013 (only behind bholley) and I
> was #8), I think Bugzilla overhead is a real thing. I foresee the day
> where we loosen our requirements on bugs needing to exist for pretty
> much every change. IMO, if an appropriate person signs off on a patch,
> no bug is needed. I feel there are many "trivial" patches I'm leaving on
> the table because I don't want to deal with overhead: I just want to
> write a "10 second" patch and ask for review "10 second review." We
> can't easily do this today because we are using Bugzilla for reviews.
> (The overhead for bug management dominates patch writing and review
> time.) But once reviews are moved elsewhere, Bugzilla's necessity wanes.
> What I'm trying to say is that limiting our review tracking to bug
> numbers (and thus requiring Bugzilla) may be a bit shortsighted. I hope
> we can use arbitrary identifiers for tracking reviews (e.g.
> branch/bookmark/mq names) so as to not require Bugzilla down the road.
> (I understand this opinion might be controversial. But it's one I'm
> willing to fight for in the name of increased productivity.)

FWIW I strongly agree with this. With a trivial change and the reviewers
and patch author all well coordinated, it should be possible to get from
commit to review and checkin in around a minute. Insisting on
heavyweight bugzilla integration prevents this and really isn't
necessary when you consider the review system itself to be the source of
truth about changes to the code.

Ehsan Akhgari

unread,
Mar 31, 2014, 6:55:35 PM3/31/14
to James Graham, George Miroshnykov, mozilla-c...@googlegroups.com
On Mon, Mar 31, 2014 at 6:08 PM, James Graham <ja...@hoppipolla.co.uk> wrote:
On 31/03/14 19:34, Ehsan Akhgari wrote:
On Fri, Mar 28, 2014 at 8:21 AM, George Miroshnykov <gmiros...@rebbix.com
wrote:


On March 28, 2014 at 1:39:49 PM, James Graham (ja...@hoppipolla.co.uk)
wrote:

On 28/03/14 10:46, George Miroshnykov wrote:
I think I know what you mean, but unfortunately we can’t use this
approach.

In order not to enforce a specific Mercurial workflow upon contributors,
we had to make a number of compromises.

FWIW I think "don't require a particular workflow" should be a non-goal.

One of the goals behind this project is building a review workflow which
works for a number of different types of workflows that people use here at
Mozilla, and as a result, we need to focus on the lowest common
denominator, hence the "don't require a particular workflow" goal.  If you
disagree with that goal, it's fine, but you should keep in mind that is
what we're building here.

Well I don't really understand how we've arrived at the conclusion that supporting arbitrary workflows is an immovable requirement, even if it comes at the expense of the overall quality of the tool. To me it feels like we are selling out the long-term productivity of the organization in order to avoid some short term pain in getting people to adapt.

I can assure you that I'm not selling out on anything, James.  I gave reasons behind my thinking in the email you quoted, and gave an example of how that actually serves a benefit because it enables more freedom in the user's choice of workflow.  Your assertion seems to suggest that you may not have completely read my email.

If you have specific pain points, I'd be more happy to discuss them.
 
I would be much more sympathetic to this if there wasn't ample evidence that people are willing and able to change VCS habits in order to get a better experience and/or to meet the requirements of the project that they are working with. For example GitHub has a workflow that is very different to the patch based system that was common in pre-VCS days, but the enormous success of GitHub doesn't seem to have been affected by the fact that it insists that you use branches + PRs for development and doesn't also provide the option to upload diff files, for example. Even within Mozilla, people seem to cope with having some things in Mercurial and other things on GitHub. Although it creates some process problems, the developers themselves seem to be able to make the transitions back and forth rather smoothly.

I have no doubt that people are more than capable of handling a change to their workflow.  But believe it or not, there is quite a bit of subjectivity in the choice of workflow.  The thing that matters to me a lot is not just the initial cost of getting used to a new workflow but also the freedom to adopt to another one if something is not working out for me, like I explained in my previous email.  For example, it we started to require a workflow that forced me to use Mercurial for my local development tomorrow, my productivity would drop suddenly.  If we forced everyone to use Critic tomorrow, gps' productivity would drop because this meant that he would suddenly need to use git for local development (not meaning to point out gps here of course.)
 
And new people coming to the project are substantially more likely to have experience of branch-based development than mq-based development.

Sure.
 
I am particularly troubled by mq because it feels like Mozilla might be one of the few organisations still using the tool. At least whenever you search for mq help on Google, you have about a 50% chance of getting a MDN page as the top result. It also seems like the wider mercurial community are starting to think that it was a bad idea. Therefore it seems all too plausible that it will be increasingly marginalised in the future, and we will be left depending on a tool that no one else is interested in.

Yes, I know that Mozilla uses mq a lot.  And I know that it is a bad idea from experience (and I'm glad that the Mercurial community is finally getting to that conclusion!).  I think you're mistaking my intent to support it for endorsement.  Trust me, there are very few things in life that I hate more than mq.

But your assertion that "we will be left depending on a tool that no one else is interested in" is just false here, there is nothing mq-dependent in what we're building here.
 

As the goal of this project is building a review workflow that all Mozilla
developers can switch to easily with minimal friction, reiterating how
github and bitbucket do things without keeping this in mind always gets us
back to the same issue: these systems are trying to solve different
problems.

Without wanting to suggest that we copy the GitHub or Bitbucket systems directly (we shouldn't; we can do substantially better), they aren't solving a different problem. They even had the same initial problem we had (how do we move users used to a centralised, patch-based workflow to a DVCS workflow). I really think we should try to design the best possible thing for a branchy development model and then if we find we *can't* get people to switch how they work, add mq support, rather than building the entire product around the assumption that people will refuse to try something new even if it offers considerable benefits.

They _are_ solving a different problem: coming up with a review workflow for people without no historical baggage, while making serious assumptions about things such as what DVCS system they use, etc.  That is a much easier problem to solve, and I do wish that was what I was trying to solve too, but it isn't.

On the issue of forcing people to use a new tool, it's not a question of "can", it's a question of "want".  Can we *force* people to use a specific workflow?  Absolutely, it's actually not that hard, you take away their existing tools, give them new ones, tell them to use it and deal with it.  But that is not what I'm interested in.

I hope that I made it clear in my previous email that you partially quoted here that I'm not just building an entire product on the assumption that people will refuse to try something new.  Please re-read the last paragraph of that email.  If that's not clear, let's discuss it.

(Also I'm not sure why you keep referring to mq support.  If we dropped mq support today, I would still stand by the rest of what I'm saying here!)
 

The mq use case only does rebases in a loose conceptual sense.  That is one
big difference with our workflow and Critic's.  In addition, requiring
people to let the code review tool know when they're planning to do a local
rebase adds one additional step to the workflow, and I don't think there is
a good reason why we would want to do that.

Yes, having to tell critic you are doing a rebase is annoying. But it's substantially less annoying than it would be to have to map each commit in the new branch to a commit in the old branch by hand.

You only map the commits manually if you need to preserve the review history.  The more common case is people submitting only a small part of their branch for a second review round and not everything that was previously reviewed.  That will mean that you don't need to manually match commits in most cases.

(and please note that in the future we might look into adding simple heuristics to try to auto-match commits if needed.)
 
In general rebases should be relatively rare (because they alter published history and create work for all the people collaborating on a branch), so having one manual step isn't *too* bad.

The frequency of rebases again depends on your workflow.  Some people rebase very rarely, and some very often.
 
FWIW I am trying to stand up a critic system (not connected to github) with the gecko-dev repo, just so that people can try it out and see what the workflow's like.

Sure, I'm not trying to stop you!  But please note that unfortunately any code review tool that only supports git won't be an option for us at Mozilla.  We have to have something which works with both.

James Graham

unread,
Mar 31, 2014, 7:04:07 PM3/31/14
to mozilla-c...@googlegroups.com
On 31/03/14 23:55, Ehsan Akhgari wrote:

>> FWIW I am trying to stand up a critic system (not connected to github)
>> with the gecko-dev repo, just so that people can try it out and see what
>> the workflow's like.
>>
>
> Sure, I'm not trying to stop you! But please note that unfortunately any
> code review tool that only supports git won't be an option for us at
> Mozilla. We have to have something which works with both.

Yes I understand that, and I understand that it would be relatively lots
of work to add hg support to critic (although I believe a patch that did
so would be accepted). The idea is just to demo how it works and let
people play with it, because I still not sure we have the same ideas
about what a good workflow looks like and I think this might cause me
difficulty in understanding where you are coming from.

Ehsan Akhgari

unread,
Mar 31, 2014, 7:22:46 PM3/31/14
to James Graham, mozilla-c...@googlegroups.com
I'm really interested to know more about the Critic workflow and why you think what we're building here will not enable that.  It may very well be the case that we're just talking past each other.  But I'm really hoping that we're not preventing good workflows here (even while we're trying to not "enforce" them.)

George Miroshnykov

unread,
Apr 1, 2014, 5:46:16 AM4/1/14
to mozilla-c...@googlegroups.com
Hey, I’ve updated the API docs, you can see them here:

This is a bare minimum required just for the HTML UI to work.
It can (and most probably should) be extended in future if we have some CLI tools in mind, but for now it gets the job done.

As usual, please let me know what you think.
Pull requests are welcome as well :)

Thanks,
George
Reply all
Reply to author
Forward
0 new messages