Dumb Matching Commits

27 views
Skip to first unread message

Steven MacLeod

unread,
Apr 2, 2014, 3:57:38 PM4/2/14
to mozilla-c...@googlegroups.com
Now that we're planning to keep the initial (and default) push workflow UI-less, we need to automatically match an updated pushes commits with the previous pushes. I propose we do this by matching the commits one by one in a stack, starting from the most parent of the pushed commits.

Example:
Inline image 1


In this case we would like to match C1' to C1, C2' to C2, C3' to C3, and then C4 should have a new review request created.

The issue we have is if m-c2 / m-c3 have not been pushed to the reviews repo yet. In this case they would mess up our stack matching.

How should we deal with identifying commits that have landed on m-c, but were not pushed to the reviews repo yet? We could do some hacky things with commit messages, bug numbers, "r=<name>", etc. Is there a better way?

What about requiring the pusher to make sure his outgoing commits are only ones to be reviewed? If there are extra commits that would be transferred along with the push, should they be forced to push those first (as a non review push?)

Basically, how can we identify the actual parts of the push to be reviewed without UI, so that we can make an easy default matching?

George Miroshnykov

unread,
Apr 2, 2014, 4:05:35 PM4/2/14
to Steven MacLeod, mozilla-c...@googlegroups.com, Gregory Szorc
Things may also be complicated by the fact that every push to our review repo will be a force-push (as discussed before).

We can indeed move the burden on the user by asking him to run “hg outgoing” before the actual push, but that won’t really be user-friendly.

I recall Gregory suggested we could use Mercurial phases:

Thanks,
George
--
You received this message because you are subscribed to the Google Groups "mozilla-code-review" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mozilla-code-re...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

James Graham

unread,
Apr 2, 2014, 4:59:41 PM4/2/14
to mozilla-c...@googlegroups.com
On 02/04/14 21:05, George Miroshnykov wrote:
> Things may also be complicated by the fact that every push to our review repo will be a force-push (as discussed before).

Sorry, I don't know if I didn't see this discussion or didn't fully
understand it, but can you point me to the reason for that? It seems
very poor from a user's point of view, not least because it make it very
difficult to pull from the review repo and have multiple people push to
the same review branch. The possibility of having such a workflow is
very important to me.

Gregory Szorc

unread,
Apr 2, 2014, 5:12:43 PM4/2/14
to Steven MacLeod, mozilla-c...@googlegroups.com
On 4/2/14, 12:57 PM, Steven MacLeod wrote:
> Now that we're planning to keep the initial (and default) push workflow
> UI-less, we need to automatically match an updated pushes commits with
> the previous pushes. I propose we do this by matching the commits one by
> one in a stack, starting from the most parent of the pushed commits.
>
> Example:
> Inline image 1
>
>
> In this case we would like to match C1' to C1, C2' to C2, C3' to C3, and
> then C4 should have a new review request created.
>
> The issue we have is if m-c2 / m-c3 have not been pushed to the reviews
> repo yet. In this case they would mess up our stack matching.
>
> How should we deal with identifying commits that have landed on m-c, but
> were not pushed to the reviews repo yet? We could do some hacky things
> with commit messages, bug numbers, "r=<name>", etc. Is there a better way?

Mercurial has 2 solutions to this problem. Git has 1. One of those
solutions is the same.

The shared solution is for the review repo to be aware of the current
heads of central, inbound, and the other release and integration
branches. At the time a review request comes in, the review tool can
walk the parent commits until it finds a commit that is a head of a
repo. The shortest path is the base of the review.

e.g.

central ----> A --> B --> C
\
- D --> E --> inbound

If we push changesets F and G:

central ----> A --> B --> C --> fx-team
\
- D --> E --> inbound --> F --> G

We can identify that this review was based on "inbound" at the time of
the push and identify F and G as the relevant commits.

The Mercurial solution involves phases. If the review repo is pulling
changesets from central, inbound, etc, changesets pulled from those
repos will be "public." Changesets that are review only (so far) will be
"draft." You can walk the parents of the review commit until you arrive
at a "public" changeset.

Both solutions require the review repository to have knowledge about the
release and integration repos. That shouldn't pose a major technical
obstacle.

A third (and slightly crazier) solution is to have clients know what the
heads of the repos are and send this as part of the review. This is
doable with Mercurial. But for Git it would require an out-of-band data
transfer (or custom attribute on the commit object), which would require
client-side magic. Apparently this would be a non-starter for the stated
goals of this project.

Ehsan Akhgari

unread,
Apr 2, 2014, 9:15:25 PM4/2/14
to James Graham, mozilla-c...@googlegroups.com
The reason is because hg is stupid.  hg has the concept of multi-headed branches, and makes it very hard to work with those multiple heads in a sane way.  As a result, it prevents pushes that create remote heads by default and to work around that you need to force-push.

It seems like you're confusing hg push -f with git push -f which are very different things.  The cases you mentioned above will all work fine when force-pushing in hg.

Ehsan Akhgari

unread,
Apr 2, 2014, 9:28:37 PM4/2/14
to Gregory Szorc, Taras Glek, Ben Kero, Steven MacLeod, mozilla-c...@googlegroups.com
Unfortunately we can't rely on phases since the commits under review might have been pushed to some other public repo which makes them "public" by default unless you do some crazy client side dance.  And we need to be git compatible anyway.

For now, what that means is that we need to pull from our upstream remotes (let's say, mozilla-central/inbound/fx-team/etc.), update the hg bookmarks that point to those heads, and walk up the rest of the graph like Greg said below.

The sucky part of this is how to make sure that your head bookmarks are updated right when an incoming push comes in.  I don't know if there is a great solution here unfortunately.  One approach would be to pull from the remotes right at that time, and then do the graph walking.  This will make pushes slow.  Another approach would be to do that lazily, i.e., accept the push, dump the commits you received in a db somewhere, and then pull from those remotes lazily and do the graph walking there.  This will make pushes fast, but makes them async so you won't have a review URL to spit out to stdout when the push is done, and people need to wait a while before their review is available.

Maybe there are better solutions for that which I can't think of right now?  Taras, Grag, Ben, any ideas?

(FWIW, Greg's third solution also requires us to solve this problem.  Once we find a solution, I'd rather do it once on the server side than do it on every client...)

Cheers,
--
You received this message because you are subscribed to the Google Groups "mozilla-code-review" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mozilla-code-review+unsub...@googlegroups.com.

Gregory Szorc

unread,
Apr 2, 2014, 9:31:10 PM4/2/14
to Ehsan Akhgari, James Graham, mozilla-c...@googlegroups.com
`hg push -B <bookmark>` (push a bookmark to a remote) does *not* require
--force. In your defense, I believe this was recently changed (in 2.8 or
2.9).

Even if --force were required, this would be hidden from view via the
planned-to-be-written Mercurial extension (just like the try extensions
today abstract that detail).

Ehsan Akhgari

unread,
Apr 2, 2014, 9:34:01 PM4/2/14
to Gregory Szorc, James Graham, mozilla-c...@googlegroups.com
On Wed, Apr 2, 2014 at 9:31 PM, Gregory Szorc <g...@mozilla.com> wrote:
On 4/2/14, 6:15 PM, Ehsan Akhgari wrote:
On Wed, Apr 2, 2014 at 4:59 PM, James Graham <ja...@hoppipolla.co.uk
<mailto:ja...@hoppipolla.co.uk>> wrote:

    On 02/04/14 21:05, George Miroshnykov wrote:

        Things may also be complicated by the fact that every push to
        our review repo will be a force-push (as discussed before).


    Sorry, I don't know if I didn't see this discussion or didn't fully
    understand it, but can you point me to the reason for that? It seems
    very poor from a user's point of view, not least because it make it
    very difficult to pull from the review repo and have multiple people
    push to the same review branch. The possibility of having such a
    workflow is very important to me.


The reason is because hg is stupid.  hg has the concept of multi-headed
branches, and makes it very hard to work with those multiple heads in a
sane way.  As a result, it prevents pushes that create remote heads by
default and to work around that you need to force-push.

It seems like you're confusing hg push -f with git push -f which are
very different things.  The cases you mentioned above will all work fine
when force-pushing in hg.

`hg push -B <bookmark>` (push a bookmark to a remote) does *not* require --force. In your defense, I believe this was recently changed (in 2.8 or 2.9).

Hah, OK, thanks for mentioning this!  (FWIW my patience ran out around the 2.4-ish days ;-)
 
Even if --force were required, this would be hidden from view via the planned-to-be-written Mercurial extension (just like the try extensions today abstract that detail).

I thought we agreed on ./mach review? ;-)

Gregory Szorc

unread,
Apr 2, 2014, 9:55:16 PM4/2/14
to Ehsan Akhgari, Taras Glek, Ben Kero, Steven MacLeod, mozilla-c...@googlegroups.com
On 4/2/14, 6:28 PM, Ehsan Akhgari wrote:
> Unfortunately we can't rely on phases since the commits under review
> might have been pushed to some other public repo which makes them
> "public" by default unless you do some crazy client side dance. And we
> need to be git compatible anyway.

No, the server (not client) controls publishing.

If the server is marked as publishing, all changesets *pushed* to the
server will be marked as "public."

If the server is marked as non-publishing, all changesets *pushed* to
the server will be marked as "draft."

If the server *pulls* from another repository (central, inbound, etc),
it will inherit the phases from that other repo.

Therefore, if the server is marked as non-publishing and the server
periodically pulls changesets from central, inbound, etc, the only way a
changeset becomes public is if it were marked as public on a repo the
server is pulling from (central, inbound, etc).

I don't think it is fair to require every semantic detail to be
compatible. We know Mercurial's changeset evolution feature will enable
us to eventually track changesets across rebases (and solve this problem
in a more robust manner). I think we can agree that not supporting
changeset evolution because Git doesn't have this (powerful) feature
would be silly.

> For now, what that means is that we need to pull from our upstream
> remotes (let's say, mozilla-central/inbound/fx-team/etc.), update the hg
> bookmarks that point to those heads, and walk up the rest of the graph
> like Greg said below.
>
> The sucky part of this is how to make sure that your head bookmarks are
> updated right when an incoming push comes in. I don't know if there is
> a great solution here unfortunately. One approach would be to pull from
> the remotes right at that time, and then do the graph walking. This
> will make pushes slow. Another approach would be to do that lazily,
> i.e., accept the push, dump the commits you received in a db somewhere,
> and then pull from those remotes lazily and do the graph walking there.
> This will make pushes fast, but makes them async so you won't have a
> review URL to spit out to stdout when the push is done, and people need
> to wait a while before their review is available.
>
> Maybe there are better solutions for that which I can't think of right
> now? Taras, Grag, Ben, any ideas?

I think our push volume is low enough that having the code review repo
periodically pull will be sufficient. i.e. the window where someone
pushes up "published" commits as part of a review should be very short.
Assuming the code review repo is pulling in new commits every 10s or so,
the only way I see this happening is if the client pushes to inbound,
fx-team, etc *and* the code review repo as part of the same shell
command. Think about it: for a review to have published commits the code
review repo doesn't know about yet, it either needs to pull-rebase-push
or push-push. I don't think that will happen that often.

We could certainly work in a hook or extension that "paused" the push
until the code review repo is up to date.

Please note that iterating over all the repos takes time - about 10s per
loop for remote repos. Clients may have to wait for a repository lock
while that is occurring. I highly recommend pulling from a "unified"
repository from the code review repo to keep this window as short as
possible. You can either have a second repo on the server or you can
pull from a remote such as https://hg.stage.mozaws.net/gecko (which has
bookmarks tracking each repo). It shouldn't take more than a few seconds
to perform an incremental pull from a single unified repo.

Ehsan Akhgari

unread,
Apr 3, 2014, 10:55:48 AM4/3/14
to Gregory Szorc, Taras Glek, Ben Kero, Steven MacLeod, mozilla-c...@googlegroups.com
On Wed, Apr 2, 2014 at 9:55 PM, Gregory Szorc <g...@mozilla.com> wrote:
On 4/2/14, 6:28 PM, Ehsan Akhgari wrote:
Unfortunately we can't rely on phases since the commits under review
might have been pushed to some other public repo which makes them
"public" by default unless you do some crazy client side dance.  And we
need to be git compatible anyway.

No, the server (not client) controls publishing.

If the server is marked as publishing, all changesets *pushed* to the server will be marked as "public."

If the server is marked as non-publishing, all changesets *pushed* to the server will be marked as "draft."

If the server *pulls* from another repository (central, inbound, etc), it will inherit the phases from that other repo.

Therefore, if the server is marked as non-publishing and the server periodically pulls changesets from central, inbound, etc, the only way a changeset becomes public is if it were marked as public on a repo the server is pulling from (central, inbound, etc).

I'm not sure which server you're talking about.  I'm talking about third-party servers which are not under our control (let's say bitbucket, etc.)
 
I don't think it is fair to require every semantic detail to be compatible. We know Mercurial's changeset evolution feature will enable us to eventually track changesets across rebases (and solve this problem in a more robust manner). I think we can agree that not supporting changeset evolution because Git doesn't have this (powerful) feature would be silly.

I disagree quite strongly.  :-)  I think it's quite wise to 1) not bet on changeset evolution ending up being what we hope it will be and 2) not build something that cannot be supported just as well in git.

FWIW I would also disagree quite strongly if you were advocating to use a git-only feature which would make it miserable to support hg.
 

For now, what that means is that we need to pull from our upstream
remotes (let's say, mozilla-central/inbound/fx-team/etc.), update the hg
bookmarks that point to those heads, and walk up the rest of the graph
like Greg said below.

The sucky part of this is how to make sure that your head bookmarks are
updated right when an incoming push comes in.  I don't know if there is
a great solution here unfortunately.  One approach would be to pull from
the remotes right at that time, and then do the graph walking.  This
will make pushes slow.  Another approach would be to do that lazily,
i.e., accept the push, dump the commits you received in a db somewhere,
and then pull from those remotes lazily and do the graph walking there.
This will make pushes fast, but makes them async so you won't have a
review URL to spit out to stdout when the push is done, and people need
to wait a while before their review is available.

Maybe there are better solutions for that which I can't think of right
now?  Taras, Grag, Ben, any ideas?

I think our push volume is low enough that having the code review repo periodically pull will be sufficient. i.e. the window where someone pushes up "published" commits as part of a review should be very short. Assuming the code review repo is pulling in new commits every 10s or so, the only way I see this happening is if the client pushes to inbound, fx-team, etc *and* the code review repo as part of the same shell command. Think about it: for a review to have published commits the code review repo doesn't know about yet, it either needs to pull-rebase-push or push-push. I don't think that will happen that often.

Yeah, I agree that it won't be a problem in most cases, but I'm mainly worrying about the cases where some kind of problem prevents the repo on the server side to get updated (things such as network issues, hg process crashing, etc.)
 
We could certainly work in a hook or extension that "paused" the push until the code review repo is up to date.

Please note that iterating over all the repos takes time - about 10s per loop for remote repos. Clients may have to wait for a repository lock while that is occurring.

Yeah, that is my main worry.

FWIW I think RelEng has some setup in place for running the git mirror which basically calls a script every time that a repo is pushed to.  I am very interested to use something like that to avoid having to poll on our server side all the time.
 
I highly recommend pulling from a "unified" repository from the code review repo to keep this window as short as possible. You can either have a second repo on the server or you can pull from a remote such as https://hg.stage.mozaws.net/gecko (which has bookmarks tracking each repo). It shouldn't take more than a few seconds to perform an incremental pull from a single unified repo.

Ideally we wouldn't need to poll.  This just shifts the issue to that other server, right?  (Assuming that server also polls.)

I've run a polling git mirror from our hg upstream repos for 2+ years, and based on that experience we should really not poll if at all possible.  Ben, can you help with that please?

Cheers,
Ehsan

Gregory Szorc

unread,
Apr 3, 2014, 1:09:20 PM4/3/14
to Ehsan Akhgari, Taras Glek, Ben Kero, Steven MacLeod, mozilla-c...@googlegroups.com
On 4/3/14, 7:55 AM, Ehsan Akhgari wrote:
> On Wed, Apr 2, 2014 at 9:55 PM, Gregory Szorc <g...@mozilla.com
> <mailto:g...@mozilla.com>> wrote:
>
> On 4/2/14, 6:28 PM, Ehsan Akhgari wrote:
>
> Unfortunately we can't rely on phases since the commits under review
> might have been pushed to some other public repo which makes them
> "public" by default unless you do some crazy client side dance.
> And we
> need to be git compatible anyway.
>
>
> No, the server (not client) controls publishing.
>
> If the server is marked as publishing, all changesets *pushed* to
> the server will be marked as "public."
>
> If the server is marked as non-publishing, all changesets *pushed*
> to the server will be marked as "draft."
>
> If the server *pulls* from another repository (central, inbound,
> etc), it will inherit the phases from that other repo.
>
> Therefore, if the server is marked as non-publishing and the server
> periodically pulls changesets from central, inbound, etc, the only
> way a changeset becomes public is if it were marked as public on a
> repo the server is pulling from (central, inbound, etc).
>
>
> I'm not sure which server you're talking about. I'm talking about
> third-party servers which are not under our control (let's say
> bitbucket, etc.)

"Server" in this case is "the code review repository." Please note that
my first statement about "the server controls publishing" is only true
in the context of "server" == "code review repo." Generally speaking,
each Mercurial repository has a repository-local definition of phases.

Also, BitBucket offers a configuration option to control whether
repositories are publishing. The publishing settings of remotes should
not come into play. Worst case, we can write an extension to force
modify the phases to be what we need them to be.

>
> I don't think it is fair to require every semantic detail to be
> compatible. We know Mercurial's changeset evolution feature will
> enable us to eventually track changesets across rebases (and solve
> this problem in a more robust manner). I think we can agree that not
> supporting changeset evolution because Git doesn't have this
> (powerful) feature would be silly.
>
>
> I disagree quite strongly. :-) I think it's quite wise to 1) not bet
> on changeset evolution ending up being what we hope it will be and 2)
> not build something that cannot be supported just as well in git.
>
> FWIW I would also disagree quite strongly if you were advocating to use
> a git-only feature which would make it miserable to support hg.

First, I *do* agree we should not bet the farm on changeset evolution.
That being said, I do think the people investing in changeset evolution
will ensure it can eventually scale to meet our needs and that makes it
a relatively safe bet in the long term. To be clear, I'm stating that
when changeset evolution is ready, I think we should utilize it as part
of the review tool. In addition, the short to medium term solution for
Mercurial and Git will be the same, so we don't need to worry about this
now.

That being said, I disagree strongly on your opinion that we should not
eventually support changeset evolution (and other varying features)
someday. I disagree because that does not put user experience first. At
the end of the day, we're serving our users. Telling them "no, you
cannot use this shiny feature because we're artificially limiting
support to the least common denominator" is our choice and it doesn't
serve the needs of the masses. The only benefactor AFAICT is less
complexity on the review tool (2 code paths instead of 1), which makes
our lives easier. Trading review tool simplicity for a worse overall
user experience is not a good long-term choice.

While I appreciate the sentiment in "I would also disagree quite
strongly if you were advocating to use a git-only feature which would
make it miserable to support hg," I reject the premise. Changeset
evolution is not a feature that would make it "miserable" to support
Git, at least from the perspective of the review tool. Actually, no, I
take that back. After deploying the changeset evolution aware commit
matching code, we'll look at the pile of code that performs semantic
matching of commits (now obsoleted by changeset evolution and used only
for Git) and reach the conclusion that Mercurial is much easier to
support. This could lead to it being "miserable" to support Git. But
unless Git's user base dies off, we'll support Git, so I still reject
the premise of the argument because we are choosing to support Git *and*
Mercurial. Again, I think our responsibility is to best serving our
users. We do that by supporting features that make it easier to use.

FWIW, I seem to recall you making arguments (IRL?) that one of the
faults of ReviewBoard was that it aimed to support the lowest common
denominator of {CVS, Subversion, Mercurial, Git, etc} and in doing so
sacrificed some of its potential to excel at handling DVCSs. I agree
with that logic, which is why I'm concerned that things like rejecting
changeset evolution because Git doesn't have that feature violates that
rationale and could lead this review tool to being less than its potential.

Ehsan Akhgari

unread,
Apr 3, 2014, 1:45:41 PM4/3/14
to Gregory Szorc, Taras Glek, Ben Kero, Steven MacLeod, mozilla-c...@googlegroups.com
OK, that makes sense now.  I was referring to other servers, hence the confusion.
 
Also, BitBucket offers a configuration option to control whether repositories are publishing. The publishing settings of remotes should not come into play. Worst case, we can write an extension to force modify the phases to be what we need them to be.

OK, that's good to know.  But still that doesn't help if you push to servers other than bitbucket...

But there is another issue on using phases.  The first time that mercurial turned on phases by default, pushing to the try server led to disasters for people because their mq (or maybe other local changesets) would be marked as public and mercurial would prevent you from pushing them elsewhere.  I don't remember exactly what the workaround were that people were using locally before the try server was configured to not mark those changesets as public.  I wonder how many people still have remnants of that workaround somewhere in their local configs...

There is also the fact that you can modify the phase for a changeset locally using hg phase.  I've had to do that a few times myself, not sure if it's a common issue or not (or if there are ways to not get yourself into that situation in the first place)...

Honestly verifying things on the server side which we control seems a bit more robust.
 


    I don't think it is fair to require every semantic detail to be
    compatible. We know Mercurial's changeset evolution feature will
    enable us to eventually track changesets across rebases (and solve
    this problem in a more robust manner). I think we can agree that not
    supporting changeset evolution because Git doesn't have this
    (powerful) feature would be silly.


I disagree quite strongly.  :-)  I think it's quite wise to 1) not bet
on changeset evolution ending up being what we hope it will be and 2)
not build something that cannot be supported just as well in git.

FWIW I would also disagree quite strongly if you were advocating to use
a git-only feature which would make it miserable to support hg.

First, I *do* agree we should not bet the farm on changeset evolution. That being said, I do think the people investing in changeset evolution will ensure it can eventually scale to meet our needs and that makes it a relatively safe bet in the long term. To be clear, I'm stating that when changeset evolution is ready, I think we should utilize it as part of the review tool. In addition, the short to medium term solution for Mercurial and Git will be the same, so we don't need to worry about this now.

Why don't we have that conversation when changeset evolution is ready, people start to use it and think it's a good idea?  Seems like we both agree that we don't need to answer that question now.
 
That being said, I disagree strongly on your opinion that we should not eventually support changeset evolution (and other varying features) someday. I disagree because that does not put user experience first. At the end of the day, we're serving our users. Telling them "no, you cannot use this shiny feature because we're artificially limiting support to the least common denominator" is our choice and it doesn't serve the needs of the masses. The only benefactor AFAICT is less complexity on the review tool (2 code paths instead of 1), which makes our lives easier. Trading review tool simplicity for a worse overall user experience is not a good long-term choice.

The reason why I disagree is that I don't have as much faith in what changeset evolution will turn out to be as you do.  Please don't take this personally against yourself or Mercurial developers, but the track record here is quite poor.  I refer you to the first time that they turned on hg phases by default.  That being said, _if_ when people start to use it in the future they agree that it's a good idea, then I'm open to change my mind.  I just don't see why you're expecting me to vouch for something that doesn't exist in a concrete form out there yet and we have no experience using it.
 
While I appreciate the sentiment in "I would also disagree quite strongly if you were advocating to use a git-only feature which would make it miserable to support hg," I reject the premise. Changeset evolution is not a feature that would make it "miserable" to support Git, at least from the perspective of the review tool. Actually, no, I take that back. After deploying the changeset evolution aware commit matching code, we'll look at the pile of code that performs semantic matching of commits (now obsoleted by changeset evolution and used only for Git) and reach the conclusion that Mercurial is much easier to support. This could lead to it being "miserable" to support Git. But unless Git's user base dies off, we'll support Git, so I still reject the premise of the argument because we are choosing to support Git *and* Mercurial. Again, I think our responsibility is to best serving our users. We do that by supporting features that make it easier to use.

I am as much interested in providing a good UX to our users as you are.  Our difference of opinion is that I'm very conservative adopting the next new shiny thing that is going to come to mercurial, and you seem to be very eager to do that.  But really, this is getting way off topic, please let's take it off list.
 
FWIW, I seem to recall you making arguments (IRL?) that one of the faults of ReviewBoard was that it aimed to support the lowest common denominator of {CVS, Subversion, Mercurial, Git, etc} and in doing so sacrificed some of its potential to excel at handling DVCSs. I agree with that logic, which is why I'm concerned that things like rejecting changeset evolution because Git doesn't have that feature violates that rationale and could lead this review tool to being less than its potential.

You're quoting me without giving full context: I was referring to the fact that RB will need to support patch based reviews because of the support of things such as CVS.  Comparing hg+changeset evolution vs git to hg or git vs CVS is comparing apples with oranges.

Gregory Szorc

unread,
Apr 3, 2014, 6:58:32 PM4/3/14
to Ehsan Akhgari, Taras Glek, Ben Kero, Steven MacLeod, mozilla-c...@googlegroups.com
On 4/3/14, 10:45 AM, Ehsan Akhgari wrote:
> On Thu, Apr 3, 2014 at 1:09 PM, Gregory Szorc <g...@mozilla.com
> <mailto:g...@mozilla.com>> wrote:
>
> On 4/3/14, 7:55 AM, Ehsan Akhgari wrote:
>
> On Wed, Apr 2, 2014 at 9:55 PM, Gregory Szorc <g...@mozilla.com
> <mailto:g...@mozilla.com>
I agree that looking at bookmarks/branches/labels for identifying the
"base commit/tree" is more robust than phases.

FWIW, remember that the Try phase fiasco was due to Mozilla not
upgrading Mercurial and then taking seemingly ages to make a 2 line
change to the hgrc of the Try repository. If we had been actively
following Mercurial development, the whole issue could have been avoided.
I agree.

> That being said, I disagree strongly on your opinion that we should
> not eventually support changeset evolution (and other varying
> features) someday. I disagree because that does not put user
> experience first. At the end of the day, we're serving our users.
> Telling them "no, you cannot use this shiny feature because we're
> artificially limiting support to the least common denominator" is
> our choice and it doesn't serve the needs of the masses. The only
> benefactor AFAICT is less complexity on the review tool (2 code
> paths instead of 1), which makes our lives easier. Trading review
> tool simplicity for a worse overall user experience is not a good
> long-term choice.
>
>
> The reason why I disagree is that I don't have as much faith in what
> changeset evolution will turn out to be as you do. Please don't take
> this personally against yourself or Mercurial developers, but the track
> record here is quite poor. I refer you to the first time that they
> turned on hg phases by default. That being said, _if_ when people start
> to use it in the future they agree that it's a good idea, then I'm open
> to change my mind. I just don't see why you're expecting me to vouch
> for something that doesn't exist in a concrete form out there yet and we
> have no experience using it.

https://hg.stage.mozaws.net/gecko-collab is running Mercurial 2.9.1 with
obsolescence (the feature behind changeset evolution) enabled. I (and a
small band of others) are actively using changeset evolution locally.
It's real. It exists. I concede it needs a lot more work and to solve
some scaling issues. But a large company is investing in it and the
track record of that company tells me they will succeed :)

We're kind of in a weird situation. Mozilla doesn't seem eager to invest
in improvements to Mercurial (by staying up to date on releases, running
extensions on the server, etc). As a result, people don't know about new
features and there is no demand for them. Someone needs to push for
these things and I find myself being a (seeming) lone voice. I think
it's fair to assume that a lot of my replies about Mercurial are
targeting an audience that's ignorant of Mercurial's features and
capabilities, not just at you :) I'm also trying to ensure whatever
solutions we employ on this review tool are compatible as possible with
the long term. We know what's coming down the Mercurial pipe: I think
it's rational to prepare for the future when it's known.

I think we are mostly in agreement. Timelines are what matters. Let's
drop this sub-thread.

James Graham

unread,
Apr 4, 2014, 11:38:06 AM4/4/14
to mozilla-c...@googlegroups.com
I don't know what "without UI" means exactly, but if the default way to
start a review is via a mach command (and I hope the intent is that this
can desugar into a comprehensible set of VCS actions), can't the mach
command allow one to specify the set of commits that will make up the
review? For example

./mach review --base=HEAD~3

(syntax could be better ofc). On the backend, this would push the
current branch to the review-tool owned VCS and then actually create the
review in a second step using a REST API like
/create_review?base=sha1_of_base&branch=name_of_remote_branch_or_bookmark

Of course --base could have a sensible default like HEAD~ or the
upstream of the current branch, or whatever.

Ehsan Akhgari

unread,
Apr 4, 2014, 5:18:15 PM4/4/14
to Gregory Szorc, Taras Glek, Ben Kero, Steven MacLeod, mozilla-c...@googlegroups.com
Yep.
 
FWIW, remember that the Try phase fiasco was due to Mozilla not upgrading Mercurial and then taking seemingly ages to make a 2 line change to the hgrc of the Try repository. If we had been actively following Mercurial development, the whole issue could have been avoided.

Yes, I know.  Not trying to place guilt -- but it's also arguable that the hg client did the wrong thing by default when talking to older hg servers.  I don't actually know whether Mercurial supports using new clients with old servers, I'm just assuming that is the case.
Hopefully.  It's good that people at Mozilla are testing it.  I'm just echoing what I read on <http://mercurial.selenic.com/wiki/ChangesetEvolution> here.
 
We're kind of in a weird situation. Mozilla doesn't seem eager to invest in improvements to Mercurial (by staying up to date on releases, running extensions on the server, etc). As a result, people don't know about new features and there is no demand for them. Someone needs to push for these things and I find myself being a (seeming) lone voice. I think it's fair to assume that a lot of my replies about Mercurial are targeting an audience that's ignorant of Mercurial's features and capabilities, not just at you :) I'm also trying to ensure whatever solutions we employ on this review tool are compatible as possible with the long term. We know what's coming down the Mercurial pipe: I think it's rational to prepare for the future when it's known.
 
Understood, and it all makes sense.  Thanks for talking about this stuff!

Ehsan Akhgari

unread,
Apr 4, 2014, 5:20:40 PM4/4/14
to James Graham, mozilla-c...@googlegroups.com
We can do that of course, but I think that's a sucky UI because it makes the developer think about something that their VCS already knows.  :-)  I'm still hopeful that we can fix the server side issue to be able to detect things reliably on the server side, but this obviously is a fallback approach that we can use if we realize that won't work.

James Graham

unread,
Apr 6, 2014, 6:36:35 PM4/6/14
to mozilla-c...@googlegroups.com
I don't think the server can know. Apart from the practical problems in
simple cases, there are a whole load of edge cases where it is more or
less impossible for the server to figure out what's going on. For
example if I pull in some changes from one of my coworkers, that they
are going to push to review, and make an additional change on top of
their branch, that I push to review, I don't want my review to include
their changes, even if they haven't pushed them yet. Similarly, if I
have put some hacky patch that the remaining commits don't build on,
e.g. something to change logging or whatever, in my branch history, I
don't want the review to include that commit.

Basically I think the user has to be in ultimate control, and I think
that's much easier to achieve if the definition of the base commit is
provided by the client rather than the server. This still allows
sensible defaults that could use phases, or the merge base, or whatever
is available. It just means that simple things are simple but hard
things are also possible.

Ehsan Akhgari

unread,
Apr 7, 2014, 1:30:32 PM4/7/14
to James Graham, mozilla-c...@googlegroups.com
On Sun, Apr 6, 2014 at 6:36 PM, James Graham <ja...@hoppipolla.co.uk> wrote:
On 04/04/14 22:20, Ehsan Akhgari wrote:

On Fri, Apr 4, 2014 at 11:38 AM, James Graham <ja...@hoppipolla.co.uk <mailto:ja...@hoppipolla.co.uk>> wrote:


    I don't know what "without UI" means exactly, but if the default
    way to start a review is via a mach command (and I hope the intent
    is that this can desugar into a comprehensible set of VCS
    actions), can't the mach command allow one to specify the set of
    commits that will make up the review? For example

    ./mach review --base=HEAD~3

    (syntax could be better ofc). On the backend, this would push the
    current branch to the review-tool owned VCS and then actually
    create the review in a second step using a REST API like
    /create_review?base=sha1_of_base&branch=name_of_remote_branch_or_bookmark

    Of course --base could have a sensible default like HEAD~ or the
    upstream of the current branch, or whatever.


We can do that of course, but I think that's a sucky UI because it makes the developer think about something that their VCS already knows.  :-)  I'm still hopeful that we can fix the server side issue to be able to detect things reliably on the server side, but this obviously is a fallback approach that we can use if we realize that won't work.

I don't think the server can know. Apart from the practical problems in simple cases, there are a whole load of edge cases where it is more or less impossible for the server to figure out what's going on. For example if I pull in some changes from one of my coworkers, that they are going to push to review, and make an additional change on top of their branch, that I push to review, I don't want my review to include their changes, even if they haven't pushed them yet. Similarly, if I have put some hacky patch that the remaining commits don't build on, e.g. something to change logging or whatever, in my branch history, I don't want the review to include that commit.

Both of those scenarios are the kinds of things that our previous experiment (changeset chooser) tried to support, and we ended up deciding that the UI for supporting those will be too complicated, and we decided to support a simpler github-style PR model where you don't post such kinds of branches for review.  In that simplistic model, the only thing that the server needs to know is the head of the upstream branch that your changes are based off of.
 
Basically I think the user has to be in ultimate control, and I think that's much easier to achieve if the definition of the base commit is provided by the client rather than the server. This still allows sensible defaults that could use phases, or the merge base, or whatever is available. It just means that simple things  are simple but hard things are also possible.

I think we basically decided that we don't need to keep all things possible.  Is there anything with the github-style PR model which we won't be able to detect on the server side?

James Graham

unread,
Apr 7, 2014, 1:49:47 PM4/7/14
to mozilla-c...@googlegroups.com
I have personally run into problems with the way GH deals with PRs.
Specifically I have had the "I want to create a review containing only
some of the commits on the branch" problem, which doesn't seem to be
possible there.

I don't really understand the advantage of putting this logic in the
server compared to the client? I also haven't understood why "one
optional argument when creating a review" is considered too complicated?

Ehsan Akhgari

unread,
Apr 7, 2014, 1:59:22 PM4/7/14
to James Graham, mozilla-c...@googlegroups.com
Yep.  Those use cases will be explicitly unsupported.  Basically we're convinced that it's impossible to solve that problem without requiring some kind of a UI to ask people about which commits they want reviewed somehow, which is what we're trying to avoid here.

(Note that this is _exactly_ the same issue as supporting mq style patch queues.)
 
I don't really understand the advantage of putting this logic in the server compared to the client?

We're trying to create a workflow without prompting the user to tell us things about their branch.
 
I also haven't understood why "one optional argument when creating a review" is considered too complicated?

Not sure what you mean with this.  Can you please clarify?

James Graham

unread,
Apr 8, 2014, 9:05:39 AM4/8/14
to mozilla-c...@googlegroups.com
On 07/04/14 18:59, Ehsan Akhgari wrote:
> On Mon, Apr 7, 2014 at 1:49 PM, James Graham <ja...@hoppipolla.co.uk> wrote:

>> I have personally run into problems with the way GH deals with PRs.
>> Specifically I have had the "I want to create a review containing only some
>> of the commits on the branch" problem, which doesn't seem to be possible
>> there.
>>
>
> Yep. Those use cases will be explicitly unsupported. Basically we're
> convinced that it's impossible to solve that problem without requiring some
> kind of a UI to ask people about which commits they want reviewed somehow,
> which is what we're trying to avoid here.

I don't see why having an optional flag in the existing UI to allow
complex cases to work is bad.

> (Note that this is _exactly_ the same issue as supporting mq style patch
> queues.)

I don't see how this is the same issue. I am still using the model that
a review is a sequence of commits between a base point and a head. I'm
just suggesting that it should be possible to explicitly assign the base
point for cases where the server doesn't guess it correctly.

>> I don't really understand the advantage of putting this logic in the
>> server compared to the client?
>
>
> We're trying to create a workflow without prompting the user to tell us
> things about their branch.

I don't think I have suggested prompting the user.

>> I also haven't understood why "one optional argument when creating a
>> review" is considered too complicated?
>
>
> Not sure what you mean with this. Can you please clarify?

So apparently I haven't been at all clear. Let me try again. In case of
doubt, assume that the following uses git terminology.

My suggestion was that "mach review" would automatically use ambient
state to determine the head and base commits of the review. The head
commit would just come from the currently checked out branch head. In
order to determine the base commit, it would use inspection of that
branch and either use the merge point with origin/master, or the
upstream tracking branch, or assume that the review contains a single
commit, or even let the server work it out based on the commits in the
branch that don't already exist upstream. I don't favour this last
option because it's hard to reason about locally and, as has already
been extensively discussed, is vulnerable to error in case that the
review repo gets out of sync for whatever reason.

In addition to the above, I suggested the addition of an explicit
override for the base commit so that people who need control are given
it. One option for the syntax would just be "mach review --base=rev"
where rev would be something that the vcs could parse as a commit. It
seems to me that this is not adding unreasonable UI surface since you
seem to be committed to requiring mach to create a review (rather than
just allowing a push to create a review directly), but avoids having to
explicitly unsupport a number of reasonable cases, and gives people the
tools to do the case thing in the case the normal heuristics break down,
which they inevitably will.

Reply all
Reply to author
Forward
0 new messages