Is landing multi-part patches in separate commits really a good practice?

Felipe Gomes

unread,

Jan 30, 2012, 8:54:22 AM1/30/12

to

The mercurial workflow that most people use with the patch queues works really well for separating the patches into review units, but I can't think of a good reasoning for landing them without qfold'ing before.

When you try to bisect a bug, having commit states that are incomplete and don't even build (or worse, build but does not work properly) wastes a lot of time. IMO there should not be changesets that you can't uniquely back out without breaking the tree.
(of course there are exceptions to this rule, but they should be the exception rather than the common case).

The tree merges already break the logarithm aspect of bisect, and many unnecessary commits make it worse.

Opinions?

Jeff Muizelaar

unread,

Jan 30, 2012, 9:31:02 AM1/30/12

to mozilla.de...@googlegroups.com, dev-pl...@lists.mozilla.org

On 2012-01-30, at 8:54 AM, Felipe Gomes wrote:

> The mercurial workflow that most people use with the patch queues works really well for separating the patches into review units, but I can't think of a good reasoning for landing them without qfold'ing before.
>
> When you try to bisect a bug, having commit states that are incomplete and don't even build (or worse, build but does not work properly) wastes a lot of time. IMO there should not be changesets that you can't uniquely back out without breaking the tree.
> (of course there are exceptions to this rule, but they should be the exception rather than the common case).

Each commit should result in a completely working tree. Ideally, the split used for review has this property and each piece can be a separate commit. However, if the split used for review doesn't result in useful independent parts, I agree that these parts should be folded into a single commit.

If we had machine time to spare, it would be nice if instead of only building the top commit of each push, we built each push individually.

-Jeff

Bobby Holley

unread,

Jan 30, 2012, 9:38:37 AM1/30/12

to Jeff Muizelaar, dev-pl...@lists.mozilla.org

On Mon, Jan 30, 2012 at 3:31 PM, Jeff Muizelaar <jmuiz...@mozilla.com>wrote:

> Each commit should result in a completely working tree.
>

+1. This is actually extremely _helpful_ for bisect, because you end up
with a much more specific/useful commit. When I regressed something with a
23-part XPConnect refactor a few months ago, it was very easy to isolate
the precise line that went wrong, and probably saved quite a few hours of
debugging.

bholley

Lucas Rocha

unread,

Jan 30, 2012, 9:41:16 AM1/30/12

to mozilla.de...@googlegroups.com, dev-pl...@lists.mozilla.org

Hi,

If people are committing code that breaks build or anything else,
they're probably doing it wrong. Micro-committing is about splitting
your patch into a successive series of smaller self-contained
functional steps. Every commit should make sense on its own and result
in a working tree. Micro commits actually make it simpler to spot the
cause of regressions with bisect because each commit likely represents
a simpler/smaller change to the code.

I'm personally a big fan[1] of micro-commits and would love to see
more developers at Mozilla doing it :-)

--lucasr

[1] http://lucasr.org/2011/01/29/micro-commits/

Boris Zbarsky

unread,

Jan 30, 2012, 9:44:05 AM1/30/12

to

On 1/30/12 8:54 AM, Felipe Gomes wrote:
> The mercurial workflow that most people use with the patch queues works really well for separating the patches into review units, but I can't think of a good reasoning for landing them without qfold'ing before.

The best reason is that it gives you better bisection if you do it right.

It also gives people trying to understand the changes later a better
idea of what's going on. Fundamentally, understanding a change after
the fact is the same as review and has similar constraints.

> When you try to bisect a bug, having commit states that are incomplete and don't even build (or worse, build but does not work properly) wastes a lot of time.

Yes. If people are pushing patches like this, they're doing it wrong.

> IMO there should not be changesets that you can't uniquely back out without breaking the tree.

That's bunk, sorry. It's trivial to have changesets that are not even
in the same bug but depend on each other. Think a changeset that adds
an API, then later a changeset that uses it. Now you can't back out the
API-adding changeset without breaking the tree.

> Opinions?

1) People should never be landing changesets that don't compile. Ever.
It makes bisecting hell.
2) People should work hard to avoid landing changesets that don't pass
tests.
3) Landing a huge glob of code as a single changeset is sometimes
unavoidable, but leads to regression-fixing hell in my experience. If
you commit to maintaining all that code for the rest of the lifetime of
the project, I guess you can land things that way.

You did ask for opinions... ;)

-Boris

Felipe Gomes

unread,

Jan 30, 2012, 11:13:06 AM1/30/12

to

On Monday, January 30, 2012 3:44:05 PM UTC+1, Boris Zbarsky wrote:

> Yes. If people are pushing patches like this, they're doing it wrong.
>
> > IMO there should not be changesets that you can't uniquely back out without breaking the tree.
>
> That's bunk, sorry. It's trivial to have changesets that are not even
> in the same bug but depend on each other. Think a changeset that adds
> an API, then later a changeset that uses it. Now you can't back out the
> API-adding changeset without breaking the tree.

That's right, sorry. I actually wanted to get into what Jeff said but I reached the wrong conclusion. I think though that is because non unit-complete patches are pushed frequently enough that the problem always hits me.

However, it's easier to get into that situation than what it looks like. I'll give an example. Please take this as *only an example* that I chose just because you'll be able to tell if my assumption is right or wrong. I see that the same potential situation can happen often by reading commit messages.

The following commit http://hg.mozilla.org/mozilla-central/rev/5e6e63f3aed8 says: "part 23. Set a backup timer to update plugin geometry if we don't manage to do any painting or layout flushes for a while"

By reading the title and the code, this tells me that if I was trying to bisect a problem with plugin painting, and hg bisect took me to part *22*, I wouldn't be able to judge if the problem was caused by bug 598482 or not (because it would need part 23 to work properly). Also, if the problem was not caused by that bug but resulted in a similar behavior of the tree without patch 23, bisect would lead me into the wrong direction.

To me, it boils down to the way that the patch queue flattens all the patches in a bug. A bug broken into various patches should actually be a merge, as usually you're first interested in which bug introduced a problem, and then you can bisect into the bug to find what caused the regression, as bholley mentioned is very useful (and I believe it was also your point about doing it right).

Cheers,
Felipe

Boris Zbarsky

unread,

Jan 30, 2012, 11:38:04 AM1/30/12

to

On 1/30/12 11:13 AM, Felipe Gomes wrote:
> The following commit http://hg.mozilla.org/mozilla-central/rev/5e6e63f3aed8 says: "part 23. Set a backup timer to update plugin geometry if we don't manage to do any painting or layout flushes for a while"
>
> By reading the title and the code, this tells me that if I was trying to bisect a problem with plugin painting, and hg bisect took me to part *22*, I wouldn't be able to judge if the problem was caused by bug 598482 or not (because it would need part 23 to work properly). Also, if the problem was not caused by that bug but resulted in a similar behavior of the tree without patch 23, bisect would lead me into the wrong direction.

"Maybe" (in practice, this only matters for a very very rare plugin
painting case). That particular patch queue does have some issues; if I
had done it all from the start, it would have looked somewhat different.

In terms of bisect, this is no different from any two separate bugs
where the first causes a problem and the second fixes it. If you then
bisect something else similar, you have to deal with the fact that it
used to work, then broke, then got fixed, then broke again.... The way
to handle it in this case is that if a bisect lands you in the middle of
the 23-patch queue you then test the before and after changesets for
that queue. If those are both OK, then you need to find some other
point where things broke. If not, then something in that 23-patch queue
broke things but you don't know exactly what (which is all you'd know if
it were flattened).

In this particular case flattening might have made some sense. On the
other hand, having the separate changesets in fact allowed me to bisect
a half dozen or so issues to a particular changeset in there, so on net
I think having it as separate changesets is a win.

> To me, it boils down to the way that the patch queue flattens all the patches in a bug. A bug broken into various patches should actually be a merge

Ideally, yes. hg doesn't have any sane way to do it that way, though.

> as usually you're first interested in which bug introduced a problem, and then you can bisect into the bug to find what caused the regression, as bholley mentioned is very useful (and I believe it was also your point about doing it right).

This approach only works if the merge commit itself is trivial, which is
where hg doesn't really let you create the sort of thing we want here...

-Boris

David Rajchenbach-Teller

unread,

Jan 30, 2012, 11:54:58 AM1/30/12

to Boris Zbarsky, dev-pl...@lists.mozilla.org

On 1/30/12 3:44 PM, Boris Zbarsky wrote:
> Yes. If people are pushing patches like this, they're doing it wrong.

Well, it is rather common to push, say, the Makefile in one patch and
the actual code in another. If we decide that this is wrong (which makes
sense), this policy needs to be made clear and explicit somewhere.

Cheers,
David

--
David Rajchenbach-Teller, PhD
Performance Team, Mozilla

signature.asc

Boris Zbarsky

unread,

Jan 30, 2012, 11:59:48 AM1/30/12

to

On 1/30/12 11:54 AM, David Rajchenbach-Teller wrote:
> On 1/30/12 3:44 PM, Boris Zbarsky wrote:
>> Yes. If people are pushing patches like this, they're doing it wrong.
>
> Well, it is rather common to push, say, the Makefile in one patch and
> the actual code in another.

This is fine if the actual code patch comes before the Makefile patch
(so three patches, probably: add code, build it, use code).

> If we decide that this is wrong (which makes
> sense), this policy needs to be made clear and explicit somewhere.

It seems to fall under the general "do not push changesets that don't
compile" thing that seems like it should be obvious....

-Bori

Justin Lebar

unread,

Jan 30, 2012, 12:41:05 PM1/30/12

to Boris Zbarsky, dev-pl...@lists.mozilla.org

>> To me, it boils down to the way that the patch queue flattens all the patches in a bug. A bug broken into various patches should actually be a merge
> Ideally, yes. hg doesn't have any sane way to do it that way, though.

At the risk of stating the obvious: This should be doable as an hg
extension. I don't think hg log has push data, so you couldn't treat
each *push* as a merge, but you could get 95% of the way there by
looking at commit messages.

Even if nobody commits the Wrong Way going forward, we'll still have
plenty of bad patch queues in our tree. So maybe we should try to
work around all the commits, past and future, with a tool.

-Justin

> _______________________________________________
> dev-planning mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-planning

Ehsan Akhgari

unread,

Jan 30, 2012, 4:42:12 PM1/30/12

to Justin Lebar, Boris Zbarsky, dev-pl...@lists.mozilla.org

We record a commit -> push mapping in some data base as part of the
pushloghtml tool we have on hg.mozilla.org. I have no idea how that data
can be queried remotely though.

--
Ehsan
<http://ehsanakhgari.org/>

On Mon, Jan 30, 2012 at 12:41 PM, Justin Lebar <justin...@gmail.com>wrote:

> >> To me, it boils down to the way that the patch queue flattens all the
> patches in a bug. A bug broken into various patches should actually be a
> merge
> > Ideally, yes. hg doesn't have any sane way to do it that way, though.
>
> At the risk of stating the obvious: This should be doable as an hg
> extension. I don't think hg log has push data, so you couldn't treat
> each *push* as a merge, but you could get 95% of the way there by
> looking at commit messages.
>
> Even if nobody commits the Wrong Way going forward, we'll still have
> plenty of bad patch queues in our tree. So maybe we should try to
> work around all the commits, past and future, with a tool.
>
> -Justin
>
> On Mon, Jan 30, 2012 at 11:59 AM, Boris Zbarsky <bzba...@mit.edu> wrote:

Christian Legnitto

unread,

Jan 30, 2012, 7:24:18 PM1/30/12

to Ehsan Akhgari, Boris Zbarsky, Justin Lebar, dev-pl...@lists.mozilla.org

It's in a sqlite database in the same directory as the repo...so not easy to query externally but fine from a hook.

On Jan 30, 2012, at 1:42 PM, Ehsan Akhgari <ehsan....@gmail.com> wrote:

> We record a commit -> push mapping in some data base as part of the
> pushloghtml tool we have on hg.mozilla.org. I have no idea how that data
> can be queried remotely though.
>
> --
> Ehsan
> <http://ehsanakhgari.org/>
>
>
> On Mon, Jan 30, 2012 at 12:41 PM, Justin Lebar <justin...@gmail.com>wrote:
>
>>>> To me, it boils down to the way that the patch queue flattens all the
>> patches in a bug. A bug broken into various patches should actually be a
>> merge
>>> Ideally, yes. hg doesn't have any sane way to do it that way, though.
>>
>> At the risk of stating the obvious: This should be doable as an hg
>> extension. I don't think hg log has push data, so you couldn't treat
>> each *push* as a merge, but you could get 95% of the way there by
>> looking at commit messages.
>>
>> Even if nobody commits the Wrong Way going forward, we'll still have
>> plenty of bad patch queues in our tree. So maybe we should try to
>> work around all the commits, past and future, with a tool.
>>
>> -Justin
>>
>> On Mon, Jan 30, 2012 at 11:59 AM, Boris Zbarsky <bzba...@mit.edu> wrote:

Justin Dolske

unread,

Jan 31, 2012, 5:59:27 AM1/31/12

to

On 1/30/12 5:54 AM, Felipe Gomes wrote:

> When you try to bisect a bug, having commit states that are
> incomplete and don't even build (or worse, build but does not work
> properly) wastes a lot of time. IMO there should not be changesets
> that you can't uniquely back out without breaking the tree. (of
> course there are exceptions to this rule, but they should be the
> exception rather than the common case).

Is this a (real) problem other than for bisecting? It sounds to me like
maybe the bisecting tool just needs to be smarter?

In any case, an "every changeset must build" requirement seems
impossible to hit in practice. We still have bad code land, or hit
unexpected problems on a platform, or hit unexpected compiler bugs. And
even if it builds, there can still be uncaught (untested) bugs that
could interfere with regression hunting (such as this weekend's bug 722167).

But I don't feel strongly either way. From the perspective of "code
history", it's nice to have organized patchsets not just for reviewing
but for understanding when understanding what was happening with code.
OTOH, there's a fuzzy line somewhere between "orderly series of patches"
and "chaotic stream-of-consciousness microcommiting".

Justin

Ted Mielczarek

unread,

Jan 31, 2012, 8:27:47 AM1/31/12

to Ehsan Akhgari, Boris Zbarsky, Justin Lebar, dev-pl...@lists.mozilla.org

On Mon, Jan 30, 2012 at 4:42 PM, Ehsan Akhgari <ehsan....@gmail.com> wrote:
> We record a commit -> push mapping in some data base as part of the
> pushloghtml tool we have on hg.mozilla.org. I have no idea how that data
> can be queried remotely though.

There's actually a JSON output format for the pushlog Hg extension:
http://hg.mozilla.org/mozilla-central/json-pushes?startdate=2%20hours%20ago

It's not particularly well documented, but you can read the code:
http://hg.mozilla.org/hgcustom/pushlog/file/e99a36d3fd4a/pushlog-feed.py#l245

-Ted

Nicholas Nethercote

unread,

Jan 31, 2012, 6:46:15 PM1/31/12

to Justin Dolske, dev-pl...@lists.mozilla.org

On Tue, Jan 31, 2012 at 2:59 AM, Justin Dolske <dol...@mozilla.com> wrote:
>
> In any case, an "every changeset must build" requirement seems impossible to
> hit in practice. We still have bad code land, or hit unexpected problems on
> a platform, or hit unexpected compiler bugs. And even if it builds, there
> can still be uncaught (untested) bugs that could interfere with regression
> hunting (such as this weekend's bug 722167).
>
> But I don't feel strongly either way. From the perspective of "code
> history", it's nice to have organized patchsets not just for reviewing but
> for understanding when understanding what was happening with code. OTOH,
> there's a fuzzy line somewhere between "orderly series of patches" and
> "chaotic stream-of-consciousness microcommiting".

Yeah. Seems to me there are some basic principles:

- Small patches help with bisecting (and reviewing!)

- Unbuildable revisions complicate bisecting.

- Deliberately creating unbuildable revisions is a bad idea.

- Accidentally creating unbuildable revisions happens sometimes; try
not to do it, but don't freak out when it happens.

The application of those principles relies mostly on common sense.

Nick