Review Board, Git and patch-series

174 views
Skip to first unread message

Stephen Gallagher

unread,
Oct 31, 2011, 10:45:15 AM10/31/11
to reviewb...@googlegroups.com, revie...@googlegroups.com
I've been trying to work out for some time now how to accomplish
patch-series in Git with Review Board. This is a very important part of
my project's workflow, and the lack of this support has been preventing
us from deploying.

I think I came up with three ideas, each building on each other, to
allow this support to come to fruition gradually.

Step 1)
Modify post-review so that it will create an individual review on the
Review Board server for each patch in a branch. Post-review should also
amend the commit history of the patches in the branch to tag them with a
link to the review in Review Board. Post-review would then be made aware
of the presence of this tag, and in the case of review updates, it
should use this tag to determine which review to update.

Furthermore, post-review should attach a complete list of the patches in
review as part of the review description (this would probably require
going back after creating the initial reviews and adding this afterwards
as a comment).
Ideally, it would read something like:
http://reviews.example.com/r/1
http://reviews.example.com/r/2 <--
http://reviews.example.com/r/3

So you'd always have an easy way to identify which patch in the series
you were looking at, as well as a view on which other reviews were
related.

If the branch adds or removes patches, it should add a new comment with
the new list of patches, ordered appropriately.

Pros: Changes are almost entirely in post-review.
Cons: Could result in heavy amounts of email, since each patch would
have its own email thread.


Step 2)
Post-review should generate patches formatted with 'git format-patch -M
-C --full-index' and attach them as raw files to the review(s). There
are several ways this could be done; one patch per review, or all
patches on the first review in the series*.

As I understand it, raw file upload support is already planned for 1.7,
so this wouldn't be a major effort to implement. Mainly just extending
the changes from Step 1) to include generating and uploading the files.

* This might have issues if the first patch in a series is ever removed.
Might require some careful design.


Step 3) (Long-term ideas, some of which are taken from gerrit)
Make the Review Board server into a public git repository. Post-review
could then commit to this public repository in a private branch which
would then be used to generate the reviews for Step 1) and Step 2).
Users can then set up a remote repository with 'git remote add' to be
able to easily retrieve the changes and perform local tests.

Doing things this way would require quite a bit more work than Step 1)
or Step 2), but in the long-term, it would fit a lot more closely with
common git workflows. We probably wouldn't need to have post-review
generate the raw files at all, here. We could have Review Board itself
generate them on-request rather than storing them separately (since it
will have access to the repository directly).

Stephen Gallagher

unread,
Nov 10, 2011, 9:51:29 AM11/10/11
to reviewb...@googlegroups.com, revie...@googlegroups.com
I just thought I might mention this email again, since I saw no
responses the first time. I noticed someone else on the list today
asking about patch-sets (this time for Mercurial) and thought that maybe
some of these ideas might work there as well.

Stephen Gallagher

unread,
Apr 23, 2012, 10:46:04 AM4/23/12
to reviewb...@googlegroups.com, revie...@googlegroups.com
I thought it might be worth bumping this thread again, given Christian's
comments in the thread "Focus and Priorities". I would like to hear some
thoughts on my suggested approach here.

Christian Hammond

unread,
Apr 23, 2012, 2:38:29 PM4/23/12
to revie...@googlegroups.com, reviewb...@googlegroups.com, revie...@googlegroups.com
Hi Stephen,

Thanks for bumping this thread. I've been giving this functionality some thought lately, and here's what I'm currently leaning toward:

1) Give Review Board support for attaching multiple diffs (not just one) in any given update. It would work like today in that a user would run post-review and another user would see the latest changes, except it'd be broken up into several diffs.

2) Update post-review to support providing multiple diffs on upload automatically. There could be an option for squashing.

3) Store all the extra diff metadata and show it per-diff set in the diff viewer.

That's the first step and gets us patch series. The next would be deeper repo integration, which will benefit from all the deep hosting service work I'm doing right now for GitHub.

4) Provide an endpoint for WebHooks on different services (like GitHub) that will auto-post review requests when there are pull requests.

5) Tie into the APIs to accept/merge changes when it's possible (possibly easier for hosting services, but harder for standalone repos)

6) Add metadata for branches and make it possible for Review Board to tie in closer with a branch on an accessible repo, to make updating easier. This isn't fully thought out yet.

7) Eventually add some sort of tree browser. This gets very complicated on Git without a hosting service or without the Git repo being directly accessible.


Whatever we do, I think it's important that Review Board be itself aware of things like patchsets and DVCS workflows.

Christian

> --
> Want to help the Review Board project? Donate today at http://www.reviewboard.org/donate/
> Happy user? Let us know at http://www.reviewboard.org/users/
> -~----------~----~----~----~------~----~------~--~---
> To unsubscribe from this group, send email to reviewboard...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/reviewboard?hl=en

John J Lee

unread,
Apr 26, 2012, 7:20:21 PM4/26/12
to reviewb...@googlegroups.com
On Mon, 23 Apr 2012, Christian Hammond wrote:

> Hi Stephen,
>
> Thanks for bumping this thread. I've been giving this functionality some thought lately, and here's what I'm currently leaning toward:
>
> 1) Give Review Board support for attaching multiple diffs (not just one) in any given update. It would work like today in that a user would run post-review and another user would see the latest changes, except it'd be broken up into several diffs.
>
> 2) Update post-review to support providing multiple diffs on upload automatically. There could be an option for squashing.
>
> 3) Store all the extra diff metadata and show it per-diff set in the diff viewer.
>
> That's the first step and gets us patch series. The next would be deeper repo integration, which will benefit from all the deep hosting service work I'm doing right now for GitHub.
>
> 4) Provide an endpoint for WebHooks on different services (like GitHub) that will auto-post review requests when there are pull requests.
>
> 5) Tie into the APIs to accept/merge changes when it's possible (possibly easier for hosting services, but harder for standalone repos)
>
> 6) Add metadata for branches and make it possible for Review Board to tie in closer with a branch on an accessible repo, to make updating easier. This isn't fully thought out yet.
>
> 7) Eventually add some sort of tree browser. This gets very complicated on Git without a hosting service or without the Git repo being directly accessible.
>
>
> Whatever we do, I think it's important that Review Board be itself aware of things like patchsets and DVCS workflows.

This sounds fantastic.

Having used crucible (a commercial review system) recently, some of the
good points of RB have become clearer to me. Despite its faults, crucible
does have something that resembles patch series. But there *are* plenty
of faults: in particular there's not really a way to update a patch! You
can add new patches to a review. But that is so uncontrolled as to get
confusing quite quickly, at least without very constrained processes or
tools on top of crucible. Is a new patch part of a patch series, or is a
new version of an old patch, or something else? In crucible, it's not
clear. You can't delete or hide old patches that have comments, so it's
hard to clean up to make the review easier to understand. The best you
can do is start a new review and link it to the old one. RB handles this
in a much nicer way with its patch history, interdiffs, and dual comments
and code views (and unlike how I've found crucible, it doesn't feel slow
to browse through the code you're reviewing). Even ignoring the
hackability and not-going-away-ness of RB that you get from it being a
successful open source project, as an end user, if RB had even the first
few steps of what you describe I'd have trouble finding anything in
crucible to prefer over RB.

I wonder if the DVCS integration will allow for faster viewing of
interdiffs and squashed patches, similar to using gitk to diff between two
revisions?

One wish: I hope the increased DVCS integration doesn't lead to
inflexibility with respect to mapping DVCS concepts to patches. I want to
be able to contribute plugin or client code for reviewboard that lets me,
for example, turn a setquence of topgit branches into a patch series
review (and update that patch series review with a single postreview-like
command). Other times I might just want to upload all the commits from a
branch. And maybe more unusual and ad hoc things with a one-off script or
command invocation.


John

Christian Hammond

unread,
Jun 7, 2012, 3:40:39 PM6/7/12
to reviewb...@googlegroups.com
Hi Ryan,

None yet. I've been working on this GitHub API migration we needed to rush out, and my next task is to get RB 1.7 out the door. This work will be my priority after that.

Christian

--
Christian Hammond - chi...@chipx86.com
Review Board - http://www.reviewboard.org
VMware, Inc. - http://www.vmware.com


On Thu, Jun 7, 2012 at 6:10 AM, Ryan Raasch <ryan....@gmail.com> wrote:
Hi,

This looks great! This is the type of functionality we are looking for. The patch set functionality
is especially needed. Any progress on this? I will see how much I can read from the code, any pointers on how to start
adding this? I just thought I would give it a run... :)

Cheers,
Ryan
 

mwoehlk...@gmail.com

unread,
Dec 15, 2012, 2:13:59 PM12/15/12
to reviewb...@googlegroups.com, revie...@googlegroups.com, mwoehlk...@gmail.com
On Monday, October 31, 2011 10:45:15 AM UTC-4, Stephen Gallagher wrote:
I've been trying to work out for some time now how to accomplish
patch-series in Git with Review Board. [...]


Modify post-review so that it will create an individual review on the
Review Board server for each patch in a branch.

I actually wouldn't like this. First off, I don't see it doing anything even remotely reasonable if a change to a request is anything other than new commits at the end (e.g. rewriting a commit early in the series, or injecting commits into the middle of the branch history). I'd also be worried about how it handles rebases. Gerrit, for example, does very badly with rebases. RB currently, by treating requests as a single diff, does hugely better, since a straight rebase doesn't cause a lot of change in the overall diff.

I think the ideal solution is to slurp in both the overall diff (so that inter-revision diffs are sane), and also the per-commit diffs, with a way to drill down into a request to view the individual commits. Then to diff requests at that level, RB should be clever enough to match up commits (even if their SHA's are different, probably by some degree-of-similarity heuristic), show differences between corresponding commits, new commits in the chain, commits removed from the chain, etc... basically, diff patch sets like file trees.

Yes, it's more work (and mostly server side), but the end result is much better, and without the regressions versus the current 'squash the branch into one diff' that your approach would introduce.

Make the Review Board server into a public git repository.

I'd be careful with this... it's actually been one of the main PROBLEMS we have with gerrit. If you aren't careful, gerrit's repository gets out of sync with the actual repository, at which point both posting new requests and merging existing branches can quit working. It sounds like your thoughts for how to use this might not take it so far as to get into that trap, at least. (I think the main trick is to make it resilient against master having advanced beyond what RB considers 'latest'... or even to avoid RB having such a concept altogether, and just accept whatever SHA a request says is its upstream.)

-- 
Matthew

Matthew Woehlke

unread,
Dec 16, 2012, 1:28:46 PM12/16/12
to reviewb...@googlegroups.com, revie...@googlegroups.com
So I was thinking more about this, and came up with a vision how I think
the UI would work best.

The basic idea is to add a new option to diff display, 'view as patch
series'. The important parts are a) not available if a request is a
single commit, or is pre-commit (or e.g. repositories for which patch
series aren't available), and b) with the option OFF, things stay as
they are now. That is, the diff shows the whole request, diffs between
request revisions work as now (including good results when a rebase
occurs), etc.

With it ON, with a single request revision (i.e. versus base, not versus
a different revision of the request), the only difference is that the
file list is now a file TREE. The top-level nodes are commits, ordered
as in the commit history. The child nodes are files, as usual. There
would be some SMALL (e.g. a line of text is okay, a big block is too
much) divider between commits, which are presented in continuous pages
the way files are now. (The pagination doesn't really change, except
probably to add a bias to break pages between commits.)

When comparing revisions, commit matching is done as previously
mentioned, then each commit is compared as revision requests are
currently. Deleted or added commits would thus show the same as a diff
hunk in one request revision and not in the other. Rearranged commits
should be displayed like rearranged code.

(On that note, it would be nice if rearranged code could be improved to
show the original code opposite, for comparison, but clearly marked that
it came from elsewhere. Likewise then for rearranged commits.)

Note I haven't said anything about commit logs. I would actually ignore
those server-side in the first pass, and have post-review fake up diff
hunks as if a file (e.g.) ".gitlog" was created. For each individual
commit, this would be the log for just that commit, always as if the
file didn't previously exist (even though trying to apply such a patch
series would of course fail; the reason is so that RB always presents
the log as a strict creation), and likewise for the unified patch, but
with the entire log. This allows the log to be reviewed the same as
code, diffs of the full log against request revisions work neatly, etc.

In the long term, this could be automated server-side if there is enough
need/benefit, but the only thing that should change UI-wise would be for
the file header to be slightly different to indicate a commit log versus
an actual file in the repository. (I'm not convinced of the necessity,
however, and I can imagine it would be non-trivial to do server-side.)

Thoughts?

--
Matthew

Stephen Gallagher

unread,
Mar 12, 2013, 8:52:18 AM3/12/13
to reviewb...@googlegroups.com, revie...@googlegroups.com, Matthew Woehlke



On Sun 16 Dec 2012 01:28:46 PM EST, Matthew Woehlke wrote:
...
I was thinking it might be a good time to start bringing this discussion
up again. I know things are probably pretty intense with the 1.8 release
plans, but I know there are a lot of people interested in this topic and
I'd like to get it back into the public view.


I had an offline conversation with Christian about patch-series a month
or so ago and I think I should probably try to summarize/transcribe the
thoughts we had back then as well.



Christian:
I absolutely want to introduce patchset support. My thought is that
post-review (well, rbt post, the successor coming in RBTools 0.5) would
allow you to optionally post a series of commits as one patchset,
instead of squashing them.

Review Board would be able to differentiate patchsets vs. standard
diffs. When a review request doesn't contain any patchsets, it'd be just
like it is today. However, when a patchset is involved, we change things
a little bit.

The revision selector is still there, but a patchset now counts as a
revision. The patchset will show each and every commit, along with their
metadata (descriptions and such). They could be viewed squashed (same
view as today), or per-commit.

We'd have some intelligence to try to see if a patchset was just amended
when updated next, or if part of it was replOn Sun 16 Dec 2012 01:28:46
PM EST, Matthew Woehlke wrote:
...
aced, or what. We'd still have a new patchset entry internally, but we'd
take advantage of some of our diff data sharing to see what the common
parts are. We can then indicate to the reviewer that commits were added,
or changed, or where. That'd make it easier to see what commits actually
need to be reviewed.

Along with this, there'd be some other improvements. We'd have a field
indicating where the development branch/repository is (so that people
can pull from you, if it makes sense in their setup), and the
branch/revision where the commit was made (as part of the submitted
form, and also available to set via new RBTools commands and options).

So that's the basic thought right now. There is a lot on our plates, so
I don't know when... Might not be 1.8 at this rate, but parts of it
might hit it. We're hoping to do more frequent 1.x releases. 1.8 at this
point is planned to be a rewrite of parts of the JavaScript codebase to
make it more extensible, and add some student projects and a few other
nice new bits of UI, and I'm hoping we can release by April.




Stephen:
> We'd have some intelligence to try to see if a patchset was just
> amended when updated next, or if part of it was replaced, or what.
> We'd still have a new patchset entry internally, but we'd take
> advantage of some of our diff data sharing to see what the common
> parts are. We can then indicate to the reviewer that commits were
> added, or changed, or where. That'd make it easier to see what commits
> actually need to be reviewed.
>

That will definitely be some interesting logic, especially if patches
are split or merged. We'd probably want some heuristics to determine
when we should treat a patch as a modification of an earlier patch vs. a
whole new patch. (e.g. what happens when as part of a review, someone
asks you to split a patch into two).


> Along with this, there'd be some other improvements. We'd have a field
> indicating where the development branch/repository is (so that people
> can pull from you, if it makes sense in their setup), and the
> branch/revision where the commit was made (as part of the submitted
> form, and also available to set via new RBTools commands and options).
>

On a related note, it would be a fantastic feature if we could also tie
this into github pull requests. It looks like the github API[1] allows a
client to register for notification of pull requests. It would be
excellent to be able to automatically-generate a ReviewBoard review for
the requested branch.

Similarly, it would be pretty excellent if we could tie in a button to
push the Merge Button™, though that would necessitate adding a
permission feature on ReviewBoard to set who was allowed to perform
merges. We'd also probably want to implement a policy such as "The
review must have at least N 'Ship It!' values before it can be merged."



Christian:
> That will definitely be some interesting logic, especially if
> patches are split or merged. We'd probably want some heuristics to
> determine when we should treat a patch as a modification of an
> earlier patch vs. a whole new patch. (e.g. what happens when as part
> of a review, someone asks you to split a patch into two).

Yeah, we're going to have to play around with it, figure out what makes
sense.


> On a related note, it would be a fantastic feature if we could also
> tie this into github pull requests. It looks like the github API[1]
> allows a client to register for notification of pull requests. It
> would be excellent to be able to automatically-generate a ReviewBoard
> review for the requested branch.

We've discussed this a couple times. There's issues to work out here.
For instance, what happens if someone submits a pull request and doesn't
have a RB account? How does that work exactly? Maybe we'd need a "Log in
with GitHub" ability for it, but now we're adding new auth backend
infrastructure and possibly account linkage to the list of tasks.

There's a few integration things there we'd need to solve...

> Similarly, it would be pretty excellent if we could tie in a button
> to push the Merge Button™, though that would necessitate adding a
> permission feature on ReviewBoard to set who was allowed to perform
> merges. We'd also probably want to implement a policy such as "The
> review must have at least N 'Ship It!' values before it can be
> merged."

I have some notes in a notebook somewhere about such a button. I think
it'd be interesting to add. Again, lots of new bits to add to make that
happen.

The "at least N Ship Its!" is something that a lot of companies want,
but not all want the same logic. Some want just a total, some want from
certain groups, some from certain people, etc. I think we'd need
something more general that could have an extension for defining such
rules. I think it's separate work from the DVCS work.


There's no way the DVCS work will happen for 1.8 anyway. It's going to
be a 2.0 feature. 1.8 has a narrower scope and we want to keep it that
way so it doesn't turn into a year-long release.

Matthew Woehlke

unread,
Mar 12, 2013, 1:30:31 PM3/12/13
to Stephen Gallagher, reviewb...@googlegroups.com, revie...@googlegroups.com
Detailed comments below, but summarizing my personal short term priorities:

1. Custom field support.
2. Support DVCS revision in the changenum field.
3. Add a field for local branch name.

Of course, (1) means that (3) (and (2), sort-of) can be done with an
extension... which is one reason (1) is first, but also because we have
local need for custom fields that are probably not useful to a wide
enough audience to justify being built-in. This lack is the biggest pain
point for us right now.

I think pushing some things (especially patch set support) to 2.x sounds
reasonable, but please, *please* at least have custom field support in
1.8 :-).

(Besides, a number of the other features can be implemented to at least
some extent with better extension support. This would be a good path for
a wider developer base to work on them, and/or to 'try them out'
conceptually before bringing them into RB core... if that is even
valuable in the long term. It may end up that some features work so well
as extensions that there is no strong reason they can't just stay
extensions.)

On 2013-03-12 08:52, Stephen Gallagher wrote:
> [paraphrasing Christian:]
> I absolutely want to introduce patchset support. My thought is that
> post-review (well, rbt post, the successor coming in RBTools 0.5) would
> allow you to optionally post a series of commits as one patchset,
> instead of squashing them.
>
> Review Board would be able to differentiate patchsets vs. standard
> diffs. When a review request doesn't contain any patchsets, it'd be just
> like it is today. However, when a patchset is involved, we change things
> a little bit.
>
> The revision selector is still there, but a patchset now counts as a
> revision. The patchset will show each and every commit, along with their
> metadata (descriptions and such). They could be viewed squashed (same
> view as today), or per-commit.

So... while I understand that this is technically more feasible, I'd be
concerned with how it will handle e.g. rebases. Also, I think it will be
much more logical for reviewers to keep commits within a request
separate from revisions of the request, rather than mixing these concepts.

Basically, unless you're talking about a project with a draconian 'no
rewriting history' policy, request revisions and commits are different
(and incompatible) concepts. Trying to shoehorn the one into the other
is likely to result in cases where the user experience is less than it
would be if they were treated as being different.

> We'd have some intelligence to try to see if a patchset was just
> amended when updated next, or if part of it was replaced, or what.
> We'd still have a new patchset entry internally, but we'd take
> advantage of some of our diff data sharing to see what the common
> parts are. We can then indicate to the reviewer that commits were
> added, or changed, or where. That'd make it easier to see what
> commits actually need to be reviewed.

I still like my suggestion better; continue to upload a single patch
(which would create a single request revision), but accept patches that
are really 'git am' chains. Then detect these server-side and add a
/second layer/ with the option to inspect individual patches.

This should still be able to take advantage of diff caching and similar.
Detection of added/removed/reordered patches can then be done using the
usual diff mechanisms on the complete patch files (some heuristics will
be needed to interpret the results, but the actual comparison can use
existing algorithms).

> Along with this, there'd be some other improvements. We'd have a field
> indicating where the development branch/repository is (so that people
> can pull from you, if it makes sense in their setup), and the
> branch/revision where the commit was made (as part of the submitted
> form, and also available to set via new RBTools commands and options).

I expect I will rehash the above paragraph, but... IMO, minimum
requirements for decent DVCS support are two branch fields: one the
merge target branch (i.e. 'master' most of the time), and the name of
the head as pushed to the remote. Also a third field with the actual SHA
of the request (I would prefer to have this in the existing changenum
field; it seems to equate with what that field is intended to be).

I also wouldn't be worried about multiple repository support initially;
even some public open source projects use single repositories with
either branch-level commit access or just plain trusting people... and
this is probably the more common case (in fact, I would expect not
having a single canonical repository for code sharing to be rare) in
corporate settings.

> On a related note, it would be a fantastic feature if we could also tie
> this into github pull requests. It looks like the github API[1] allows a
> client to register for notification of pull requests. It would be
> excellent to be able to automatically-generate a ReviewBoard review for
> the requested branch.

I wouldn't even try to do this in RB proper in the short term (if ever);
it's too obviously excellent fodder for an extension or external tool...
which means it can be developed in parallel without affecting RB's
release schedule.

The only possible reason I can think of why it would ever need to be
fully integrated is if it is not otherwise possible to provide
reasonable security (i.e. if it would not be possible to secure access
to the well-permissioned user that the tool would need to have to talk
to RB).

> Similarly, it would be pretty excellent if we could tie in a button to
> push the Merge Button™, though that would necessitate adding a
> permission feature on ReviewBoard to set who was allowed to perform
> merges. We'd also probably want to implement a policy such as "The
> review must have at least N 'Ship It!' values before it can be merged."

For the record: this is one of the main reasons people give why RB is
not an acceptable alternative to gerrit.

> The "at least N Ship Its!" is something that a lot of companies want,
> but not all want the same logic. Some want just a total, some want from
> certain groups, some from certain people, etc. I think we'd need
> something more general that could have an extension for defining such
> rules. I think it's separate work from the DVCS work.

This sounds like a good excuse to develop the extension system to allow
an extension to implement the logic. Probably this just needs an
internal field if a request can be merged that can be set by the extension.

--
Matthew

Christian Hammond

unread,
Mar 12, 2013, 2:57:50 PM3/12/13
to revie...@googlegroups.com, Stephen Gallagher, reviewb...@googlegroups.com, revie...@googlegroups.com
#1 will absolutely happen for 1.8. It's in progress now.

One of our students is writing support for specifying a revision and branch a change was pushed as, after you close a review request. That should make 1.8.

Christian
> --
> Want to help the Review Board project? Donate today at http://www.reviewboard.org/donate/
> Happy user? Let us know at http://www.reviewboard.org/users/
> -~----------~----~----~----~------~----~------~--~---
> To unsubscribe from this group, send email to reviewboard...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/reviewboard?hl=en
> --- You received this message because you are subscribed to the Google Groups "reviewboard" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to reviewboard...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Matthew Woehlke

unread,
Mar 12, 2013, 4:00:47 PM3/12/13
to reviewb...@googlegroups.com
On 2013-03-12 14:57, Christian Hammond wrote:
> #1 will absolutely happen for 1.8. It's in progress now.

Awesome; thanks!

> One of our students is writing support for specifying a revision and
> branch a change was pushed as, after you close a review request. That
> should make 1.8.

Er... "after you close"? Goodness, I've been doing that for months with
rbt and git-rb.

They really should be present when a diff is uploaded. (The branch name
is easy enough to do in a custom field. The SHA I'd prefer to be able to
stick in changenum.)

...or do I misread something?

--
Matthew
Reply all
Reply to author
Forward
0 new messages