Re: [sympy] Merging instead of rebasing

67 views
Skip to first unread message

Aaron Meurer

unread,
Jul 15, 2015, 2:13:47 PM7/15/15
to sy...@googlegroups.com


On Tue, Jul 14, 2015 at 10:32 AM, Joachim Durchholz <j...@durchholz.org> wrote:
Am 14.07.2015 um 16:39 schrieb Jason Moore:
It wasn't ignored, I just don't see it. All I see are detailed agreements
or counters to each point that has been mentioned to be negative about
rebasing.

Indeed, I misrepresented that a bit.
I can't hope to discuss solution details if we don't even agree on the analysis. Even less if the solution isn't 100% complete yet.

Here is my two sentence solution:

Rebasing has enough substantial negative effects on contributions that we'd
like to avoid encouraging it and using it in SymPy development. The few
benefits that rebasing offers are not worth the cost of the loss
contributions.

Can you write a two sentence solution to solving the loss of contributions
due to git kung fu issues? I'm happy to read it if so.

1) I think the negative effects can be nullified by giving people a tried-and-true, undoable git workflow ("I think" is what I meant with "incomplete" above).

If all you care about is the ability to undo things, git revert works just fine. It has far fewer pitfalls than rebasing, and maintains the record that the thing was reverted.

2) Rebasing is the only way to clean up a PR that has undergone several rounds of review.

This clean up is very often unnecessary, and indeed, detrimental to anyone who wants to study the git history to see how things developed (and if you don't care about that, then the history doesn't matter anyway).

You shouldn't think of the git history as something that you should clean up and make nice before publishing. It's a record of your thought process. You can't go back and change your thought process. The final published item is the code itself (we don't include any metadata from git in the files we release on PyPI). 

Changing the history of your revisions is detrimental to the open philosophy that you should have when developing in open source. We should not be afraid to make mistakes, and even have it in a permanent record that we made those mistakes. Good open source software, certainly SymPy, is built in the bazaar, not the cathedral. 

Aaron Meurer


My two-sentence position on the current official policy:

A) Rebasing is indeed a more advanced use of git, so it should never be requested, and recommended only with a reference to the explanation of the workflow.
B) The current official policy is too strict, the justifications are either bogus or can be avoided.

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/55A52B99.70000%40durchholz.org.

For more options, visit https://groups.google.com/d/optout.

Ondřej Čertík

unread,
Jul 15, 2015, 2:23:55 PM7/15/15
to sympy
On Sun, Jul 12, 2015 at 2:53 AM, Joachim Durchholz <j...@durchholz.org> wrote:
> Am 12.07.2015 um 00:17 schrieb Jason Moore:
>>
>> FYI: We had a discussion at SymPy on how to synchronize with master and
>> decided that we no longer want to allow or encourage rebasing after a pull
>> request is made to the main SymPy repo.
>
>
> I think that's overstated.
>
> I see some reasons against merging:
>
> A) A history with merges is harder to read. You need to track multiple
> parallel developments, and that's particularly hard if merge conflicts were
> resolved (because some of the code you see didn't get into master, at least
> not in the form it's committed initially).
>
> B) The results from `git bisect` become harder to read if a bug stems from
> the combination of two changes that were made in parallel branches: the
> bisect will point at the merge commit, but the actual problem happened in
> some earlier commit.
> The more parallel branches in existence, the more prominent this will
> become.

I agree with this --- many times there is just a few lines change,
that took 50 commits to get done, because the author was learning how
to make it work.

For these cases I recommend to send a new PR with a polished (=rebased) history.

Then there is no problem.

Ondrej

>
>> The main reason is that we stall too many new contributors by forcing them
>> to do too much git kung fu.
>
>
> This problem can be mitigated via interactive rebasing, but I agree it's an
> extra hoop to jump through and cannot be asked of everyone.
>
>> But there are additional important reasons too:
>>
>>
>> - Once a pull request is "public" the history should not change because
>> other people may have pull it.
>
>
> Does anybody ever pull from a pull request?
> I can see that happening when collaborating on a fix, but it's not something
> affecting newbies.
>
>> - Rebasing with merge conflicts is much more difficult
>
>
> Only if you try to rebase everything in a single go. You may end up
> resolving the same conflict over and over for each commit, and that can get
> confusing very quickly, and it's also annoying and a lot of needless work.
>
> However, there is a process that's actually quite easy:
> 1) `git rebase --interactive` first; this is a rebase *within your working
> branch*.
> I also squash commits that were a continuation of previous work. I.e. if I
> have a sequence of 1.code - 2.reformat - 3.code - 4.bugfix - 5.reformat -
> 6.code - 7.bugfix - 8.reformat, I like to rearrange this to 1.code - 3.code
> - 4. bugfix - 6.code - 7.bugfix - 5.reformat - 7.reformat, and then I squash
> this into two commits, one with 13467 for the coding and 57 for
> reformatting.
> If this still is confusing, this can be broken down into a series of
> primitive steps: (a) swap two commits, leaving all others alone, (b) merge
> two adjacent commits, leaving all others alone; repeat until the history is
> rewritten in a way that's easy to read and review.
> 2) `git fetch master && git rebase master`. This will expose the merge
> conflicts, but now it's far less commits, and it's also giving you much less
> pain because most conflicts will apply just in one commit.
>
>> - Reviewers are no longer able to see commit by commit changes in a review
>> process.
>
>
> Only if you squash everything into a single commit.
> Reordering and squashing commits makes sense if some work jumped between
> multiple things, such as reformatting, fixing docstrings, and actual code
> changes (and if the code changes touched multiple loosely-related points, it
> may make sense to keep these in separate commits anyway).
>
>> - Github does not manage the diffs in a PR review if a rebase happens.
>
>
> You mean the "click here to see outdated diffs" thing?
> Sure, part of the discussion are removed one more click, which means that
> people won't know to look there and usually won't find them.
> However, if a discussion is relevant for posteriority, it should have been
> merged into the commit in some way, either into the commit comment or in a
> code comment or docstring. After all, people who're looking at `git
> annotate` or `git log` aren't going to see these discussions either.
>
>> If you want to rebase then you should do it before you make a pull
>> request,
>> either locally or on your fork's branch.
>
>
> Oh, now I see what's the concern - indeed, rebasing on the main repo is bad,
> very bad, and a real no-go. But that's a non-issue!
>
> For pull requests, they're on separate branches, and people don't work on
> the PR branch of anybody else (unless they explicitly have agreed to
> cooperate in that way, in which case the work branch should indeed not be
> rebased anymore unless all downstream collaborators agree).
>
> Newbies don't even have the permissions to modify the main repo.
>
> The one point where I see strong language against rebasing justified is when
> merging into master on the main repo. This is merge, and merge only, and
> should be written in bold letters into the project maintenance guidelines.
> For people who contribute pull requests, I do not think that applies.
>
> YMMV, but I'd be interested to hear why.
>
> Regards,
> Jo
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/55A22B0B.20405%40durchholz.org.

Jason Moore

unread,
Jul 15, 2015, 2:46:30 PM7/15/15
to sy...@googlegroups.com
Yes I agree, if your history is so heinous, and bothers you so much, and you really really want to rebase, just make a new PR for pure merging after the review is done (or discuss with you reviewers about it). But otherwise it doesn't really matter. For example, if git didn't have a rebase command, then everything would work just fine and we would not even care about this issue. Git works perfectly fine without the rebase command, it is a complete functioning and useful tool without it. Cleaning the history is 99% of the time an unnecessary step. The arguments for cleaning the history are very superficial and really amount to "I don't like that the history is messy.". There is very little gain from rebasing but the overhead is high as we've shown. The rebase command does not actually solve any problems, it just introduces them. The main goal here is that reviewers should not recommend rebasing so I'm not fond of Matthew's suggestion as it is written because it removes that guideline. We've also said that you can rebase your own PR if you discuss that with your reviewers. So any rebasing purists can work that out on their PRs.

I suggest that we let the proposal stand and that Jo can document all of the cases over the next several months where rebasing was necessary and where this guideline caused more problems that it solved. Once Jo builds his case with real data he can propose a new guideline and then we can discuss. Is that sufficient to move forward here?

AMiT Kumar

unread,
Jul 15, 2015, 2:54:41 PM7/15/15
to sy...@googlegroups.com
> Changing the history of your revisions is detrimental to the open philosophy that you should have when developing in open source. We should not be afraid to make 
> mistakes, and even have it in a permanent record that we made those mistakes. Good open source software, certainly SymPy, is built in the bazaar, not the 
> cathedral. 

Well said, I think, this point won the argument!


> I suggest that we let the proposal stand and that Jo can document all of the cases over the next several months where rebasing was necessary and where this 
> guideline caused more problems that it solved. Once Jo builds his case with real data he can propose a new guideline and then we can discuss. Is that sufficient to 
> move forward here?

+1

AMiT Kumar

Joachim Durchholz

unread,
Jul 15, 2015, 3:02:10 PM7/15/15
to sy...@googlegroups.com
Am 15.07.2015 um 20:23 schrieb Ondřej Čertík:
> I agree with this --- many times there is just a few lines change,
> that took 50 commits to get done, because the author was learning how
> to make it work.

That too, yes.

> For these cases I recommend to send a new PR with a polished (=rebased) history.
>
> Then there is no problem.

Ah, nice. Sounds like a good alternative approach.

Joachim Durchholz

unread,
Jul 15, 2015, 3:45:11 PM7/15/15
to sy...@googlegroups.com
Am 15.07.2015 um 20:54 schrieb AMiT Kumar:
> Well said, I think, this point won the argument!

Actually it didn't, each sentence is easily refuted:

>> Changing the history of your revisions is detrimental to the open
>> philosophy that you should have when developing in open source.

Open philosophy and open source are strictly orthogonal.

There are automotive companies that do "glass-pane production", i.e. you
can look into the actual production process; yet their products are as
proprietary as their legal departments can make them (e.g. apply "design
copyright" so nobody can sell a replacement mirror).

The converse is available as well: strictly open-source libraries of
highest quality that are developed mostly behind closed doors, witness
Google's Guava library (you *can* get your changes in, but it's a major
undertaking and you don't see the team's internal priorities so you
don't even know where a contribution would be welcome).

>> We should not be afraid to make
>> mistakes, and even have it in a permanent record that we made those
> mistakes.

It's also not about showing mistakes to the public. It's about making
the commit log more useful.

>> Good open source software, certainly SymPy, is built in the
>> bazaar, not the cathedral.

It's not an opposite, it's a trade-off. Bazaar tends to favor quantity
over quality, cathedral is the reverse.

Case in point is Linux, according to Con Kolivas of kernel scheduler
fame: http://ck-hack.blogspot.be/2010/10/other-schedulers-illumos.html
and also according to the list of CVEs for Linux (bazaar) vs. OpenBSD
(cathedral).

>> I suggest that we let the proposal stand and that Jo can document all of
>> the cases over the next several months where rebasing was necessary and
>> where this
>> guideline caused more problems that it solved. Once Jo builds his case
>> with real data he can propose a new guideline and then we can discuss. Is
>> that sufficient to
>> move forward here?
>
> +1

We'd just be arguing about whether rebasing in what particular case
would have helped or not.
Aaron's stance is that the commit log should be a
stream-of-consciousness thing. I see no value in that, and actually
think it's damaging (I found the log to be less than helpful to find out
what was changed and why), but essentially it's priorities and
preferences, both subjective, and there's no point in arguing that.

Regards,
Jo

Matthew Brett

unread,
Jul 15, 2015, 7:47:28 PM7/15/15
to sympy
Hi,
I agree the arguments against rebasing are overstated, but it sounds
like you have a compromise that could work - get agreement rebase to a
new PR...

Cheers,

Matthew

Ondřej Čertík

unread,
Jul 15, 2015, 8:22:46 PM7/15/15
to sympy
On Wed, Jul 15, 2015 at 12:46 PM, Jason Moore <moore...@gmail.com> wrote:
> Yes I agree, if your history is so heinous, and bothers you so much, and you
> really really want to rebase, just make a new PR for pure merging after the
> review is done (or discuss with you reviewers about it). But otherwise it
> doesn't really matter. For example, if git didn't have a rebase command,
> then everything would work just fine and we would not even care about this
> issue. Git works perfectly fine without the rebase command, it is a complete
> functioning and useful tool without it. Cleaning the history is 99% of the
> time an unnecessary step. The arguments for cleaning the history are very
> superficial and really amount to "I don't like that the history is messy.".
> There is very little gain from rebasing but the overhead is high as we've
> shown. The rebase command does not actually solve any problems, it just
> introduces them. The main goal here is that reviewers should not recommend
> rebasing so I'm not fond of Matthew's suggestion as it is written because it
> removes that guideline. We've also said that you can rebase your own PR if
> you discuss that with your reviewers. So any rebasing purists can work that
> out on their PRs.

Yes, once the PR is up and people start reviewing, and I start
following the author's branch and have it checked out (and possibly
have some commits of my own in there), then a rebase is a tremendously
impolite thing to do, as it screws my own git repo, and it also screws
the PR itself, as the old discussion is suddenly hanging there without
commits and it is not clear what changes were made. So for these
reasons, we should say no rebase after you submit a PR for review.


That being said, once you become experienced developer, you should
take pride in submitting a nice set of polished patches, with good git
commit logs, each patch logically self consistent, so that it is a
pleasure to look at your work, and easy to review for reviewers. And
for that you need rebase. There is high value in having good commit
history, for debugging or just learning what was done. So I agree with
Jo on this. But the point is that you do this before you post a PR.
From that point on, you just keep committing and merging.


And for cases when the history really gets messy, you just send a new
PR with polished commits. This happens for new authors, but sometimes
I do that myself. I see no harm in submitting a new polished PR and
closing the old one.


So this should be our recommendation. Jason already agreed to the
above. Yo and others, do you also agree with the above?

Ondrej

Aaron Meurer

unread,
Jul 15, 2015, 11:33:00 PM7/15/15
to sy...@googlegroups.com
Actually my primary argument is that rebasing adds unnecessary complexity to git, which is already hard enough to use. That's why the primary point of this policy is to not tell other people to rebase, especially people who are new to git. 

If you still want to rebase yourself, you may make things unnecessarily difficult for your reviewers, and you may shoot yourself in the foot, but no one can stop you. We do highly recommend that you don't, though, because you will make things harder for your reviewers, and you will shoot yourself in the foot. I personally have very little motivation to review most branches where every new commit comes in as a full rebase. And if I've warned you against rebasing a hundred times and you shoot yourself in the foot, you won't get my sympy, as it's said in some places on the internet. 

Also, if you're a habitual rebaser and you have push access, you had better have your origin remote set to read-only. 

Aaron Meurer


Regards,
Jo


--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.

Joachim Durchholz

unread,
Jul 16, 2015, 4:58:01 AM7/16/15
to sy...@googlegroups.com
Am 16.07.2015 um 05:32 schrieb Aaron Meurer:
> Also, if you're a habitual rebaser and you have push access, you had better
> have your origin remote set to read-only.

Very much agreed.
Maybe not just for habitual rebasers, merging with conflicts can go
wrong, too.

I hope to set up my own projects so that committers will have two
workdirs, one for day-to-day work preparing PRs and one for working on
the main branch.
The former can have its push access disabled; for the latter, I hope
there's a way to disable rebasing and merge conflict resolution.

Jason Moore

unread,
Jul 16, 2015, 11:36:00 AM7/16/15
to sy...@googlegroups.com
Jo,

Please stop derailing the conversation and answer our questions. In particular, answer Ondrej's question:


> So this should be our recommendation. Jason already agreed to the above. Yo and others, do you also agree with the above?

You are so focused on finding minor counter arguments to minor details that you are wasting all of our time and energy. The more you do this, the less we like you.

We are trying to solve a problem and you are bogging that down. Very little of what you have written, if anything at all, has helped us solve the problem at hand. I'm not sure you can even state the problem because you are so focused on defending the details of rebasing.

All I can gather is that you want us to encourage and promote rebasing which is the exact opposite of this proposed guideline. All of us believe that the overhead of rebasing is too high. We are willing to give up the minor benefits of a superbly clean git log to ensure that we have minimal overhead for contributions. Let me repeat, "you can still rebase your PR if your reviewers agree" and "if absolutely necessary you can submit a new 'clean' PR".

Can we move forward here and close this thread?

It is obvious that you want the last word and that you believe your personal needs and opinions trump that of the groups. This kind of behavior is not productive and is not gaining you any allies. You should seriously think deeply about this and think about changing your behavior.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.

Joachim Durchholz

unread,
Jul 16, 2015, 12:04:08 PM7/16/15
to sy...@googlegroups.com
> The more you do this, the less
> we like you.

Frankly, I do not care whether you like me. I care about code.
Also, I do not believe anybody gave you a mandate to tell me that they
don't like me anymore.

You accuse me of not helping moving stuff forward.
Ironically, you've been bogging this discussion down with personal
allegations about my stance, my motives, and my understanding,
detracting from factual issues. You complained I write too much, and I
tried to stick with two-sentence arguments; now you're simply ignoring
their factual content, and write a huge message that's 90% allegations
about my motives, my aims, and my understanding, i.e. the exact opposite
of what you've been demanding of me.

You're a hypocrite, sir.
Have a nice day.

Aaron Meurer

unread,
Jul 16, 2015, 12:22:11 PM7/16/15
to sy...@googlegroups.com
OK, guys, this thread has reached its end. If someone has a specific question or issue about rebasing on a specific pull request, let's discuss it there. 

If someone has any comments or concerns about the side discussion that Jason and Joachim are having please email me off list. 

Aaron Meurer

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.

Joachim Durchholz

unread,
Jul 16, 2015, 12:57:53 PM7/16/15
to sy...@googlegroups.com
Am 16.07.2015 um 18:21 schrieb Aaron Meurer:
> OK, guys, this thread has reached its end.

Fully agreed.
With 20/20 hindsight, I'd say it did many messages ago.

I'm not going to take 100% responsibility for this, but enough that I
want to offer my apologies.

Regards,
Jo

Jason Moore

unread,
Jul 16, 2015, 3:58:09 PM7/16/15
to sy...@googlegroups.com, Joachim Durchholz
Jo,

I've looked over the emails in this thread and I see that I said things several times that were too personal. I let my irritation and anger get the best of me. Those instances unfortunately tainted the core message I was trying to convey and hurt the situation instead of helping it. I apologize for anything that you feel was a personal attack. But I do stand by what I was trying to communicate wrt to merging vs rebasing, user contribution overhead, productive communication/decision making, and group needs over individual needs. I'm sorry I was ineffective at communicating that and calling you out instead of the issues at hand.

Sincerely,
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.

Aaron Meurer

unread,
Jul 16, 2015, 4:06:45 PM7/16/15
to sy...@googlegroups.com, Joachim Durchholz
Thank you for sending this Jason. It's important to not let personal frustrations spill over into personal attacks, or into what can be viewed as a personal attack. 

Joachim, thank you for your contributions to SymPy and for your input on this thread. You are of course free to do whatever you want, but I want you to know that your contributions to SymPy are always welcome. 

Aaron Meurer

Reply all
Reply to author
Forward
0 new messages