Too much fun with “git pull –rebase”

Note:  this article refers to “git pull -r” and “git pull –rebase” interchangeably. They are the same command, except the merge-preserving variation can only be specified via the long form: git pull --rebase=preserve

Introduction

I’ve long known that “git pull –rebase” reconciles the local branch correctly against upstream amends, rebases, and reorderings. The official “git rebase” documentation attests to this:

‘Note that any commits in HEAD which introduce the same textual changes as a commit in HEAD..<upstream> are omitted (i.e., a patch already accepted upstream with a different commit message or timestamp will be skipped).’

Thanks to the git patch-id command it’s easy to imagine how this mechanism might work. Take two commits, look at their patch-ids, and if they’re the same, drop the local one.

But what about squashes and other force-pushes where git patch-id won’t help? What does “git pull -r” do in those cases? I created a series of synthetic force-pushes to find out. I tried squashes, merge-squashes, dropped commits, merge-base adjustments, and all sorts of other force-push craziness.

I was unable to confuse “git pull –rebase,” no matter how hard I tried. It’s bulletproof, as far as I can tell.

Investigating ‘git pull –rebase’

The context here is not a master branch that’s advancing. The context is a feature branch that two people are working in parallel, where either person might force-push at any time. Something like this:

The starting state. Evangeline and Gabriel are working together on branch ‘feature’. Note: ‘evangeline/feature’ is actually Evangeline’s local ‘feature’ branch, and ‘gabriel/feature’ is Gabriel’s local ‘feature’ branch. Clone it from here, or recreate it using the script below.
git init
echo 'a' > a; git add .; git commit -m 'a'
echo 'b' > b; git add .; git commit -m 'b'
echo 'c' > c; git add .; git commit -m 'c'
git checkout -b feature HEAD~1
echo 'd' > d; git add .; git commit -m 'd'
echo 'e' > e; git add .; git commit -m 'e'
echo 'f' > f; git add .; git commit -m 'f'
git checkout -b gabriel/feature
echo 'gf' > gf; git add .; git commit -m 'gf' --author='Gabriel Lajeunesse <gabriel@mergebase.com>'
git checkout -b evangeline/feature HEAD~1
echo 'ef' > ef; git add .; git commit -m 'ef' --author='Evangeline Bellefontaine <evangeline@mergebase.com>'

git push --mirror [url-to-an-empty-git-repo]

The Experiment

In each scenario Evangeline rewrites the history of origin/feature with a force-push of some kind, usually incorporating her own ‘ef‘ commit into her push. Meanwhile, Gabriel has already made his own ‘gf‘ commit to his local feature branch. For each scenario we want to see if Gabriel can use “git pull –rebase” to correctly reconcile his own work (his ‘gf‘ commit) against Evangeline’s most recent push.

Preconditions

  1. We assume Gabriel has correctly setup remote tracking for his local feature branch. This is a reasonable assumption, since git sets this up by default when a user first types “git checkout feature”.
  2. We only tested Git v2.14.1 and Git v1.7.2 for this experiment. Perhaps “git pull –rebase” behaves differently in other versions.
  3. Important: we only use “git pull –rebase” (or -r).  Some people claim “git fetch; git rebase origin/master” is equivalent to “git pull -r”, but it isn’t.

Force-Push Scenarios

For each scenario we are on Gabriel’s local branch feature. The graph on the left shows both the state of origin/feature (thanks to Evangeline’s force-push), as well as the state of Gabriel’s local feature and how it relates to Evangeline’s force-push.  The graph on the right shows the result of Gabriel typing “git pull -r”.

A scenario is deemed successful if “git pull -r” results in Gabriel’s gf‘ commit sitting on top of origin/feature.  Since Gabriel does not push back in these scenarios, his ‘gf‘ commit remains confined to his local feature branch.

  1. origin/feature rebased (against origin/master)
    This is the canonical example of why we prefer “git pull -r”.  The rebase notices that older commits ‘d‘, ‘e‘, and ‘f‘ on Gabriel’s feature branch are patch-identical to the rebased ones on origin/feature, and thus it only replays the final ‘gf’ commit.

     
     
    git pull -r
    ---------->



    Result: Success!
  2. origin/feature squash-merged (with origin/master)
    This is the rebase + squash combo meal.  Evangeline takes all work on feature, squashes it down to a single commit, and rebases it on top of origin/master.  She probably did this via “git merge –squash.” I did not expect “git pull -r” to be able to handle this, but I was wrong.

     
     
    git pull -r
    ---------->




    Result: Success!
  3. origin/feature squashed in-place
    This is the classic squash. Evangeline types “git rebase –interactive origin/master”.  In the interactive screen she marks the first commit as “pick” and every other commit as “squash” or “fixup”.  This squashes feature down to a single commit, but leaves the merge-base alone (commit ‘b‘ in this case). I also did not expect “git pull -r” to handle this one, but I was wrong here, too.

     
     
    git pull -r
    ---------->



    Result: Success!
  4. origin/feature dropped a commit
    For some reason Evangeline decided she wanted to drop commit ‘e‘ from origin/feature. She ran “git rebase –interactive origin/master” and marked every commit as “pick,” except commit ‘e‘, which she marked with “drop”.  I expected “git pull -r” to erroneously bring commit ‘e‘ back.  I was wrong.  Running “git rebase” instead of “git pull -r” did bring commit ‘e‘ back, and so there is obviously some deeper intelligence inside “git pull -r” enabling the correct behaviour here.

     
     
    git pull -r
    ---------->



    Result: Success!
  5. origin/feature lost their mind
    I have no idea what Evangeline was trying to do here.  If you look closely, you’ll see she reversed her branch (‘ef’ is now the oldest commit), she squashed the middle two commits, and she adjusted the merge-base so that origin/feature emerges from commit ‘a‘ on the mainline instead of commit ‘b‘.  This is one serious force-push!  I had no idea what to expect here.  I certainly did not expect “git pull -r” to nail it, but it did.

     
     
    git pull -r
    ---------->



    Result: Success!
  6. origin/feature went back to how things were (undoes the rewrite)
    Evangeline, either through her reflog or her photographic memory, happened to remember that origin/feature previously pointed to commit ‘325a76a.  Here she force-pushes origin/feature back to ‘325a76a‘ to undo her push from scenario 5. The command to do that is useful to know:  “git push –force origin 325a76a:refs/heads/feature”. Staring in awe at how “git push -r” did the right thing for scenario 5, all I could do was continue to stare when it did the same here. (Note: Gabriel’s start-state here is scenario 5, not the original start-state).

     
     
    git pull -r
    ---------->



    Result: Success!

Caveats

Dropped Merges: Since “git pull -r” is a rebase, it drops all local merges during reconciliation.  This is usually what you want:  why keep a bunch of pointless sync-merges around? They just add noise and no value to the commit graph. But sometimes you do want to keep a merge. When you do, you can try git pull’s merge-preserving variation: git pull --rebase=preserve

Stash and Stash-Pop:  The “git rebase” command refuses to run if your worktree is dirty, whereas default “git pull” will proceed as long as incoming changes have no conflicts with unstaged edits.   This means “git pull” will run in many situations where “git pull -r” will refuse because of the dirty worktree.  The solution it to stash and then stash-pop, either manually, or via the “–autostash” flag:  git pull -r --autostash

Conflicts, a.k.a. Rebase Hell:  Rebase hell happens when several commits on your branch edit the same area, and upstream also touched the same area.  The problem occurs because each conflict resolution will itself conflict with the subsequent commit in the series.  And since the conflict markers tend to touch the same areas again and again, it feels like an infinite hall of mirrors, and makes you lose your mind.

If you use “git pull -r”, you will eventually experience rebase hell. Here’s some tips for surviving it:

  • Single-commit branches are immune from rebase hell.  Rebasing a branch with only a single commit can trigger at most a single conflict resolution.
  • If your branch has many commits, and you find yourself in rebase hell, try aborting, squashing, and then rebase.   Squashing is always a viable way out of rebase hell.
  • Personally, I’ve never investigated the “git rerere” command, but it’s also another tool available to help with rebase hell.

Conclusion: Time To Revise The Golden Rule

Supposedly, the golden rule of git is to never force-push public branches.

Of course I would never force-push against ‘master’ and ‘release/*’.  As a git admin, that’s always the first config I set for a new repo:  disallow all rewrites for ‘master’ and ‘release/*’.

But all public branches?  I find force-pushing feature branches incredibly useful.

Industry has arrived at a compromise: defer the rewrite to the final merge. Bitbucket, Gitlab, and Github now offer “rebase” and “squash” flavours of PR merge. But it’s a silly compromise, because the golden rule itself is silly.  Instead of building complex merge machinery to dance around the golden rule, I think we’d be better served by reworking the rule itself. Three reasons:

  1. Force-pushes are useful!  Public amends, squashes, and rebases help us make better PR’s for code review.
  2. What is the actual point of the golden rule?  Are we trying to prevent lost work on the mainlines (e.g., ‘master’ and ‘release/*’)?  If that’s the point, then we’re much better off setting appropriate branch permissions on our central git server for those branches.
  3. Is the point to prevent the spaghetti graphs caused by default “git pull” behaviour?  In that case a better golden rule would be never use default “git pull” and always use “git pull –rebase”, since it avoids spaghetti graphs, while allowing history rewrites.

I propose a new golden git rule (in haiku form):

We never force-push master
or release. But always,
for all branches: git pull -r

 

Alternatively, you can make “git pull -r” the default behaviour:

git config --global pull.rebase true

 

Git graphs in this article were generated using the Bit-Booster – Rebase Squash Amend plugin for Bitbucket Server.

8 Replies to “Too much fun with “git pull –rebase””

  1. Great post!

    Today, I’ve had some observation which made me think if pull.rebase=true is really what I want as a default, though. I’m recapitulating from the top of my head, so I might miss some important details. But anyway…

    I regularly merge the latest release branch into master so that bugfixes arrive there, too. So basically just a “git checkout master && git merge release_x_y && git push”. But in the time it took me to solve some merge conflicts, someone else pushed a new commit on the master branch so that my push was rejected.

    So, I just ran “git pull” (–rebase implied because I use the pull.rebase=true setting) which removed my merge commit, so in master’s log it seemed like the merged commits from the release_x_y branch were just added on top. That was fine with me, so I pushed and this time it was not rejected.

    However, the next time when I merged the release_x_y branch into master, I’ve merged in the very same commits (plus the newer ones) from release_x_y as in the previous merge followed by the “git pull –rebase”, i.e., those commits are now reachable from master twice!

    Well, they are probably not the “very same” commits but they have the same messages. It looked like the “git pull –rebase” after my previous merge somehow changed the semantics from a merge to something similar to “cherry-pick all commits from release_x_y which are not on master”.

    Too bad I can’t check the repository before Monday to investigate a bit more. But maybe you have a clue. At least that made me think about changing my default from pull.rebase=true to pull.rebase=preserve (and what is pull.rebase=merges???)…

    1. git pull -r is my default when working with topic branches (short lived branches).

      But I would never use it for merge-backs (a.k.a. merges to bring fixes on release branches back into master. Also includes merges to bring fixes on older release branches into newer release branches).

      I consider merge-backs from a release branch into master to be a best-practice (to prevent regressions), and I think it’s ideal to use proper merge commits for these and never rebase the release branch. I never rebase long-lived branches.

      The scenario you described is probably the one major headache one can potentially encounter when running a merge-back: conflict resolution merging back into a busy master. Fortunately in my experience master is usually not that busy, and conflict resolution is pretty rare (usually merges are clean for me).

      Two suggestions:

      1. Only ever do the merge-back during a quiet time (e.g., 7pm or later) to minimize the risk of origin/master getting ahead of you!

      2. But I think you are correct and “git pull –rebase=preserve” would do exactly the right thing here AFTER the initial failed push of the conflict-resolved merge. (Don’t do “git pull –rebase=preserve” at first…. you need to complete that merge and get that merge commit into your history beforehand!).

      p.s. Thanks for your note! I never thought of “git pull -r” when doing a merge-back. Sadly, you’ve broken my haiku. Grrrrr!

      p.p.s. Sounds like your release branch was relatively small (just a couple commits off of master in the first place). Yes, the “git pull -r” went and cherry-picked them all over to master. You’re correct. And then the subsequent merge-back joined the cherry-picked history to the original history. It’s harmless as far as the code is concerned, but it’s very confusing when doing any git history work. Sadly without having an admin go in to permanently rewrite master to remove the cherry-picks (and redo the merge-back), there’s no good way to clean this up. I recommend just moving on. One day I might write a blog post about this exact scenario.

      1. Hey Julius,

        thanks for the detailed answer!

        I wonder if there’s a “git pull –do-what-i-mean” switch. Maybe –rebase=preserve already is it? Does it behave equivalent to –rebase in every case except for the situation after a merge? Or maybe an alias which checks if HEAD is a merge commit and then does git pull –rebase=preserve and just –rebase otherwise?

        And I still don’t get what –rebase=merges actually does compared to –rebase and –rebase=preserve…

        1. That’s a good idea except one of my favourite things about “git pull -r” is that it drops merges! (And it will never drop merges if “git pull –rebase=preserve” is what you always do).

          However, if you’ve setup config to make sure the default “git pull” behaviour never happens, then perhaps “git pull –rebase=preserve” is the answer.

          I think “merges” and “preserve” options are essentially the same thing. From my reading of the docs, looks like “merges” is pretty new and uses a completely rewritten merge-replay mechanism under the hood, and for now both mechanisms are available, and both have the same end goal.

          1. From which actions would I get merge commits which were good to be removed by git pull -r?

            I mean, except when I explicitly merge between long-lived branches, I never produce merge commits in my daily workflow. And I assume that –rebase=preserve doesn’t create merge commits itself but just, well, preserve or recreate already existing merge commits.

          2. Default “git pull” is the only action I can think of that does this. More of an issue when two or more people work together on the same topic branch, and the other person sometimes uses default “git pull”. Or if you sometimes work from computers that are not configured the same as your main workstation and so your normal “git pull” defaults (e.g., “rebase=preserve”) aren’t present.

            git pull –rebase=preserve (or “merges”) should never create a *new* merge commit!

  2. I agree on the wrong-config-on-foreign-workstation argument but I don’t get the more-people-on-topic-branch argument.

    Let’s assume you and me were working on a topic branch and you sometimes “git pull” without –rebase whereas I always –rebase. When I pull –rebase, it will only rebase my own unpushed commits (none of which are merge commits) on top of yours (possibly including merge commits). But it cannot remove the merge commits you’ve introduced. Or do I get something wrong?

  3. After reading your great Git V post it occurred to me that “git merge” on master (or whatever) after a rejected push of a “git merge other-branch” would create a foxtrot merge because my master is first-parent whereas the already published origin/master is second-parent, right? I wonder how “git pull –rebase=merges” or “git pull –rebase=preserve” handle that case.

    And can I somehow omit the foxtrot merge without rebase?

    git checkout origin/master
    git merge master
    git push origin HEAD:master

    Is that the right way?

Leave a Reply

Your email address will not be published. Required fields are marked *