DOING GIT PULL WRONG

DOING GIT PULL WRONG | MergeBase

Too much fun with “git pull –rebase”

Note:  this article refers to “git pull -r” and “git pull –rebase” interchangeably. They are the same command, except the merge-preserving variation can only be specified via the long form: git pull --rebase=preserve

In this article:

Introduction

I’ve long known that “git pull –rebase” reconciles the local branch correctly against upstream amends, rebases, and reorderings. The official “git rebase” documentation attests to this:

‘Note that any commits in HEAD which introduce the same textual changes as a commit in HEAD.. are omitted (i.e., a patch already accepted upstream with a different commit message or timestamp will be skipped).’

Thanks to the git patch-id command it’s easy to imagine how this mechanism might work. Take two commits, look at their patch-ids, and if they’re the same, drop the local one.

But what about squashes and other force-pushes where git patch-id won’t help? What does “git pull -r” do in those cases? I created a series of synthetic force-pushes to find out. I tried squashes, merge-squashes, dropped commits, merge-base adjustments, and all sorts of other force-push craziness.

I was unable to confuse “git pull –rebase,” no matter how hard I tried. It’s bulletproof, as far as I can tell.

Investigating ‘git pull –rebase’

The context here is not a master branch that’s advancing. The context is a feature branch that two people are working in parallel, where either person might force-push at any time. Something like this:

Git pull starting state

The starting state. Evangeline and Gabriel are working together on branch ‘feature’. Note: ‘evangeline/feature’ is actually Evangeline’s local ‘feature’ branch, and ‘gabriel/feature’ is Gabriel’s local ‘feature’ branch. R__ecreate it using the script below.

git init
echo 'a' > a; git add .; git commit -m 'a'
echo 'b' > b; git add .; git commit -m 'b'
echo 'c' > c; git add .; git commit -m 'c'
git checkout -b feature HEAD~1
echo 'd' > d; git add .; git commit -m 'd'
echo 'e' > e; git add .; git commit -m 'e'
echo 'f' > f; git add .; git commit -m 'f'
git checkout -b gabriel/feature
echo 'gf' > gf; git add .; git commit -m 'gf' --author='gabriel@mergebase.com'
git checkout -b evangeline/feature HEAD~1
echo 'ef' > ef; git add .; git commit -m 'ef'evangeline@mergebase.com'
git push --mirror [url-to-an-empty-git-repo]

The Experiment

In each scenario, Evangeline rewrites the history of origin/feature with a force-push of some kind, usually incorporating her own ‘ef‘ commit into her push. Meanwhile, Gabriel has already made his own ‘gf‘ commit to his local feature branch. For each scenario, we want to see if Gabriel can use “git pull –rebase” to correctly reconcile his work (his ‘gf‘ commit) against Evangeline’s most recent push.

Preconditions

  1. We assume Gabriel has correctly set up remote tracking for his local feature branch. This is a reasonable assumption since git sets this up by default when a user first types “git checkout feature”.
  2. We only tested Git v2.14.1 and Git v1.7.2 for this experiment. Perhaps “git pull –rebase” behaves differently in other versions.
  3. Important: we only use “git pull –rebase” (or -r).  Some people claim “git fetch; git rebase origin/master” is equivalent to “git pull -r”, but it isn’t.

Force-Push Scenarios

For each scenario, we are on Gabriel’s local branch feature. The graph on the left shows both the state of origin/feature (thanks to Evangeline’s force-push), as well as the state of Gabriel’s local feature and how it relates to Evangeline’s force-push.  The graph on the right shows the result of Gabriel typing “git pull -r.”

A scenario is deemed successful if “git pull -r” results in Gabriel’s _‘_gf‘ commit sitting on top of origin/feature.  Since Gabriel does not push back in these scenarios, his ‘gf‘ commit remains confined to his local feature branch.

  1. origin/feature rebased (against origin/master) This is the canonical example of why we prefer “git pull -r”.  The rebase notices that older commits ‘d‘, ‘e‘, and ‘f‘ on Gabriel’s feature branch are patch-identical to the rebased ones on origin/feature, and thus it only replays the final ‘gf’ commit.
DOING GIT PULL WRONG DOING GIT PULL WRONG DOING GIT PULL WRONG
Result: Success!

  • origin/feature squash-merged (with origin/master) This is the rebase + squash combo meal.  Evangeline takes all work on feature, squashes it down to a single commit, and rebases it on top of origin/master.  She probably did this via “git merge –squash.” I did not expect “git pull -r” to be able to handle this, but I was wrong.
squash merge DOING GIT PULL WRONG git pull after squash merge



Result: Success!

  • origin/feature squashed in-place
    This is the classic squash. Evangeline types “git rebase –interactive origin/master”.  In the interactive screen she marks the first commit as “pick” and every other commit as “squash” or “fixup”.  This squashes feature down to a single commit, but leaves the merge-base alone (commit ‘b‘ in this case). I also did not expect “git pull -r” to handle this one, but I was wrong here, too.
DOING GIT PULL WRONG DOING GIT PULL WRONG DOING GIT PULL WRONG


Result: Success!

  • origin/feature dropped a commitFor some reason Evangeline decided she wanted to drop commit ‘e‘ from origin/feature. She ran “git rebase –interactive origin/master” and marked every commit as “pick,” except commit ‘e‘, which she marked with “drop”.  I expected “git pull -r” to erroneously bring commit ‘e‘ back.  I was wrong.  Running “git rebase” instead of “git pull -r” did bring commit ‘e‘ back, and so there is obviously some deeper intelligence inside “git pull -r” enabling the correct behaviour here.
DOING GIT PULL WRONG DOING GIT PULL WRONG DOING GIT PULL WRONG


Result: Success!

  • origin/feature lost their mindI have no idea what Evangeline was trying to do here.  If you look closely, you’ll see she reversed her branch (‘ef’ is now the oldest commit), she squashed the middle two commits, and she adjusted the merge-base so that origin/feature emerges from commit ‘a‘ on the mainline instead of commit ‘b‘.  This is one serious force-push!  I had no idea what to expect here.  I certainly did not expect “git pull -r” to nail it, but it did.
DOING GIT PULL WRONG DOING GIT PULL WRONG DOING GIT PULL WRONG


Result: Success!

  • origin/feature went back to how things were (undoes the rewrite)
    Evangeline, either through her reflog or her photographic memory, happened to remember that origin/feature previously pointed to commit ‘325a76a.  Here she force-pushes origin/feature back to ‘325a76a‘ to undo her push from scenario 5. The command to do that is useful to know:  “git push –force origin 325a76a:refs/heads/feature”. Staring in awe at how “git push -r” did the right thing for scenario 5, all I could do was continue to stare when it did the same here. (Note: Gabriel’s start-state here is scenario 5, not the original start-state).
DOING GIT PULL WRONG DOING GIT PULL WRONG DOING GIT PULL WRONG


Result: Success!

Conclusion: Time To Revise The Golden Rule

Supposedly, the golden rule of git is to never force-push public branches.

Of course I would never force-push against ‘master’ and ‘release/*’.  As a git admin, that’s always the first config I set for a new repo:  disallow all rewrites for ‘master’ and ‘release/*’.

But all public branches?  I find force-pushing feature branches incredibly useful.

Industry has arrived at a compromise: defer the rewrite to the final merge. Bitbucket, Gitlab, and Github now offer “rebase” and “squash” flavours of PR merge. But it’s a silly compromise, because the golden rule itself is silly.  Instead of building complex merge machinery to dance around the golden rule, I think we’d be better served by reworking the rule itself. Three reasons:

  1. Force-pushes are useful!  Public amends, squashes, and rebases help us make better PR’s for code review.
  2. What is the actual point of the golden rule?  Are we trying to prevent lost work on the mainlines (e.g., ‘master’ and ‘release/*’)?  If that’s the point, then we’re much better off setting appropriate branch permissions on our central git server for those branches.
  3. Is the point to prevent the spaghetti graphs caused by default “git pull” behaviour?  In that case, a better golden rule would be never use default “git pull” and always use “git pull –rebase”, since it avoids spaghetti graphs while allowing history rewrites.

I propose a new golden git rule (in haiku form):

We never force-push master
or release. But always,
for all branches: git pull -r

Alternatively, you can make “git pull -r” the default behaviour:

git config --global pull.rebase true

Git graphs in this article were generated using the Bit-Booster – Rebase Squash Amend plugin for Bitbucket Server.

Want to learn more? Get in touch with us!

Julius Musseau

About the Author

Julius Musseau

Co-founder & Adivsor. Senior architect and developer with strong academic background and roots in the open source community. Contributor to a number of important open source projects.