Cleaning up a messy branch

Saturday 21 September 2024

Let’s say you have a long-lived git branch. Most of the changes should be merged back to main, but some of the changes were already cherry-picked from main, and some of the changes shouldn’t be put onto main at all. How do you review the branch and merge it?

Here’s a diagram of a simple example. The main branch at the top has seven commits. Beneath that is our work branch with three commits, of the three different kinds: W is important work we need to end up on main, M is a commit we cherry-picked from main, and X is a temporary tweak that we don’t want to end up on main:

ABMCDEFWMX

If we make a pull request from our work branch, GitHub will show a diff that includes all three commits W, M, and X. It was a surprise to me that M was included: it’s not a change that will happen if we merge the work branch, because M is already on main. GitHub doesn’t show you a diff between your branch and main, it shows the diff since your branch diverged from main: it shows all of the commits on your branch. This makes it hard to assess what a merge will do if the branch has cherry-picked commits.

And of course the pull request diff includes X, since that would be a change to main if we merge the work branch. But we don’t want X in the merge, and we don’t want to be distracted by M when reviewing the pull request. What should we do?

The answer is to use the “git revert” command to add commits to the branch that undo M and undo X. We show those as -M and –X:

ABMCDEFWMX–M–X

Now the diff will show only W, great! The –X commit is perfect, it will prevent X from merging to main. But what about –M? What will happen when we merge that? I was concerned that it would undo the M commit on main. But it doesn’t.

A git merge compares two snapshots of the repo and combines them. In this case, the changes from M are on the main branch, and no trace of them are on the work branch, so M is fine, and remains on main after the merge. The merge does just what we want. It brings the W changes onto main, and I’ve named it wM to indicate that:

ABMCDEFWMX–M–XmW

Some other points here:

  • Why not just merge the branch after the W commit? This is a simplified example for illustration. The real branch that sent me down this path has dozens of commits intermixed.
  • GitHub has three different ways to finish a pull request (merge, squash, rebase). This technique of using reverts to hide cherry-picked changes and avoid unwanted changes applies to all of them.
  • Although our merge only adds the W changes to main, the history will show the complete work branch, including our revert commits. If you wanted it a little cleaner, you could leave out the –M reverts before merging. The result will be the same with or without them.
  • If you want you can also make a new branch for the revert commits to keep the work branch pristine:
  • ABMCDEFWMX–M–XW
  • Finally, the way to get the cleanest history is to create a new branch and rebase the commits we want before merging. This could be a lot of work, and some people will object to misrepresenting the actual history of commits. Git gives you plenty of tools to do it as you prefer.

Comments

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.