I had to clean up a messy git branch. Revert commits helped.

Let’s say you have a long-lived git branch. Most of the changes should be merged back to main, but some of the changes were already cherry-picked from main, and some of the changes shouldn’t be put onto main at all. How do you review the branch and merge it?

Here’s a diagram of a simple example. The main branch at the top has seven commits. Beneath that is our work branch with three commits, of the three different kinds: W is important work we need to end up on main, M is a commit we cherry-picked from main, and X is a temporary tweak that we don’t want to end up on main:

If we make a pull request from our work branch, GitHub will show a diff that includes all three commits W, M, and X. It was a surprise to me that M was included: it’s not a change that will happen if we merge the work branch, because M is already on main. GitHub doesn’t show you a diff between your branch and main, it shows the diff since your branch diverged from main: it shows all of the commits on your branch. This makes it hard to assess what a merge will do if the branch has cherry-picked commits.

And of course the pull request diff includes X, since that would be a change to main if we merge the work branch. But we don’t want X in the merge, and we don’t want to be distracted by M when reviewing the pull request. What should we do?

The answer is to use the “git revert” command to add commits to the branch that undo M and undo X. We show those as -M and –X:

Now the diff will show only W, great! The –X commit is perfect, it will prevent X from merging to main. But what about –M? What will happen when we merge that? I was concerned that it would undo the M commit on main. But it doesn’t.

A git merge compares two snapshots of the repo and combines them. In this case, the changes from M are on the main branch, and no trace of them are on the work branch, so M is fine, and remains on main after the merge. The merge does just what we want. It brings the W changes onto main, and I’ve named it wM to indicate that:

Some other points here:

Why not just merge the branch after the W commit? This is a simplified example for illustration. The real branch that sent me down this path has dozens of commits intermixed.
GitHub has three different ways to finish a pull request (merge, squash, rebase). This technique of using reverts to hide cherry-picked changes and avoid unwanted changes applies to all of them.
Although our merge only adds the W changes to main, the history will show the complete work branch, including our revert commits. If you wanted it a little cleaner, you could leave out the –M reverts before merging. The result will be the same with or without them.
If you want you can also make a new branch for the revert commits to keep the work branch pristine:

Finally, the way to get the cleanest history is to create a new branch and rebase the commits we want before merging. This could be a lot of work, and some people will object to misrepresenting the actual history of commits. Git gives you plenty of tools to do it as you prefer.

Comments

Adam Silkey 11:06 PM on 15 Oct 2024

Finally, the way to get the cleanest history is to create a new branch and rebase the commits we want before merging. This could be a lot of work, and some people will object to misrepresenting the actual history of commits.

This is such an interesting comment, because I think that people who object to ‘misrepresenting the actual history of commits’ are not using source control to its true potential - particularly with Git, where rebasing is such a powerful tool.

And, before you think I’ve started another talk based on something you’ve brought up, I’ve been advocating for and teaching better git usage for years.

Git 201 - Moving past -am "Really REALLY fix" coming soon to a blogging/speaking/something platform near you!

Ned Batchelder 5:59 AM on 16 Oct 2024

@Adam: I look forward to it!

Cleaning up a messy branch

Comments

Add a comment: