Tuesday 15 November 2011 — This is over 14 years old. Be careful.

A Mercurial challenge today: we have a repo (A) that we cloned (B) a year ago for a client. Work has been progressing in both A and B. Now we wanted to take all the work in repo A, and make it available in repo B as a new branch. I took a few stabs at it myself, but kept getting confused, so I asked in the #mercurial IRC channel on freenode.

The answers were enlightening to me.

Initially I had tried pulling from repo A into repo B. This was disconcerting, because all of the changes from A appeared in the default branch in repo B. I wanted them to be sequestered off into a new branch.

I learned that just because a changeset is in the default branch, that doesn’t mean that its effect will be in your tree if you update to default. A branch name in Mercurial is nothing more than a string label associated with a changeset. What we developers think of as “a branch” is a line of development separate from other lines. But in Mercurial, a “branch” is simply the set of changesets labelled with the same branch name. There’s no requirement that this set of changesets form a line, or in fact, that the changesets have any relationship to each other.

The Mercurial branch concept maps well onto the common meaning of branch because changesets inherit branch names from their parent. Once you set a branch name on a changeset, it tends to trickle down the lineage of changesets, naturally labelling a line of development. But in strict terms, this is a convenient coincidence: the Mercurial concept of branch (set of changesets with the same branch label) and the common meaning (a line of changesets deriving from each other) coincide.

Another import thing: Mercurial changesets are immutable, and the branch label is part of the changeset. So there’s no way to take a changeset from A’s default, and put it into another branch in B.

When I pulled changesets from A into B, since the changesets in A had been part of “default”, when they landed in B, they were also in “default”. But that doesn’t mean they were magically part of the ancestry of B’s tip. When I looked at the log, I saw recent changesets on “default”, but there were really two distinct lines of development, both labelled “default”.

Here are the directions for the right way to accomplish this merge. To prepare, note the current tip revisions of repo A and B, call them TipA and TipB. Then, all the actual work happens in repo B. First, pull all of the changes from repo A into repo B, and update to the tip of A:

$ hg pull Repo_A
$ hg update Tip_A

At this point, we’re in the state that scared me: A pile of new changes have appeared in your “default” branch. Don’t panic: they aren’t really in your main line of development.

To make the repo make sense, we’ll label A’s default line as a new branch:

$ hg branch New_Branch_Name
$ hg ci -m "Created New_Branch_Name from latest on Repo_A"

We’ve labelled the tip with a new branch name and checked it in. This means people who want to work with the newly-pulled changes from A can use the branch name, while others can stay with B’s default. Notice that this doesn’t change the branch name on all of those A changes (nothing can, changesets are immutable), but at least we have a name for the tip of that line of development.

We’re all set now, except for one thing: if someone updates to “default” now, they won’t get what they expect. When updating to a branch name, they changeset used is the newest one carrying the branch name, and newest doesn’t mean the most recent date, it means the last one added to the repo. Since we just pulled in all those A changes on “default”, one of them will be the tip of “default”, even though we just created a new branch to represent them.

To fix this, we switch back to B’s default branch, and commit a change there:

$ hg update Tip_B
(.. edit something ..)
$ hg ci -m "The new tip to work from after all those Repo_A changes"

Now this change is the most recent “default” changeset, so all is good: B’s default line of development is still “default”, and A’s changes are available on a new branch.

I’ve been using Mercurial for a long time, but never understood things at this level. What had seemed bizarre and confusing makes so much more sense when the fundamentals are clear! Thanks, Ry4an and timeless in #mercurial.

Comments

Ed Davies 4:58 AM on 16 Nov 2011

Thanks for this. I've read various descriptions of branches (as in branch names rather than just different lines of development) in Mercurial and not been able to make sense of them so never used them. Perhaps realizing they're just tags on arbitrary collections of commits will help if I go back and look again.

John 7:44 AM on 16 Nov 2011

Good post. Another thing to note is that you could have made that new branch in repo A before you pulled it. For example "cd repo_A" "hg branch New_Branch_Name" "hg commit" all before pulling into repo_B.

If you plan on doing more work in repo_A this may in fact be better because you'll know all your new changes will be marked with the proper branch name with respect to repo_B. Otherwise, you'll end up making new changesets on the default branch, and you'll have to go through this whole process again. Pulling from repo_B into repo_A and then updating to New_Branch_Name before making any new changes there will also have a similar effect.

Aron Griffis 11:16 AM on 16 Nov 2011

Ned, have you seen this article?. That was helpful to me, though the redundancy between bookmarks and named branches is just annoying. It's one of the few areas where I think git is actually less confusing than mercurial.

Remigiusz 'lRem' Modrzejewski 11:50 AM on 16 Nov 2011

On the other hand, in Fossil branch names are not part of check-ins, but of tags associated with them. While check-ins are immutable, you can edit tags at any point (although a trace of that stays in history forever). Thus in your case you would simply rename the branches at the point of branching and get a tree that's what you wanted.

Advanced Mercurial branches

Comments

Add a comment: