I use a lot of git aliases because I work in the terminal and aliases give me
short commands for common operations. They are defined in my
global git config file and range from simple to powerful
but twisty.
First, some basic aliases for operations I do often:
[alias]
br = branch
co = checkout
sw = switch
d = diff
di = diff
s = status -s -b --show-stash
These are simple, but others could use some explanation.
Committing
I have a few aliases for committing code. The “ci” alias provides the
default option “--edit” so that even if I provide a message on the command line
with “git ci -m”, it will pop me into the editor to provide more detail. “git
amend” is for updating the last commit with the latest file edits I’ve made, and
“git edit” is for updating the commit message on the latest commit:
[alias]
ci = commit --edit
amend = commit --amend --no-edit
edit = commit --amend --only
Returning home
I work in many repos. Many have a primary branch called “main” but in some
it’s called “master”. I don’t want to have to remember which is which, so I
have an alias “git ma” that returns me to the primary branch however it’s
named. It uses a helper alias to find the name of the primary branch:
[alias]
# Find the name of the primary branch, either "main" or "master".
primary = "!f() { \
git branch -a | \
sed -n -E -e '/remotes.origin.ma(in|ster)$/s@remotes/origin/@@p'; \
}; f"
If you haven’t seen this style of alias before, the initial exclamation point
means it’s a shell command not a git command. Then we use shell
f() {···}; f
syntax to define a function and immediately invoke
it. This lets us use shell commands in a pipeline, access arguments with
$1
, and so on. (Fetching GitHub pull requests has
more about this technique.)
This alias uses the “git branch -a” command to list all the branches, then
pipes it into the Unix sed command to find the remote one named either “main” or
“master”.
With “git primary” defined, we can define the “ma” alias to switch to the
primary branch and pull the latest code. I like “ma” because it’s short for
both main and master, and because it feels like coming home (“Hi ma!”):
[alias]
# Switch to main or master, whichever exists, and update it.
ma = "!f() { \
git checkout $(git primary) && \
git pull; \
}; f"
For repos with an upstream, I need to pull their latest code and also push to
my fork to get everything in sync. For that I have “git mma” (like ma but
more):
[alias]
# Pull the upstream main/master branch and update our fork.
mma = "!f() { \
git ma && \
git pull upstream $(git primary) --ff-only && \
git push; \
}; f"
Merging and finishing branches
For personal projects, I don’t use pull requests to make changes. I work on
a branch and then merge it to main. The “brmerge” alias merges a branch and
then deletes the merged branch:
[alias]
# Merge a branch, and delete it here and on the origin.
brmerge = "!f() { \
: git show; \
git merge $1 && \
git branch -d $1 && \
git push origin --delete $1; \
}; f"
This shows another technique: the : git show;
command does
nothing but instructs zsh’s tab completion that this command takes the same
arguments as “git show”. In other words, the name of a branch. That argument
is available as $1
so we can use it in the aliased shell commands.
Often what I want to do is switch from my branch to main, then merge the
branch. The “brmerge-” alias does that. The “-” is similar to “git switch -”
which switches to the branch you last left:
[alias]
# Merge the branch we just switched from.
brmerge- = "!f() { \
git brmerge $(git rev-parse --abbrev-ref @{-1}); \
}; f"
Finally, “git brdone” is what I use from a branch that has already been
merged in a pull request. I return to the main branch, and delete the work
branch:
[alias]
# I'm done with this merged branch, ready to switch back to another one.
brdone = "!f() { \
: git show; \
local brname=\"$(git symbolic-ref --short HEAD)\" && \
local primary=\"$(git primary)\" && \
git checkout ${1:-$primary} && \
git pull && \
git branch -d $brname && \
git push origin --delete $brname; \
}; f"
This one is a monster, and uses “local” to define shell variables I can use
in a few places.
There are other aliases in my git config file, some
of which I’d even forgotten I had. Maybe you’ll find other useful pieces
there.
I have two main approaches for producing changelogs, but both are based on
the same principles: make it convenient for the author to create them, then make
it possible to use the information automatically to benefit the readers.
The first way is with a tool such as scriv, which I
wrote, but which was inspired by previous similar tools like
towncrier and CPython’s
blurb. They let you write your
changelog one entry at a time in the same pull request as the product change
itself. The entries are individual uniquely named files that are collected
together when a release is made. This avoids merge conflicts that will happen
if a number of developers have to all edit the same changelog file.
The second way I maintain a changelog is how I do it for
coverage.py. This predates scriv, and is more
custom-coded, so I’ll walk through the steps. Maybe you will be inspired to add
bits to other tooling.
I hand-edit a CHANGES.rst file. An entry there
might look like this:
CHANGES.rst
- Fix: we failed calling
:func:`runpy.run_path <python:runpy.run_path>`, as described
in `issue 1234`_. This is now fixed, thanks to `Debbie Developer
<pull 2345_>`_. Details are on the :ref:`configuration page
<config_report_format>`.
.. _issue 1234: https://github.com/nedbat/coveragepy/issues/1234
.. _pull 2345: https://github.com/nedbat/coveragepy/pull/2345
This lets me use semantic linking mechanisms. GitHub displays .rst files,
but doesn’t understand the :ref:
-style of links
unfortunately.
The changelog is part of the docs for the project, pulled into the docs/ tree
with a Sphinx directive. The :end-before:
lets me have end-page content
in CHANGES.rst that don’t appear in the docs:
doc/changes.rst
.. include:: ../CHANGES.rst
:end-before: scriv-end-here
It’s great when researching a bug fix in other projects to see an issue
closed with a comment about the commit that fixed it. Even better is when the
issue mentions what release first had the fix. I automate that process for
coverage.py.
To do that and a few other things, I have some custom tooling. It’s a bit
baroque because it grew over time, but it suits my purposes. First I need to get
the changelog into a more easily understood form. Sphinx has a little-known
feature to produce .rst files as output. It sounds paradoxical, but the benefit
is that all links are reduced to their simplest form. The entry above
becomes:
tmp/changes.rst
* Fix: we failed calling
https://docs.python.org/3/library/runpy.html#runpy.run_path, as
described in `issue 1234
<https://github.com/nedbat/coveragepy/issues/1234>`_. This is now
fixed, thanks to `Debbie Developer
<https://github.com/nedbat/coveragepy/pull/2345>`_. Details are on
the `configuration page <config.rst#config-report-format>`_.
Then pandoc converts it to Markdown
and my parse_relnotes.py creates a JSON file to
make it easy to find entries for each version:
[
{
"version": "7.6.1",
"text": "- Fix: coverage used to fail when measuring code using ...",
"prerelease": false,
"when": "2024-08-04"
},
...
Finally(!) comment_on_fixes.py gets the
latest release from the JSON file, regexes it for GitHub URLs in the text, and
adds a comment to closed issues and merged pull requests:
This is now released as part of [coverage 7.x.y](https://pypi.org/project/coverage/7.x.y).
The other automated output from my CHANGES.rst file is a GitHub release.
GitHub releases are both convenient and problematic. I don’t like the idea of
authoring content on GitHub that is only available on GitHub. The history of my
project is an important part of my project, so I want the source of truth to be
a version-controlled text file in my source distribution. But people want to
see GitHub releases. So I author in CHANGES.rst, but publish to GitHub
releases.
Using github_releases.py I automatically
generate a GitHub release from the JSON file. This was useful enough that I
added a
github-release
command to scriv to do a similar thing, but coverage.py still has the custom
code to take advantage of the rst link simplifications I showed above.
One of the things I don’t like about GitHub releases is that they always have
“Assets” appended to the end, with links to .zip and .tar.gz snapshots of the
repo. Those aren’t the right way to get the package, so I include the link to
the PyPI page and the correct command to install the package.
Describing all this, it sounds complicated, and I guess it is. I like being
able to publish information to people who want it, and this automation
accomplishes that.
I playfully quipped about changelogs, and Sumana
Harihareswara thoughtfully responded with Changelogs and
Release Notes. I agree with her on some things, and disagree on others.
My point with the meme was that people should put effort into a hand-crafted
description of what has changed in each release of their product. It should be
focused on what users need to know, and not include internal changes, which can
be found in the git commits or pull requests. It’s easy to publish a list of
commits or pull requests and call it a changelog, but it’s not that helpful to
your users trying to understand what has changed for them. That was the point
of the meme.
But Sumana raised the stakes, explaining why projects should produce
two hand-crafted descriptions. The first is a changelog which mentions
every non-trivial change. The second are release notes which should be
user-focused with more details.
I liked the reasons Sumana gave:
- Release notes can include project-level information that doesn’t correspond
to a particular change in a release. Maybe you started a new discussion forum,
or there’s a shift in maintainer attention, plans for upcoming work, and so
on.
- If the release notes are user-focused, then the changelog can be more
comprehensive, giving people a fuller picture of the work that goes into
producing the project. This can pull back the curtain, helping people understand
the inner workings of the project and perhaps find a way to help out.
My problem with separating the changelog and release notes is that I have
limited energy to produce them, and perhaps more importantly, people have
limited attention to read them. For my projects, I opt instead for a middle
ground: my changelogs lean more toward Sumana’s
ideal of release notes. They are hand-written, focused on what users of the
project need to know, and do not include things like build changes and
refactorings.
For large projects like Python and Linux, there are many maintainers and many
types of information, so it makes sense to have multiple views of “what’s
changed.” For single-maintainer projects, it feels like too much. I applaud
people who can do it, but I don’t think I can, and I won’t expect it from
others.
Ultimately, each project has to decide for themselves how to balance the
effort and the benefit. They know their audience(s), and what resources they
have to do the work. Open source is already difficult, the last thing I want to
do is add a giant SHOULD to a project.
There’s an inexact nested ratio at work in projects: Most users (say 90%)
will only consume, you will never hear from them. You hear from the remaining
10%, but only 10% of those will do something you consider a contribution. For
widely used projects like coverage.py, I think the ratio might be more like 1%
of 1% instead of 10% of 10%. How does this affect your communication approach?
You could look at it two ways: either write for the audience you have (focus on
the 90%), or write for the audience you want (focus on the 10%).
In my changelogs now, for fixes I try to describe the bad thing that used to
happen and any important changes in behavior. For features, I link to the new
docs. I include links to issues and pull requests, and I name the contributors
who helped.
So I guess my approach is to write changelogs for the 90%. But I like
Sumana’s idea of making the full picture of maintainence more visible to people,
so I’m thinking about how to add that without changing the essential character
of my changelog. Perhaps something at the end summarizing the changes that
aren’t yet mentioned, with a link to the git history? I’m not sure I can
automate collecting that information, but I’ll have to play with it.
Let’s say you have a long-lived git branch. Most of the changes should be
merged back to main, but some of the changes were already cherry-picked from
main, and some of the changes shouldn’t be put onto main at all. How do you
review the branch and merge it?
Here’s a diagram of a simple example. The main branch at the top has seven
commits. Beneath that is our work branch with three commits, of the three
different kinds: W is important work we need to end up on main, M is a commit we
cherry-picked from main, and X is a temporary tweak that we don’t want to end up
on main:
If we make a pull request from our work branch, GitHub will show a diff that
includes all three commits W, M, and X. It was a surprise to me that M was
included: it’s not a change that will happen if we merge the work branch,
because M is already on main. GitHub doesn’t show you a diff between your
branch and main, it shows the diff since your branch diverged from main: it
shows all of the commits on your branch. This makes it hard to assess what a
merge will do if the branch has cherry-picked commits.
And of course the pull request diff includes X, since that would be a change
to main if we merge the work branch. But we don’t want X in the merge, and we
don’t want to be distracted by M when reviewing the pull request. What should
we do?
The answer is to use the “git revert” command to add commits to the branch
that undo M and undo X. We show those as -M and –X:
Now the diff will show only W, great! The –X commit is perfect, it will
prevent X from merging to main. But what about –M? What will happen when we
merge that? I was concerned that it would undo the M commit on main. But it
doesn’t.
A git merge compares two snapshots of the repo and combines them. In this
case, the changes from M are on the main branch, and no trace of them are on the
work branch, so M is fine, and remains on main after the merge. The merge does
just what we want. It brings the W changes onto main, and I’ve named it wM to
indicate that:
Some other points here:
- Why not just merge the branch after the W commit? This is a simplified
example for illustration. The real branch that sent me down this path has
dozens of commits intermixed.
- GitHub has three different ways to finish a pull request (merge, squash,
rebase). This technique of using reverts to hide cherry-picked changes and
avoid unwanted changes applies to all of them.
- Although our merge only adds the W changes to main, the history will show
the complete work branch, including our revert commits. If you wanted it a
little cleaner, you could leave out the –M reverts before merging. The result
will be the same with or without them.
- If you want you can also make a new branch for the revert commits to
keep the work branch pristine:
- Finally, the way to get the cleanest history is to create a new branch and
rebase the commits we want before merging. This could be a lot of work, and
some people will object to misrepresenting the actual history of commits. Git
gives you plenty of tools to do it as you prefer.
Cog is my tool for using bits of Python to generate
content inside an otherwise static file. I used it in extreme ways to generate
my GitHub profile page.
If you haven’t seen it before, you can customize your GitHub profile by
creating a README.md in a repo named the same as your username. So
my profile is rendered from
nedbat/nedbat/README.md.
My profile has a bit of static text, but much of it is badges, blog posts,
links to PyPI projects, and so on. The README.md is
literally a Markdown file that can be displayed by GitHub, but it’s full HTML
comments containing Python code that generates the content. The generation
happens once a day in a GitHub action.
There are three kinds of lines in a file run through cog: static content,
code that will generate content, and generated content. My README.md is
lop-sided: it has 225 lines of code, 38 of static content, and 43 of generated
content.
The badges are made with shields.io image
URLs. To make this easier, there are Python functions for Markdown image
syntax, for building shields.io badge URLs, and so on.
I can’t walk through all of the code, but I can show a few simplified
versions to convey the idea. Read the file itself if
you are interested in the full details.
This makes a shields.io URL:
def shields_url(
label=None,
message=None,
color=None,
label_color=None,
logo=None,
):
params = {"style": "flat"}
url = "".join([
"/badge/",
quote(label or ""),
"-",
quote(message),
"-",
color,
])
url = "https://img.shields.io" + url
if label_color:
params["labelColor"] = label_color
if logo:
params["logo"] = logo
return url + "?" + urlencode(params)
This makes a Markdown image:
def md_image(image_url, text, link):
return f'[![{text}]({image_url} "{text}")]({link})'
Now we can make a Markdown badge:
def badge(text=None, link=None, **kwargs):
return md_image(image_url=shields_url(**kwargs), text=text, link=link)
Anything print’ed will become part of the generated portions of the file.
We can add a badge to the page with:
print(badge(
logo="discord", logo_color="white", label_color="7289da",
message="Discord", color="ffe97c",
text="Python Discord", link="https://discord.gg/python",
))
There are other functions built on top of these to make Mastodon badges,
Stack Overflow badges, a row of badges for a PyPI project, and so on.
Building the page ends up pulling data from 10 URLs, including a JSON summary
of my blog for including blog posts. It’s satisfying to be able to have this
update automatically instead of having to copy data around.
The result is a convenient mix of static and
generated, and it was a fun exercise in light-touch automation.
As I mentioned in a few recent posts, I’ve been working
on some significant work in coverage.py to take advantage of new capabilities in
Python.
Mark Shannon has been improving the sys.monitoring
API so that branch coverage can be done with low overhead. I want to take
advantage of that in coverage.py, but I needed to do some refactoring work
first. The tests were focused on mapping the complete set of code pathways
(which I called arcs), but using low-overhead branch monitoring won’t provide
those complete pathways. If the tests continued to focus on them, they would
fail with sys.monitoring.
But the complete pathways aren’t actually needed. The useful information is
where the branches are, and which branches were taken. That can be measured
with sys.monitoring. So a first step was to refactor the tests to focus on
branches instead of arcs. That took a while, but is now done.
Not needing all those arcs also meant I could simplify the AST-based parser
that found the arcs, removing about 150 lines. I suspect there’s more that
could be removed. Maybe it will happen over time. Also, the new
code.co_branches() method might make it all obsolete over time.
If you read Coverage at a crossroads on this blog, I
talked about using ideas from SlipCover like inserting fake lines with an import
hook. Those exotic ideas were appealing in their way, but are no longer needed,
and they would have brought a bunch of complexity. With the two new
sys.monitoring events, we can get the branch information directly without
advanced shenanigans.
There’s more work to do, including attending to incoming bug reports. If
you’d like to help, or learn more about any of this, we have a
#coverage-py channel in the Python Discord.
Older: