I’ve released another alpha of coverage.py 5.0:
There are some design decisions ahead that I could use feedback on.
The big feature in 5.0 is “contexts”: recording data for varying execution
context, also known as Who Tests
What. The idea is to record not just that a line was executed, but also
which tests ran each line.
Some of the changes in alpha 6 were driven by a hackathon project at work:
using who-tests-what on the large Open edX codebase. We wanted to collect
context information, and then for each new pull request, run only the subset of
tests that touched the lines you changed. Initial experiments indicate this
could be a huge time-savings.
Big changes in this alpha:
Support for contexts when reporting. The --show-contexts option annotates
lines with the names of contexts recorded for the line. The --contexts option
lets you filter the report to only certain contexts. Big thanks to Stephan
Richter and Albertas Agejevas for the contribution.
Our largest test suite at work has 29k tests. The .coverage SQLite data
file was 659Mb, which was too large to work with. I changed the database format
to use a compact bitmap representation for line numbers, which reduced the data
file to 69Mb, a huge win.
The API to the CoverageData object has changed.
Some implications of these changes:
The HTML reporting on contexts is good for small test suites, but very
quickly becomes unwieldy if you have more than 100 tests. Please try using it
and let me know what kind of reporting would be helpful.
The new more-compact data file is harder to query. The larger data file has
a schema designed to be useful for ad-hoc querying. It was a classic
third-normal form representation of the data. Now I consider the database
schema to be a private implementation detail. Should we have a new “coverage
sql” report command that exports the data to a convenient SQLite file?
Because CoverageData has changed, you will need an updated version of
pytest-cov if you use that plugin. The future of the plugin is somewhat up in
the air. If you would like to help maintain it, get in touch. You can install
the up-to-date code with:
To support our hackathon project, we wrote a new pytest plugin: it uses
pytest hooks to indicate the test boundaries, and can read the database and the
code diff to choose the subset of tests to run. This plugin is in very
rough shape (as in, it hasn’t yet fully worked), but if you are interested
in participating in this experiment, get in touch. The code is here
I don’t think this will remain as an independent plugin, so again, if you want
to help with future maintenance or direction, let me know.
One of Jerod’s questions was unexpected: what other open source maintainers
do I appreciate? Two people that came to mind were
Daniel Hahler and
Julian Berman. Some people are
well-known in the Python community because they are the face of large widely
used projects. Daniel and Julian are known to me for a different reason: they
seem to make small contributions to many projects. I see their names in the
commits or issues of many repos I wander through, including my own.
This is a different kind of maintainership: not guiding large efforts, but
providing little pushes in lots of places. If I had had the presence of mind, I
would have also mentioned Anthony Sottile
for the same reason.
And I would have mentioned Mariatta,
for a different reason: her efforts are focused on CPython, but on the
contribution process and tooling around it, rather than the core code itself. A
point I made in the podcast was that people and process challenges are often the
limiting factor to contribution, not technical challenges. Mariatta has been at
the forefront of the efforts to open up CPython contribution, and I wish I had
mentioned her in the podcast.
And I am sure there are people I am overlooking that should be mentioned in
these appreciations. My apologies to you if you are in that category...
Friday was the summer solstice, when day and night are at their most extreme
imbalance. It reminded me of the last summer solstice — and the year of light
and dark, ups and downs, since then — all revolving around Nat, my 29-year-old
Last year on the solstice we were at a block party in a neighborhood in
Boston. We had become close friends with a young couple, and planned for Nat to
move in with them, and for them to be his caregivers.
The couple was eager to extend their family from two to three. We had had
long serious discussions with them about the challenges involved. They knew Nat
pretty well, and had deep experience with similar disabled populations. They
had even moved to a new apartment in order to have space for Nat.
The new apartment was on this quirky cul-de-sac on a hill, a small tight-knit
community, complete with a summer solstice party. It seemed magical, like an
entirely new experience opening up to us. Nat would be moving in with a couple
his own age, with young enthusiasms, and an anything-is-possible approach to the
world. The neighborhood only added to the sense of expanding possibilities. It
seemed like a good plan, almost too good to be true.
We planned for Nat to move at the end of August. We spent lots of time with
his new caregivers over the summer, doing new things from their world. This
helped them understand Nat’s full-day routines better, and was exciting for
The move went great. But over the course of the fall, things started not
going well. Nat has always had periods of anxiety, but it’s hard to pinpoint
the causes. He was going through a bad time, with alarming head-hitting. The
caregivers were having health issues of their own, which made it difficult for
them to give Nat the routine and stability he needs.
We tried to support the new arrangement by having Nat on weekends, and
generally being there for everyone. For reasons that are still not clear to me,
it wasn’t enough, and things just kept going downhill, including our
interactions with the couple. By March, the arrangement that seemed too good to
be true proved to be exactly that. Nat moved back with us.
This was a hard time. Everyone involved reacted in their own ways to the
stress, which caused conflict between us and the caregivers. I think Nat
overall was happy to be back in our house, but his anxieties had not lessened.
Was the change of home part of the cause? We’ll never know.
Parenting Nat has involved a long series of choices for him: where he’ll be
schooled, where he will live, what he will do during his days and nights. These
choices often fall into two broad categories: the exciting but risky, and the
safe but underwhelming — another kind of light and dark. And underlying those
decisions is always the impossible question: are we doing enough?
Now that he was back with us, we had to decide where he would go next. He
could stay with us permanently, but we know that we are perhaps the least
stimulating place for him. We get caught up in our own activities and
interests, and he is passive enough that lots of time passes doing nothing. He
might be fine with it, but it makes us wonder: are we doing enough?
And looming over all of our planning for him is what will happen at the end
of our lives? Now he is 29 and we are 57, but when he is 49 and we are 77 (or
87 or 97!), living together will be a very different story. We want him to have
a life separate from us. We think it will be better for him.
The arrangement we had with the couple is known as “shared living,” and we
thought about whether we wanted to try that again. Nat had been in two shared
living situations by now, and our feeling was that it was too reliant on too few
people. We know shared living has worked for other people, but that’s another
constant in parenting Nat: just because something works for one autism family
doesn’t mean it will work for us. Shared living didn’t seem right for Nat.
We talked with other families we know about what they were planning to do.
But most of them had younger guys, or far more resources, or were making
decisions on longer timescales than us for other reasons. And honestly, housing
together with families we’re already friends with could be like going into
business with friends: a good way to strain or ruin the friendship. We didn’t
want to do that again.
We asked for a new placement in a group home, figuring we’d take a look at
what opened up and see how we felt about it. Two months later we were offered a
placement, in a house run by the same organization as Nat’s previous group home
that he had moved out of the year before.
The residents are a much better fit with Nat this time, and the staff seems
eager and energetic. It’s hard to know whether we are getting accurate answers
from Nat when asked his opinion, but he has been nothing but positive about
moving to this new house. Being in the same organization means we are familiar
with some of the logistics, and Nat will know some of the residents from other
houses when they do things together.
Although a group home generally falls into the safe category rather than the
risky, it feels like this one might be safe without being underwhelming. We
moved him in yesterday, and all seems good. We’ve been through this enough to
know that it won’t be perfect. There will be miscommunications with the
rotating staff, and he’ll come home wearing another resident’s shirt, but
nothing is perfect.
We are still connected to the couple, through other circles. But it is
awkward now, because we have never directly talked about the strains from the
move-out. I hope that we can do that some day.
As I have said before, I
know this is not the last time we will have to make big decisions for Nat. This
one feels good, but others have felt good in the past too. I’m optimistic but
The Heinz company has been using the slogan “57 varieties”
for more than 120 years. But it was never factual. When they introduced the
slogan in 1896, the company already had more than 60 products. The number was
chosen for its sound and “psychological effect.”
It’s hard to know the exact number, but today Heinz has thousands of products,
including at least
Here’s a really simplistic model: if you want someone to do something, you
have to give them a compelling reason to do it, and you have to make it as easy
as possible for them to do it. That is, you need to have good answers to
Why? and How? (I don’t know much about
marketing, but I think these are the value proposition and the call to
Let’s look at the Why and How model as it applies to corporations funding
open source. They don’t do it because the answers to Why and How are really bad
Why should a corporation fund open source? As much as I wish it
were different for all sorts of reasons, corporations act only in purely
selfish ways. In order to spend money, they need to see some positive benefit
to them that wouldn’t happen if they didn’t spend the money.
This frustrates me because a corporation is a collection of people, none of
whom would act this way. I could say much more about this, but we aren’t going
to be able to change corporations.
Companies only spend money if doing so will bring them a (perceived)
benefit. Funding open source would make it stronger and better, but that is a
very long effect, and not one that accrues directly to the funder. This is the
famous Tragedy of the Commons. It’s a fair question for companies to ask: if
they fund open source, what do they get for their money?
That’s the difficulty with Why, but let’s imagine for a moment that we could
somehow convince someone to spend their company’s money funding open source:
now what? How do they do it? A significant Python project
could have a hundred library dependencies. How do they decide how to allocate
the funding budget among them? Once that decision is made, how does the money
get delivered? Very few open source project are equipped to receive funds. If
even 10% of the projects have a clear path for funding, now there are 10 checks
to write, or 10 PayPal links to click through or whatever? Some of that money
will need to be sent internationally, and it has to be considered at tax time.
Does it have to be done again next year, and the year after that? It’s a
So when we try to convince companies to fund open source, we don’t have good
answers for either Why? or How? It’s no
wonder it doesn’t happen.
This is one of the reasons I am optimistic about Tidelift:
they have good answers for both of these questions. The
Tidelift subscription gives
companies information and services around their open source dependencies, which
answers the why. And the payment to Tidelift solves the how: Tidelift looks at
the list of dependencies, decides an allocation, and distributes the money to
Sure, there are still lots of questions to be answered:
is the allocation
algorithm right? Will enough companies subscribe to make Tidelift itself
sustainable? And even larger questions, like: if an interesting amount of money
does flow to open source maintainers, what will be the cultural change in open
I don’t know the answers to those questions, but Tidelift seems like the
most promising answer to how to support open source. I’m an
participant. You should be too.
If you’ve used any programming language for a long enough time, you’ve found
things about it that you wish were different. It’s true for me with Python.
I have ideas of a number of things I would change about Python if I could.
I’ll bore you with just one of them: the syntax of class definitions.
But let’s start with the syntax for defining functions. It has this really
nice property: function definitions look like their corresponding function
calls. A function is defined like this:
When you call the function, you use similar syntax: the name of the
function, and a comma-separated list of arguments in parentheses:
Just by lining up the punctuation in the call with the same bits of the
definition, you can see that arg1 will be 12, and arg2 will be 34. Nice.
OK, so now let’s look at how a class with base classes is defined:
To create an instance of this class, you use the name of the class, and
parens, but now the parallelism is gone. You don’t pass a BaseClass to
construct a MyClass:
Just looking at the class line, you can’t tell what has to go in the parens
to make a MyClass object. So “def” and “class” have very similar syntax, and
function calls and object creation have very similar syntax, but the mimicry in
function calls that can guide you to the right incantation will throw you off
completely when creating objects.
This is the sort of thing that experts glide right past without slowing
down. They are used to arcane syntax, and similar things having different
meanings in subtly different contexts. And a lot of that is inescapable in
programming languages: there are only so many symbols, and many many more
concepts. There’s bound to be overlaps.
But we could do better. Why use parentheses that look like a function call
to indicate base classes? Here’s a better syntax:
Not only does this avoid the misleading punctuation parallelism, but it even
borrows from the English we use to talk about classes: MyClass derives
from BaseClass and AnotherBase. And “from” is already a keyword in
BTW, even experts occasionally make the mistake of typing “def” where they
meant “class”, and the similar syntax means the code is valid. The error isn’t
discovered until the traceback, which can be baffling.
I’m not seriously proposing to change Python. Not because this wouldn’t be
better (it would), but because a change like this is impractical at this
late date. I guess it could be added as an alternative syntax, but it would be
hard to argue that having two syntaxes for classes would be better for
But I think it is helpful to try to see our familiar landscape as confused
beginners do. It can only help with explaining it to them, and maybe help us
make better choices in the future.