Let’s say you are running a conference,
and let’s say your Call for Proposals
is open, and is saving people’s talk ideas into a spreadsheet.
I am in this situation. Reviewing those proposals is a pain, because there
are large paragraphs of text, and spreadsheets are a terrible way to read
them. I did the typical software engineer thing: I spent an hour writing
a tool to make reading them easier.
The result is csv_review.py.
It’s a terminal program that reads a CSV file (the exported proposals).
It displays a row at a time on the screen, wrapping text as needed.
It has commands for moving around the rows. It collects comments into a
second CSV file. That’s it.
There are probably already better ways to do this. Everyone knows that to
get answers from the internet, you don’t ask questions, instead you present
wrong answers. More people will correct you than will help you. So this
tool is my wrong answer to how to review CFP proposals. Correct me!
Another alpha of Coverage.py 5.0 is available: 5.0a4.
This fixes a few problems with the new SQLite-based storage. Please give
it a try, especially to experiment with
The challenge with this release was something that started as a seemingly
simple fix. Coverage.py tries to emulate how Python runs programs,
including how the first element of sys.path is set. A few people run
coverage with sys.path fully configured, and coverage’s setting of
sys.path was breaking their stuff.
The proposal was simple: delete the one line of code that set sys.path.
I tried that, and it seemed to work. Fixed!
Not so fast: the Windows builds failed. This started a multi-week adventure
of debugging and refactoring. The Windows builds were failing not because
of Windows itself, but because on Windows, I don’t use
parallelizes tests into worker processes. With xdist, the tests were all
passing. Without xdist, a few sys.path-related tests were failing.
It turns out that
xdist manipulates sys.path
itself, which was masking the fact that I had removed an important step
from coverage.py. First thing to do was to adjust my test code so that
even with xdist, my tests didn’t get xdist’s path changes.
Then I had to re-think how to adjust sys.path. That required refactoring
how I ran the user’s Python code, so that I could apply the path changes a
little earlier than I used to. That made me look at how I was testing that
layer of code with mocks, and I changed it from explicit dependency
injection to implicit mock patching.
A few more little fixes were needed here and there along the way. All
told, the “one line” fix ended up being
files changed, 587 insertions, 427 deletions.
A cautionary tale about testing that things are unequal...
We had a test that was kind of like this:
data = "I am the data"
But someone refactored the test oh-so-slightly, like this:
data = "I am the data"
modified = modify_another_way(change_the_thing(data)),
Now the test isn’t testing what it should be testing, and will pass even if
change_the_thing and modify_another_way both return their argument
unchanged. (I’ll explain why below.)
Negative tests (asserting that two things are unequal) is really tricky,
because there are infinite things unequal to your value. Your assertion
could pass because you accidentally have a different one of those unequal
values than you thought.
Better would be to know what unequal value you are producing, and test that
you have produced that value, with an equality assertion. Then if
something unexpectedly shifts out from under you, you will find out.
Why the test was broken: the refactorer left the trailing comma on the
“modified =” line, so “modified” is a 1-element tuple. The comparison is
now between a tuple and a string, which are always unequal, even if the
first element of the tuple is the same as the string.
I recently went through a process I’ve done many times before: I tried to
configure the Python warnings
module, and was mystified. I set the PYTHONWARNINGS environment variable,
and it doesn’t do what it seems like it should. I read the docs again, I
think I understand what they are telling me, but the code isn’t doing what
it seems like it should be doing. Frustrating.
I had some time today to dig into it, and now I understand better. The docs
are misleading and/or incomplete. The module is not designed for
maximum utility. Let me explain.
Here’s what the docs tell you: PYTHONWARNINGS (or the -W command-line option)
is a comma-separated list of filters. Each filter is a 5-tuple of fields
separated by colons. The third and fourth fields are of interest here,
they are category and module.
Let’s start with module: it’s the module that caused the warning. The docs
say it is a regular expression. This is false! Internally, this string is
used as part of a regex match operation, but first it is escaped, so if you
include an asterisk in your setting, you will be trying to match module names
that have a literal asterisk in them, which is impossible.
OK, so the module string is a literal string, not a regex, but the escaped
string is being used as part of re.match, so it should be possible to
suppress warnings from an entire package (like backports.*) just by
specifying “backports”, right? Nope! After being escaped,
a $ is added to the end, so your literal string must be an exact match on
the entire module name. Sigh.
Just to add to the confusion, the docs have long included this example,
which isn’t even a sensible regex, never mind that regexes aren’t usable
error:::mymodule[.*] # Convert warnings to errors in "mymodule"
# and any subpackages of "mymodule"
These concerns about the regex behavior are the topic of
bpo 34624, BTW.
On to category: this is actually the class of warning exception being used,
so you can say (for example) DeprecationWarning here. In my case, I wanted
to suppress the deprecation warnings that pytest raises. Pytest helpfully
uses a base class, pytest.PytestDeprecationWarning, so I used that as the
category. But this causes an error message at startup:
Invalid -W option ignored: invalid module name: ‘pytest’
Huh? pytest is an importable module! Nope: these category
names are imported early in the startup sequence, before your real sys.path
is built, so you cannot name third-party modules here...!
There are probably other things about warnings that confuse people. These
are the ones I uncovered today after a long rage-fueled debugging session.
An important developer skill is an irrational belief that things can be
understood, and be made to make sense. Another is knowing when to give up
and just accept the confusion. This morning I fully embraced the first.
In my case, I was trying to suppress warnings reported while running tests
with pytest. Pytest has its own setting for warnings filters, and it uses
its own copy of the warnings.py code for reading them, so that the regexes
are not escaped! This is very useful, but could also add to the mystery,
since the pytest docs don’t mention the difference.
And since pytest interprets its settings after sys.path has been configured,
I can use third-party warning categories there. So this works perfectly:
It’s very satisfying to have some mysteries solved.
A long-awaited feature of coverage.py is now available in a rough form:
Who Tests What annotates coverage data with the
name of the test function that ran the code.
To try it out:
- Install coverage.py v5.0a3.
- Add this line literally to the [run] section of your .coveragerc file:
dynamic_context = test_function
- Run your tests.
- The .coverage file is now a SQLite database. There is no change to
reporting yet, so you will need to do your own querying of the SQLite
database to get information out. See below for a description of the
The database can be accessed in any SQLite-compatible way you like. Note
that the schema is not (yet) part of the public API. That is, it may not
be guaranteed to stay the same. This is one of the things yet to be
decided. For now though, the database has these tables:
- file: maps full file paths to file_ids: id, path
- context: maps contexts (test function names) to contexts_ids: id, context
- line: the line execution data: file_id, context_id, lineno
- arc: similar to line, but for branch coverage: file_id, context_id, fromno, tono
It’s not the most convenient, but the information is all there. If you used
branch coverage, then the important data is in the “arc” table, and “line”
is empty. If you didn’t use branch coverage, then “line” has data and
“arc” is empty. For example, using the sqlite3 command-line tool, here’s a
query to see which tests ran a particular line:
...> distinct context.context from arc, file, context
...> where arc.file_id = file.id
...> and arc.context_id = context.id
...> and file.path like '%/xmlreport.py'
...> and arc.tono = 122;
BTW, there are also “static contexts” if you are interested in keeping
coverage data from different test runs separate: see
Contexts in the docs for details.
Some things to note and think about:
- The test function name recorded includes the test class if we can
figure it out. Sometimes this isn’t possible. Would it be better to
record the filename and line number?
- Is test_function too fine-grained for some people? Maybe chunking to
the test class or even the test file would be enough?
- Better would be to have test runner plugins that could tell us the
test identifier. Anyone want to help with that?
- What other kinds of dynamic contexts might be useful?
- What would be good ways to report on this data? How are you navigating
the data to get useful information from it?
- How is the performance?
- We could have a “coverage extract” command that would be like the
opposite of “coverage combine”: it could pull out a subset of the data
so a readable report could be made from it.
Please try this out, and let me know how it goes. Thanks.
I was the guest on the most recent episode of the Talk Python To Me podcast:
It was a really fun conversation. I liked at one point being able to say,
“I’ll answer the question, and then I will challenge the premise of the
question.” Michael does a great job running the show, including providing a
of the episode!
We talked mostly about coverage.py, but there was a small digression into
what I do for work, which is the Open edX
team at edX. EdX has thousands of courses
from hundreds of institutions. But the software we write is open source,
and is used by a thousand other sites to do online education. It’s really
gratifying to see them doing the kinds of education that edX will never
The example I gave on the show is UKK Guru,
an Indonesian site running hundreds of small vocational courses that will
mean a lot to its learners.
Sorry to repeat myself! Mostly the show is about coverage.py, it’s good. :)