Coverage branches instead of arcs

Monday 26 August 2024

As I mentioned in a few recent posts, I’ve been working on some significant work in coverage.py to take advantage of new capabilities in Python.

Mark Shannon has been improving the sys.monitoring API so that branch coverage can be done with low overhead. I want to take advantage of that in coverage.py, but I needed to do some refactoring work first. The tests were focused on mapping the complete set of code pathways (which I called arcs), but using low-overhead branch monitoring won’t provide those complete pathways. If the tests continued to focus on them, they would fail with sys.monitoring.

But the complete pathways aren’t actually needed. The useful information is where the branches are, and which branches were taken. That can be measured with sys.monitoring. So a first step was to refactor the tests to focus on branches instead of arcs. That took a while, but is now done.

Not needing all those arcs also meant I could simplify the AST-based parser that found the arcs, removing about 150 lines. I suspect there’s more that could be removed. Maybe it will happen over time. Also, the new code.co_branches() method might make it all obsolete over time.

If you read Coverage at a crossroads on this blog, I talked about using ideas from SlipCover like inserting fake lines with an import hook. Those exotic ideas were appealing in their way, but are no longer needed, and they would have brought a bunch of complexity. With the two new sys.monitoring events, we can get the branch information directly without advanced shenanigans.

There’s more work to do, including attending to incoming bug reports. If you’d like to help, or learn more about any of this, we have a #coverage-py channel in the Python Discord.

Comments

[gravatar]

I don’t understand what you mean by “arc” vs “branch”. Is an arc a branch if there’s more than one for the same destination?

[gravatar]

I should have explained more: an arc is a pair of line numbers, two lines that execute in sequence, like a prev/next pair. The problem with the old way was that it collected every arc, including the obvious ones (line 1 is followed by line 2, line 2 is followed by line 3, etc).

What I mean by branches is the subset of all arcs, narrowed down to the user-visible choice points in the code (if statements, etc).

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.