Stuck

Sunday 5 December 2004This is nearly 20 years old. Be careful.

I’ve been hacking on code this weekend, and I’ve managed to get myself stuck. I started using Gareth Rees’ coverage.py to measure the code coverage of some unit tests. It works well, but counts docstrings as missed lines of code. I have a lot of docstrings, so the results were too noisy to be useful. I wanted to fix it so that it understood docstrings for what they were. I also wanted to add some other features, for example, the ability to mark lines of code as not expected to be run, so that they would not be counted as missed lines.

So I dug into the code. It works by using the debugging trace hook to record which lines of code are executed, and parsing the source of the modules to understand which lines are executable. I decided to switch it from the parser module to the compiler module for the parsing half of the job. The parser module returns a low-level, grammar-centric representation of the source text, making it difficult to distinguish a docstring from any other expression statement. The compiler module is higher-level, returning a tree of nodes that corresponds more to the semantics of the program. It seemed like a no-brainer, and it all went very well.

Then I wanted to add another feature, so that an entire suite of statements (the Python equivalent of a block in other languages) could be marked as “not expected to be run”. For example, if a module has a chunk of code at the end to allow it to be run from the command line, it would be nice if the entire suite could be marked without having to put a marker comment on every line. So a single marker could do it like this:

if __name__ == '__main__':   #-notrun-
    # blah
    # blah

And if we’re going to do that, then it should work uniformly for all suites:

if alwaystrue:
    # this code
    #   will be run
else:           #-notrun-
    # this code
    #   won't be run

And there’s the problem: the compiler module completely discards any trace of the else. It has both suites of code, but the actual line with the “else:” on it isn’t represented in the parse tree at all. So it’s impossible to match up line-oriented regular expression results (finding the markers) with the parse tree results (what code does it apply to?).

So I’m pondering my options. Go back to the old parse technique, and hack up something to exclude docstrings? Disallow excluding “else” suites? Do something else completely? It had all been going so well. Sigh.

Comments

[gravatar]
Can you count back from the first line of the start of the else block? I assume that would be available through the parse tree.
[gravatar]
Do you know about trace.py in the standard library? Does coverage.py that you gave a link to have any advantages over it? The date on the web page (2001) seems to indicate that coverage.py was created when Python did not have a code coverage analysis tool in the standard library.

I've used trace.py from the Python standard library for checking unit test coverage. It knows about docstrings. It marks lines containing 'finally:' as not executed, even when the finally block itself was executed, but that bug was easy to fix. I still haven't gotten around to submitting this bug to Python's bug tracker, bad me.

The SchoolTool test runner has an option (--coverage) to produce code coverage reports while running unit tests. You can find it at http://source.schooltool.org/, however it is GPLed. IIRC the Zope 3 test runner also supports coverage analysis with trace.py.

Oh, and before you ask -- I do not think trace.py allows you to mark code blocks as expected to not be executed. It lets you mark individual lines with #pragma: NO COVER, though.
[gravatar]
trace.py: Interesting, I hadn't found it. It seems not to be mentioned in the Python documentation, and doesn't appear in a Google search for "python code coverage"!

I'm sick of the whole thing today, but I'll probably investigate the options later this week. There's also sancho,
which is a unit test framework including coverage analysis.
[gravatar]
The problem as far as I know it is that trace.py doesnt use hotshot for timings. That means that it is dog slow, probably like 10x slower than without tracing.

I would love to see a decent test runner with code coverage reports (hopefully using hotshot to avoid the time penalties).

If you do make progress on this let me know and Ill pass it on to a few folks on my team for testing or co-development opportunities.

--
John C
[gravatar]
I'm not familiar with coverage.py so this might not be a useful comment at all...But does python -OO help in any way???

$ python -h
...
-OO : remove doc-strings in addition to the -O optimizations
...
[gravatar]
John: I don't know what the timing implications are for using hotshot or not. The code coverage tools typically don't use the profiling modules, but provide a debug hook trace function, which is called for each line executed. Maybe there are internals of hotshot which make calling the hook faster?

Levi: interesting idea. As it happens, I've managed to properly ignore the docstrings, and am trying to chase down excluding entire suites of code from the coverage.
[gravatar]
Hotshot provides a coverage capability. Basically instead of logging information for each line (like timing, # of executions etc), it just marks the line as visited or not.

There is a slide set from one of the Zope guys on hotshot, I cant seem to find the link for it at the moment but it does mention the coverage capabilities.

--
John C
[gravatar]
"HotShot: The New Python Profiler" -- a 2002 talk by Fred Drake:
http://starship.python.net/crew/fdrake/talks/IPC10-HotShot-2002-Feb-06.ppt

(I am currently trying to use hotshot for coverage analysis, but hotshot's coverage support is very undocumented and low level.)

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.