« | » Main « | »

Coverage.py 3.6b1

Thursday 29 November 2012

The latest beta version of coverage.py is available: I give you coverage.py v3.6b1. There are lots of changes here. Somehow I got on a roll, and fixed 22 tickets. The full list of changes is below.

But before we get to that, there are two bugs I'd like to fix but need some help with:

  • Coverage xml doesn't produce <sources> element: The XML report is meant to be consumed by Cobertura, particularly as a plugin in Jenkins. This bug says that the links to the source files don't work, but in my tests they do, so I don't understand the conditions that make it fail. If you have a publicly available repo that demonstrates this problem, let me know so I can use it as a test case.
  • On Linux, packages get installed in places coverage.py doesn't ignore: If you have any good ideas for determining which directories contain "third-party" packages, I'd like to make coverage.py smarter about it. It doesn't have to be perfect, because users can always override the defaults, but I'd like it to start with a better guess.

Other than the 22 bugs fixed, big changes include:

  • For continuous integration users, coverage.py can now easily indicate whether the total coverage percentage exceeds a given threshold, with the --fail-under switch. Similar information is available through the API as well.
  • HTML reports can now be titled, which helps in multi-project environments.
  • Configuration files now support environment variable substitution.

Try it out (or "give it a burl" as they say down under), and let me know if anything is amiss.

Full changes:

  • Wildcards in include= and omit= arguments were not handled properly in reporting functions, though they were when running. Now they are handled uniformly, closing issue 143 and issue 163. NOTE: it is possible that your configurations may now be incorrect. If you use include or omit during reporting, whether on the command line, through the API, or in a configuration file, please check carefully that you were not relying on the old broken behavior.
  • The report, html, and xml commands now accept a --fail-under switch that indicates in the exit status whether the coverage percentage was less than a particular value. Closes issue 139.
  • The reporting functions coverage.report(), coverage.html_report(), and coverage.xml_report() now all return a float, the total percentage covered measurement.
  • The HTML report’s title can now be set in the configuration file, with the --title switch on the command line, or via the API.
  • Configuration files now support substitution of environment variables, using syntax like ${WORD}. Closes issue 97.
  • Embarrassingly, the [xml] output= setting in the .coveragerc file simply didn’t work. Now it does.
  • The XML report now consistently uses filenames for the filename attribute, rather than sometimes using module names. Fixes issue 67. Thanks, Marcus Cobden.
  • Coverage percentage metrics are now computed slightly differently under branch coverage. This means that completely unexecuted files will now correctly have 0% coverage, fixing issue 156. This also means that your total coverage numbers will generally now be lower if you are measuring branch coverage.
  • When installing, now in addition to creating a “coverage” command, two new aliases are also installed. A “coverage2” or “coverage3” command will be created, depending on whether you are installing in Python 2.x or 3.x. A “coverage-X.Y” command will also be created corresponding to your specific version of Python. Closes issue 111.
  • The coverage.py installer no longer tries to bootstrap setuptools or Distribute. You must have one of them installed first, as issue 202 recommended.
  • The coverage.py kit now includes docs (closing issue 137) and tests.
  • On Windows, files are now reported in their correct case, fixing issue 89 and issue 203.
  • If a file is missing during reporting, the path shown in the error message is now correct, rather than an incorrect path in the current directory. Fixes issue 60.
  • Running an HTML report in Python 3 in the same directory as an old Python 2 HTML report would fail with a UnicodeDecodeError. This issue (issue 193) is now fixed.
  • Fixed yet another error trying to parse non-Python files as Python, this time an IndentationError, closing issue 82 for the fourth time...
  • If coverage xml fails because there is no data to report, it used to create a zero-length XML file. Now it doesn’t, fixing issue 210.
  • Jython files now work with the --source option, fixing issue 100.
  • Running coverage under a debugger is unlikely to work, but it shouldn’t fail with “TypeError: ‘NoneType’ object is not iterable”. Fixes issue 201.
  • On some Linux distributions, when installed with the OS package manager, coverage.py would report its own code as part of the results. Now it won’t, fixing issue 214, though this will take some time to be repackaged by the operating systems.
  • Docstrings for the legacy singleton methods are more helpful. Thanks Marius Gedminas. Closes issue 205.
  • The pydoc tool can now show docmentation for the class coverage.coverage. Closes issue 206.
  • Added a page to the docs about contributing to coverage.py, closing issue 171.
  • When coverage.py ended unsuccessfully, it may have reported odd errors like 'NoneType' object has no attribute 'isabs'. It no longer does, so kiss issue 153 goodbye.

Tricky locals()

Tuesday 13 November 2012

One of the perks of maintaining coverage.py is that you get some really interesting bug reports. Digging into them can be a good way to learn about some obscure corners of Python.

Today's bug was that a piece of product code succeeded when run without coverage.py, and succeeded when run under the C tracer, but failed when run under the Python tracer. I should explain: the heart of coverage.py is the trace function invoked by CPython on every line of exection. Coverage.py has two implementations of its trace function: one in C for speed, and another simpler one in Python for maximum flexibility. The bug report was that one of the implementations caused the product code to fail, and the other did not.

The product code in question looked like this:

def wacky(x, y):
    args = locals()
    args_keys = args.keys()
    #.. do something with args_keys ..

The intent of this code was that args_keys would be the list ['x', 'y']. The code failed because the list was actually ['x', 'y', 'args']. At the moment locals() is called, there are only two local names, x and y, and running under Python gives us the answer we expected. How could running the code under coverage.py cause this change in behavior?

Playing around with it some more, it became clear that it was nothing about coverage.py in particular, it was the presence of a trace function, any trace function, that would cause the change:

import sys

def trace(frame, event, arg):
    return trace

def wacky():
    x = y = 1
    args = locals()


Running this code on any version of Python produces:

['y', 'x']
['y', 'x', 'args']
['y', 'x']

Without the trace function, either before one was registered, or after it was un-registered, the list is ['y', 'x']. But with a trace function, it's ['y', 'x', 'args'].

Thinking it might be a bug in CPython, I searched the bug database, and found ticket 7083, which explained what's going on.

The locals() function is trickier than it appears at first glance. The returned value is a dictionary which is a copy of the local symbol table. This is why changing the dict might not actually change the local variables.

The copy is made when locals() is called, so in our code, the dict has keys 'x' and 'y'. But in fact, the same dict is returned every time you call locals(), but updated to the new contents of the local symbols.

Here's the important (subtle) fact about how CPython works:

When a trace function is in effect, the local symbol table is copied into the locals() dictionary after every statement.

This means that when "args = locals()" is executed, args is simply a reference to the locals() dictionary. Without a trace function, that dictionary is updated only when locals() is called. So the assignment to args isn't reflected in the dictionary.

But with a trace function, after executing "args = locals()", the locals() dict is updated again, copying the name "args" into it. As with all mutable values in Python, when the value is changed in-place, all references see the changed value, so now "args" refers to a dict with the keys, "x", "y", and "args".

The reason the locals are copied after every statement is simple: the trace function is executed after every statement, and to make building debuggers and other tools possible, the locals dict is updated so that the trace function has an accurate view of the current state. But that updating is expensive, and without a trace function, unnecessary. So it's only done when a trace function is registered.

Most Python programs have no trace function registered, but coverage of course uses one to collect data. So the program behaves differently under coverage than without it.

The fix is simple: make a copy of the locals() dict instead of using it directly:

def wacky(x, y):
    args = dict(locals())
    args_keys = args.keys()
    #.. do something with args_keys ..

By copying the dictionary with dict(), we get an independent copy that won't see the changes when the locals dict is updated for the trace function.

BTW: a remaining mystery is why the original bug report said that one trace function worked, but the other didn't. I'm still trying to track that down, but I think perhaps coverage.py wasn't really in effect for the case that worked.

One last question: is there a way to explain this in the docs that makes the point without going into too much detail?

I fixed Python!

Sunday 4 November 2012

About a month ago, I found a bad-behavior bug in the tokenize standard library module, and with help from Aron Griffis, submitted a patch to fix it. Yesterday was a Python bug day, and Ezio Melotti merged my change, so I have officially contributed to CPython!

The bug in tokenize was an obscure case: if the code ends with a line that starts with non-space, then ends with many spaces, and no newline, then the tokenizer gets into an N² run-time behavior, where N is the number of spaces. The problem is that each space is tokenized as an error token (because it precedes no good token), so N tokens are produced, but each token takes linear time for the regex to see that there's no good token following it, leading to N² behavior.

I discovered this working on code that grades student submissions at edX. For some reason there was a submission ending with 40,000 spaces and no newline, and it was taking 20 minutes to tokenize!

Simple demonstration:

import tokenize
import time
from cStringIO import StringIO

def time_to_tokenize_trailing(spaces):
    source = StringIO("@" + " "*spaces)
    start = time.time()
    end = time.time()
    return end - start

for spaces in xrange(1000, 15000+1, 1000):
    print "%5d%.2fs" % (spaces, time_to_tokenize_trailing(spaces))


 1000: 0.71s
 2000: 2.83s
 3000: 6.47s
 4000: 11.52s
 5000: 17.68s
 6000: 26.16s
 7000: 35.35s
 8000: 46.65s
 9000: 58.35s
10000: 72.80s
11000: 89.53s
12000: 107.27s
13000: 126.44s
14000: 147.60s
15000: 166.81s

If you are running a server that tokenizes untrusted Python code, you might want to throw an .rstrip() into it to prevent this case...

« | » Main « | »