« | » Main « |

A tale of two exceptions

Sunday 22 January 2017

It was the best of times, it was the worst of times...

This week saw the release of three different versions of Coverage.py. This is not what I intended. Clearly something was getting tangled up. It had to do with some tricky exception handling. The story is kind of long and intricate, but has a number of chewy nuggets that fascinate me. Your mileage may vary.

Writing it all out, many of these missteps seem obvious and stupid. If you take nothing else from this, know that everyone makes mistakes, and we are all still trying to figure out the best way to solve some problems.

It started because I wanted to get the test suite running well on Jython. Jython is hard to support in Coverage.py: it can do "coverage run", but because it doesn't have the same internals as CPython, it can't do "coverage report" or any of the other reporting code. Internally, there's one place in the common reporting code where we detect this, and raise an exception. Before all the changes I'm about to describe, that code looked like this:

for attr in ['co_lnotab', 'co_firstlineno']:
    if not hasattr(self.code, attr):
        raise CoverageException(
            "This implementation of Python doesn't support code analysis.\n"
            "Run coverage.py under CPython for this command."

The CoverageException class is derived from Exception. Inside of Coverage.py, all exceptions raised are derived from CoverageException. This is a good practice for any library. For the coverage command-line tool, it means we can catch CoverageException at the top of main() so that we can print the message without an ugly traceback from the internals of Coverage.py.

The problem with running the test suite under Jython is that this "can't support code analysis" exception was being raised from hundreds of tests. I wanted to get to zero failures or errors, either by making the tests pass (where the operations were supported on Jython) or skipping the tests (where the operations were unsupported).

There are lots of tests in the Coverage.py test suite that are skipped for all sorts of reasons. But I didn't want to add decorators or conditionals to hundreds of tests for the Jython case. First, it would be a lot of noise in the tests. Second, it's not always immediately clear from a test that it is going to touch the analysis code. Lastly and most importantly, if someday in the future I figured out how to do analysis on Jython, or if it grew the features to make the current code work, I didn't want to have to then remove all that test-skipping noise.

So I wanted to somehow automatically skip tests when this particular exception was raised. The unittest module already has a way to do this: tests are skipped by raising a unittest.SkipTest exception. If the exception raised for "can't support code analysis" derived from SkipTest, then the tests would be skipped automatically. Genius idea!

So in 4.3.2, the code changed to this (spread across a few files):

from coverage.backunittest import unittest

class StopEverything(unittest.SkipTest):
    """An exception that means everything should stop.

    This derives from SkipTest so that tests that spring this trap will be
    skipped automatically, without a lot of boilerplate all over the place.


class IncapablePython(CoverageException, StopEverything):
    """An operation is attempted that this version of Python cannot do."""


# Alternative Python implementations don't always provide all the
# attributes on code objects that we need to do the analysis.
for attr in ['co_lnotab', 'co_firstlineno']:
    if not hasattr(self.code, attr):
        raise IncapablePython(
            "This implementation of Python doesn't support code analysis.\n"
            "Run coverage.py under another Python for this command."

It felt a little off to derive a product exception (StopEverything) from a testing exception (SkipTest), but that seemed acceptable. One place in the code, I had to deal specifically with StopEverything. In an inner loop of reporting, we catch exceptions that might happen on individual files being reported. But if this exception happens once, it will happen for all the files, so we wanted to end the report, not show this failure for every file. In pseudo-code, the loop looked like this:

for f in files_to_report:
    except StopEverything:
        # Don't report this on single files, it's a systemic problem.
    except Exception as ex:
        record_exception_for_file(f, ex)

This all seemed to work well: the tests skipped properly, without a ton of noise all over the place. There were no test failures in any supported environment. Ship it!

Uh-oh: very quickly, reports came in that coverage didn't work on Python 2.6 any more. In retrospect, it was obvious: the whole point of the "from coverage.backunittest" line in the code above was because Python 2.6 doesn't have unittest.SkipTest. For the Coverage.py tests on 2.6, I install unittest2 to get a backport of things 2.6 is missing, and that gave me SkipTest, but without my test requiements, it doesn't exist.

So my tests passed on 2.6 because I installed a package that provided what was missing, but in the real world, unittest.SkipTest is truly missing.

This is a conundrum that I don't have a good answer to:

How can you test your code to be sure it works properly when the testing requirements aren't installed?

To fix the problem, I changed the definition of StopEverything. Coverage.py 4.3.3 went out the door with this:

class StopEverything(unittest.SkipTest if env.TESTING else object):
    """An exception that means everything should stop."""

The env.TESTING setting was a pre-existing variable: it's true if we are running the coverage.py test suite. This also made me uncomfortable: as soon as you start conditionalizing on whether you are running tests or not, you have a very slippery slope. In this case it seemed OK, but it wasn't: it hid the fact that deriving an exception from object is a dumb thing to do.

So 4.3.3 failed also, and not just on Python 2.6. As soon as an exception was raised inside that reporting loop that I showed above, Python noticed that I was trying to catch a class that doesn't derive from Exception. Of course, my test suite didn't catch this, because when I was running my tests, my exception derived from SkipTest.

Changing "object" to "Exception" would fix the problem, but I didn't like the test of env.TESTING anyway. So for 4.3.4, the code is:

class StopEverything(getattr(unittest, 'SkipTest', Exception)):
    """An exception that means everything should stop."""

This is better, first because it uses Exception rather than object. But also, it's duck-typing the base class rather than depending on env.TESTING.

But as I kept working on getting rid of test failures on Jython, I got to this test failure (pseudo-code):

def test_sort_report_by_invalid_option(self):
    msg = "Invalid sorting option: 'Xyzzy'"
    with self.assertRaisesRegex(CoverageException, msg):

This is a reporting operation, so Jython will fail with a StopEverything exception saying, "This implementation of Python doesn't support code analysis." StopEverything is a CoverageException, so the assertRaisesRegex will catch it, but it will fail because the messages don't match.

StopEverything is both a CoverageException and a SkipTest, but the SkipTest is the more important aspect. To fix the problem, I did this, but felt silly:

def test_sort_report_by_invalid_option(self):
    msg = "Invalid sorting option: 'Xyzzy'"
    with self.assertRaisesRegex(CoverageException, msg):
        except SkipTest:
            raise SkipTest()

I knew this couldn't be the right solution. Talking it over with some co-workers (OK, I was griping and whining), we came up with the better solution. I realized that CoverageException is used in the code base to mean, "an ordinary problem from inside Coverage.py." StopEverything is not an ordinary problem. It reminded me of typical mature exception hierarchies, where the main base class, like Exception, isn't actually the root of the hierarchy. There are always a few special-case classes that derive from a real root higher up.

For example, in Python, the classes Exception, SystemExit, and KeyboardInterrupt all derive from BaseException. This is so "except Exception" won't interfere with SystemExit and KeyboardInterrupt, two exceptions meant to forcefully end the program.

I needed the same thing here, for the same reason. I want to have a way to catch "all" exceptions without interfering with the exceptions that mean "end now!" I adjusted my exception hierarchy, and now the code looks like this:

class BaseCoverageException(Exception):
    """The base of all Coverage exceptions."""

class CoverageException(BaseCoverageException):
    """A run-of-the-mill exception specific to coverage.py."""

class StopEverything(
        getattr(unittest, 'SkipTest', Exception)
    """An exception that means everything should stop."""

Now I could remove the weird SkipTest dance in that test. The catch clause in my main() function changes from CoverageException to BaseCoverageException, and things work great. The end...?

One of the reasons I write this stuff down is because I'm hoping to get feedback that will improve my solution, or advance my understanding. As I lay out this story, I can imagine points of divergence: places in this narrative where a reader might object and say, "you should blah blah blah." For example:

  • "You shouldn't bother supporting 2.6." Perhaps not, but that doesn't change the issues explored here, just makes them less likely.
  • "You shouldn't bother supporting Jython." Ditto.
  • "You should just have dependencies for the things you need, like unittest2." Coverage.py has a long-standing tradition of having no dependencies. This is driven by a desire to be available to people porting to new platforms, without having to wait for the dependencies to be ported.
  • "You should have more realistic integration testing." I agree. I'm looking for ideas about how to test the scenario of having no test dependencies installed.

That's my whole tale. Ideas are welcome.

Evil ninja module initialization

Tuesday 10 January 2017

A question about import styles on the Python-Dev mailing list asked about imports like this:

import os as _os

Understanding why people do this is an interesting lesson in how modules work. A module is nothing more than a collection of names. When you define a name in a .py file, it becomes an attribute of the module, and is then importable from the module.

An underlying simplicity in Python is that many statements are really just assignment statements in disguise. All of these define the name X:

X = 17
def X(): print("look!")
import X

When you create a module, you can make the name "X" importable from that module by assigning to it, or defining it as a function. You can also make it importable by importing it yourself.

Suppose your module looks like this:

# yourmodule.py
import os

def doit():

This module has two names defined in it: "doit", and "os". Someone else can now do this:

# someone.py
from yourmodule import os

# or worse, this imports os and doit:
from yourmodule import *

This bothers some people. "os" is not part of the actual interface of yourmodule. That first import I showed prevents this leaking of your imports into your interface. Importing star doesn't pull in names starting with underscores. (Another solution is to define __all__ in your module.)

Most people though, don't worry about this kind of name leaking. Import-star is discouraged anyway, and people know not to import os from other modules. The solution of renaming os to _os just makes your code ugly for little benefit.

The part of the discussion thread that really caught my eye was Daniel Holth's winking suggestion of the "evil ninja mode pattern" of module initialization:

def ninja():
    global exported
    import os
    def exported():

del ninja

What's going on here!? Remember that def is an assignment statement like any other. When used inside a function, it defines a local name, as assignment always does. But an assignment in a function can define a global name if the name is declared as global. It's a little unusual to see a global statement without an explicit assignment at the top-level, but it works just fine. The def statement defines a global "exported" function, because the global statement told it to. "os" is now a local in our function, because again, the import statement is just another form of assignment.

So we define ninja(), and then execute it immediately. This defines the global "exported", and doesn't define a global "os". The only problem is the name "ninja" has been defined, which we can clean up with a del statement.

Please don't ever write code this way. It's a kind of over-defensiveness that isn't needed in typical Python code. But understanding what it does, and why it does it, is a good way to flex your understanding of Python workings.

For more about how names (and values) work in Python, people seem to like my PyCon talk, Python Names and Values.

No PyCon for me this year

Thursday 5 January 2017

2017 will be different for me in one specific way: I won't be attending PyCon. I've been to ten in a row:

Ten consecutive PyCon badges

This year, Open edX con is in Madrid two days later after PyCon, actually overlapping with the sprints. I'm not a good enough traveler to do both. Crossing nine timezones is not something to be taken lightly.

I'll miss the usual love-fest at PyCon, but after ten in a row, it should be OK to miss one. I can say that now, but probably in May I will feel like I am missing the party. Maybe I really will watch talks on video for a change.

I usually would be working on a presentation to give. I like making presentations, but it is a lot of work. This spring I'll have that time back.

In any case, this will be a new way to experience the Python community. See you all in 2018 in Cleveland!

« | » Main « |