Impoverished exceptions

Wednesday 6 September 2006

Yesterday's post about testing for exception details turned into a surprisingly lively discussion about what is the interesting detail in exceptions, and how to test for it.

My string comparison was quickly upped to substring comparison, and then to regex matching. Then it was pointed out that in different locales, the exception messages can be completely different.

As it happens, this came up again for me in a different context. One of the things test runners need to do is find test code nestled in among the application code under test. Here's sample code for doing this (from Russ Magee's new test framework for Django:

# Check to see if a separate 'tests' module exists parallel to the 
# models module
TEST_MODULE = 'tests'
try:
    app_path = app_module.__name__.split('.')[:-1]
    test_module = __import__('.'.join(app_path + [TEST_MODULE]), [], [], TEST_MODULE)
    suite.addTest(testLoader.loadTestsFromModule(test_module))
except ImportError:
    # No tests.py file for application
    pass

This works great: if there's a tests.py file, it will be imported and added to the test suite. If there isn't, it will silently move on. But what if you have a tests.py file, but it has an import error in it? I mis-type imports all the time, or move code between files and let the interpreter find the imports I need to bring along. If my tests.py file has an import error, it is suppressed, and the tests are found, and I don't get told about it.

I changed the code to this:

# Check to see if a separate 'tests' module exists parallel to the 
# models module
TEST_MODULE = 'tests'
try:
    app_path = app_module.__name__.split('.')[:-1]
    test_module = __import__('.'.join(app_path + [TEST_MODULE]), [], [], TEST_MODULE)
    suite.addTest(testLoader.loadTestsFromModule(test_module))
except ImportError, exc:
    # No tests.py file for application, or some other import error.
    if str(exc) != 'No module named %s' % TEST_MODULE:
        # It's something other than a missing tests module, probably a real
        # error, so show the user.
        import traceback
        traceback.print_exc()

Now if tests.py doesn't exist, it will still be silent, but any other error will be displayed to the user. Better: when my tests aren't found, I can see why. But there we are testing exception messages. If this code were run on a non-English installation, the logic would be all wrong. One more edge case: suppose deep in my tests.py, there's an import for another module named "tests"? If that import fails, the exception will still be swallowed.

This is one of the difficulties in error handling in general: when something goes wrong, how do you express it richly enough so that the caller can understand the problem well enough to do something about it?

Clearly, just knowing that an ImportError happened is not enough. Here we want to distinguish between two different causes of ImportError. The exception itself only has one piece of data on it: a message. Wouldn't it be better if it had some structured information as well. As Calvin Spealman pointed out on yesterday's post, it would be great if the details of the exception were available without trying to parse a human-readable message.

Why doesn't ImportError have the name of the module that couldn't be imported? Then we could do something like this:

...
except ImportError, exc:
    # No tests.py file for application, or some other import error.
    test_path = '.'.join(app_path + [TEST_MODULE])
    if exc.module_path != test_path:
        # It's something other than a missing tests module, probably a real
        # error, so show the user.
        import traceback
        traceback.print_exc()

Now we can do the test we want to do. We aren't beholden to any particular locale, and we can distinguish between different tests.py that couldn't be imported.

All sorts of Python exceptions could be extended this way. Calvin's original example was AttributeError having the object and attribute name that couldn't be found on it. The good news is that this could be added to Python at any time.

Comments

[gravatar]
Bill Mill 7:24 AM on 6 Sep 2006

+1 on this for py3k.

[gravatar]
Brandon Corfman 9:02 AM on 6 Sep 2006

I don't see how you can have a good discussion of exception handling without discussing an overall strategy.

From your example, it looks like your strategy is that you want to pick up an exception at any point in the call stack and read metadata that tells you where the exception came from.

My strategy would be "handle the exception as close to the point of failure as possible". If an exception occurs inside a particular function, that function is best suited to deal with it. So I would have one try-except in the main file and one try-except in tests.py to handle any ImportError in there. My belief is that the caller inside your main.py file shouldn't see exceptions that weren't meant for it. If the caller DOES see an exception, it should only be a result of a re-throw by the callee that says "I can't handle this myself".

These ideas are borrowed from Herb Sutter's writings. Even he doesn't write specifically for Python, Herb Sutter's Exceptional C++ has the best discussion of exception safety and exception semantics I've ever seen. His ideas are widely applicable to any language that uses exceptions.

[gravatar]
Michael Chermside 1:08 PM on 6 Sep 2006

A big rousing YES! from me. Yes we should do this, and yes we start doing it any time. There doesn't need to be any particular rhyme or reason about it... just a broad consensus that placing rich data on the exception is wise and then it can implemented one by one.

Within application frameworks that have their own exception types, I have tried implementing things like this before. I have rarely gotten very far before giving up, though and there's one particular problem I have run into frequently, which I will try to describe. The problem is that the exception is raised in a low-level piece of code that does not have access to the information that we want to place in the exception. For instance, in function foo() I may open a file, then invoke bar() which reads from the file handle I get back. When bar() encounters an I/O error it cannot include the filename in the exception because bar() only has a file handle, and doesn't know the filename. I usually solve this by having bar() raise the exception without a filename then have foo() catch the exception, add the additional information (the filename), and re-raise it. But while that makes sense in an application framework that I control completely, I'm not sure the approach makes sense within Python libraries.

I'm not offering a solution here, just describing a potential difficulty.

--------

Brandon Corfman:

I think I understand what you are trying to say with "handle the exception as close to the point of failure as possible", and I disagree with you. I hold instead to the Spartan Principle of exception handling: "Return with your shield or on it." My approach is that any function should either complete successfully or raise an exception -- but that raising exception IS an acceptable behavior: it's just the function's way of reporting failure. I belive that callers of a failing function should not catch the exception unless (1) they can FIX the problem (in which case, they should catch it and fix things), or (2) they want to do something in the presence of the exception (like write to the log or decorate the exception as above) then re-raise it, or (3) they want to wrap the exception. But (3) is *WAY* overused... appropriate use might be for a math library to promise that it will only raise subclasses of MathLibException so at the top-level entry points to the library it catches things like IOException or ValueError and wraps them in MathLibException so the user won't have to wonder where this odd ValueError came from. Wrapping just for the sake of wrapping gains nothing and loses a GREAT deal in readability.

[gravatar]
Chris McDonough 1:21 PM on 6 Sep 2006

Brandon, I'm not sure I agree 100%.

In Ned's case, he's not writing library code at all; his code is pretty much the top-level code for a driver that is finding a bunch of test modules. Essentially, with this hat on, he cannot change the code he is importing (tests.py). But in order to make his life easier, the only thing he can do is to guess about what it means for an import statement to raise an ImportError. This is purely practical even though (as he says later in the entry), it's more guesswork than he'd like.

But even in application and library code, the case for *not* handling exceptions is pretty strong. When you want to break out of a recursive bit of code, exceptions are wonderful, even when their handlers are nowhere near the executing function. Likewise, in library code, handling edge-case exceptions in every function is typically ill-advised. It's usually better for programmers who use your library to get a traceback with the real stack than an error message (or custom exception) that you manufacture.

I say these things as a person who used to believe that catching every possible exception and reraising it with a context-appropriate error message was a good thing. But having been burned by other people's code that did the same thing, I no longer think that.

[gravatar]
Brandon Corfman 2:32 PM on 6 Sep 2006

Michael: Not saying that my exception semantics were the One True Way, but it's very hard to have a discussion on exception handling without knowing what strategy Ned was assuming in the first place.

Chris: Thanks, I think I understand what you and Ned were driving at now. Maybe the Python developers can forge some new ground here.

The only other way out I can think of, besides metadata, is to get language support for checked exceptions (like Java) in imported modules, but that seems very un-Pythonic.

[gravatar]
Ned Batchelder 5:23 AM on 7 Sep 2006

Brandon: you are right about the need to talk about exception strategy as a larger topic. My own problem is that when I start writing about exceptions, it's hard to not leap into a full-blown discussion of strategy, and then it becomes a week-long writing project, and the blog post never gets done! I've got two longer pieces already on this topic, which I should have linked to, the main one being Exceptions in the Rainforest.

In my current case, the exceptions I'm talking about catching are failures to import modules. These are the Python equivalent of compile-time errors. I'm not going to wrap all my imports with a try/catch just in case I've mis-typed the module name!

[gravatar]
Michael Chermside 7:54 AM on 7 Sep 2006

Brandon:

One of the most telling points for me is that Java's checked exceptions (so far as I know it's the only language to use this concept) seem like a very good idea -- I thought they were brilliant when I first saw them -- but most experienced users who are independent thinkers seem to agree that the experiment has been a failure and checked exceptions are better avoided. (It's kind of like C++'s const in this regard... seems brilliant, but not actually so nice in practice.)

What ends up happening is that 90% of the methods you write wind up declaring "throws MySystemException". Or if your programmers aren't so good you wind up with everything declaring "throws Exception", or lots of clauses that say "catch Exception { }" that catch and ignore the exceptions. And it doesn't add anything. What *WOULD* be useful is good documentation of what exceptions can be thrown... but we certainly can't trust untested documentation, so we'll never really be able to believe the docs. I think the only TRUE solution is good tools (IDEs and code analyzers) that determine for you what exceptions are possible at various places... we're not there yet, but given the rate at which programming tools are advancing lately I think it's coming soon.

So to summarize, I guess I object to both checked exceptions and metadata, and instead I'm asking for you to hope for future improvements in tools. I guess that isn't very helpful today. Sorry.

Oh, and "Exceptions in the Rainforest" is one of Ned's more brilliant pieces -- well worth reading.

[gravatar]
Robert Brewer 11:12 AM on 8 Sep 2006

Chris said:
> I say these things as a person who used
> to believe that catching every possible
> exception and reraising it with a context-
> appropriate error message was a good thing.
> But having been burned by other people's
> code that did the same thing, I no longer
> think that.

Yeah; there's only so many times you can receive the exception, dig the code, comment out the try/except, run the code again, see the *real* error, fix it, and then uncomment the try/except again. That gets old real quick. In other people's code, doubly so.

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.