![]() | Ned Batchelder : Blog | Code | Text | Site Exceptions in the rainforest » Home : Text |
Created 16 October 2003 As part of a debate about exceptions and status returns, Joel asked for an example of exception handling using a particular chunk of code. Before jumping to the code, I want to talk about rainforests for a little bit. If you haven't read my previous article about exceptions and status returns, you might want to start there. RainforestsIf you've ever studied the rainforest, you know that it is not a simple place. A simplistic model of it would be that there are lots of trees, and lots of animals, and they all live together. It's more interesting than that: The forest is divided horizontally into layers, and each layer has its own ecosystem, with different inhabitants. To understand how the rainforest works, you have to consider the layers separately, and see how they differ from each other. Complex software is the same way: there are different layers, and the error handling they perform is different. If we want to discuss what exception handling looks like in real code, we have to talk about the layers. Three layers of codeIn my experience, there are three layers to real code (from bottom to top, so this list might look upside-down):
Keep in mind, this is a simple model, and real software is fractal in most of its aspects. A 100,000-line system will have layers within layers within layers. But this three-layer model closely matches the way I've seen a number of real systems evolve. Let's look at each of these layers in detail. Adapting the software beneath youBeneath every piece of software is more software. Your Windows application sits on top of the Win32 API, or ATL. Your PHP web site sits on top of MySQL calls, and PHP primitives. Your Java system sits on top of the JDK, the J2EE facilities. Even if you are writing a device driver, your code is sitting on top of the actual I/O operations that write bits to the disk, or whatever it is your driver does. At the lowest layer of your system, your code deals with your particular underlying software. It makes its calls, and interprets the results. This layer is where you convert cultures, making the underlying software more the way you'd like it to be: operations become more convenient, concepts are presented more palatably to the rest of the system, ugly workarounds are hidden. Building pieces of your systemThe middle layer of your code is where you construct the pieces of your world. Are you writing a spreadsheet? You'll need a cell engine, and some way to read and write data files, and connectors to databases, and charting modules. In some worlds this is called business logic. This is where the bulk of the code will be, and where you are likely to be adding value. Few applications compete on how well they read and write the registry. The interesting technology is in the cell engines, or drawing paradigms, or database intelligence, or logical inference algorithms. This is the interesting part. The more time you can spend here productively, the better off you will be. Combining it all togetherAt the top of your system is the big picture. For example: when the application starts, we need to create an empty document, initialize the database layer, and show the GUI. This is where you can see the main flow of the application. If you had to explain what your system did in detail to a knowledgeable user, this layer is the one you'd be talking about. This is the stage manager layer, coordinating pieces, making the whole thing hang together into a cohesive whole. How exceptions are used in the layersAt the bottom layer (Adapting), there's a lot of throwing exceptions. Unless you are coding in Java or C#, where the system toolkits throw exceptions (in which case, I'm preaching to the choir), the layer beneath you more than likely is returning statuses to you. Each call will have its return value checked, and converted into an appropriate exception, and thrown. Sometimes, error values will be dealt with immediately. For example, this layer may implement some simple retrying mechanism for some operations, or it may decide that some error returns are really not errors. At the middle layer (Building), things are flowing pretty smoothly. Typically, there's not a lot of exceptions being thrown, and not a lot being caught either. This is where you often get to just think about the ideal case, and focus on the algorithms and data structures at hand. Of course, exceptions can happen, especially in the A-layer calls you make. But for the most part, you can let those exceptions fly. An upper layer will deal with them. At the top layer (Combining), there's a lot of catching exceptions happening. Couldn't open a file? Now you have to decide what to do about it. You can alert the user, try a different file name, exit the application, whatever you as the system designer decide is the best approach. This C-layer code can actually be quite pre-occupied with dealing with exceptions. This makes sense: this is the layer where the code really knows what's going on. If you have an A-layer function to open a file, what should it do when the file can't be opened? How can you possibly say? This function will be used to open all sorts of files for all sorts of reasons. Maybe the C-layer caller knows that the file could be missing, and has a plan for what to do in that case, so alerting the user would be wrong. It's the C-layer that understands the big picture, so it's the C-layer that should be dealing with the exceptions. Exceptions vs. status returns againNow for Joel's example. He asked that we discuss this code: void InstallSoftware() Using the three-layer model above, this is clearly C-layer code. I know Joel asked for this example because he knew that even with exceptions the code would be cluttered with error handling, just as it would be with status returns. He's right. It's C-layer code, so it will have to deal with unusual cases. There's no way around that. Others have taken up this challenge, and come up with some nice ways to deal with it cleanly, using C++ destructor semantics to ensure that operations are rolled back. To be perfectly honest, I don't know that I would have been as clever as these writers, though they have given me some good ideas. I might have done it like this: void InstallSoftware() This function either succeeds, in which case the files are copied and the registry entries are written, or it throws an exception, and the files and registry entries are cleaned up. Is this sufficient? I don't really know, and in a real implementation I can imagine it getting much hairier than this. The status return folks may well be crowing about this code, that it is either not handling the problems completely, or that it is just as ugly as status return code. They're missing the point. I'm not claiming that exceptions make all code prettier, or that they somehow remove the burden of thinking through what should happen when something goes wrong. The debate over exceptions and status returns is not about whether error handling is hard to do well. We all agree on that. It's not about whether exceptions make it magically better. They don't, and if someone says they do, they haven't written large systems in the real world. The debate is about how errors should be communicated through the code. The C-layer code we're talking about is going to be complicated no matter which technique you use to communicate errors around. But what does the B-layer code look like? void MakeRegistryEntries() Here at the B-layer, we can get into the zone and just write registry entries. How would this look with status returns? Either cluttered with if statements, or hidden behind macros that simply pull your code into the "hidden function return" camp that are supposed to make exceptions evil. The A-layer code looks like this: void CRegistry::WriteString( Here we're adapting to the Win32 registry functions, converting their status returns into exceptions (which carries the actual status return as data so that it can be used for error messages, or analysis). These example are all too brief to be real code, but demonstrate the concepts. Broadly speaking:
Exceptions are better at communicating errorsThe challenge in building a large system is making sure errors get communicated around. Exceptions are a better way to do that than status returns:
See also
| |
Comments
Bob gave this some treatment over at his place: http://www.bobcongdon.net/blog/
(Bob's appreantly has no permalinks)
Geez, everybody is piling on Joel.
Also, in your example to Joel, don't forget our "real world" status (or should I say STATUS) experience:
STATUS DoSomething(int a, int b)
{
STATUS st;
if (st != DoSomethingEx(a,b))
goto error;
if (st != DoSomethingEx2(a,b))
goto error;
error:
return st;
}
I know it's an example and that it's hypothetical BUT
Using the following example (taken from the article) what happens if anything in the catch{} block throws an exception (as well it might if CopyFiles() threw the exception and then you called DeleteRegistryEntries).
What is the solution? try{}catch{} in the catch{} block ?
void InstallSoftware()
{
try {
CopyFiles();
MakeRegistryEntries();
}
catch (CException & ex) {
RemoveFiles();
DeleteRegistryEntries();
throw ex;
}
}
Ross, taking your point to an even higher level, ultimately there are errors that software just can't fix or clean up from. For example, what if during the RemoveFiles() function call in the example there was a permanent hard disk failure?
The best that a program can do here is to attempt to accurately inform the user of the issue exit.
That's why exceptions are cleaner in that you can wrap one highest level exception around all your code to catch anything, report it to the user, and exit. Doing that with status returns can get extremely messy. I know, because it's all I used to do before exceptions were made mainstream.
But it's Joel's point that this type of catch-all exception handling leads to a higher probability of being sloppy and missing exceptions that could be readily recovered from.
BTW, the forest is divided into *vertical*, not horizontal layers :)
Just to be pedantic, the rainforest is divided vertically into horizontal layers.
Straight from the horse's mouth:
"Primary tropical rainforest is vertically divided into at least five layers..."
Error codes are 'opt-in', while Exceptions are 'opt-out'. That is, you have to explicitly code to allow error codes to propagate (opting in), while with exceptions, you can have gobs of code that is not cluttered with the exception handling code, and only write handling code in places where you can actually do something about it. Most code is unable to do much about the exception - imagine trying to handle a FileNotFoundException in middle-layer code, how would you prompt the user? You wouldn't, so it is a UI-layer problem. Exception handling is usually cleaner overall, because only code that cares to handle and can handle the exception needs to get 'cluttered' to actually handle it. All the other code remains uncluttered. Thus, the opt-in nature of exception handling tends to result in more readable code and more cohesive programs.
Hi all, Joel came with absolutely wrong sample and answer to Joel should be:
void InstallSoftware()
{
Log log;
try {
CopyFiles(log);
MakeRegistryEntries(log);
}
catch (CException & ex) {
Rollback(log);
throw ex;
}
}
Just use right techiques in right places, guys
Ned, aren't you ignoring a big issue? Non-trivial code at the B level, which just lets exceptions fly by, has to be equipped with declarations of objects having destructors tending towards nightmarish complexity.
From the point of view of B level code, any function you call might return straight into your destructors, which have to tidy everything up without knowing how far through the main body of code you got.
(Unless you sprinkle flag-settings through the body of the code. But B level code is supposed to be clean.)
Or is this just an argument against trying to write non-trivial code in the first place? I know it's not pretty with status returns; but with exceptions, isn't it even worse? i.e. the body of the code looks very pretty, and the destructors are horrible.
In C++, this is dealt with by acquiring resources in object constructors and releasing them in destructors. Objects are instantiated on the stack, and as they go out of scope, resources are freed. The nice thing about this approach is that there's no need to worry about how the function exits. See Bjarne's "appendix e" to his book (google for "c++ appendix e"). I have implemented this consistently through a large system and it works extremely well.
IMHO exceptions are bad because it makes execution flow a little bit unpredictable. It's like "goto" which we avoid.
Anyone tried to debug large systems with exceptions? We started at A, then down to B, then down to C, and then, bum-bang, voila, back to A! I hate this "jumps" !!!
Sorry, wrong sequence of A-B-C, should be: start form C, down to B etc.
can anybody give me java code and explanation about rainforest algorithm related to decision tree on machine learning
Thanks Before
Add a comment: