Thursday 4 August 2016 — This is more than eight years old. Be careful.
A common question is, how do I break out of two nested loops at once? For example, how can I examine pairs of characters in a string, stopping when I find an equal pair? The classic way to do this is to write two nested loops that iterate over the indexes of the string:
s = "a string to examine"
for i in range(len(s)):
for j in range(i+1, len(s)):
if s[i] == s[j]:
answer = (i, j)
break # How to break twice???
Here we are using two loops to generate the two indexes that we want to examine. When we find the condition we’re looking for, we want to end both loops.
There are a few common answers to this. But I don’t like them much:
- Put the loops into a function, and return from the function to break the loops. This is unsatisfying because the loops might not be a natural place to refactor into a new function, and maybe you need access to other locals during the loops.
- Raise an exception and catch it outside the double loop. This is using exceptions as a form of goto. There’s no exceptional condition here, you’re just taking advantage of exceptions’ action at a distance.
- Use boolean variables to note that the loop is done, and check the variable in the outer loop to execute a second break. This is a low-tech solution, and may be right for some cases, but is mostly just extra noise and bookkeeping.
My preferred answer, and one that I covered in my PyCon 2013 talk, Loop Like A Native, is to make the double loop into a single loop, and then just use a simple break.
This requires putting a little more work into the loops, but is a good exercise in abstracting your iteration. This is something Python is very good at, but it is easy to use Python as if it were a less capable language, and not take advantage of the loop abstractions available.
Let’s consider the problem again. Is this really two loops? Before you write any code, listen to the English description again:
How can I examine pairs of characters in a string, stopping when I find an equal pair?
I don’t hear two loops in that description. There’s a single loop, over pairs. So let’s write it that way:
def unique_pairs(n):
"""Produce pairs of indexes in range(n)"""
for i in range(n):
for j in range(i+1, n):
yield i, j
s = "a string to examine"
for i, j in unique_pairs(len(s)):
if s[i] == s[j]:
answer = (i, j)
break
Here we’ve written a generator to produce the pairs of indexes we need. Now our loop is a single loop over pairs, rather than a double loop over indexes. The double loop is still there, but abstracted away inside the unique_pairs generator.
This makes our code nicely match our English. And notice we no longer have to write len(s) twice, another sign that the original code wanted refactoring. The unique_pairs generator can be reused if we find other places we want to iterate like this, though remember that multiple uses is not a requirement for writing a function.
I know this technique seems exotic. But it really is the best solution. If you still feel tied to the double loops, think more about how you imagine the structure of your program. The very fact that you are trying to break out of both loops at once means that in some sense they are one thing, not two. Hide the two-ness inside one generator, and you can structure your code the way you really think about it.
Python has powerful tools for abstraction, including generators and other techniques for abstracting iteration. My Loop Like A Native talk has more detail (and one egregious joke) if you want to hear more about it.
Comments
However I'm confused as to how a generator function is less confusing than simply putting the nested loops in a function and returning it.
@gareth love this itertoolss example!
Certainly not code I would expect to see in Python (although I am almost certainly guilty of writing similar snippets).
It looks like a different case, when such iterator itself needs a docstring and tests.
> Something like itertools.combinations would be nice but it doesn't do what the author wants when the two lists contain arbitrary objects.
I would very much like to know what the author wants in this case. :-) It seems to me it really should work no matter what objects the list(s??) contains. It doesn't inspect the objects at all.
@Ned:
> maybe you need access to other locals during the loops.
Then put it in a local function, of course. Python has had nested scopes for about 16 years now. It's not rocket science. :-P
> This is using exceptions as a form of goto. There's no exceptional condition here, you're just taking advantage of exceptions' action at a distance.
I didn't believe I'd hear this from you. I thought Python has gotten over the "Exceptions should only be used for exceptional conditions" bullsh*t.
Exceptions _are_ a (structured) form of goto. Read Knuth sometime. ;-P
Both "local function" and "use exceptions" seem to me to be using Python features not for their intended use (functions are for abstracting a series of statements into a named unit, and exceptions are for dealing with if not exceptional, then out-of-band conditions), but as a trick taking advantage of their execution semantics (functions give you return, and exceptions give you raise/except).
Of course, people will disagree. If they didn't, this wouldn't be an interesting blog post. :)
I still believe abstracting the iteration is easily the better solution, as it gives you an abstraction in a useful and conceptual place, not just where it happens to be a convenient execution semantic.
If functions were modelling mathematical functions (ALGOL functions), you'd be right. And then the SESE philosophy would apply, and "premature return" would be an antipattern. But Python functions are not that. They model C "functions" (Pascal procedures), in fact. They control flow. And it's definitely a feature.
Second, exceptions are exactly the form of structured goto that Guido has hoped would alleviate the need for raw goto (remember that at the time Python was introduced, it was really not clear that the language can survive without raw goto). It is not a sideeffect. It is an explicit structured out-of-band communication of control flow (in-band meaning function calls and returns). Breaking out of the loop is as out-of-band as StopIteration is. :-)
I don't know what here reminds you of Java. I am advocating abstracting the iteration, which the last I looked (admittedly, very long ago) wasn't something Java provided.
I was referring (in both my comments) _only_ to that unfortunate sentence of yours, which to me sounded too much like the famous Java dogma, "Exceptions should only be used for exceptional conditions" (see e.g. http://www.informit.com/articles/article.aspx?p=2133373&seqNum=6). In Python, I'm sure I don't have to convince you too hard, _that doesn't apply_.
There was, IIRC, recently a Python-dev discussion about whether except should use issubclass instead of just walking through the MRO, and one of main reasons Guido quoted as being against it was that it would make exception handling too slow for normal Python usage, and then we would have to invent faster LBYL patterns for many things we currently do in EAFP way, which would change language too much.
Python embraces exceptions. They are an integral part of the language, not an afterthought. Somebody could probably argue that idiomatic C++ or Java in the ideal world (smart users, enough memory, responsive peripherals) woudn't need exceptions at all. Python would still need them.
I don't think this view falls under the category of:
> Java dogma, "Exceptions should only be used for exceptional conditions"
And in any case, my point has never been "don't do these three ways of breaking out of two loops," but instead, "I think this fourth way is better than those three ways."
[Of course I never said that fourth way shouldn't be considered, and it's nice that Ned wrote this post (although it's written already in many places on the net, it's always nice when one of Python "inner circle" people writes it). I'm just saying those three ways are not that bad, and it depends on the concrete situation whether each of them would be better than the fourth way.]
s = "a string to examine"
y = filter(lambda x: x is not None, [(key,s[key+1:].find(val)+1) if s[key+1:].find(val)> -1 else None for key,val in enumerate(s)])
answer = y[0] if y else None
I am a bit surprised that nobody is addressing the elephant in the room: labels. Using “break label” seems like the most obvious and clear solution. I wonder if the new parser makes this now possible or if there was some particular aversion to using labels all along. I mean, it’s not like I’m suggesting adding GOTOs!
Add a comment: