Breaking out of two loops

Thursday 4 August 2016This is eight years old. Be careful.

A common question is, how do I break out of two nested loops at once? For example, how can I examine pairs of characters in a string, stopping when I find an equal pair? The classic way to do this is to write two nested loops that iterate over the indexes of the string:

s = "a string to examine"
for i in range(len(s)):
    for j in range(i+1, len(s)):
        if s[i] == s[j]:
            answer = (i, j)
            break   # How to break twice???

Here we are using two loops to generate the two indexes that we want to examine. When we find the condition we’re looking for, we want to end both loops.

There are a few common answers to this. But I don’t like them much:

  • Put the loops into a function, and return from the function to break the loops. This is unsatisfying because the loops might not be a natural place to refactor into a new function, and maybe you need access to other locals during the loops.
  • Raise an exception and catch it outside the double loop. This is using exceptions as a form of goto. There’s no exceptional condition here, you’re just taking advantage of exceptions’ action at a distance.
  • Use boolean variables to note that the loop is done, and check the variable in the outer loop to execute a second break. This is a low-tech solution, and may be right for some cases, but is mostly just extra noise and bookkeeping.

My preferred answer, and one that I covered in my PyCon 2013 talk, Loop Like A Native, is to make the double loop into a single loop, and then just use a simple break.

This requires putting a little more work into the loops, but is a good exercise in abstracting your iteration. This is something Python is very good at, but it is easy to use Python as if it were a less capable language, and not take advantage of the loop abstractions available.

Let’s consider the problem again. Is this really two loops? Before you write any code, listen to the English description again:

How can I examine pairs of characters in a string, stopping when I find an equal pair?

I don’t hear two loops in that description. There’s a single loop, over pairs. So let’s write it that way:

def unique_pairs(n):
    """Produce pairs of indexes in range(n)"""
    for i in range(n):
        for j in range(i+1, n):
            yield i, j

s = "a string to examine"
for i, j in unique_pairs(len(s)):
    if s[i] == s[j]:
        answer = (i, j)
        break

Here we’ve written a generator to produce the pairs of indexes we need. Now our loop is a single loop over pairs, rather than a double loop over indexes. The double loop is still there, but abstracted away inside the unique_pairs generator.

This makes our code nicely match our English. And notice we no longer have to write len(s) twice, another sign that the original code wanted refactoring. The unique_pairs generator can be reused if we find other places we want to iterate like this, though remember that multiple uses is not a requirement for writing a function.

I know this technique seems exotic. But it really is the best solution. If you still feel tied to the double loops, think more about how you imagine the structure of your program. The very fact that you are trying to break out of both loops at once means that in some sense they are one thing, not two. Hide the two-ness inside one generator, and you can structure your code the way you really think about it.

Python has powerful tools for abstraction, including generators and other techniques for abstracting iteration. My Loop Like A Native talk has more detail (and one egregious joke) if you want to hear more about it.

Comments

[gravatar]
How about this
for i in range(len(s)):
    for j in range(i+1, len(s)):
        if s[i] == s[j]:
            answer = (i, j)
            break
    else:
        continue
    break
[gravatar]
When you have multiple nested loops you can often reduce them to a single loop using the tools in the itertools module. When you are looping over distinct pairs (or triples, quadruples, etc.), use itertools.combinations. In this case the code becomes:
from itertools import combinations
for (i, a), (j, b) in combinations(enumerate(s), 2):
    if a == b:
        answer = i, j
        break
else:
    # Failed to find a matching pair — what now?
Alternatively, the function unique_pairs in the post could be rewritten like this:
def unique_pairs(n):
    """Produce pairs of indexes in range(n)"""
    return combinations(range(n), 2)
Similarly, when you are looping with repetition, use itertools.product.
[gravatar]
@ionelmc: break/else/continue/break seems very confusing to me.
[gravatar]
@Gareth: Thanks! itertools does have lots of goodies. In this case, I was trying to focus on the refactoring from two loops to one, but yes, often you can use itertools to avoid implementing your own logic.
[gravatar]
Does someone could calculate the computational complexity of all the different refactoring? What about profiling with millions of elements, which performs better?
[gravatar]
Balazs Tothfalussy 8:29 PM on 4 Aug 2016
How about avoiding the for loop and breaking altogether?
s = "a string to examine"
i,j = 0, 1
while i < len(s) - 1 and s[i] != s[j]:
    while j < len(s) and s[i] != s[j]:
        j = j + 1
    if j >= len(s):
        i,j = i + 1, i + 2

if i < len(s) - 1 and j < len(s):
    answer = (i, j)
[gravatar]
Cool solution using a generator!

However I'm confused as to how a generator function is less confusing than simply putting the nested loops in a function and returning it.

@gareth love this itertoolss example!
[gravatar]
@Balazs: I admire the work you put into a while-loop solution, but this is definitely not clearer than the for-loop.
[gravatar]
@Tom: my problem with putting the loops in a function is that you don't get the power of abstracting the iteration from the condition checking. And in a real program, it could be awkward to move the double loop into a separate function, depending on what happened in the loop with the other local variables.
[gravatar]
Something like itertools.combinations would be nice but it doesn't do what the author wants when the two lists contain arbitrary objects. unique_pairs is really useful and IMO should be part of itertools. Am I missing something or are there any good reasons why it isn't (yet)?
[gravatar]
Andrew Grigorev 10:08 PM on 4 Aug 2016
I'd prefer to transform the nested loop to a single generator without defining a generator function:
it = ((i, j) for i in range(len(s)) for j in range(i + 1, len(s)))

for i, j in it:
    if s[i] == s[j]:
        answer = (i, j)
        break
Also, there is a specific solution to get first matching element from such generator:
answer = next(
    (i, j)
    for i in range(len(s))
    for j in range(i + 1, len(s))
    if s[i] == s[j]
)
[gravatar]
Andrew Grigorev 10:17 PM on 4 Aug 2016
Btw, thank you for your 2003 year talk, found a good point in "Abstracting your iterators" slide! Other stuff also looks pretty useful and complete :-).
[gravatar]
@Andrew: I'm not sure why it's better to write such compact generators. I prefer giving it a name, and a docstring. That way, I can write a test of the generator as a unit. It can be understood as a unit, and reused in the code.
[gravatar]
No offense intended but @Balazs answer looks like a C implementation to me - iterating through lists using the index and range checking against the length (Although a real C implementation would check for the `\0` character).

Certainly not code I would expect to see in Python (although I am almost certainly guilty of writing similar snippets).
[gravatar]
Balazs Tothfalussy 10:05 AM on 5 Aug 2016
@Tony Flury - no offense taken, I was curios what others think about such an approach, as it was not listed as an alternative.
[gravatar]
Andrew Grigorev 2:36 PM on 5 Aug 2016
@Ned: It's really a matter of taste :-). Altough, if it worth a docstring, why not then just move all stuff into a function and just return?

It looks like a different case, when such iterator itself needs a docstring and tests.
[gravatar]
@Andrew: One of the points I'm trying to make with this post is that the iteration over pairs is a separate idea from what you do with those pairs. If you move the entire double loop into a new function, you aren't separating those two ideas. The way my code is written, we have a named unit that encapsulates the idea of iterating over pairs, independent of what you do with those pairs. That unit can be tested, and reused.
[gravatar]
Andrew Grigorev 3:39 PM on 5 Aug 2016
@Ned: Ok, got it :-).
[gravatar]
@Tobi:

> Something like itertools.combinations would be nice but it doesn't do what the author wants when the two lists contain arbitrary objects.

I would very much like to know what the author wants in this case. :-) It seems to me it really should work no matter what objects the list(s??) contains. It doesn't inspect the objects at all.

@Ned:

> maybe you need access to other locals during the loops.

Then put it in a local function, of course. Python has had nested scopes for about 16 years now. It's not rocket science. :-P

> This is using exceptions as a form of goto. There's no exceptional condition here, you're just taking advantage of exceptions' action at a distance.

I didn't believe I'd hear this from you. I thought Python has gotten over the "Exceptions should only be used for exceptional conditions" bullsh*t.

Exceptions _are_ a (structured) form of goto. Read Knuth sometime. ;-P
[gravatar]
@Veky: maybe tone down the attitude a bit?

Both "local function" and "use exceptions" seem to me to be using Python features not for their intended use (functions are for abstracting a series of statements into a named unit, and exceptions are for dealing with if not exceptional, then out-of-band conditions), but as a trick taking advantage of their execution semantics (functions give you return, and exceptions give you raise/except).

Of course, people will disagree. If they didn't, this wouldn't be an interesting blog post. :)

I still believe abstracting the iteration is easily the better solution, as it gives you an abstraction in a useful and conceptual place, not just where it happens to be a convenient execution semantic.
[gravatar]
Sorry about the tone. I just thought we were more mature than flaunting Java-style dogmatic patterns around.

If functions were modelling mathematical functions (ALGOL functions), you'd be right. And then the SESE philosophy would apply, and "premature return" would be an antipattern. But Python functions are not that. They model C "functions" (Pascal procedures), in fact. They control flow. And it's definitely a feature.

Second, exceptions are exactly the form of structured goto that Guido has hoped would alleviate the need for raw goto (remember that at the time Python was introduced, it was really not clear that the language can survive without raw goto). It is not a sideeffect. It is an explicit structured out-of-band communication of control flow (in-band meaning function calls and returns). Breaking out of the loop is as out-of-band as StopIteration is. :-)
[gravatar]
@Veky: your improved attitude is to imply that I am immature, and dogmatic? :)

I don't know what here reminds you of Java. I am advocating abstracting the iteration, which the last I looked (admittedly, very long ago) wasn't something Java provided.
[gravatar]
Hm. I don't know if you're messing with me (I probably deserved it:), but I'll try one more time.

I was referring (in both my comments) _only_ to that unfortunate sentence of yours, which to me sounded too much like the famous Java dogma, "Exceptions should only be used for exceptional conditions" (see e.g. http://www.informit.com/articles/article.aspx?p=2133373&seqNum=6). In Python, I'm sure I don't have to convince you too hard, _that doesn't apply_.

There was, IIRC, recently a Python-dev discussion about whether except should use issubclass instead of just walking through the MRO, and one of main reasons Guido quoted as being against it was that it would make exception handling too slow for normal Python usage, and then we would have to invent faster LBYL patterns for many things we currently do in EAFP way, which would change language too much.

Python embraces exceptions. They are an integral part of the language, not an afterthought. Somebody could probably argue that idiomatic C++ or Java in the ideal world (smart users, enough memory, responsive peripherals) woudn't need exceptions at all. Python would still need them.
[gravatar]
@Veky I think using exceptions as an adhoc go-to is odd and not normally how exceptions are used in my experience. The use of generators and StopIteration exceptions, on the other hand, are ubiquitous in python code and easier to grok by python programmers.

I don't think this view falls under the category of:

> Java dogma, "Exceptions should only be used for exceptional conditions"
[gravatar]
=D i came up with this:
res = ((a,i,j+i+1) for i,a in enumerate(t) for j,b in enumerate(t[i+1:]) if a==b and a!=' ')

print(next(res))
[gravatar]
or in compliance with the article =D:
s = "a string to examine"
answer = ((i,j+i+1) for i,a in enumerate(s) for j,b in enumerate(s[i+1:]) if a==b)
[gravatar]
Oh! Andrew Grigorev did that already =)))
[gravatar]
@Veky: yes, Python embraces exceptions, but that doesn't mean you can use them for anything at all. There are still cases where exceptions will work, but will be frowned upon. Rather than repeat the bad Java words, I'll say this: I think it is a code smell to have an exception raised and then caught entirely within one function. I'm not saying you can't do it, but it seems outside the spirit of the feature to me.

And in any case, my point has never been "don't do these three ways of breaking out of two loops," but instead, "I think this fourth way is better than those three ways."
[gravatar]
Now with Tom's comment too, it really seems people don't understand what I'm writing about. It's probably my fault then, for not expressing myself clearly. Unfortunately, since I already said I took issue _only_ with the sentence part "There's no exceptional condition here" in the original post, I don't know what else I can say to clarify my position.

[Of course I never said that fourth way shouldn't be considered, and it's nice that Ned wrote this post (although it's written already in many places on the net, it's always nice when one of Python "inner circle" people writes it). I'm just saying those three ways are not that bad, and it depends on the concrete situation whether each of them would be better than the fourth way.]
[gravatar]
I agree, the approach you show is the best one I can think of to solve this problem.
[gravatar]
I do like Andrew Grigorev's solution,
answer = next(
    (i, j)
    for i in range(len(s))
    for j in range(i + 1, len(s))
    if s[i] == s[j]
)
but that's probably because I've written way too much SQL
SELECT i,j
FROM some_table_involving_s
WHERE s[i] = s[j]
These days I tend to favor declarative over procedural, and I think keeping the iterators and conditionals together is actually easier to understand. But opinions differ ...
[gravatar]
Something I don't see mentioned in the discussion are the performance implications of using exceptions as a goto. I haven't analyzed it myself, but my understanding is that -- aside from design considerations -- another good reason to only use exceptions for exceptional cases is that they're slow.
[gravatar]
good strategy is to avoid nested loops ;)

s = "a string to examine"
y = filter(lambda x: x is not None, [(key,s[key+1:].find(val)+1) if s[key+1:].find(val)> -1 else None for key,val in enumerate(s)])
answer = y[0] if y else None
[gravatar]
Ned, this post and your PyCon talk were super useful. Elegant and fun explanation. Thank you!
[gravatar]

I am a bit surprised that nobody is addressing the elephant in the room: labels. Using “break label” seems like the most obvious and clear solution. I wonder if the new parser makes this now possible or if there was some particular aversion to using labels all along. I mean, it’s not like I’m suggesting adding GOTOs!

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.