Recursive dogma

Monday 28 May 2012This is close to 13 years old. Be careful.

This morning in the #python IRC channel, someone needed help with a recursive function:

Questioner: i have a problem with a little recursive function
Questioner: i create a list recursive of length 2**30 but i get a memory problem
Naysayer: recursion in Python is almost always bad news
Questioner: i have probs with high memory because i create too many too large list when i enter a value of 2**30 for N
RecursionFan: recursion is awesome
RecursionFan: it sounds like you’re doing it wrong
Naysayer: you should avoid recursion in python
RecursionFan: why?
Naysayer: There’s this little thing called the stack
Questioner: yeah i now the depth of my recursion, its just log(N)
Questioner: but can somebody help me so this function can work for parameters of 2**30
RecursionFan: there’s this little thing called http://en.wikipedia.org/wiki/Tail_call
Naysayer: yeah, and python doesn’t have it
Onlooker: RecursionFan, does python optimise tail-recursion?
nedbat: RecursionFan: Python doesn’t do tail-call removal.
Naysayer: Onlooker: No
nedbat: Naysayer: “you should avoid recursion” is a bit strong. You need to understand the limitation.
nedbat: Naysayer: Python is as good at recursion as C is at integer arithmetic.
Naysayer: nedbat: It is good practice to avoid recursion in Python if you don’t understand how it works.
nedbat: Naysayer: it is good practice to avoid _____ in Python if you don’t understand how it works.

(Names other than mine, nedbat, are changed.) Before anyone’s read Questioner’s code, or even listened fully to his question, we’re off on the usual rant against recursion in Python, and debates about tail-call elimination. As it happens, and as Questioner rightly understood, the problem is not about deep recursion:

def computeHW(N):
    if N == 1:
        return [1]
    else:
        L = computeHW(N/2)
        L.extend([2*i for i in L])
        return L

Questioner wanted to call this function with 2**30, which is quite modest in terms of depth, the stack will only get 30 levels deep. But it’s crazy in memory terms: it tries to allocate a list with a billion elements, but not before it’s allocated and deallocated many many others, up to that size.

So yes, this function isn’t going to work, but not because of tail-call optimization or its lack. We can change the function into a generator that still works recursively, but won’t allocate any memory:

def computeHW2(N):
    if N == 1:
        yield 1
    else:
        for i in computeHW2(N/2):
            yield i
        for i in computeHW2(N/2):
            yield 2 * i

Talking more with Questioner, it turns out he didn’t want the sequence, he wanted to be able to randomly access elements of his list. We can turn this function into a closed-form solution that computes the n’th element of this list:

def closedHW(N):
    if N == 0:
        return 1
    else:
        bits = N.bit_length()
        return 2 * closedHW(N - 2**(bits-1))

This function is still recursive, and so will use up to 30 stack frames for numbers up to 2**30, but doesn’t allocate any lists at all. This function is also in a simple tail-call form, so we can easily make it iterative:

def closedHW_iterative(N):
    hw = 1
    while N:
        bits = N.bit_length()
        N -= 2**(bits-1)
        hw *= 2
    return hw

BTW, this code gets to use one of the few methods on integers, bit_length(), which is new in 2.7.

In the end, we eliminated the recursion, but not for any of the reasons Naysayer was darkly hinting at.

When people say Python is bad at recursion, they are referring to the fact that the stack can’t grow beyond 1000 frames. This makes it bad at the kind of recursion that languages like Scheme and Haskell encourage. This school of thought uses recursive procedures for iterative processes, to put it in terms from Structure and Interpretation of Computer Programs. Python is not good for this sort of recursion, because it won’t work for more than 1000 iterations, which is far too low a limit.

But truly recursive processes don’t often need that much stack, so using recursive procedures for recursive processes is fine in Python. Blanket statements like “recursion is almost always bad in Python” are just simplistic.

Comments

[gravatar]
I like the logic here. It is also a sad fact that far too often people try to solve simplistic problems using recursive techniques (usually badly expressed) when they are unnecessary and wasteful, as you have also shown.I don't have anything against recursion, but it pains me to see it used when people think it makes their code more 'elegant' or 'simpler'. It may make it a few lines shorter, its just sad that it is also going to eat all your memory and utilize 100 times as much cpu as it needs to unless you write it properly. So Python being 'bad' at implementing flabby, wasteful code is a bonus in my book

The sys.setrecursionlimit() call can set the stack size to greater depth if you want. The docs state there are hard limits depending on platform, but I have never had any trouble raising it when needed.
[gravatar]
@Nick: the risk with sys.setrecursionlimit() is that if you set it too high, your stack will grow to the point of segfaulting.
[gravatar]
I don't deliberately eschew recursion up front but I do occasionally find myself doing things like http://jtauber.com/blog/2007/12/15/avoiding_recursion/

And of course, there's http://jtauber.com/blog/2008/03/30/thunks,_trampolines_and_continuation_passing/ (but I'm not seriously suggesting anyone write in continuation passing style in Python :-)
[gravatar]
It remembers of my last CTO disliking when I told him its supposed expertise in python was quite a fraud.

He was very proud to have recompiled python with a 10000 frames recursion limit, and was still trying to figure out why it was segfaulting.

The coder is a link in a chain which is as strong as its weakest link, and knowledge which strengthen it is very hard to share, since it may conflict with politics.
[gravatar]
Sometimes the iterative version of a routine is easiest to create. When solving this task at http://rosettacode.org/wiki/Set_consolidation I created the new task and the iterative Python solution. The recursive version came from someone else, at a later date.

I just wanted to dispel any impression that iterative routines were necessarily more 'intuitive' or easiest.
[gravatar]
According to the docs (http://docs.python.org/dev/library/stdtypes.html#additional-methods-on-integer-types) bit_length() is new in 3.1, or is it also backported to 2.7?
[gravatar]
@Jeroen: The Python docs can be a bit confusing on this point. Think of the Python2/Python3 split as a fork. The Python 3 docs list what version got the feature on the Python 3 side of the fork, while the Python 2 docs list what version got it on the 2 side of the fork. If you look at the same page in the Python 2 docs (http://docs.python.org/2/library/stdtypes.html#additional-methods-on-integer-types), it says, "new in 2.7". The bit_length method was added in 2.7 and 3.1.
[gravatar]
@Nick Veitch
There is absolutely nothing inherently wasteful about recursion. In languages with support for tail call optimisation, tail calls compile to the same code iteration would compile to.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.