Running the same code on Python 2.x and 3.x

Saturday 3 October 2009This is more than 15 years old. Be careful.

Last weekend I released a version of Coverage.py for Python 3.x. Getting to that point took a while because 3.x was new to me, and, it seems, everyone is still figuring out how to support it.

I experimented with using 2to3 to create my 3.x code from my 2.x code base, and that worked really well, see Coverage.py on Python 3.x for some details. For a while, I developed like this, with 3.x code translated by 2to3 so that I could run the tests under Python 3.1. But then I had to figure out how to package it.

I didn’t want to have to create a separate package in PyPI for the 3.x support. I tried for a while to make one source package with two distinct trees of code in it, but I never got setup.py to be comfortable with that. Setup.py is run during kitting, and building, and installation, and the logic to get it to pick the right tree at all times became twisted and confusing.

(As an aside, setuptools has forked to become Distribute, and they’ve just released their Python 3 support which includes being able to run 2to3 as part of build and install. That may have been a way to go, but I didn’t know it at the time.)

Something, I forget what, made me consider having one source tree that ran on both Python 2 and Python 3. When I looked at the changes 2to3 was making, it seemed doable. I adapted my code to a 2-and-3 idiomatic style, and now the source runs on both.

Changes I had to make:

¶   I already had a file called backward.py that defined 2.5 stuff for 2.3, now I also used it to deal with import differences between 2 and 3. For example:

try:
    from cStringIO import StringIO
except ImportError:
    from io import StringIO

and then in another file:

from backward import StringIO

¶   exec changed from a statement to a function. Syntax changes like this are the hardest to deal with because code won’t even compile if the syntax is wrong. For the exec issue, I used this (perhaps too) clever conditional code:

# Exec is a statement in Py2, a function in Py3

if sys.hexversion > 0x03000000:
    def exec_function(source, filename, global_map):
        """A wrapper around exec()."""
        exec(compile(source, filename, "exec"), global_map)
else:
    # OK, this is pretty gross.  In Py2, exec was a statement, but that will
    # be a syntax error if we try to put it in a Py3 file, even if it isn't
    # executed.  So hide it inside an evaluated string literal instead.
    eval(compile("""\
def exec_function(source, filename, global_map):
    exec compile(source, filename, "exec") in global_map
""",
    "<exec_function>", "exec"
    ))

¶   All print statements have to adopt an ambiguous print(s) syntax. The string to be printed has to be a single string, so some comma-separated lists turned into formatted strings.

¶   2to3 is obsessive about converting any d.keys() use into list(d.keys()), since keys returns a dictionary view object. If the dict isn’t being modified, you can just loop over it without the list(), but in a few places, I really was returning a list, so I included the list() call.

¶   A few 2to3 changes are fine to run on both, so these:

d.has_key(k)
d.itervalues()
callable(o)
xrange(limit)

became:

k in d
d.values()
hasattr(o, '__call__')
range(limit)

¶   Exception handling has changed when you want to get a reference to the exception. This is one of those syntax differences, and it’s structural, so a tricky function definition isn’t going to bridge the gap.

Where Python 2 had this:

try:
    # .. blah blah ..
except SomeErrorClass, err:
    # use err

now Python 3 wants:

try:
    # .. blah blah ..
except SomeErrorClass as err:
    # use err

The only way to make both versions of Python happy is to use the more cumbersome:

try:
    # .. blah blah ..
except SomeErrorClass:
    _, err, _ = sys.exc_info()
    # use err

This is uglier, but there were only a few places I needed it, so it’s not too bad.

¶   Simple imports are relative or absolute in Python 2, but only absolute in Python 3. The new relative import syntax in Python 3 won’t compile in Python 2, so I can’t use it. I was only using relative imports in my test modules, so I used this hack to make them work:

sys.path.insert(0, os.path.split(__file__)[0]) # Force relative import 
from myotherfile import MyClass

By explicitly adding the current directory to the path, Python 3’s absolute-only importer would find the file alongside this one in the current directory.

¶   One area that still tangles me up is str/unicode and bytes/str. Python 3 is making a good change here, but it feels like we’re still in transition. The docs aren’t always clear on what will be returned, and trying to get the same code to do the right thing under both versions still seems to require experiments with decode and encode.

After making all of these changes, I had a single code base that ran on both Python versions, without too much strangeness. It’s way better than having to maintain two packages at PyPI, or trying to trick setup.py into installing different code on different versions.

Others have written about the same challenge:

Comments

[gravatar]
I don't even know py (even though I am dying to find a reason to learn it), but I love reading this type of code dorkage.
[gravatar]
The exec example makes it clear that there's always a solution, no matter how dirty. I for sure hope that this dirtyness won't be prevalent in all future python code. I have only experimented little with such cross-platformness myself, for a very simple module (see link in my name in signature).
[gravatar]
The good news about the dirtiness is that it's mostly encapsulated in backward.py.
[gravatar]
Instead of the code-in-a-string approach, you might consider deploying separate modules for Py2 and P3, and only importing the one appropriate for the current platform inside your 'backward' module. That approach worked fairly well for CherryPy back when we wanted to test decorator syntax, but not require it.
[gravatar]
@Robert: you are right, that's a technique I hadn't considered. In the case of backward.py, it started with a few 2.3-to-2.4 and 2.3-to-2.5 compatibility imports, so I was used to using it for everything. If the exec hack grows to encompass more ugliness, I'll think about separate modules.
[gravatar]
@Ned: I would also do the following:
if 'xrange' not in dir(__builtins__):
  xrange = range
This way you can use xrange on 2 and 3 and stop creating a bunch of lists. A similar solution might exist for itervalues.
[gravatar]
@Andre, that's a good idea. The place where I replaced xrange with range I knew was going to be a very small list, so I just used range, but avoiding a list creation is good even for small lists.
[gravatar]
Even better, Fabio shows a way to find out if we have a builtin without creating a key list with dir() - a solution that, in my opinion is more pythonic:
try:
   xrange = xrange
except:
   xrange = range
[gravatar]
Ned, in the exec_function code sample you're missing two closing parens at the end.

Thanks for the great resource
[gravatar]
Thanks, Eli, I've fixed it.
[gravatar]
I *never* liked xrange, it was just such an obvious hacked up form of range to provide what range should have done in the first place. Since Py3 finally cleans this up, why perpetuate xrange's ugly legacy? In my Py2-3 cross code, I define range = xrange in the Py2 code, and just use range throughout. Good riddance, xrange!
[gravatar]
Please fix the exec example, `exec('code', globals(), locals())` has *always* worked in CPython and also works in Jython 2.5.2 (and now is documented for Python 2.7).

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.